Unoffical empeg BBS

Quick Links: Empeg FAQ | RioCar.Org | Hijack | BigDisk Builder | jEmplode | emphatic
Repairs: Repairs

Topic Options
#347389 - 14/09/2011 14:36 Mac OS shell scripting - search and replace within multiple files
hybrid8
carpal tunnel

Registered: 12/11/2001
Posts: 7738
Loc: Toronto, CANADA
The biggest caveat in what I'm trying to do is that I have to be able to use the command line tools that are built into Mac OS X by default. This is because I want to include a script as part of an install script and would prefer not to install any extra tools, even if only temporarily.

So far it looks like the Mac OS version of sed doesn't handle replacing newlines, which has thrown a monkey wrench into this.

Here's what I want to do...

Search through an arbitrary number of files in a specific folder which contain XML content, for a specific block of text which includes tabs and newlines.

Then I want to replace that block with a new block of text, which also includes tabs and newlines.

Sample of the text I'm searching for:

Code:
			<key>Action Data</key>
			<dict>
				<key>Display Name</key>
				<string>My App</string>
				<key>Should Repeat</key>
				<false/>
				<key>Type</key>
				<integer>2</integer>
			</dict>




I thought about using AppleScript's support for plists instead, but that seems to require you to specify specific keys within which to write - this data can exist anywhere, under any parent key, so that's a dead end. A more straight-forward search and replace is what I need.

I've found a simple example of using sed for text without tabs or newlines, but I don't know how to make it work in Mac OS X because my tests of matching on a newline followed by tabs have been unsuccessful.

Reference:

http://hintsforums.macworld.com/archive/index.php/t-62778.html

So based on a comment in that thread about sed and newlines, I'm trying to use perl instead. And I can easily find tabs and I can even find a newline. But when I search for a newline followed by tabs, I'm not getting anything.

This works

Code:
perl -pe 's!\t\t\t\t<string>My App</string>\n!TEST REPLACEMENT!g'


And this doesn't (it has a tab after the newline, as I'm trying to get the first character from the next line...)

Code:
perl -pe 's!\t\t\t\t<string>My App</string>\n\t!TEST REPLACEMENT!g'



Edited by hybrid8 (14/09/2011 14:52)
_________________________
Bruno
Twisted Melon : Fine Mac OS Software

Top
#347390 - 14/09/2011 14:52 Re: Mac OS shell scripting - search and replace within multiple files [Re: hybrid8]
Daria
carpal tunnel

Registered: 24/01/2002
Posts: 3937
Loc: Providence, RI
[scully:/tmp] root# which perl
/usr/bin/perl
[scully:/tmp] root# uname -a
Darwin scully.local 11.1.0 Darwin Kernel Version 11.1.0: Tue Jul 26 16:07:11 PDT 2011; root:xnu-1699.22.81~1/RELEASE_X86_64 x86_64

Top
#347391 - 14/09/2011 14:54 Re: Mac OS shell scripting - search and replace within multiple files [Re: Daria]
Daria
carpal tunnel

Registered: 24/01/2002
Posts: 3937
Loc: Providence, RI
cheater, you edited while i was posting wink

Top
#347392 - 14/09/2011 15:17 Re: Mac OS shell scripting - search and replace within multiple files [Re: Daria]
hybrid8
carpal tunnel

Registered: 12/11/2001
Posts: 7738
Loc: Toronto, CANADA
The problem is now that I can't match anything after a newline with perl. It's really puzzling me.

This means no tabs after newlines, no double newlines, etc.

ex: I can find /n but not /n/n.


Edited by hybrid8 (14/09/2011 15:20)
_________________________
Bruno
Twisted Melon : Fine Mac OS Software

Top
#347393 - 14/09/2011 15:34 Re: Mac OS shell scripting - search and replace within multiple files [Re: hybrid8]
tonyc
carpal tunnel

Registered: 27/06/1999
Posts: 7058
Loc: Pittsburgh, PA
Use
Code:
perl -0777 -pe


The problem is -pe is going to go line-by-line, and "lines" are terminated by, well, newlines. The specifics of how the -0 command-line option work are rather arcane (unsurprising when dealing with Perl) but the net effect is you eat the whole file before trying to match.

Alternatively, setting
Code:
$/=undef;

before your pattern matching code will have the same effect (setting the record separator to consume the whole input stream.)


Edited by tonyc (14/09/2011 15:38)
_________________________
- Tony C
my empeg stuff

Top
#347395 - 14/09/2011 15:48 Re: Mac OS shell scripting - search and replace within multiple files [Re: tonyc]
hybrid8
carpal tunnel

Registered: 12/11/2001
Posts: 7738
Loc: Toronto, CANADA
Sweet, works like a charm. Just finishing off the pattern now...

Ok, one last glitch with the completed mashup.

I'm trying to use find to send filenames to that perl command - which *was* working - it worked the first time wonderfully. It modified exactly one file, the only file that contained the matching text.

But the issue was that I'm also getting a ton of output, basically dumping every file to stout.

Essentially I'm did


find . | xargs perl -0777 -pe -i ‘s!something!else!g’

When I tried sending output to /dev/null like this:

1> /dev/null

I then get a ton of errors that indicate it's breaking with the filenames containing spaces.

And now, I can't seem to get the find to work properly at all, even seemingly using the exact same structure I first tried. If I use the -name param for find I get the errors. If I use the same structure as I did initially I see a lot of files change their modified date and the one file that should have been changed, isn't.

The perl portion of the command is the same except for the -i param which can't be used when testing with a piped cat of the target file.

Further, I've seen strange behavior when I change the format of the params to perl.

Example: -pie versus -pe -i versus -p -i -e


Edited by hybrid8 (14/09/2011 16:42)
_________________________
Bruno
Twisted Melon : Fine Mac OS Software

Top
#347400 - 14/09/2011 17:17 Re: Mac OS shell scripting - search and replace within multiple files [Re: hybrid8]
hybrid8
carpal tunnel

Registered: 12/11/2001
Posts: 7738
Loc: Toronto, CANADA
Ok, I think I finally got something going.

I'm now using grep instead to pass filenames to perl, since the perl command can take filenames at the end.

So simply using the following brings me back the filenames I need.

Code:
`grep -rilP '<string>My App</string>\n\t\t\t\t<key>Should.*</key>' *`


However, filenames with spaces will be broken up by bash, so I also had to temporarily re-declare the field separator to remove spaces.

Code:
tmpIFS=$IFS; IFS='\n';perl command `grep -rilP '<string>My App</string>\n\t\t\t\t<key>Should.*</key>' *`;IFS=$tmpIFS


Seems to be working.
_________________________
Bruno
Twisted Melon : Fine Mac OS Software

Top
#347401 - 14/09/2011 17:34 Re: Mac OS shell scripting - search and replace within multiple files [Re: hybrid8]
hybrid8
carpal tunnel

Registered: 12/11/2001
Posts: 7738
Loc: Toronto, CANADA
And now for the last problem...

If I specify a path along with grep, that field separator seems to also be matching the character "n" - for example on my user folder it will split the path in two (...brun and o....)

FIXED by replacing the \n newline for the octal representation instead (IFS=$'\012')


Edited by hybrid8 (14/09/2011 17:37)
_________________________
Bruno
Twisted Melon : Fine Mac OS Software

Top
#347403 - 14/09/2011 18:05 Re: Mac OS shell scripting - search and replace within multiple files [Re: hybrid8]
wfaulk
carpal tunnel

Registered: 25/12/2000
Posts: 16706
Loc: Raleigh, NC US
The proper (modern) solution to the find/xargs/spaces problem is "find -print0 | xargs -0".

That said, if you're already using perl, why not just have perl find the files?

This is why commercial Unix software installations suck, BTW. No one packaging it has any idea what they're doing.
_________________________
Bitt Faulk

Top
#347404 - 14/09/2011 18:07 Re: Mac OS shell scripting - search and replace within multiple files [Re: hybrid8]
hybrid8
carpal tunnel

Registered: 12/11/2001
Posts: 7738
Loc: Toronto, CANADA
And now for the actual last problem. wink

It all works if there's something to be done, but if there are no matching files, the perl command will just sit there waiting.

To simplify I verified that simply issuing the perl command without a filepattern at the end will do the same thing.

Can anyone think of a way around this? Simply passing in every file instead of only a matching file isn't ideal as all the modified file dates will change.


Edited by hybrid8 (14/09/2011 18:15)
_________________________
Bruno
Twisted Melon : Fine Mac OS Software

Top
#347405 - 14/09/2011 18:08 Re: Mac OS shell scripting - search and replace within multiple files [Re: wfaulk]
hybrid8
carpal tunnel

Registered: 12/11/2001
Posts: 7738
Loc: Toronto, CANADA
Originally Posted By: wfaulk
The proper (modern) solution to the find/xargs/spaces problem is "find -print0 | xargs -0".


Which is what I use elsewhere and I did try it here. It didn't work until I finally played with the params for perl (must specify as -p -i -e).

The drawback however is that every file's modified date is changed, not only the files that need to be edited.

It does get around the no files issue - because it tries to pass "." as a file which gets perl to simply output an error.

Originally Posted By: wfaulk
That said, if you're already using perl, why not just have perl find the files?


Because I was trying to avoid creating a brand new script and just wanted to insert the command I'm creating into an existing bash script. If I have to, I suppose I could learn enough Perl to do it all and then execute the perl script from the bash script. Or the installer might be able to execute it on its own after the bash script (something I'd have to look into).


Edited by hybrid8 (14/09/2011 18:42)
_________________________
Bruno
Twisted Melon : Fine Mac OS Software

Top
#347411 - 14/09/2011 19:06 Re: Mac OS shell scripting - search and replace within multiple files [Re: hybrid8]
hybrid8
carpal tunnel

Registered: 12/11/2001
Posts: 7738
Loc: Toronto, CANADA
Instead of doing this all as a one-liner I just split it off and added a bit of logic. Now I only call perl if grep finds matching files.

This solve the problem. Finally. I probably should have just attacked this from the script to begin with instead of spending all the initial time testing from the command line.
_________________________
Bruno
Twisted Melon : Fine Mac OS Software

Top
#355055 - 21/09/2012 16:12 Re: Mac OS shell scripting - search and replace within multiple files [Re: hybrid8]
hybrid8
carpal tunnel

Registered: 12/11/2001
Posts: 7738
Loc: Toronto, CANADA
Necropost...

Now using Mac OS X 10.8 this fails. Specifically, I know the grep fails, because of the "-P" parameter. I've found mention that 10.8 is using BSD grep instead of GNU grep and therefore lacks -P support.

So I can no longer do this:

Quote:

`grep -rilP '<string>My App</string>\n\t\t\t\t<key>Should.*</key>' *`


How can I get around this and be able to search for newlines and tabs as part of the string?



Edited by hybrid8 (21/09/2012 16:19)
_________________________
Bruno
Twisted Melon : Fine Mac OS Software

Top
#355059 - 21/09/2012 17:41 Re: Mac OS shell scripting - search and replace within multiple files [Re: hybrid8]
hybrid8
carpal tunnel

Registered: 12/11/2001
Posts: 7738
Loc: Toronto, CANADA
Ripping my fucking hair out. For a number of instances I don't need regex at all, and that's where I've been concentrating some energy over the past little while, being able to use grep on a pretty simple string.

So get this...

Code:
grep -il '<string>' $HOME/Library/Preferences/*.plist


Brings up barely any matches. It should bring back an enormous list of files since pretty much every plist in that path contains at least one instance of the text "<string>".

If instead I search for some other string which I know exists in one of the files missing from the above, I get a hit. Example:

Code:
grep -il 'Path Finder' $HOME/Library/Preferences/*.plist


The string I searched for there in fact exists as "<string>Path Finder</string>" in the file that I'm not getting in the first grep. Actually in pretty much every one of the 10 or so matches.

Searching for "<string>Path Finder</string>" produces no results at all.

WTF?
_________________________
Bruno
Twisted Melon : Fine Mac OS Software

Top
#355060 - 21/09/2012 17:44 Re: Mac OS shell scripting - search and replace within multiple files [Re: hybrid8]
tfabris
carpal tunnel

Registered: 20/12/1999
Posts: 31600
Loc: Seattle, WA
Are the greater-than and less-than signs a special character that you need to escape-out when using grep?
_________________________
Tony Fabris

Top
#355061 - 21/09/2012 17:46 Re: Mac OS shell scripting - search and replace within multiple files [Re: tfabris]
hybrid8
carpal tunnel

Registered: 12/11/2001
Posts: 7738
Loc: Toronto, CANADA
Not that I know of. But using fgrep or -F to specify fixed string doesn't change anything.

Oh... And it works on a few files when using different text other than "Path Finder" in a sub-directory of the one I'm currently searching.


Edited by hybrid8 (21/09/2012 17:47)
_________________________
Bruno
Twisted Melon : Fine Mac OS Software

Top
#355062 - 21/09/2012 18:26 Re: Mac OS shell scripting - search and replace within multiple files [Re: hybrid8]
hybrid8
carpal tunnel

Registered: 12/11/2001
Posts: 7738
Loc: Toronto, CANADA
I've restored the old GNU grep from a backup to a new file and it has the same issues in this OS. Even though I can now use -P with it. smile

If instead of searching through the Preferences folder instead look through my own app's folder where I also have a lot of PLIST files, then it matches on all of those without an issue.

If I tae one of those files that doesn't come up, load it into a text editor and re-save it, then it will come up in grep's match. I am not changing the encoding or anything else. File when re-saved doesn't matter if it's UTF-8, UTF-8 with BOM, etc. As soon as I cause the system to write to the file, then it will no longer show up in grep for the pattern containing the braces.

Again, WTF.


Edited by hybrid8 (21/09/2012 18:37)
_________________________
Bruno
Twisted Melon : Fine Mac OS Software

Top
#355063 - 21/09/2012 18:40 Re: Mac OS shell scripting - search and replace within multiple files [Re: hybrid8]
hybrid8
carpal tunnel

Registered: 12/11/2001
Posts: 7738
Loc: Toronto, CANADA
Arrgh. Face-palm.

So I look at the file with a hex editor and it's got binary data in it. But it loads up fine into my text editor, which made me assume it was just a text-based XML file.

This explains why the matching doesn't work, because the bytes I'm looking for don't actually exist as I've typed them.

Seems like binary XML files have been a default in Mac OS for quite a while - at least for the OS services.

Seems like my old code is going to be OK since it works with my own plist files which are always text. But it doesn't help me solve the main problem I started looking at today which lead me to finding the BSD grep issue.


Edited by hybrid8 (21/09/2012 18:47)
_________________________
Bruno
Twisted Melon : Fine Mac OS Software

Top