Mac OS shell scripting - search and replace within multiple files

Posted by: hybrid8

Mac OS shell scripting - search and replace within multiple files - 14/09/2011 14:36

The biggest caveat in what I'm trying to do is that I have to be able to use the command line tools that are built into Mac OS X by default. This is because I want to include a script as part of an install script and would prefer not to install any extra tools, even if only temporarily.

So far it looks like the Mac OS version of sed doesn't handle replacing newlines, which has thrown a monkey wrench into this.

Here's what I want to do...

Search through an arbitrary number of files in a specific folder which contain XML content, for a specific block of text which includes tabs and newlines.

Then I want to replace that block with a new block of text, which also includes tabs and newlines.

Sample of the text I'm searching for:

Code:
			<key>Action Data</key>
			<dict>
				<key>Display Name</key>
				<string>My App</string>
				<key>Should Repeat</key>
				<false/>
				<key>Type</key>
				<integer>2</integer>
			</dict>




I thought about using AppleScript's support for plists instead, but that seems to require you to specify specific keys within which to write - this data can exist anywhere, under any parent key, so that's a dead end. A more straight-forward search and replace is what I need.

I've found a simple example of using sed for text without tabs or newlines, but I don't know how to make it work in Mac OS X because my tests of matching on a newline followed by tabs have been unsuccessful.

Reference:

http://hintsforums.macworld.com/archive/index.php/t-62778.html

So based on a comment in that thread about sed and newlines, I'm trying to use perl instead. And I can easily find tabs and I can even find a newline. But when I search for a newline followed by tabs, I'm not getting anything.

This works

Code:
perl -pe 's!\t\t\t\t<string>My App</string>\n!TEST REPLACEMENT!g'


And this doesn't (it has a tab after the newline, as I'm trying to get the first character from the next line...)

Code:
perl -pe 's!\t\t\t\t<string>My App</string>\n\t!TEST REPLACEMENT!g'

Posted by: Daria

Re: Mac OS shell scripting - search and replace within multiple files - 14/09/2011 14:52

[scully:/tmp] root# which perl
/usr/bin/perl
[scully:/tmp] root# uname -a
Darwin scully.local 11.1.0 Darwin Kernel Version 11.1.0: Tue Jul 26 16:07:11 PDT 2011; root:xnu-1699.22.81~1/RELEASE_X86_64 x86_64
Posted by: Daria

Re: Mac OS shell scripting - search and replace within multiple files - 14/09/2011 14:54

cheater, you edited while i was posting wink
Posted by: hybrid8

Re: Mac OS shell scripting - search and replace within multiple files - 14/09/2011 15:17

The problem is now that I can't match anything after a newline with perl. It's really puzzling me.

This means no tabs after newlines, no double newlines, etc.

ex: I can find /n but not /n/n.
Posted by: tonyc

Re: Mac OS shell scripting - search and replace within multiple files - 14/09/2011 15:34

Use
Code:
perl -0777 -pe


The problem is -pe is going to go line-by-line, and "lines" are terminated by, well, newlines. The specifics of how the -0 command-line option work are rather arcane (unsurprising when dealing with Perl) but the net effect is you eat the whole file before trying to match.

Alternatively, setting
Code:
$/=undef;

before your pattern matching code will have the same effect (setting the record separator to consume the whole input stream.)
Posted by: hybrid8

Re: Mac OS shell scripting - search and replace within multiple files - 14/09/2011 15:48

Sweet, works like a charm. Just finishing off the pattern now...

Ok, one last glitch with the completed mashup.

I'm trying to use find to send filenames to that perl command - which *was* working - it worked the first time wonderfully. It modified exactly one file, the only file that contained the matching text.

But the issue was that I'm also getting a ton of output, basically dumping every file to stout.

Essentially I'm did


find . | xargs perl -0777 -pe -i ‘s!something!else!g’

When I tried sending output to /dev/null like this:

1> /dev/null

I then get a ton of errors that indicate it's breaking with the filenames containing spaces.

And now, I can't seem to get the find to work properly at all, even seemingly using the exact same structure I first tried. If I use the -name param for find I get the errors. If I use the same structure as I did initially I see a lot of files change their modified date and the one file that should have been changed, isn't.

The perl portion of the command is the same except for the -i param which can't be used when testing with a piped cat of the target file.

Further, I've seen strange behavior when I change the format of the params to perl.

Example: -pie versus -pe -i versus -p -i -e
Posted by: hybrid8

Re: Mac OS shell scripting - search and replace within multiple files - 14/09/2011 17:17

Ok, I think I finally got something going.

I'm now using grep instead to pass filenames to perl, since the perl command can take filenames at the end.

So simply using the following brings me back the filenames I need.

Code:
`grep -rilP '<string>My App</string>\n\t\t\t\t<key>Should.*</key>' *`


However, filenames with spaces will be broken up by bash, so I also had to temporarily re-declare the field separator to remove spaces.

Code:
tmpIFS=$IFS; IFS='\n';perl command `grep -rilP '<string>My App</string>\n\t\t\t\t<key>Should.*</key>' *`;IFS=$tmpIFS


Seems to be working.
Posted by: hybrid8

Re: Mac OS shell scripting - search and replace within multiple files - 14/09/2011 17:34

And now for the last problem...

If I specify a path along with grep, that field separator seems to also be matching the character "n" - for example on my user folder it will split the path in two (...brun and o....)

FIXED by replacing the \n newline for the octal representation instead (IFS=$'\012')
Posted by: wfaulk

Re: Mac OS shell scripting - search and replace within multiple files - 14/09/2011 18:05

The proper (modern) solution to the find/xargs/spaces problem is "find -print0 | xargs -0".

That said, if you're already using perl, why not just have perl find the files?

This is why commercial Unix software installations suck, BTW. No one packaging it has any idea what they're doing.
Posted by: hybrid8

Re: Mac OS shell scripting - search and replace within multiple files - 14/09/2011 18:07

And now for the actual last problem. wink

It all works if there's something to be done, but if there are no matching files, the perl command will just sit there waiting.

To simplify I verified that simply issuing the perl command without a filepattern at the end will do the same thing.

Can anyone think of a way around this? Simply passing in every file instead of only a matching file isn't ideal as all the modified file dates will change.
Posted by: hybrid8

Re: Mac OS shell scripting - search and replace within multiple files - 14/09/2011 18:08

Originally Posted By: wfaulk
The proper (modern) solution to the find/xargs/spaces problem is "find -print0 | xargs -0".


Which is what I use elsewhere and I did try it here. It didn't work until I finally played with the params for perl (must specify as -p -i -e).

The drawback however is that every file's modified date is changed, not only the files that need to be edited.

It does get around the no files issue - because it tries to pass "." as a file which gets perl to simply output an error.

Originally Posted By: wfaulk
That said, if you're already using perl, why not just have perl find the files?


Because I was trying to avoid creating a brand new script and just wanted to insert the command I'm creating into an existing bash script. If I have to, I suppose I could learn enough Perl to do it all and then execute the perl script from the bash script. Or the installer might be able to execute it on its own after the bash script (something I'd have to look into).
Posted by: hybrid8

Re: Mac OS shell scripting - search and replace within multiple files - 14/09/2011 19:06

Instead of doing this all as a one-liner I just split it off and added a bit of logic. Now I only call perl if grep finds matching files.

This solve the problem. Finally. I probably should have just attacked this from the script to begin with instead of spending all the initial time testing from the command line.
Posted by: hybrid8

Re: Mac OS shell scripting - search and replace within multiple files - 21/09/2012 16:12

Necropost...

Now using Mac OS X 10.8 this fails. Specifically, I know the grep fails, because of the "-P" parameter. I've found mention that 10.8 is using BSD grep instead of GNU grep and therefore lacks -P support.

So I can no longer do this:

Quote:

`grep -rilP '<string>My App</string>\n\t\t\t\t<key>Should.*</key>' *`


How can I get around this and be able to search for newlines and tabs as part of the string?

Posted by: hybrid8

Re: Mac OS shell scripting - search and replace within multiple files - 21/09/2012 17:41

Ripping my fucking hair out. For a number of instances I don't need regex at all, and that's where I've been concentrating some energy over the past little while, being able to use grep on a pretty simple string.

So get this...

Code:
grep -il '<string>' $HOME/Library/Preferences/*.plist


Brings up barely any matches. It should bring back an enormous list of files since pretty much every plist in that path contains at least one instance of the text "<string>".

If instead I search for some other string which I know exists in one of the files missing from the above, I get a hit. Example:

Code:
grep -il 'Path Finder' $HOME/Library/Preferences/*.plist


The string I searched for there in fact exists as "<string>Path Finder</string>" in the file that I'm not getting in the first grep. Actually in pretty much every one of the 10 or so matches.

Searching for "<string>Path Finder</string>" produces no results at all.

WTF?
Posted by: tfabris

Re: Mac OS shell scripting - search and replace within multiple files - 21/09/2012 17:44

Are the greater-than and less-than signs a special character that you need to escape-out when using grep?
Posted by: hybrid8

Re: Mac OS shell scripting - search and replace within multiple files - 21/09/2012 17:46

Not that I know of. But using fgrep or -F to specify fixed string doesn't change anything.

Oh... And it works on a few files when using different text other than "Path Finder" in a sub-directory of the one I'm currently searching.
Posted by: hybrid8

Re: Mac OS shell scripting - search and replace within multiple files - 21/09/2012 18:26

I've restored the old GNU grep from a backup to a new file and it has the same issues in this OS. Even though I can now use -P with it. smile

If instead of searching through the Preferences folder instead look through my own app's folder where I also have a lot of PLIST files, then it matches on all of those without an issue.

If I tae one of those files that doesn't come up, load it into a text editor and re-save it, then it will come up in grep's match. I am not changing the encoding or anything else. File when re-saved doesn't matter if it's UTF-8, UTF-8 with BOM, etc. As soon as I cause the system to write to the file, then it will no longer show up in grep for the pattern containing the braces.

Again, WTF.
Posted by: hybrid8

Re: Mac OS shell scripting - search and replace within multiple files - 21/09/2012 18:40

Arrgh. Face-palm.

So I look at the file with a hex editor and it's got binary data in it. But it loads up fine into my text editor, which made me assume it was just a text-based XML file.

This explains why the matching doesn't work, because the bytes I'm looking for don't actually exist as I've typed them.

Seems like binary XML files have been a default in Mac OS for quite a while - at least for the OS services.

Seems like my old code is going to be OK since it works with my own plist files which are always text. But it doesn't help me solve the main problem I started looking at today which lead me to finding the BSD grep issue.