-
-
Save kalenjordan/6766591 to your computer and use it in GitHub Desktop.
# I'm certain there's a much more elegant way to do this, but I'm | |
# not too handy with bash script. | |
# Change all the instances of 2013 copyright to 2012 | |
find . -name "*.xml" -exec sed -i.bak -E "s/Copyright .c. 2013/Copyright (c) 2012/g" {} \; | |
find . -name "*.xml.template" -exec sed -i.bak -E "s/Copyright .c. 2013/Copyright (c) 2012/g" {} \; | |
find . -name "*.php" -exec sed -i.bak -E "s/Copyright .c. 2013/Copyright (c) 2012/g" {} \; | |
find . -name "*.css" -exec sed -i.bak -E "s/Copyright .c. 2013/Copyright (c) 2012/g" {} \; | |
find . -name "*.js" -exec sed -i.bak -E "s/Copyright .c. 2013/Copyright (c) 2012/g" {} \; | |
find . -name "*.phtml" -exec sed -i.bak -E "s/Copyright .c. 2013/Copyright (c) 2012/g" {} \; | |
find . -name "*.html" -exec sed -i.bak -E "s/Copyright .c. 2013/Copyright (c) 2012/g" {} \; | |
# Delete all the backup files that were created | |
find . -name "*.bak" -exec rm {} \; |
What's the -E for? I'm not familiar. I'd have thought -i would be enough
@brendanf, what's the error message you're getting? It's working for me. Maybe it's some crazy difference in our OS version?
@bobbyshaw, It causes the regex interpreter to use regular (full) regular expressions. I had to use it because the default regex interpreter was not liking my pattern for some reason.
:) other than that looks good to me. You could replace the . With ( for it to be more restrictive rather than matching any character but it'll be fine.
Hmm, I'm on Mavericks but I don't think that should change anything. When I changed to using "-type f" it started saying:
sed: RE error: illegal byte sequence
I'd be pretty happy making it simple enough to only need to modify the year, and run one command.
Hm, sounds like it's maybe not using the full regular expressions. Try a man sed
and see if you have that same -E
option. Maybe you're on a different version?
Ya I'm sure there's a much more elegant way to do this in shell script...
The illegal byte error is because "find . -type f...." gets ALL files (including image binaries), which sed can't handle. Naturally @AlanStorm pointed that out, and not me: http://stackoverflow.com/a/19735703/1497746
With his rewrite, I'm able to execute the command without error but I don't see any changes written into the files.
# Executes but no effect on files
find . -name '*.php' -o -name '*.xml' -o -name '*.phtml' -print0 | xargs -0 sed -i '' 's/2013 Magento/2012 Magento/g'
# Executes but no effect on files (with piping braces)
find . -name '*.php' -o -name '*.xml' -o -name '*.phtml' -print0 | xargs -0 sed -i '' 's/2013 Magento/2012 Magento/g' {}
This feels really close. Any more ideas?
OK, I think I've got this. I had only tested my version with a single file. It looks like find
on OS X wants each -o
to have its own print0
. Give this a try
find . -name '*.php' -print0 -o -name '*.xml' -print0 -o -name '*.phtml' -print0 | xargs -0 sed -i '' 's/2013 Magento Inc./2012 Magento Inc./g'
and if you need to add additional file types, do it like this
-o -name '*.ext' -print0
Nice.
So @brendanf, have you tried the exact commands as they are in my original gist? It seems like you've tweaked them a bit each time. The exact commands I have here work for me in OSX.
For example, in your most recent one, you have the print0
, but I'm not using that.
But thanks for the proper way to search multiple file types in a single command @astorm ! Beauty.
Using "find" with -print0 ensures that any filenames that (for whatever reason) include a newline character will not be mis-interpreted by as two separate files. It forces a null character instead of a newline after each matched file, and then "xargs" uses the -0 param to indicate the file list is piped in with null chars for separators...or so I learned 5 minutes ago.
Here's the full command I just tested on MagentoEE 1.13.0.2:
https://gist.github.com/brendanfalkowski/7274294
I get two of this error as output:
sed: RE error: illegal byte sequence
But if I search the directory, there are only 70 files with "(c) 2013 Magento" in them. They are .php.sample, .xml.sample, .css, and .js types. Not sure how to detect where it's choking, but it's processing almost everything.
The sed
on my home laptop doesn't suffer the RE error: illegal byte sequence
problem, so the following is speculation.
I think sed should recover from RE error: illegal byte sequence
and continue processing files. Try running the sed
against a few of the individual files to see if they have extra spaces or something that may be tripping up the regular expressions.
sed -i '' 's/2013 Magento Inc./2012 Magento Inc./g' /path/to/individual-file
Re: identifying problem files Give this a try
find . -name '*.css' -print0 -o -name '*.html' -print0 -o -name '*.js' -print0 -o -name '*.php' -print0 -o -name '*.phtml' -print0 -o -name '*.xml' -print0 -o -name '*.xml.template' -print0 | xargs -t -n 1 -0 sed -i '' 's/(c) 2013 Magento/(c) 2012 Magento/g'
This command adds two arguments to xargs
xargs -t -n 1
The -n 1
commands says to run sed
once (1
) for every argument pipped in. The -t
argument tells xargs to print every command. This means your screen will be filled with the output from the command.
Before you run this command, clear your terminal scrollback
View -> Clear Scrollback
Run the command. When it's done, select all the terminal text with
Edit -> Select All
And then copy/paste it into a text editor. Search for which line produces RE error: illegal byte sequence
and you'll find your problem files. If they're nto binary, they may have an encoding which conflicts with what's set in $LANG
.
Thanks Alan, traced those files down:
/skin/frontend/enterprise/default/js/jqzoom/jquery.jqzoom1.0.1.js
/js/extjs/resources/css/ytheme-galdaka.css
/js/tiny_mce/plugins/spellchecker/editor_plugin.js
/js/tiny_mce/plugins/spellchecker/editor_plugin_src.js
- Without inspecting what went wrong, I can see these aren't files I'll need to merge/update in custom work. So it's a quick way to isolate them.
- The first two bugging files contain comments written in Italian and Spanish that use accented characters. I'm guessing the UTF-8 chars are what's causing
sed
to choke. I didn't notice anything like that in the next two, but it's probably a similar problem.
Regardless, we can use this to rule out the files that can't be processed with sed
but are valid file types.
Updated my gist with this info: https://gist.github.com/brendanfalkowski/7274294
One last follow up. It's not strictly UTF-8 characters that are tripping up sed
. Per the previous Stack Overflow questions, sed
will obey the encoding set in
$ echo $LANG
en_US.UTF-8
That means it's fine with UTF-8. The "real" problem is those files aren't UTF-8 encoded. BBEdit reports them as "Western (Max OS Roman)" on my system (— but text encoding is complicated).
So, a better explanation of what's going wrong is those files contain characters that aren't technically valid for their encoding. Our text editors and browsers have heuristics to do something smart when they encounters this — but sed
's a tool written by c programmers to operate directly on bitstreams (sed
stands for s
tream ed
itor). When sed
encounters those characters, it gets upset and bails rather than making a wrong "heuristical" guess.
Ultimately not useful to our task at hand, but interesting if you're interested in C programming.
If all you're needing is a diff, try this:
diff -qrI '@copyright' /path/to/mage-v1 /path/to/mage-v2
This does work, but I'm thinking it could be simplified to check all file types and skip creating the .bak files. My "guess and check" skills are not working. Something like: