Find and rename files/folders using regex - regex

I am trying to find a right regex for the filename that starts with I0[0-9][0-9]- eg: "I097-". I am not familiar with regex but using online, I came up with this [I][0][\d][\d][-], I am sure this is not the best regex pattern for the string I have, but I tested using online regex tools and it works. Now I want to use Linux 'find' to find all the files that match this regex and re-name the resulting files by replacing the matching string with nothing.
From:
I071-PTEN-7
./I071-PTEN-7/I071-PTEN-7.txt
To:
PTEN-7
./PTEN-7/PTEN-7.txt
command used:
find . -name "I0*" -type f -o -name "I0*" -type d -exec rename -n "s/[I][0][\d][\d][-]/''/" {} \;
But it doesn't seem to do anything, not sure what is going on. Any help to find the issue or solution would be greatly appreciated. Thanks.

Use -execdir option to get only filenames entries in find also there is no need to use character class around every character in your regex.
find . -name 'I0*' -execdir rename -n 's/^I0\d\d-//' {} \;
If rename isn't working then you may try this:
find . -type f -name 'I0*' -execdir bash -c 'mv "$1" "${1/I0[0-9][0-9]-/}"' - {} \; &&
find . -name 'I0*' -execdir bash -c 'mv "$1" "${1/I0[0-9][0-9]-/}"' - {} \;

Related

UNIX find and replace using execdir and bash

I am new to scripting and have a requirement where I need to change the special characters from file and replace with some other character.
Below is the file name where I have to replace the ? by _.
file - 21041159?74DECL?ARAÇÃO14581?5904289?6770700.pdf
result - 21041159_74DECL_ARAÇÃO14581_5904289_6770700.pdf
find . -depth -name '*\?*' -type f -execdir bash -c 'mv "$1" "${1/\?/_}"' -- {} \;
The above script changes the first occurrence of question mark to underscore but not from complete file name.
Please suggest what can be done?
A simplified version of your question is:
I can replace the first occurrence of a string in a bash variable with ${var/foo/bar}.
How can I replace all occurrences?
And the answer is to use double slash: ${var//foo/bar}.
In context, it would be:
find . -depth -name '*\?*' -type f -execdir bash -c 'mv "$1" "${1//\?/_}"' -- {} \;
# Here --^

Bash script find/sed not working

I have the following line in my bash script:
find . -name "*.html" -print |
xargs sed -i 's/http\:\/\/version2\.staging\.myname\.com//g'
and it's giving me the following error:
sed: 1: "./instant/index. ...": invalid command code .
What I'm trying to do is replace any occurrence of http://version2.staging.myname.com with /. How do you do it?
Usually I use something like:
find . -name "*.html" -exec sed -i 's|http://version2\.staging\.myname\.com/|/|g' '{}' ';'
To test this out, you can first insert an echo statement
find . -name "*.html" -exec echo sed -i 's|http://version2\.staging\.myname\.com/|/|g' '{}' ';'
... that will tell you if the output will be what you expect. I always recommend doing a dry-run with echo first before any mass update. Also you can use | as an alternate regex delimiter to avoid using as many `/' in the paths.
For OSX try this:
find . -name "*.html" -exec sed -i.bak 's#http://version2\.staging\.myname\.com##g' '{}' \; -print
I think you may be using a Mac (and now I see a comment that you are on an iMac). On Mac OS X, the sed -i option requires an argument. That makes sense of your error message. The sed command is interpreting your s/...//g command as the suffix to use for the back up file; it is then trying to interpret the first file name as the sed script, and fortunately, that is not working.
Additionally, you can avoid most of the escaping issues by using some character other than / as the delimiter for s///. Also, it is generally better (especially on Macs where file paths often end up with spaces in them) to avoid xargs and use -exec in find, along with the + option to do what xargs does — namely group many file names into one command invocation.
This leads to:
find . -name "*.html" -type f \
-exec sed -i .bak -e 's%http://version2.staging.myname.com%%g' {} +
(NB: strictly, that will map http://version2-staging*myname#com to / too; if you're really worried about that, by all means escape the dots in the URL.)
If you want to get rid of the .bak files afterwards:
find . -name '*.bak' -type f -exec rm -f {} +

Linux Command: Find & Replace using regex not woking as expected

I am trying to replace a path in all php files using regex command, but it isn't working as expected!
I want to replace '/home/example/public_html with $_SERVER['DOCUMENT_ROOT'] . '
I am using the below command in ssh:
find /var/www/advertise/ -name '*.php' -type f -exec sed -i 's/\'\/home\/example\/public_html/\$\_SERVER\[\'DOCUMENT\_ROOT\'\]\ \.\ \'/g' {} \;
When i enter the command and hit return, > sign follows like:
>
>
>
.. so on as i keep hitting return to execute the command.
Where as below command works perfectly (for replacing home/example/public_html with var/www):
find /var/www/advertise/ -name '*.php' -type f -exec sed -i 's/home\/example\/public_html/var\/www/g' {} \;
You're messing up with the quotes.
Use a separator other than / so that you don't need to escape the /
You don't need to escape in the replacement
Since you have ' in the replacement, better use "s#..#..#" (i.e. double quotes). However, you'll need to escape the $ in the replacement to prevent the shell from trying to expand.
The following might work for you:
find /var/www/advertise/ -name '*.php' -type f -exec sed -i "s#'/home/example/public_html#\$_SERVER['DOCUMENT_ROOT'] . '#g" {} \;

find all files except e.g. *.xml files in shell

Using bash, how to find files in a directory structure except for *.xml files?
I'm just trying to use
find . -regex ....
regexe:
'.*^((?!xml).)*$'
but without expected results...
or is there another way to achieve this, i.e. without a regexp matching?
find . ! -name "*.xml" -type f
find . -not -name '*.xml'
Should do the trick.
Sloppier than the find solutions above, and it does more work than it needs to, but you could do
find . | grep -v '\.xml$'
Also, is this a tree of source code? Maybe you have all your source code and some XML in a tree, but you want to only get the source code? If you were using ack, you could do:
ack -f --noxml
with bash:
shopt -s extglob globstar nullglob
for f in **/*!(.xml); do
[[ -d $f ]] && continue
# do stuff with $f
done
You can also do it with or-ring as follows:
find . -type f -name "*.xml" -o -type f -print
Try something like this for a regex solution:
find . -regextype posix-extended -not -regex '^.*\.xml$'

batch process regular expression find and replace on folder and subfolders contents

I have a folder with subfolders that contain text documents (hundreds). The text documents all require a find and replace. The regular expression I am using to find the text is:
^([A-Z])[\r\n]+(\w+)\b
This is being replaced by:
$1$2
How can I batch process this find and replace on a folder with subfolders?
I'm using a mac (osx 10.6.8)
You could use sed for this as well:
cd /path/to/files # make sure you are in the right directory
find . -type f -exec sed -i.bak 's/^([A-Z])[\r\n]+(\w+)\b/$1$2/g' {} \;
Edit: I just realized that the above is a Textmate search/replace string. For sed you'll have to use:
find . -type f -exec sed -i.bak 's/^([A-Z])[\r\n]+(\w+)\b/\1\2/g' {} \;
This makes a backup of all files.
You could do this using find and perl:
find ./* -exec perl -p -i -e 's/^([A-Z])[\r\n]+(\w+)\b/$1$2/g' {} \;
Warning: untested :)