How to use UNIX find to find (file1 OR file2)? - regex

In the bash command line, I want to find all files that are named foo or bar. I tried this:
find . -name "foo\|bar"
but that doesn't work. What's the right syntax?

You want:
find . \( -name "foo" -o -name "bar" \)
See the wikipedia page (of all places)

I am cheap with find, I would use this:
find ./ | grep -E 'foo|bar'
Thats just my personal pref, I like grep more than find because the syntax is easier to 'get' and once you master it there are more uses than just walking file tree.

Related

Find a string in multiple files using grep

I have a folder with sub-folders inside, all have many types of files. I want to search for a word inside the .css-files. I am using Windows 7 and I have grep.
How I can use grep to :
Find pattern and print it
Give file name (and path) if pattern found
Actually you don't need find. Just use:
grep -R --include=*.css -H pattern .
this will recurse and look for all *.css in subdirectories, while -H will show the filename.
find folder/ -name "*.css" |xargs grep "your-pattern"
You will need to install cygwin to do this.
if the files in which we have to look for, has pattern then we can use this.
Consider I'm looking for pattern "cardlayout" in files named chap1.lst chap2.lst and so on.
then the command
grep -e 'cardlayout' ` find . -name "chap??.lst"`
hope this would help

Recursive find and replace based on regex

I have changed up my director structure and I want to do the following:
Do a recursive grep to find all instances of a match
Change to the updated location string
One example (out of hundreds) would be:
from common.utils import debug --> from etc.common.utils import debug
To get all the instances of what I'm looking for I'm doing:
$ grep -r 'common.' ./
However, I also need to make sure common is preceded by a space. How would I do this find and replace?
It's hard to tell exactly what you want because your refactoring example changes the import as well as the package, but the following will change common. -> etc.common. for all files in a directory:
sed -i 's/\bcommon\./etc.&/' $(egrep -lr '\bcommon\.' .)
This assumes you have gnu sed available, which most linux systems do. Also, just to let you know, this will fail if there are too many files for sed to handle at one time. In that case, you can do this:
egrep -lr '\bcommon\.' . | xargs sed -i 's/\bcommon\./etc.&/'
Note that it might be a good idea to run the sed command as sed -i'.OLD' 's/\bcommon\./etc.&/' so that you get a backup of the original file.
If your grep implementation supports Perl syntax (-P flag, on e.g. Linux it's usually available), you can benefit from the additional features like word boundaries:
$ grep -Pr '\bcommon\.'
By the way:
grep -r tends to be much slower than a previously piped find command as in Rob's example. Furthermore, when you're sure that the file-names found do not contain any whitespace, using xargs is much faster than -exec:
$ find . -type f -name '*.java' | xargs grep -P '\bcommon\.'
Or, applied to Tim's example:
$ find . -type f -name '*.java' | xargs sed -i.bak 's/\<common\./etc.common./'
Note that, in the latter example, the replacement is done after creating a *.bak backup for each file changed. This way you can review the command's results and then delete the backups:
$ find . -type f -name '*.bak' | xargs rm
If you've made an oopsie, the following command will restore the previous versions:
$ find . -type f -name '*.bak' | while read LINE; do mv -f $LINE `basename $LINE`; done
Of course, if you aren't sure that there's no whitespace in the file names and paths, you should apply the commands via find's -exec parameter.
Cheers!
This is roughly how you would do it using find. This requires testing
find . -name \*.java -exec sed "s/FIND_STR/REPLACE_STR/g" {}
This translates as "Starting from the current directory find all files that end in .java and execute sed on the file (where {} is a place holder for the currently found file) "s/FIND_STR/REPLACE_STR/g" replaces FIND_STR with REPLACE_STR in each line in the current file.

why isn't this regex working : find ./ -regex '.*\(m\|h\)$

Why isn't this regex working?
find ./ -regex '.*\(m\|h\)$
I noticed that the following works fine:
find ./ -regex '.*\(m\)$'
But when I add the "or a h at the end of the filename" by adding \|h it doesn't work. That is, it should pick up all my *.m and *.h files, but I am getting nothing back.
I am on Mac OS X.
On Mac OS X, you can't use \| in a basic regular expression, which is what find uses by default.
re_format man page
[basic] regular expressions differ in several respects. | is an ordinary character and there is no equivalent for its functionality.
The easiest fix in this case is to change \(m\|h\) to [mh], e.g.
find ./ -regex '.*[mh]$'
Or you could add the -E option to tell find to use extended regular expressions instead.
find -E ./ -regex '.*(m|h)$'
Unfortunately -E isn't portable.
Also note that if you only want to list files ending in .m or .h, you have to escape the dot, e.g.
find ./ -regex '.*\.[mh]$'
If you find this confusing (me too), there's a great reference table that shows which features are supported on which systems.
Regex Syntax Summary [Google Cache]
A more efficient solution is to use the -o flag:
find . -type f \( -name "*.m" -o -name "*.h" \)
but if you want the regex use:
find . -type f -regex ".*\.[mh]$"
Okay this is a little hacky but if you don't want to wrangle the regex limitations of find on OSX, you can just pipe find's output to grep:
find . | grep ".*\(\h\|m\)"
What’s wrong with
find . -name '*.[mh]' -type f
If you want fancy patterns, then use find2perl and hack the pattern.

Regular Expression Differences Between ls and find to search for 1 string or another string

I'm having a minor brain-fart that I'm sure someone can answer quickly. I'm using cygwin to get a bash shell in windows (in case that has any idiosyncrasies) and am having trouble shifting a regular expression between ls and find.
I have a bunch of files that I need to access, some which start EA_ and some which start FS_ so I can list them with ls like this
ls -l {EA,FS}_*
and this also works fine with wc but when I try to use this in a find, the regex doesn't seem to be right:-
find . -iname "{EA,FS}_*"
I've tried escaping the { and } but that doesn't seem to work either - what am I doing wrong?
Cheers
MH
Looks like you need a regular expression instead of the usual name glob:
find . -iregex './\(EA\|FS\)_.*'
Remember with this syntax that you have to match the directory too. From your commands it looks like you're doing it all in one directory (no depth) so what I've provided will work. For more recursive searches you'd need a different regex.
Test run on Cygwin, Windows 7:
$ find . -iregex './\(RT\|ED\).*' | head
./ED-AT-CK01-A01.xml
./ED-AT-CK02-A01.xml
./ED-AT-CL01-A01.xml
./ED-AT-CL02-A01.xml
./ED-AT-CL03-A01.xml
./ED-AT-CL04-A01.xml
./ED-AT-IL001-A01.xml
./ED-AT-IL01-A01.xml
./ED-AT-IL02-A01.xml
./ED-AT-TB02-A01.xml
you can also do this
find . -type f \( -iname "ES*" -o -iname "FS_*" \)

Using non-consuming matches in Linux find regex

Here's my problem in a simplified scenario.
Create some test files:
touch /tmp/test.xml
touch /tmp/excludeme.xml
touch /tmp/test.ini
touch /tmp/test.log
I have a find expression that returns me all the XML and INI files:
[root#myserver] ~> find /tmp -name -prune -o -regex '.*\.\(xml\|ini\)'
/tmp/test.ini
/tmp/test.xml
/tmp/excludeme.xml
I now want a way of modifying this -regex to exclude the excludeme.xml file from being included in the results.
I thought this should be possible by using/combining a non-consuming regex (?=expr) with a negated match (?!expr). Unfortunately I can't quite get the format of the command right, so my attempts result in no matches being returned. Here was one of my attempts (I've tried many different forms of this with different escaping!):
find /tmp -name -prune -o -regex '\(?=.*excludeme\.xml\).*\.\(xml\|ini\)'
I can't break down the command into multiple steps (e.g. piping through grep -v) as the find command is assumed as input into other parts of our tool.
This does what you want on linux:
find /tmp -name -prune -o -regex '.*\.\(xml\|ini\)' \! -regex '.*excludeme\.xml'
I'm not sure if the "!" operator is unique to gnu find.
Not sure about what escapes you need or if lookarounds work, but these work for Perl:
/^(?!.*\/excludeme\.).*\.(xml|ini)$/
/(?<!\/excludeme)\.(xml|ini)$/
Edit - Just checked find command, best you can do with find is to change the regextype to -regextype posix-extended but that doesen't do stuff like look-arounds. The only way around this looks to be using some gnu stuff, either as #unholygeek suggests with find or piping find into gnu grep with the -P perl option. You can use the above regex verbatim if you go with a gnu grep. Something like find .... -print | xargs grep -P ...
Sorry, thats the best I can do.