Regex to match logfiles 1 to 11 - regex

I would like to simply fetch logfiles 1 to 11 out of 500 with one regex:
log4j-cnode1.log.11
log4j-cnode1.log.10
log4j-cnode1.log.9
log4j-cnode1.log.8
log4j-cnode1.log.7
log4j-cnode1.log.6
log4j-cnode1.log.5
log4j-cnode1.log.4
log4j-cnode1.log.3
log4j-cnode1.log.2
log4j-cnode1.log.1
so I do not want to fetch log4j-cnode1.log.12, log4j-cnode1.log.13, ... , log4j-cnode1.log.500
I was trying this command:
find . -iname "log4j-cnode1*\.log\.(1[0-1]|[1-9])"
why does this not work?
1 to 9 works fine with this:
find . -iname "log4j-cnode1*\.log\.[1-9]"

Because -iname doesn't accept regular expressions, and even if it would, your 1* would probably not be what you want. Use -iregex:
find -regextype posix-extended -iregex '(.*/)?log4j-cnode1.*\.log\.(1[0-1]|[1-9])'

find . -iname "log4j-cnode1*\.log\.(1?[0-9])"
Your Regex says 1 followed by 0 or 1 followed by 1-9

$ find -name 'log4j-cnode1*\.log\.[0-9]*'
./log4j-cnode1.log.1
./log4j-cnode1.log.10
./log4j-cnode1.log.11
./log4j-cnode1.log.2
./log4j-cnode1.log.3
./log4j-cnode1.log.4
./log4j-cnode1.log.5
./log4j-cnode1.log.6
./log4j-cnode1.log.7
./log4j-cnode1.log.8
./log4j-cnode1.log.9

You got it almost right.
But, instead of -iname, use -iregex with -regextype egrep (or awk), like this:
find . -regextype egrep \
-iregex ".*log4j-cnode1.*\.log\.(1[0-1]|[1-9])"

Related

Recursively find filenames of exactly 8 hex characters, but not all 0-9, no lookahead (Mac terminal, bash)

I'm trying to write a regex to find files recursively with Mac Terminal (bash, not zsh even though Catalina wants me to switch over for whatever reason) using the find command. I'm looking for files that are:
Exactly 8 hexadecimal digits (0-9 and A-F)
But NOT only decimal digits (0-9)
In other words, it would match A1234567, ABC12DEF, 12345ABC, and ABCDABCD, but not 12345678 or 09876543.
To find files that are exactly 8 hex digits, I've used this:
find -E . -type f -regex '.*/[A-F0-9]{8}'
The .*/ is necessary to allow the full path name to precede the filename. This is eventually going to get fed to rm, so I have to keep the path.
It SEEMS like this should work to fulfill both of my requirements:
find -E . -type f -regex '.*/(?![0-9]{8})[A-F0-9]{8}'
But that returns an error:
find: -regex: .*/(?![0-9]{8})[A-F0-9]{8}: repetition-operator operand invalid
It seems like the find command doesn't support lookaheads. How can I do this without one?
With any POSIX-compliant find
find . -type f \
-name '????????' \
! -name '*[![:xdigit:]]*' \
-name '*[![:digit:]]*'
And if you insist on using regexps for this, here you go
find -E . -type f \
-regex '.*/[[:xdigit:]]{8}' \
! -regex '.*/[[:digit:]]*'
Those who use GNU find should drop -E and insert -regextype posix-extended after paths to make this work.
It's probably easiest to just filter out the results you don't like:
find -E . -type f -regex '.*/[A-F0-9]{8}' -print | egrep -v '.*/[0-9]{8}$'
$ find -E . -type f -regex '.*/[A-F0-9]{8}' -print
./01234567
./ABCDEFAF
./ABCDEF01
./ABCDEF2A
./ABCDEFA2
./x/01234567
./x/ABCDEFAF
./x/ABCDEF01
./x/ABCDEF2A
./x/ABCDEFA2
$ find -E . -type f -regex '.*/[A-F0-9]{8}' -print | egrep -v '.*/[0-9]{8}$'
./01234567
./ABCDEFAF
./ABCDEF01
./ABCDEF2A
./ABCDEFA2
./x/01234567
./x/ABCDEFAF
./x/ABCDEF01
./x/ABCDEF2A
./x/ABCDEFA2
My find didn't understand -E and was inexplicably grumpy about -regex in general, but this still worked:
find . -type f -name '[A-F0-9][A-F0-9][A-F0-9][A-F0-9][A-F0-9][A-F0-9][A-F0-9][A-F0-9]' -a -name '*[A-F]*'
Not as elegant as oguz ismail's, but easier to read for my clogged brain, lol

RegEx for finding all files that start with any letter followed by a number

I am trying to make my bash script smart. I have some code that is doing a clean up of 10000's of files. What I am trying to work out is how can I find all files that have a bunch of letters a-z or A-Z in front of 0901*.*
find . -maxdepth 1 -type f -regextype posix-extended -regex '[a-z]\(0901).*'
find . -maxdepth 1 -type f -regextype posix-extended -regex '[a-z]*.*'
returns everything
while
find . -maxdepth 1 -type f -regextype posix-extended -regex '[a-z]\(0901).*'
returns nothing.
How do I solve this problem?
The following option should work with sed as the regex engine:
find . -maxdepth 1 -type f -regextype sed -regex "[A-Za-z]\+0901.*"
I am interpreting a bunch of letters as one or more letter, hence I used [A-Za-z]+ in front of the digits. You should not need parentheses here, but if you wanted to use them, you would have to escape those via backslash:
find . -maxdepth 1 -type f -regextype sed -regex "[A-Za-z]\+\(0901\).*"
With -regex returning the entire path (relatively as the case may be), need to take ./ into account. So, borrowing heavily from Tim's answer.
find . -maxdepth 1 -type f -regextype sed -regex '.*[A-Za-z]\+\(0901\).*'
I think you have to use this regex:
"[A-Za-z]0901"
Good Luck!

Unix regular expression

I need to use unix regular expression in unix find command:
find "/home/user/somePath/" -maxdepth 1 ! -regex
"/home/user/somePath/someUnwantedPath" ! -regex
"/home/user/somePath/someMoreUnwantedPath"
This works but I need to optimize the regex into a single one because the unwanted paths are more than just a few.
I suppose you can do it with alternation.
/home/user/somePath/(someUnwantedPath|someMoreUnwantedPath)
find "/home/user/somePath/" -maxdepth 1 ! -regex "/home/user/somePath/(someUnwantedPath|someMoreUnwantedPath)"
Just add more paths at the end of the end of the parenthesized group starting with a new | as alternation delimiter. I.e. |AnotherUnwantedPath.
Edit
I'm a "Windows dude", so I'm not that familiar with Unix, but I wanted to try it out on BUW, and it appears you have to escape regex metacharacters. So I guess the correct answer should be
/home/user/somePath/\(someUnwantedPath\|someMoreUnwantedPath\)/.*
find "/home/user/somePath/" -maxdepth 1 ! -regex "/home/user/somePath/\(someUnwantedPath\|someMoreUnwantedPath\)/.*"
You can use grep -v -f instead of regex to make it clean. The alternate (|) operator does not work in many unix systems.Your exclude files should list down all the files to be excluded(including subdirectories if any. )
cat excl_files.txt
/home/user/somePath/someUnwantedPath1
/home/user/somePath/someUnwantedPath2
/home/user/somePath/someUnwantedPath3
..
/home/user/somePath/someUnwantedPathn
find "/home/user/somePath/" -maxdepth 1 | grep -v -f excl_files.txt

globular vs regex in find

I am trying to find file names similar to this: fsimage_0000000000501205926
This is what I tried :
works: find . -name 'fsimage_???????????????????' -mtime -1
Following another SO post I tried this and it doesn't work:
find . -regextype posix-extended -regex '^fsimage_[0-9]{19}' -mtime -1
*** EDIT:
As suggested escaping the curly braces doesn't work either.
What I am doing wrong with the regex command ? I am using 4.4.2 GNU findutils.
You can use this regex in find:
find . -regextype posix-extended -regex '.*/fsimage_[0-9]{19}'
PS: If you're on OSX then use:
find -E . -regex '.*/fsimage_[0-9]{19}'

Unix - File Search with Regular expression

find . -iname "abc_v?_test.txt" -print
Which finds all the files
abc_v1_test.txt, abc_v2_test.txt, ..., abc_v9_test.txt
But how can I get additionally get abc_v10_test.txt, abc_v11_test.txt..
You can use -regex option as well:
find . -regextype posix-egrep -iregex ".*abc_v[0-9]{1,2}_test\.txt$"
You still have the option to get all files and pass them to grep, ack or ag:
find . | ag 'abc_v\d+_test\.txt'
note you can replace ag by egrep
Finally, I have implemented in this way
find . -iname "abc_v*_test.txt" -print.
is there any Regular expression that accepts only 1 or 2 numbers after V?