Unix - File Search with Regular expression - regex

find . -iname "abc_v?_test.txt" -print
Which finds all the files
abc_v1_test.txt, abc_v2_test.txt, ..., abc_v9_test.txt
But how can I get additionally get abc_v10_test.txt, abc_v11_test.txt..

You can use -regex option as well:
find . -regextype posix-egrep -iregex ".*abc_v[0-9]{1,2}_test\.txt$"

You still have the option to get all files and pass them to grep, ack or ag:
find . | ag 'abc_v\d+_test\.txt'
note you can replace ag by egrep

Finally, I have implemented in this way
find . -iname "abc_v*_test.txt" -print.
is there any Regular expression that accepts only 1 or 2 numbers after V?

Related

RegEx for finding all files that start with any letter followed by a number

I am trying to make my bash script smart. I have some code that is doing a clean up of 10000's of files. What I am trying to work out is how can I find all files that have a bunch of letters a-z or A-Z in front of 0901*.*
find . -maxdepth 1 -type f -regextype posix-extended -regex '[a-z]\(0901).*'
find . -maxdepth 1 -type f -regextype posix-extended -regex '[a-z]*.*'
returns everything
while
find . -maxdepth 1 -type f -regextype posix-extended -regex '[a-z]\(0901).*'
returns nothing.
How do I solve this problem?
The following option should work with sed as the regex engine:
find . -maxdepth 1 -type f -regextype sed -regex "[A-Za-z]\+0901.*"
I am interpreting a bunch of letters as one or more letter, hence I used [A-Za-z]+ in front of the digits. You should not need parentheses here, but if you wanted to use them, you would have to escape those via backslash:
find . -maxdepth 1 -type f -regextype sed -regex "[A-Za-z]\+\(0901\).*"
With -regex returning the entire path (relatively as the case may be), need to take ./ into account. So, borrowing heavily from Tim's answer.
find . -maxdepth 1 -type f -regextype sed -regex '.*[A-Za-z]\+\(0901\).*'
I think you have to use this regex:
"[A-Za-z]0901"
Good Luck!

pattern to match multiple filenames with find utility

How to find multiple filenames with the bash find command?
$ find /path/* -type f -name pattern
The pattern should match a list of file names:
fname1.jpg
fname2.png
myfile.css
example.gif
I tryed with
https://alvinalexander.com/linux-unix/linux-find-multiple-filenames-patterns-command-example
find multiple filenames command: finding three filename extensions
find . -type f \( -name "*cache" -o -name "*xml" -o -name "*html" \)
and it works.
Anyway I think it would be cleaner with a -name pattern, rather than with a list of -names.
from
$ man find
-name pattern
I m searching for something like: -name '[fname2.png|myfile.css|example.gif ]'
-regex alternative would look as follows:
find . -type f -regextype posix-egrep -regex ".+\.(jpg|png|css)$"
As for -name option:
-name pattern - Base of file name (the path with the leading
directories removed) matches shell pattern.
Shell pattern is not a full-fledged regex pattern.
Just mix them:
find -name "aoc*" -regextype awk -regex ".*[0-9].(class|scala)"
This searches for files, matching shell-pattern aoc* and end in number, with ending .class or .scala.
For your example:
find -name "fname*" -regextype awk -regex ".*[0-9].(png|jpg|css)"
Available types are listet with:
find -regextype -help
However, I first tried "-regextype sed" which is available, but sed itself has options, changing the styles of regexes. And patterns I used to use with sed didn't work, but since the pattern works with awk, it's sufficient for me.

How to find files with regex and list them?

I am new to the whole command-line thing and trying to figure out how to search the current directory and its sub directories for files with a specific filename via regex. Then I want to have the files listed in my command-line.
The regex should match files like:
B2ctes_UCUAAwF-K-large-123x322-132x423.jpg
this_is-a-123-file_name-3124x2445-4235x32.jpeg
file-32x32-64x64.png
The important part is the -[number]x[number]-[number]x[number]
My attempt looks like this:
find . -type f -regex ".+?-\d+x\d+-\d+x\d+\.\w{3,4}" -ls;
There are two problems with this:
-ls puts shows a lot of information. I just want the filenames.
The regex doesn’t work. I have tried to use .+, but even that does not return anything.
You can use this find with regex:
find . -regextype posix-extended -type f -regex ".*-[[:digit:]]+x[[:digit:]]+-[[:digit:]]+x[[:digit:]]+\.[[:alnum:]]{3,4}"
Or on OSX:
find -E . -type f -regex ".*-[[:digit:]]+x[[:digit:]]+-[[:digit:]]+x[[:digit:]]+\.[[:alnum:]]{3,4}"
And without regex:
find . -type f -name "*-[[:digit:]]*x[[:digit:]]*-[[:digit:]]*x[[:digit:]]*.[[:alnum:]]*"
What about simply :
find . -type f -name '-[0-9]*x[0-9]*-[0-9]*x-[0-9]*'
or
find . -type f -regextype posix-egrep -regex '.*-[0-9]+x[0-9]+-[0-9]+x-[0-9]+.*'

Shell - How to deal with find -regex?

I need to look in a directory for sub-directories that all start by "course" but they have version next. For example
course1.1.0.0
course1.2.0.0
course1.3.0.0
So how should I modify my command to make it give me the right list of directories?
find test -regex "[course*]" -type d
You can do:
find test -type d -regex '.*/course[0-9.]*'
it will match files whose name is course plus an amount of numbers and dots.
For example:
$ ls course*
course1.23.0 course1.33.534.1 course1.a course1.a.2
$ find test -type d -regex '.*course[0-9.]*'
test/course1.33.534.1
test/course1.23.0
You need to remove the brackets, and use the proper wildcard syntax for regexes (.*):
find test -regex "course.*" -type d
You can also use the more familiar shell wildcard syntax, by using the -name option instead of -regex:
find test -name 'course*' -type d
I suggest using a regex for precise matching of version number sub directories:
find . -type d -iregex '^\./course\([0-9]\.\)*[0-9]$'
TESTING:
ls -d course*
course1.1.0.0 course1.1.0.5 course1.2.0.0 course1.txt
find . -type d -iregex '^\./course\([0-9]\.\)*[0-9]$'
./course1.1.0.0
./course1.1.0.5
./course1.2.0.0
UPDATE: To match [0-9]. exactly 3 times use this find command:
find test -type d -regex '.*/course[0-9]\.[0-9]\.[0-9]\.[0-9]$'

Regex to match logfiles 1 to 11

I would like to simply fetch logfiles 1 to 11 out of 500 with one regex:
log4j-cnode1.log.11
log4j-cnode1.log.10
log4j-cnode1.log.9
log4j-cnode1.log.8
log4j-cnode1.log.7
log4j-cnode1.log.6
log4j-cnode1.log.5
log4j-cnode1.log.4
log4j-cnode1.log.3
log4j-cnode1.log.2
log4j-cnode1.log.1
so I do not want to fetch log4j-cnode1.log.12, log4j-cnode1.log.13, ... , log4j-cnode1.log.500
I was trying this command:
find . -iname "log4j-cnode1*\.log\.(1[0-1]|[1-9])"
why does this not work?
1 to 9 works fine with this:
find . -iname "log4j-cnode1*\.log\.[1-9]"
Because -iname doesn't accept regular expressions, and even if it would, your 1* would probably not be what you want. Use -iregex:
find -regextype posix-extended -iregex '(.*/)?log4j-cnode1.*\.log\.(1[0-1]|[1-9])'
find . -iname "log4j-cnode1*\.log\.(1?[0-9])"
Your Regex says 1 followed by 0 or 1 followed by 1-9
$ find -name 'log4j-cnode1*\.log\.[0-9]*'
./log4j-cnode1.log.1
./log4j-cnode1.log.10
./log4j-cnode1.log.11
./log4j-cnode1.log.2
./log4j-cnode1.log.3
./log4j-cnode1.log.4
./log4j-cnode1.log.5
./log4j-cnode1.log.6
./log4j-cnode1.log.7
./log4j-cnode1.log.8
./log4j-cnode1.log.9
You got it almost right.
But, instead of -iname, use -iregex with -regextype egrep (or awk), like this:
find . -regextype egrep \
-iregex ".*log4j-cnode1.*\.log\.(1[0-1]|[1-9])"