Regex: Find files not ending with numeral suffix

Regex: Find files not ending with numeral suffix - regex

I need to make a command which returns all files without numeral suffix (*.0, *.123, ...)
Have for example three files:
gg.p qqq.449 rtr55
I want to find only these:
./rtr55
./gg.p
I tried to find them using grep. However I got only results with no effect.
find -type f | grep -v '\.[0-9]+$'
(This command returned:)
./qqq.449
./rtr55
./gg.p
So there is probably some regex format error. Do you know, how to fix it?

The + operator belongs to the extended regular expressions. There are many workarounds:
find -type f | grep -v '\.[0-9]\+$'
find -type f | egrep -v '\.[0-9]+$'
find -type f | grep -E -v '\.[0-9]+$'
find -type f | grep -v '\.[0-9][0-9]*$'

Why would you use grep at all?
find -regex '.*\.[0-9][0-9]*' -prune -o -type f
If your expressions are simple enough (or your find doesn't support -regex), you could use -name instead of -regex but a glob wildcard can't capture an arbitrary amount of numbers after the dot. Here's one or two:
find -name '*.[0-9]' -prune -o -name '*.[0-9][0-9]' -prune -o -type f
Notice that this isn't purely an efficiency question; grep would simply not do the right thing if you ever come across file names with newlines in them.

Related

Deleting files not containing double digit number and pattern in grep

The pattern below is supposed to delete all files that dont start with 1_ but instead it matches all files that don't contain 1.
For example, it'll not match 11_xxx.sql.bz2 and 1_xxx.sql.bz2 but will match all the others correctly.
How can I ensure the pattern only matches the exact number and not any number which contains the number?
For example, i would like the script below only to not match 1_xxx.sql.bz2
ls | grep -P "^[^1]+_([^_]+).+$" | xargs -d"\n" rm

I will need to keep items without a number at the start
I suggest using find like this to match all files in current directory excluding those that start with 1_:
find . -maxdepth 1 -type f -name '[0-9]*' -not -name '1_*' -delete
If your find doesn't support -delete then use:
find . -maxdepth 1 -type f -name '[0-9]*' -not -name '1_*' -exec rm {} +

use grep -v to invert the match, so you exclude files that match the pattern.
grep -v '^1_'

Recursively find filenames of exactly 8 hex characters, but not all 0-9, no lookahead (Mac terminal, bash)

I'm trying to write a regex to find files recursively with Mac Terminal (bash, not zsh even though Catalina wants me to switch over for whatever reason) using the find command. I'm looking for files that are:
Exactly 8 hexadecimal digits (0-9 and A-F)
But NOT only decimal digits (0-9)
In other words, it would match A1234567, ABC12DEF, 12345ABC, and ABCDABCD, but not 12345678 or 09876543.
To find files that are exactly 8 hex digits, I've used this:
find -E . -type f -regex '.*/[A-F0-9]{8}'
The .*/ is necessary to allow the full path name to precede the filename. This is eventually going to get fed to rm, so I have to keep the path.
It SEEMS like this should work to fulfill both of my requirements:
find -E . -type f -regex '.*/(?![0-9]{8})[A-F0-9]{8}'
But that returns an error:
find: -regex: .*/(?![0-9]{8})[A-F0-9]{8}: repetition-operator operand invalid
It seems like the find command doesn't support lookaheads. How can I do this without one?

With any POSIX-compliant find
find . -type f \
-name '????????' \
! -name '*[![:xdigit:]]*' \
-name '*[![:digit:]]*'
And if you insist on using regexps for this, here you go
find -E . -type f \
-regex '.*/[[:xdigit:]]{8}' \
! -regex '.*/[[:digit:]]*'
Those who use GNU find should drop -E and insert -regextype posix-extended after paths to make this work.

It's probably easiest to just filter out the results you don't like:
find -E . -type f -regex '.*/[A-F0-9]{8}' -print | egrep -v '.*/[0-9]{8}$'
$ find -E . -type f -regex '.*/[A-F0-9]{8}' -print
./01234567
./ABCDEFAF
./ABCDEF01
./ABCDEF2A
./ABCDEFA2
./x/01234567
./x/ABCDEFAF
./x/ABCDEF01
./x/ABCDEF2A
./x/ABCDEFA2
$ find -E . -type f -regex '.*/[A-F0-9]{8}' -print | egrep -v '.*/[0-9]{8}$'
./01234567
./ABCDEFAF
./ABCDEF01
./ABCDEF2A
./ABCDEFA2
./x/01234567
./x/ABCDEFAF
./x/ABCDEF01
./x/ABCDEF2A
./x/ABCDEFA2

My find didn't understand -E and was inexplicably grumpy about -regex in general, but this still worked:
find . -type f -name '[A-F0-9][A-F0-9][A-F0-9][A-F0-9][A-F0-9][A-F0-9][A-F0-9][A-F0-9]' -a -name '*[A-F]*'
Not as elegant as oguz ismail's, but easier to read for my clogged brain, lol

pattern to match multiple filenames with find utility

How to find multiple filenames with the bash find command?
$ find /path/* -type f -name pattern
The pattern should match a list of file names:
fname1.jpg
fname2.png
myfile.css
example.gif
I tryed with
https://alvinalexander.com/linux-unix/linux-find-multiple-filenames-patterns-command-example
find multiple filenames command: finding three filename extensions
find . -type f \( -name "*cache" -o -name "*xml" -o -name "*html" \)
and it works.
Anyway I think it would be cleaner with a -name pattern, rather than with a list of -names.
from
$ man find
-name pattern
I m searching for something like: -name '[fname2.png|myfile.css|example.gif ]'

-regex alternative would look as follows:
find . -type f -regextype posix-egrep -regex ".+\.(jpg|png|css)$"
As for -name option:
-name pattern - Base of file name (the path with the leading
directories removed) matches shell pattern.
Shell pattern is not a full-fledged regex pattern.

Just mix them:
find -name "aoc*" -regextype awk -regex ".*[0-9].(class|scala)"
This searches for files, matching shell-pattern aoc* and end in number, with ending .class or .scala.
For your example:
find -name "fname*" -regextype awk -regex ".*[0-9].(png|jpg|css)"
Available types are listet with:
find -regextype -help
However, I first tried "-regextype sed" which is available, but sed itself has options, changing the styles of regexes. And patterns I used to use with sed didn't work, but since the pattern works with awk, it's sufficient for me.

Regexp for matching filenames

I have a files:
first.error.log
second1.log
second2.log
FFFpc.log
TR.den.log
bla.error.log
and I would like to make a pattern that will match all files with error inside of filenames + few additional ones but no more:
For a sole error it would be
$FILE_PATTERN="*.error*"
But what if I want to match not only those errors but also all second and FFpc etc?
This does not work:
$FILE_PATTERN="*.error*|^second.*\log$|.*FFPC\.log$"
Thanks in advance for your help
EDIT:
$FILE_PATTERN is later used by:
find /somefolder -type f -name $FILE_PATTERN
EDIT: THIS FILE_PATTERN is in property file that is later used by bash script.

You need to use find with -regex option:
find -E /somefolder -type f -regex '\./(.*\.error.*|second.*log|.*FFPC\.log)$'
PS: Use -iregex for ignore case matching:
find -E /somefolder -type f -iregex '\./(.*\.error.*|second.*log|.*FFPC\.log)$'

$ ls | grep -i '\(.*error.*\)\|\(^second.*\log$\)\|\(.*FFPC\.log$\)'
bla.error.log
FFFpc.log
first.error.log
second1.log
second2.log
If you wanted to use with find
find /somefolder -type f | grep -i '\(.*error.*\)\|\(^second.*\log$\)\|\(.*FFPC\.log$\)'

If you're in bash I'm assuming you have to grep. Using grep -E or egrep will allow you to use alternation (ORing your searches)
$ stat * | egrep "(error|second)"
File: `first.error.log'
File: `second1.log'
File: `second2.log'
You could use ls instead of stat but sometimes ls will not give you what you predicted. But considering you're only search for filenames, ls should suffice.
$ ls | egrep "(error|second)"
first.error.log
second1.log
second2.log
You can use command substitution to store the output into a bash variable:
FILE_PATTERN=$(ls | egrep "(error|second)")

FILE_PATTERN=("*.error*" "second.*log" ".*FFPC.log")
ARGS=(-name "$FILE_PATTERN")
for F in "${FILE_PATTERN[#]:2}"; do
ARGS+=(-o -name "$F")
done
find /somefolder -type f '(' "${ARGS[#]}" ')'

You were close, theres just a few misplaced symbols.
Here's what I came up with:
.*\.error\..*|^second.*\.log$|.*FF[Pp][Cc]\.log$
here's a demo of a working modification of your regex:
http://regex101.com/r/rL3rM1/1

regextype with find command

I am trying to use the find command with -regextype but it could not able to work properly.
I am trying to find all c and h files send them to pipe and grep the name, func_foo inside those files. What am I missing?
$ find ./ -regextype sed -regex ".*\[c|h]" | xargs grep -n --color func_foo
Also in a similar aspect I tried the following command but it gives me an error like paths must precede expression:
$ find ./ -type f "*.c" | xargs grep -n --color func_foo

The accepted answer contains some inaccuracies.
On my system, GNU find's manpage says to run find -regextype help to see the list of supported regex types.
# find -regextype help
find: Unknown regular expression type 'help'; valid types are 'findutils-default', 'awk', 'egrep', 'ed', 'emacs', 'gnu-awk', 'grep', 'posix-awk', 'posix-basic', 'posix-egrep', 'posix-extended', 'posix-minimal-basic', 'sed'.
E.g. find . -regextype egrep -regex '.*\.(c|h)' finds .c and .h files.
Your regexp syntax was wrong, you had square brackets instead of parentheses. With square brackets, it would be [ch].
You can just use the default regexp type as well: find . -regex '.*\.\(c\|h\)$' also works. Notice that you have to escape (, |, ) characters in this case (with sed regextype as well). You don't have to escape them when using egrep, posix-egrep, posix-extended.

Why not just do:
find ./ -name "*.[c|h]" | xargs grep -n --color func_foo
and
find ./ -type f -name "*.c" | xargs grep -n --color func_foo
Regarding the valid paramters to find's option -regextype this comes verbatim from man find:
-regextype type
Changes the regular expression syntax understood by -regex and -iregex tests which occur later on
the command line. Currently-implemented types are emacs (this is the default),
posix-awk, posix-basic, posix-egrep and posix-extended
There is no type sed.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Regex: Find files not ending with numeral suffix - regex

The + operator belongs to the extended regular expressions. There are many workarounds: find -type f | grep -v '\.[0-9]\+$' find -type f | egrep -v '\.[0-9]+$' find -type f | grep -E -v '\.[0-9]+$' find -type f | grep -v '\.[0-9][0-9]*$'

Related

Deleting files not containing double digit number and pattern in grep

Recursively find filenames of exactly 8 hex characters, but not all 0-9, no lookahead (Mac terminal, bash)

pattern to match multiple filenames with find utility

Regexp for matching filenames

regextype with find command

Categories

Resources