Using Regular Expression in Find (CentOS 6.7) - regex

How can I use the regular expression \w{7}[D]\w{6}$ in a find command?

You can try :
find -regextype posix-extended -regex 'YOUR REGEX'
-regextype options are:
findutils-default, awk, egrep, emacs, gnu-awk, grep, posix-awk, posix-basic, posix-egrep, posix-extended
References:
Find Utils Manual (CentOS)

Related

How to delete files based on the extension in MacOS terminal using regex?

I need to delete a huge amount of .zip and .apk files from my project's root folder I'd like to do it using the bash terminal (MacOS X).
So far I've successfully made it with two commands:
$ find . -name \*.zip -delete
$ find . -name \*.apk -delete
But I want to do it in one using regex:
$ find . -regex '\w*.(apk|zip)' -delete
But this regular expression doesn't seem to work because it's deleting anything... what am I doing wrong?
MORE INFO:
An example of what I want to delete is android~1~1~sampleproject.zip.
$ find -E . -regex './[~a-zA-Z0-9]+\.(apk|zip)' -delete
The find tries to match the whole file name. So it is necessary to start the regex with ./
I believe find doesn't support \w \d etc. So replace them with character class. But find doesn't support them as well so you need to add -E to enable extended regular expressions.
-E Interpret regular expressions followed by -regex and -iregex primaries as extended (modern) regular expres-
sions rather than basic regular expressions (BRE's). The re_format(7) manual page fully describes both for-
mats.
Example
For example consider the following commands
$ ls *.json
bower.json composer.json package.json
$ find -E . -regex "\./[a-zA-Z0-9]+\.(json)"
./bower.json
./composer.json
./package.json
Note The above answer is specifically for BSD find. If you are using GNU find, it won't support -E option, instead it support -regextype posix-extended. I can rewrite the above example as
$ find . -regextype posix-extended -regex "\./\w+\.(json)"
I would use:
find . -type f \( -name "*.zip" -o -name "*.apk" \) -delete

Unix find not respecting regex

I'm trying to do a simple find in my /var/log directory to find all syslog files that are not zipped. What I have so far is the regex:
syslog(\.[0-9]*)?$
So this would find syslog, syslog.1, syslog.999, etc and skip over the gzipped logs like syslog.1.gz or anything else not matching the pattern of the aforementioned syslogs. I'm doing a pretty basic find command, too:
find /var/log -regextype posix-extended -regex "syslog(\\.[0-9]*)?$"
However, I always get an empty result! Now, I thought the regex I wrote was POSIX-extended compatible, but it doesn't seem to be so. Here are variations of the command I ran, to no avail:
find /var/log -regextype posix-extended -regex "syslog(\\.[0-9]*)?$"
sudo find /var/log -regextype posix-extended -regex "syslog(\\.[0-9]*)?$"
find /var/log -regextype posix-extended -regex "syslog"
find /var/log -regextype posix-extended -regex "(syslog)"
This following works as expected by listing all files in the directory, however, so I know my command format is correct.
find /var/log -regextype posix-extended -regex ".*"
What am I doing wrong?
The regex pattern you provide needs to match the whole path. That means that you don't need to anchor it at the beginning and end with ^ and $, it's already implicitly anchored at both ends. But you do need to provide a leading .* or something similar if the rest of your pattern should match somewhere other than the beginning (and remember, find paths always include a directory, even if it's .).
find . -regextype posix-extended -regex '.*syslog(\.[0-9]*)?'
works for me.

Or condition in find's regex expression

Why does it look up nothing when files are exist?
find ./ -regex '.*(jar|war)'
See man find for information about supported syntax for regular expressions:
-regextype type
Changes the regular expression syntax understood by -regex and
-iregex tests which occur later on the command line. Currently-
implemented types are emacs (this is the default), posix-awk,
posix-basic, posix-egrep and posix-extended.
This works:
$ find . -regextype egrep -regex '(.*jar)|(.*war)'
To avoid using -regextype change the expression to an emacs regular expression:
$ find . -regex '.*\(jar\|war\)'

Linux $FIND and hex notation of Unicode characters' range?

I'am unable to get the unicode hex notation working within linux $find utility and its -regex functionality. There is my case.
Given a folder with 5 files suchs :
./cmn-ζˆ‘.flac
./cmn-ηš„.flac
./cmn-δΈ‰.flac
./cmn-a.flac
./cmn-b.flac
To find the files with CJK characters, I tried the following :
find ./ -regex "./cmn-.\.flac" #Find *ALL* files "*.txt", not what I want.
find ./ -regex "./cmn-[\x4e00-\x9fa5]\.flac" #fails
find ./ -regex "./cmn-[\u4e00-\u9fa5]\.flac" #fails
find ./ -regex "./cmn-[\x{4e00}-\x{9fa5}]\.flac" #fails
find ./ -regex "./cmn-[\u{4e00}-\u{9fa5}]\.flac" #fails
find ./ -regex "./cmn-[\U0004e00-\U0009fa5]\.flac" #fails
without success.
How to find the files with CJK characters using find ./ -regex "[myRegEx]" and an unicode hex notation regex ?
As I explained it in What regex to find files with CJK characters using find command? find use POSIX regex that doesn't support this kind of pattern.
Explanation
Looking at the -regex-type option I only see POSIX regular expression types: emacs (default), posix-awk, posix-basic, posix-egrep and posix-extended).
Which doesn't support custom hex range definition (compare Perl with POSIX).
Solution
But grep does have an experimental -P or --perl-regexp option where you can use this kind of pattern:
find . -name 'cmn-*.flac' -print | grep -P '[\x4e00-\x9fa5]'
see command explanation.

Wording a regex with bash find and boolean match

I'm not at all sure why this doesn't work. Other posts here suggest that it should. I just want a regex on find to locate all files that match ___orig.png and ___DIFF.png. This will find the first:
find . -type f -regex '.*_____orig\.png'
But this finds nothing:
find . -type f -regex '.*_____(orig|DIFF)\.png'
What is the correct way to phrase the regex to match both? (Yes I know I can use -or to have a much longer and less maintainable comamnd...)
You need to escape both parens and the pipe, use:
find . -type f -regex '.*_____\(orig\|DIFF\)\.png'
GNU find's -regex uses emacs flavour by default, which I'm not very familiar with. You can change the regex used with -regextype. With -regextype posix-extended your current -regex should work.
The portable way is to use two -name operators.
find . -type f \( -name "*_____orig.png" -o -name "*_____DIFF.png" \) -print
Or, with bash 4.0 or newer, you can use globstar and extglob instead of find
shopt -s globstar extglob
for file in ./**/*_____#(orig|DIFF).png; do
echo "$file"
done