Unix find not respecting regex - regex

I'm trying to do a simple find in my /var/log directory to find all syslog files that are not zipped. What I have so far is the regex:
syslog(\.[0-9]*)?$
So this would find syslog, syslog.1, syslog.999, etc and skip over the gzipped logs like syslog.1.gz or anything else not matching the pattern of the aforementioned syslogs. I'm doing a pretty basic find command, too:
find /var/log -regextype posix-extended -regex "syslog(\\.[0-9]*)?$"
However, I always get an empty result! Now, I thought the regex I wrote was POSIX-extended compatible, but it doesn't seem to be so. Here are variations of the command I ran, to no avail:
find /var/log -regextype posix-extended -regex "syslog(\\.[0-9]*)?$"
sudo find /var/log -regextype posix-extended -regex "syslog(\\.[0-9]*)?$"
find /var/log -regextype posix-extended -regex "syslog"
find /var/log -regextype posix-extended -regex "(syslog)"
This following works as expected by listing all files in the directory, however, so I know my command format is correct.
find /var/log -regextype posix-extended -regex ".*"
What am I doing wrong?

The regex pattern you provide needs to match the whole path. That means that you don't need to anchor it at the beginning and end with ^ and $, it's already implicitly anchored at both ends. But you do need to provide a leading .* or something similar if the rest of your pattern should match somewhere other than the beginning (and remember, find paths always include a directory, even if it's .).
find . -regextype posix-extended -regex '.*syslog(\.[0-9]*)?'
works for me.

Related

Find Regextype non recursive

I'm trying to isolate some PHP infected files which includes 8 alphanurical chars from the /home directory and recursively.
I'm able to have them located once I'm on the directory with the command:
find ./ -regextype posix-egrep -regex ^./[a-zA-Z0-9]{8}\.php$
Or
find ./ -regextype posix-egrep -regex '^./[a-zA-Z0-9]{8}\.php$'
But as soon as I try from another directory:
find /home -regextype posix-egrep -regex '^./[a-zA-Z0-9]{8}\.php$'
It comes without any results.
I have tried to add the flag -L (--follow) but it comes without any results and there are many. file system loop errors.
I have read many answers online which seems to be related on glob and find works.
I tried different solutions such as :
find . -type f -print | egrep '^./[a-zA-Z0-9]{8}\.php$'
Ideally the output should be the full path regardless of depth so I may quickly delete them all.
The main point is that find command regex needs to match the entire path with the file name. So, if there is are other folder/directory names before the file name, you need to consume them, too.
Besides, [a-zA-Z0-9] is better replaced with [[:alnum:]]:
find /home -regextype posix-egrep -regex '^.*/[[:alnum:]]{8}\.php$'
Actually, ^ is redundant here:
find /home -regextype posix-egrep -regex '.*/[[:alnum:]]{8}\.php$'
will work, too.

Differentiate between .h and .sh with find and regex

I am trying to remove files with certain extensions from a directory. The command I am using is not able to differentiate between .h and .sh. Where can I improve my regex?
This is my current command:
find directory/ -type f -regextype posix-extended -regex '.*.(java|[hc]|cpp|hpp|cc|hh)'
Currently this returns .csh and .sh files. I do not want that to happen. When I remove "[hc]" this fixes the problem, but then I cannot find any .c or .h files. I have also tried
find directory/ -type f -regextype posix-extended -regex '.*.(java|h|c|cpp|hpp|cc|hh)'
but this returns .csh and .sh files as well.
Add an end of input anchor:
find ... -regex '.*\.(java|h|c|cpp|hpp|cc|hh)$'
This makes the list an absolute list of extensions, rather than just a prefix of the extension.

Or condition in find's regex expression

Why does it look up nothing when files are exist?
find ./ -regex '.*(jar|war)'
See man find for information about supported syntax for regular expressions:
-regextype type
Changes the regular expression syntax understood by -regex and
-iregex tests which occur later on the command line. Currently-
implemented types are emacs (this is the default), posix-awk,
posix-basic, posix-egrep and posix-extended.
This works:
$ find . -regextype egrep -regex '(.*jar)|(.*war)'
To avoid using -regextype change the expression to an emacs regular expression:
$ find . -regex '.*\(jar\|war\)'

how to set gnu find use posix-extended regex type as default

When I use find regex to find .c .cpp .h files
I have to type
find . -regex ".*\.\(c\|cpp\|h)"
or use posix-extended regex type
find . -regextype posix-extended -regex ".*\.(c|cpp)"
The first one have so many '\' and not easy to read.
The second one have to type much more characters. And I am familiar with the second one.
Is there any way to make find use posix-extended regex as default?
I tried to set a alias
alias find='find -regextype posix-extended'
at my .zshrc file. But it doesn't work because find need put the path on the second argument.
Thanks for any suggestion.
With zsh you have a few options. You can define a global alias:
alias -g reg="-regextype posix-extended"
This will allow you to type find ./ reg -regex ".*\.(c|cpp)" and zsh will do the replacement for you.
The other option is to create a function. Something like:
function findr()
{
dir=$1;
shift;
find $dir -regextype posix-extended $*
}
You can call it as follows:
findr ./ -regex ".*\.(c|cpp)"

Wording a regex with bash find and boolean match

I'm not at all sure why this doesn't work. Other posts here suggest that it should. I just want a regex on find to locate all files that match ___orig.png and ___DIFF.png. This will find the first:
find . -type f -regex '.*_____orig\.png'
But this finds nothing:
find . -type f -regex '.*_____(orig|DIFF)\.png'
What is the correct way to phrase the regex to match both? (Yes I know I can use -or to have a much longer and less maintainable comamnd...)
You need to escape both parens and the pipe, use:
find . -type f -regex '.*_____\(orig\|DIFF\)\.png'
GNU find's -regex uses emacs flavour by default, which I'm not very familiar with. You can change the regex used with -regextype. With -regextype posix-extended your current -regex should work.
The portable way is to use two -name operators.
find . -type f \( -name "*_____orig.png" -o -name "*_____DIFF.png" \) -print
Or, with bash 4.0 or newer, you can use globstar and extglob instead of find
shopt -s globstar extglob
for file in ./**/*_____#(orig|DIFF).png; do
echo "$file"
done