Why is this pattern search hanging? - regex

I am running Linux CentOs and i am trying to find some malicious code in my wordpress installation with this command:
grep -r 'php \$[a-zA-Z]*=.as.;' * |awk -F : '{print $1}'
When I hit enter, the process just hangs...I want to double check that I have the syntax right and all I have to do is wait?
How Can I get some sort of feedback/something happening while its searching?
Thanks

Instead of using grep -r to recursively grep, one option is to use find to get the list of filenames, and feed them to grep one at a time. That lets you add other commands alongside the grep, such as echos. For example, you could create a script called is-it-malware.sh that contains this:
#!/bin/bash
if grep 'php \$[a-zA-Z]*=.as.;' "$1" >/dev/null
then
"!!! $1 is malware!!!"
else
" $1 is fine."
fi
and run this command:
find -type f -exec ./is-it-malware.sh '{}' ';'
to run your script over every file in the current directory and all of its subdirectories (recursively).

Its probably taking its time due to the -r * (recursively, all files/dirs)?
Consider
find -type f -print0 | xargs -0trn10 grep -l 'php \$[a-zA-Z]*=.as.;'
which will process the files in batches of (max) 10, and printing those commands as it goes.
Of course, like that you can probably optimize the heck out of it, with a simple measure like
find -type f -iname '*.php' -print0 | xargs -0trn10 grep -l 'php \$[a-zA-Z]*=.as.;'
Kind of related:
You can do similar things without find for smaller trees, with recent bash:
shopt -s globstar
grep -l 'pattern' **/*.php

Related

Shell returns all files when regex doesn't match any file in directory?

I use below command in shell file and works fine in the directories that regex matches.
Problem is, it lists all files when there is no match for regex. Anyone knows why it has this behaviour?
Are there anyway to avoid it?
find . -type f -mtime +365 | egrep '.*xxx.yyy*.*'|grep -v "[^.]/" | xargs ls -lrt | tr -s " " | cut -d" " -f6-9
thanks for your time.
Note: I m using this script with splunk forwarder on solaris 8.
If the input of xargs is empty, then it will execute ls -lrt in the current folder.
Try xargs -i "{}" ls -lrt "{}" instead. That forces xargs to put the input arguments into a certain place in the command that it executes. If it doesn't have any input, it can't and will skip running the command at all.
If you have GNU xargs, you can use the switch --no-run-if-empty instead.
If that doesn't work, try to move all the greping into find, so you can use -ls to display the list of files. That will also avoid running the ls command if no file matches.

Find and delete all core files in a directory

Core files are generated when a program terminates abnormally. It consists the working memory of the system when the program exits abnormally. You can use a debugger with the generated core file to debug the program. The Challenge is:
Delete all core files from a directory (recursive search). Core files are quite huge in size and you may want to delete them to save memory
Make sure you don't delete any folder named core and some other filed named core which not actually a memory/system dump
After some searching on the internet, I found a nice piece of code to do this. Drawback is it asks you to recognize the core file to make sure its not some other file named core. Source : http://csnbbs.com/
Code:
find . -name core\* -user $USER -type f -size +1000000c -exec file {} \; -exec ls -l {} \; -exec printf "\n\ny to remove this core file\n" \; -exec /bin/rm -i {} \;
Please post if you have better solutions.
To delete all files matching to the regex "*.core" you can use:
find . -name "*.core" -type f -delete
find supports many filters like:
-size +1000000c # size > 1G
-user $USER # specific user
-mtime +3 # older than 3 days
if you are afraid for files ending with "core" that are not core files you can filter by file command piped to some other linux commands. for example -
find . -name "*.core" -type f -exec file {} \; | grep 'core file' | awk -F":" '{print $1}' | xargs -n1 -P4 rm -rf

Find & replace recursively except for certain files

With regards to this post, how would I exclude one or more files from applying the string replacement? By using the aforementioned post as an example, I would like to be able to replace "apples" with "oranges" in all descendant files of a given directory except, say, ./fpd/font/symbol.php.
My idea was using the -regex switch in the find command but unfortunately it does not have a -v option like the grep command hence I can't negate the regex to not match the files where the replacement must occur.
I use this in my Git repository:
grep -ilr orange . | grep -v ".git" | grep -e "\\.php$" | xargs sed -i s/orange/apple/g {}
It will:
Run find and replace only in files that actually have the word to be replaced;
Not process the .git folder;
Process only .php files.
Needless to say you can include as many grep layers you want to filter the list that is being passed to xargs.
Known issues:
At least in my Windows environment it fails to open files that have spaces in the path or name. Never figured that one out. If anyone has an idea of how to fix this I would like to know.
Haven't tested this but it should work:
find . -path ./fpd/font/symbol.php -prune -o -exec sed -i 's/apple/orange/g' {} \;
You can negate with ! (or -not) combined with -name:
$ find .
.
./a
./a/b.txt
./b
./b/a.txt
$ find . -name \*a\* -print
./a
./b/a.txt
$ find . ! -name \*a\* -print
.
./a/b.txt
./b
$ find . -not -name \*a\* -print
.
./a/b.txt
./b

Unix find with wildcard directory structure

I am trying to do a find where I can specify wildcards in the directory structure then do a grep for www.domain.com in all the files within the data directory.
ie
find /a/b/c/*/WA/*/temp/*/*/data -type f -exec grep -l "www.domain.com" {} /dev/null \;
This works fine where there is only one possible level between c/*/WA.
How would I go about doing the same thing above where there could be multiple levels between C/*/WA?
So it could be at
/a/b/c/*/*/WA/*/temp/*/*/data
or
/a/b/c/*/*/*/WA/*/temp/*/*/data
There is no defined number of directories between /c/ and /WA/; there could be multiple levels and at each level there could be the /WA/*/temp/*/*/data.
Any ideas on how to do a find such as that?
How about using a for loop to find the WA directories, then go from there:
for DIR in $(find /a/b/c -type d -name WA -print); do
find $DIR/*/temp/*/*/data -type f \
-exec grep -l "www.domain.com" {} /dev/null \;
done
You may be able to get all that in a single command, but I think clarity is more important in the long run.
Assuming no spaces in the paths, then I'd think in terms of:
find /a/b/c -name data -type f |
grep -E '/WA/[^/]+/temp/[^/]+/[^/]+/data' |
xargs grep -l "www.domain.com" /dev/null
This uses find to find the files (rather than making the shell do most of the work), then uses the grep -E (equivalent to egrep) to select the names with the correct pattern in the path, and then uses xargs and grep (again) to find the target pattern.

remove files when name does NOT contain some words

I am using Linux and intend to remove some files using shell.
I have some files in my folder, some filenames contain the word "good", others don't.
For example:
ssgood.wmv
ssbad.wmv
goodboy.wmv
cuteboy.wmv
I want to remove the files that does NOT contain "good" in the name, so the remaining files are:
ssgood.wmv
goodboy.wmv
How to do that using rm in shell? I try to use
rm -f *[!good].*
but it doesn't work.
Thanks a lot!
This command should do what you you need:
ls -1 | grep -v 'good' | xargs rm -f
It will probably run faster than other commands, since it does not involve the use of a regex (which is slow, and unnecessary for such a simple operation).
With bash, you can get "negative" matching via the extglob shell option:
shopt -s extglob
rm !(*good*)
You can use find with the -not operator:
find . -not -iname "*good*" -a -not -name "." -exec rm {} \;
I've used -exec to call rm there, but I wonder if find has a built-in delete action it does, see below.
But very careful with that. Note in the above I've had to put an -a -not -name "." clause in, because otherwise it matched ., the current directory. So I'd test thoroughly with -print before putting in the -exec rm {} \; bit!
Update: Yup, I've never used it, but there is indeed a -delete action. So:
find . -not -iname "*good*" -a -not -name "." -delete
Again, be careful and double-check you're not matching more than you want to match first.