Searching files in directory using regex and grep - regex

So I have to find every file in the /etc directory that start with a,b or c
what i have is: grep -l '/^[a-cA-C].*/g' /etc/* though i keep getting every file in the /etc directory.
I use grep -lto get every file (I guess using find or grep doesn't matter
'/^[a-cA-C].*/g' to find everything that starts with a,b or c uppercase or lowercase followed by zero or more characters ending with a global search so it doesn't stop after the first match
I know the regex is right cause i've checked it with a regex-checker online.
EDIT: found the solution --> ls /etc/[a-cA-C]*

Here my example:
find ./ -type f -exec basename {} \; | grep -Ei '^(a|b|c)'
It search recursively and find all files, but return in output only basename of file, is it ok for you?

You can try this one:
find | grep '^\./[abc]'

Related

How to use grep to find in a directory by a regex?

I tried
grep -R '.*invalidTemplateName.*' -regex './online_admin/.*/UTF-8/.*'
to find all occurences of possible mathces of the '.invalidTemplateName.' regex within a directory regex pattern './online_admin/.*/UTF-8/.*', but it doesn't work. I got the message:
grep: ./online_admin/.*/UTF-8/.*: No such file or directory
If I use
grep -R '.*invalidTemplateName.*' .
it looks up in all subdirectory of the current directory that's overwhelming. How can I specify a directory pattern in grep? Is it possible?
Find might be a better choice here:
find ./online_admin/*/UTF-8/* -type f -exec grep -H "invalidTemplateName" {} \;
Find will locate all files in the locations you want, including subdirs of UTF-8 and then execute grep on each file. the -H argument ensures the filename will be printed along with the match. If you want only the filename, use the -L switch instead.
with find you could do something like that:
find /abs/path/to/directory -maxdepth 1 -name '.*invalidTemplateName.*'
using the name argument you can directly filter by names. you can also use wildcards for the filter-string.
using the maxdepth argument you can specify the level of recursion to look up the files. 1 means to look up in /abs/path/to/directory, 2 means to look up in /abs/path/to/directory and in the first level of directories in /abs/path/to/directory as well.

How to grep for a file extension

I am currently trying to a make a script that would grep input to see if something is of a certain file type (zip for instance), although the text before the file type could be anything, so for instance
something.zip
this.zip
that.zip
would all fall under the category. I am trying to grep for these using a wildcard, and so far I have tried this
grep ".*.zip"
But whenever I do that, it will find the .zip files just fine, but it will still display output if there are additional characters after the .zip so for instance .zippppppp or .zipdsjdskjc would still be picked up by grep. Having said that, what should I do to prevent grep from displaying matches that have additional characters after the .zip?
Test for the end of the line with $ and escape the second . with a backslash so it only matches a period and not any character.
grep ".*\.zip$"
However ls *.zip is a more natural way to do this if you want to list all the .zip files in the current directory or find . -name "*.zip" for all .zip files in the sub-directories starting from (and including) the current directory.
On UNIX, try:
find . -type f -name \*.zip
You can also use grep to find all files with a specific extension:
find .|grep -e "\.gz$"
The . means the current folder.
If you want to specify a folder other than the current folder, just replace the . with the path of the folder.
Here is an example: Let's find all files that end with .gz and are in the folder /var/log
find /var/log/ |grep -e "\.gz$"
The output is something similar to the following:
✘ ⚙> find /var/log/ |grep -e "\.gz$"
/var/log//mail.log.1.gz
/var/log//mail.log.0.gz
/var/log//system.log.3.gz
/var/log//system.log.7.gz
/var/log//system.log.6.gz
/var/log//system.log.2.gz
/var/log//system.log.5.gz
/var/log//system.log.1.gz
/var/log//system.log.0.gz
/var/log//system.log.4.gz
The $ sign says that the file extension is ending with gz
I use this to get a listing of the file types inside a folder.
find . -type f | egrep -i -E -o "\.{1}\w*$" | sort -su
Outputs for example:
.DS_Store
.MP3
.aif
.aiff
.asd
.doc
.flac
.jpg
.m4a
.m4p
.m4r
.mp3
.pdf
.png
.txt
.wav
.wma
.zip
BONUS: with
find . -type f | egrep -i -E -o "\.{1}\w*$" | sort | uniq -c
You'll get the file count:
106 .DS_Store
35 .MP3
89 .aif
5 .aiff
525 .asd
1 .doc
60 .flac
48 .jpg
149 .m4a
11 .m4p
1 .m4r
12844 .mp3
1 .pdf
5 .png
9 .txt
108 .wav
44 .wma
2 .zip
You need to do a couple of things. It should look like this:
grep '.*\.zip$'
You need to escape the second dot, so it will just match a dot, and not any character. Using single quotes makes the escaping a bit easier.
You need the dollar sign at the end of the line to indicate that you want the "zip" to occur at the end of the line.
grep -r pattern --include="*.txt" /path/to/dir/
Try: grep -o -E "(\\.([A-z])+)+"
I used this to get multi-dotted/multiple extensions. So if the input was hello.tar.gz, then it would output .tar.gz.
For single dotted, use grep -o -E "\\.([A-z])+$".
Tested on Cygwin/MingW+MSYS.
One more fix/addon of the above example:
# multi-dotted/multiple extensions
grep -oEi "(\\.([A-z0-9])+)+" file.txt
# single dotted
grep -oEi "\\.([A-z0-9])+$" file.txt
This will get file extensions like '.mp3' and etc.
Just reviewing some of the other answers. The .* isn't necessary, and if you're looking for a certain file extension, it's best to include -i so that it's case-insensitive; in case the file is HELLO.ZIP, for example. I don't think the quotes are necessary, either.
grep -i \.zip$
If you just want to find in the current folder, why not with this simple command without grep ?
ls *.zip
Simply do :
grep ".*.zip$"
The "$" indicates the end of line

Find a string in multiple files using grep

I have a folder with sub-folders inside, all have many types of files. I want to search for a word inside the .css-files. I am using Windows 7 and I have grep.
How I can use grep to :
Find pattern and print it
Give file name (and path) if pattern found
Actually you don't need find. Just use:
grep -R --include=*.css -H pattern .
this will recurse and look for all *.css in subdirectories, while -H will show the filename.
find folder/ -name "*.css" |xargs grep "your-pattern"
You will need to install cygwin to do this.
if the files in which we have to look for, has pattern then we can use this.
Consider I'm looking for pattern "cardlayout" in files named chap1.lst chap2.lst and so on.
then the command
grep -e 'cardlayout' ` find . -name "chap??.lst"`
hope this would help

List all files not starting with a number

I want to examine the all the key files present in my /proc. But /proc has innumerable directories corresponding to the running processes. I don't want these directories to be listed. All these directories' names contain only numbers. As I am poor in regular expressions, can anyone tell me whats the regex that I need to send to ls to make it NOT to search files/directories which have numbers in their name?
UPDATE: Thanks to all the replies! But I would love to have a ls alone solution instead of ls+grep solution. The ls alone solutions offered till now doesn't seem to be working!
You don't need grep, just ls:
ls -ad /proc/[^0-9]*
if you want to search the whole subdirectory structure use find:
find /proc/ -type f -regex "[^0-9]*" -print
All files and directories in /proc which do not contain numbers (in other words, excluding process directories):
ls -d /proc/[^0-9]*
All files recursively under /proc which do not start with a number:
find /proc -regex '.*/[0-9].*' -prune -o -print
But this will also exclude numeric files in subdirectories (for example /proc/foo/bar/123). If you want to exclude only the top-level files with a number:
find /proc -regex '/proc/[0-9].*' -prune -o -print
Hold on again! Doesn't this mean that any regular files created by touch /proc/123 or the like will be excluded? Theoretically yes, but I don't think you can do that. Try creating a file for a PID which does not exist:
$ sudo touch /proc/123
touch: cannot touch `/proc/123': No such file or directory
Use grep with -v which tells it to print all lines not matching the pattern.
ls /proc | grep -v '[0-9+]'
ls /proc | grep -v -E '[0-9]+'
Following regex matches all the characters except numbers
^[\D]+?$
Hope it helps !
For the sake of of completion. You may apply Mithandir's answer with find.
find . -name "[^0-9]*" -type f

How to exclude a directory in a recursive search using grep?

How to do a recursive search using grep while excluding a particular directory ?
Background : I have a large directory consisting of log files which I would like to eliminate in the search. The easiest way is to move the log folder. Unfortunately I cannot do that, as the project mandates the location.
Any idea how to do it ?
are you looking for this?
from grep man page:
--exclude-dir=DIR
Exclude directories matching the pattern DIR from recursive searches.
As an alternate, if you can use find in your search, it may also be useful:
find [directory] -name "*.log" -prune -o -type f -print|grep ...
The [directory] can actually be the current directory if you want (just a . will do).
The next part, -name "*.log" -prune is all together. It searches for filenames with the pattern *.log and will strip them OUT of your results.
Next is -o (for "or")
Then, -type f -print which says "print (to stdout) any type that is a file."
Those results should include every file (no directories are returned) found in [directory] except those that end in .log. Then you can grep the results as you need.