How to exclude a directory in a recursive search using grep? - regex

How to do a recursive search using grep while excluding a particular directory ?
Background : I have a large directory consisting of log files which I would like to eliminate in the search. The easiest way is to move the log folder. Unfortunately I cannot do that, as the project mandates the location.
Any idea how to do it ?

are you looking for this?
from grep man page:
--exclude-dir=DIR
Exclude directories matching the pattern DIR from recursive searches.

As an alternate, if you can use find in your search, it may also be useful:
find [directory] -name "*.log" -prune -o -type f -print|grep ...
The [directory] can actually be the current directory if you want (just a . will do).
The next part, -name "*.log" -prune is all together. It searches for filenames with the pattern *.log and will strip them OUT of your results.
Next is -o (for "or")
Then, -type f -print which says "print (to stdout) any type that is a file."
Those results should include every file (no directories are returned) found in [directory] except those that end in .log. Then you can grep the results as you need.

Related

How to ignore file with .<numberic>.ext in git?

I have a list of file in my project:
For example:
1. src/index.1.js
2. src/screens/index.1.js
3. src/screens/index.2.js
I want to ignore all the files having the numeric number.
I have tried using **/*.1.* , **/*.2.*. Is there a way to ignore all the file with numeric value?
You can use a range. For your example:
**/*.[0-9].js
Would match a js file in any directory that ends with .(number).js
Git uses glob pattern to match ignored files. Use the following to ignore all such above-mentioned files (with multi-digit numbers also).
**/*.[0-9]*.js
Why don't you run the following find command after eventually adapting the \.js part if you do not want to take into account only the .js files:
find . -type f -regextype sed -regex '.*\/.*\.[0-9]\+\.js'
./src/screens/index.2.js
./src/screens/index.123.js
./src/index.1.js
when you find all the files you are interested in, change your find command into:
find . -type f -regextype sed -regex '.*\/.*\.[0-9]\+\.js' -exec git checkout {} \;
to checkout those files.

How to use grep to find in a directory by a regex?

I tried
grep -R '.*invalidTemplateName.*' -regex './online_admin/.*/UTF-8/.*'
to find all occurences of possible mathces of the '.invalidTemplateName.' regex within a directory regex pattern './online_admin/.*/UTF-8/.*', but it doesn't work. I got the message:
grep: ./online_admin/.*/UTF-8/.*: No such file or directory
If I use
grep -R '.*invalidTemplateName.*' .
it looks up in all subdirectory of the current directory that's overwhelming. How can I specify a directory pattern in grep? Is it possible?
Find might be a better choice here:
find ./online_admin/*/UTF-8/* -type f -exec grep -H "invalidTemplateName" {} \;
Find will locate all files in the locations you want, including subdirs of UTF-8 and then execute grep on each file. the -H argument ensures the filename will be printed along with the match. If you want only the filename, use the -L switch instead.
with find you could do something like that:
find /abs/path/to/directory -maxdepth 1 -name '.*invalidTemplateName.*'
using the name argument you can directly filter by names. you can also use wildcards for the filter-string.
using the maxdepth argument you can specify the level of recursion to look up the files. 1 means to look up in /abs/path/to/directory, 2 means to look up in /abs/path/to/directory and in the first level of directories in /abs/path/to/directory as well.

Find file with at least one number in /usr/include

Using the find command, I want to see the files in /usr/include whose name contains at least one number.
I tried this command :
find /usr/include -type f -regex '.\*[0-9].\*$'
But the number is not always in the name of the file but sometimes in the path. For example /usr/include/linux/netfilter_ipv4/ipt_ah.h is found.
After that, I tried this command :
find /usr/include -type f -regex '/[^\/]*[0-9][^\/]*$'
But it returns nothing.
How can I resolve this problem?
If you use the -name test instead of the -regex test, it will match only the filename, ignoring the preceding directories (see the man page). Note that -name uses a shell pattern rather than a regex pattern, so the syntax is slightly different. You can use this command to find files which have numbers in the filename:
find /usr/include -type f -name '*[0-9]*'
With regex itself:
find /usr/include/ -type f -regex ".*/[^/]*[0-9][^/]*"
Here, we look for atleast 1 number after the last / in the file names.

Unix - Using find to List all .html files. (Do not use shell wildcards or the ls command)

I've tried 'find -name .html$', 'find -name .html\>'.
None worked.
I'd like to know why these two are wrong and what's the right one to use with no wildcards?
What you needed was
find -name '*.html'
Or for regex:
find -regex '.*/.*\.html'
To ignore case, use -iname or -iregex:
find -iname '*.html'
find -iregex '.*/.*\.html'
Manual for -name:
-name pattern
Base of file name (the path with the leading directories
removed) matches shell pattern pattern. The metacharacters
(`*', `?', and `[]') match a `.' at the start of the base name
(this is a change in findutils-4.2.2; see section STANDARDS CON‐
FORMANCE below). To ignore a directory and the files under it,
use -prune; see an example in the description of -path. Braces
are not recognised as being special, despite the fact that some
shells including Bash imbue braces with a special meaning in
shell patterns. The filename matching is performed with the use
of the fnmatch(3) library function. Don't forget to enclose
the pattern in quotes in order to protect it from expansion by
the shell.
find . -name '*.html'
You have to single quote the wildcard to keep the shell from globbing it when passing it to find.
You want
find . -name "*.html"
Find uses emacs regex by default, not the posix you are probably used to.
You are missing a couple things here. First of all the path. If you are searching in the local path, use . For example: find . will list every file and directory recursively in the current directory. Second a * is a wildcard. So to find all the .html files in the current directory, try
find . -name *.html

List all files not starting with a number

I want to examine the all the key files present in my /proc. But /proc has innumerable directories corresponding to the running processes. I don't want these directories to be listed. All these directories' names contain only numbers. As I am poor in regular expressions, can anyone tell me whats the regex that I need to send to ls to make it NOT to search files/directories which have numbers in their name?
UPDATE: Thanks to all the replies! But I would love to have a ls alone solution instead of ls+grep solution. The ls alone solutions offered till now doesn't seem to be working!
You don't need grep, just ls:
ls -ad /proc/[^0-9]*
if you want to search the whole subdirectory structure use find:
find /proc/ -type f -regex "[^0-9]*" -print
All files and directories in /proc which do not contain numbers (in other words, excluding process directories):
ls -d /proc/[^0-9]*
All files recursively under /proc which do not start with a number:
find /proc -regex '.*/[0-9].*' -prune -o -print
But this will also exclude numeric files in subdirectories (for example /proc/foo/bar/123). If you want to exclude only the top-level files with a number:
find /proc -regex '/proc/[0-9].*' -prune -o -print
Hold on again! Doesn't this mean that any regular files created by touch /proc/123 or the like will be excluded? Theoretically yes, but I don't think you can do that. Try creating a file for a PID which does not exist:
$ sudo touch /proc/123
touch: cannot touch `/proc/123': No such file or directory
Use grep with -v which tells it to print all lines not matching the pattern.
ls /proc | grep -v '[0-9+]'
ls /proc | grep -v -E '[0-9]+'
Following regex matches all the characters except numbers
^[\D]+?$
Hope it helps !
For the sake of of completion. You may apply Mithandir's answer with find.
find . -name "[^0-9]*" -type f