Having trouble with GREP and REGEX - regex

I have a text file that stores combinations of four numbers in the following format:
Num1,Num2,Num3,Num4
Num5,Num6,Num7,Num8
.............
I have a whole bunch of such files and what I need is to grep for all filenames that contains the pattern described above.
I constructed my grep as follows:
grep -l "{d+},{d+},{d+},{d+}" /some/path/to/file/name
The grep terminates without returning anything.
Can somebody point out what I might be doing wrong with my grep statement?
Thanks

This should do what you want:
egrep -l '[[:digit:]]+,[[:digit:]]+,[[:digit:]]+,[[:digit:]]+' /some/path/to/file/name

One way is using a perl regexp:
grep -Pl "\d+,\d+,\d+,\d+" /some/path/to/file/name
In your syntax d is literal. It should be escaping that letter, but is not accepted by grep regular regexp.

Related

How to grep file to find lines like <version>1.1.9-beta</version>?

Looking for suggestion to cat file | grep REGEX to get the lines with <version>anything</version>.
grep -F '<version>1.1.9-beta</version>' file
-F will match your pattern as literal text
you don't need that useless cat
if you really mean anything: try grep '<version>.*</version>' file or grep -P '<version>.*?</version>' file , however searching xml with regex is bad idea.
Use the -E option to match a regular expression:
grep -E "<version>.*</version>" file
Refer to these rules for the regular expression: https://www.gnu.org/savannah-checkouts/gnu/grep/manual/grep.html#Regular-Expressions
For example, to match the typical version format (3.14, or 13.14, or 0.1458) you can type:
grep -E "<version>[0-9]?\.[0-9]?</version>" file
You can do:
grep '<version>[^<]*</version>' file.xml
[^<]* will match zero or more characters upto next <.

Basic grep regex

I'm trying to use grep to find all files whose contents contain ".ple". I tried:
grep -ilr /.\.ple/ *
But it did not work.
Would someone be able to tell me what I did wrong?
Thanks.
there are many programming/script languages allow us to write regex between /.../ (slashes). But with grep, you cannot do it, you should wrap your regex expression in quotes. I think that is the problem of your grep line. The correct one:
grep -ilr '.\.ple' *
with yours: grep -ilr /.\.ple/ * grep will look for lines like this:
foo/a.ple/bar
that is, the slash would be literal letter.
What do you mean by it did not work?
Try this:
grep -rlE '\.ple$'
The above should work.
Edit: To search all files whose contents contain ".ple" try this
grep -rl '\.ple'

How to use grep with two patterns?

I have followed many links and reference documentation but still facing errors.
I want to list all the words start or ending with 'sx'.
I already found a solution which works fine:
awk '/^sx|sx$/' usr/dict/words
However I tried to use grep or egrep and neither worked.
And then I made it simpler. Any word which contains 'ax' or 'sx', and again it did not work!
grep 'ax|sx' words
grep -E '^ax\|sx$' words
egrep '^ax\|sx$' words
Error:
grep: illegal option -- E
Usage: grep -hblcnsviw pattern file . . .
Any suggestion appreciated.
This should work for you:
egrep '(^ax|sx$)' usr/dict/words
NOTE:
I couldn't find any words ending in sx on the dictionary I've used.

regex for one character or another?

I'm searching for a regex code which lists all the lines that contain an a OR an i.
I tryed this:
grep -E '[(a|i)]{1}' testFile.txt
but this gives me the words containing a or i and words that contain a en i.
What's wrong?
You can achieve that with:
grep -E "^[^ai]*(a|i){1}[^ai]*$" testFile.txt
If you want an exclusive OR then try this:
grep -E '^[^i]*a[^i]*$|^[^a]*i[^a]*$'

Regex: Match only if string A is found and string B is not

I'm trying to write a regular expression that will essentially return true if string A is found and string B is not found.
To be more specific, I'm looking for any file on my server which has the text 'base64_decode' in it, but not 'copyright'.
Thanks!
I'm not sure your real task can be solved purely within the regex passed into grep, since grep processes files line-by-line. I would use the -l (--files-with-matches) and -L (--files-without-match) options along with command substitution backticks, like so:
grep -L copyright `grep -l base64_decode *`
grep -l base64_decode * lists the names of all the files with "base64_decode" in them, and the backticks put that list on the command line after grep -L copyright, which searches those files and lists the subset of them that doesn't contain "copyright".
It's not recommended to do this in one regex, but if you must, you can use lookaheads:
^(?=.*must-have)(?!.*must-not-have)
You may want to do this in single-line/dot-all mode, and the beginning anchor may be \A instead of ^.
(?=…) is positive lookahead; it asserts that a given pattern can be matched. (?!…) is negative lookahead; it asserts that a given pattern can NOT be matched.
References
regular-expressions.info/Lookarounds, Anchors, Dot
Piped greps should be able to achieve that easily:
find -type f -print | xargs grep -l "base64_decode" | xargs grep -L "copyright"
Use negative lookahead and lookbehind:
^(?<!.*copyright.*)(base64_decode)(?!.*copyright.*)$
Perl doesn't support this yet :-P.
i think it is
^.*[^copyright].*base64_decode.*[^copyright].*$
however this will catch the phrase copyright anywhere even if it is not by itself in the word.
I was wrong, but bewhere that my example below still holds true for most other examples here
eg it will match
This text is non-copyrightable because I said so! but it is not encoded in base64_decode unfortunatly :(