I was going through a bash script and read a line which says:
echo "Some line..." | grep -ioP '(?<=Arguments=\")(.*)(?=":Language=)'
I understood the grep part i.e grep -ioP but the input to grep i.e
'(?<=Arguments=\")(.*)(?=":Language=)' type of expression is encountered for the first time.
What does it mean? Does it mean something special to grep or it's just the grepping of similar string from the echo string part?
Thanks!
These are look-around assertions. (?<...) is a look-behind (what precedes) and (?=...) is a look-forward (what follows). The reason for them is they aren't part of the match, so -o won't output them.
Related
so i'm running the linux command
ls /etc/systemd/system | grep -o -E "[0-9]+"
which should return just numerical values, the only problem it returns some unwanted numerical values from parts of results i dont want, i want only the numerical values between - and .service so in like test-blah4-1321.service i just want it to return 1321. What am i missing here?
example
$ ls /etc/systemd/system
test.service test-blah4-1321.service test-blah2.service test-blah5-1387.service test-blah3-1521.service
GNU grep has the -P option for perl-style regexes, and the -o option to print only what matches the pattern. These can be combined using look-around assertions (described under Extended Patterns in the perlre manpage) to remove part of the grep pattern from what is determined to have matched for the purposes of -o.
Source
Applied to your example this would be:
echo test-blah4-1321.service | grep -oP '(?<=-)\d+(?=\.service)'
when I need look ahead or look behind tests I usually switch to a Perl one-liner.
this should do the trick.
echo test-blah4-1321.service | perl -ne 'm/(?<=-)(\d+)(?=\.service)/g; print "$1\n";'
Looking for suggestion to cat file | grep REGEX to get the lines with <version>anything</version>.
grep -F '<version>1.1.9-beta</version>' file
-F will match your pattern as literal text
you don't need that useless cat
if you really mean anything: try grep '<version>.*</version>' file or grep -P '<version>.*?</version>' file , however searching xml with regex is bad idea.
Use the -E option to match a regular expression:
grep -E "<version>.*</version>" file
Refer to these rules for the regular expression: https://www.gnu.org/savannah-checkouts/gnu/grep/manual/grep.html#Regular-Expressions
For example, to match the typical version format (3.14, or 13.14, or 0.1458) you can type:
grep -E "<version>[0-9]?\.[0-9]?</version>" file
You can do:
grep '<version>[^<]*</version>' file.xml
[^<]* will match zero or more characters upto next <.
I have a text file, which contains a date in the form of dd/mm/yyyy (e.g 20/12/2012).
I am trying to use grep to parse the date and show it in the terminal, and it is successful,
until I meet a certain case:
These are my test cases:
grep -E "\d*" returns 20/12/2012
grep -E "\d*/" returns 20/12/2012
grep -E "\d*/\d*" returns 20/12/2012
grep -E "\d*/\d*/" returns nothing
grep -E "\d+" also returns nothing
Could someone explain to me why I get this unexpected behavior?
EDIT: I get the same behavior if I substitute the " (weak quotes) for ' (strong quotes).
The syntax you used (\d) is not recognised by Bash's Extended regex.
Use grep -P instead which uses Perl regex (PCRE). For example:
grep -P "\d+/\d+/\d+" input.txt
grep -P "\d{2}/\d{2}/\d{4}" input.txt # more restrictive
Or, to stick with extended regex, use [0-9] in place of \d:
grep -E "[0-9]+/[0-9]+/[0-9]" input.txt
grep -E "[0-9]{2}/[0-9]{2}/[0-9]{4}" input.txt # more restrictive
You could also use -P instead of -E which allows grep to use the PCRE syntax
grep -P "\d+/\d+" file
does work too.
grep and egrep/grep -E don't recognize \d. The reason your first three patterns work is because of the asterisk that makes \d optional. It is actually not found.
Use [0-9] or [[:digit:]].
To help troubleshoot cases like this, the -o flag can be helpful as it shows only the matched portion of the line. With your original expressions:
grep -Eo "\d*" returns nothing - a clue that \d isn't doing what you thought it was.
grep -Eo "\d*/" returns / (twice) - confirmation that \d isn't matching while the slashes are.
As noted by others, the -P flag solves the issue by recognizing "\d", but to clarify Explosion Pills' answer, you could also use -E as follows:
grep -Eo "[[:digit:]]*/[[:digit:]]*/" returns 20/12/
EDIT: Per a comment by #shawn-chin (thanks!), --color can be used similarly to highlight the portions of the line that are matched while still showing the entire line:
grep -E --color "[[:digit:]]*/[[:digit:]]*/" returns 20/12/2012 (can't do color here, but the bold "20/12/" portion would be in color)
I have a text file that stores combinations of four numbers in the following format:
Num1,Num2,Num3,Num4
Num5,Num6,Num7,Num8
.............
I have a whole bunch of such files and what I need is to grep for all filenames that contains the pattern described above.
I constructed my grep as follows:
grep -l "{d+},{d+},{d+},{d+}" /some/path/to/file/name
The grep terminates without returning anything.
Can somebody point out what I might be doing wrong with my grep statement?
Thanks
This should do what you want:
egrep -l '[[:digit:]]+,[[:digit:]]+,[[:digit:]]+,[[:digit:]]+' /some/path/to/file/name
One way is using a perl regexp:
grep -Pl "\d+,\d+,\d+,\d+" /some/path/to/file/name
In your syntax d is literal. It should be escaping that letter, but is not accepted by grep regular regexp.
I have a text file
$ cat test.log
SYB-01001
SYB-18913
SYB-02445
SYB-21356
I want to grep for 01001 and 18913 only whats the way to do this
I want the output to be
SYB-01001
SYB-18913
SYB-02445
I tried this but not sure whats wrong with it
grep 'SYB-(18913)|0*)' test.log
Use the -E flag for "extended regular expressions" with grep.
e.g.
grep -E 'SYB-(0|18913)' test.log
Other things to be aware of:
parentheses must match (for every opening bracket you want a closing bracket)
0* means zero or more 0 characters - in truth this will match everything
Try that :
grep 'SYB-\(18913\|0*\)' test.log
But you maybe don't want the 0* part to act like this. Maybe 0+ is better.
awk
awk '/SYB-(0|18913)/' file
Your brackets are out. Try
grep 'SYB-[18913|0*]' test.log