How to do a grep regex search for single-quotes? - regex

How do you use grep to do a text file search for a pattern like ABC='123'?
I'm currently using:
grep -rnwi some/path -e "ABC\s*=\s*[\'\"][^\'\"]+[\'\"]"
but this only finds text like ABC="123". It misses any instances that use single-quotes. What's wrong with my regex?

You are using a PCRE. So, you need the -P flag. So, use this:
grep -rnwi some/path -P "ABC\s*=\s*[\'\"][^\'\"]+[\'\"]"
We don't need a \\ for single quotes inside the character classes. So, your regex can also be written as:
"ABC\s*=\s*['\"][^'\"]+['\"]"
Input file:
ABC="123"
ABC='123'
Run grep with your PCRE:
grep -P "ABC\s*=\s*['\"][^'\"]+['\"]" input.txt
Output:
ABC="123"
ABC='123'

Related

grep strings between "{{_(" and ")}}"

I want to parse html files to extract strings between "{{_(" and ")}}" using GREP. I tried something like this:
grep '"[^{{_(|)}}$]"' *.html
but it didn't work.
Can someone help me please?
Thanks!
You may use
grep -oP '(?<={{_\().+?(?=\)}})' file
Details
-o - output only matched substrings
-P - enable the PCRE regex engine
(?<={{_\().+?(?=\)}}) match:
(?<={{_\() - a location that is immediately preceded with {{+(
.+? - any 1 or more more chars other than line break chars, as few as possible
(?=\)}}) - a location that is immediately followed with )}} .
See the regex demo.
#Wiktor Stribiżew's answer works really good. However, if you have multiple files, you would get an output like this, where the respective file name per each match is also displayed:
foo.html: content abc
foo.html: test 123
bar.html: first match
bar.html: second match
So, if you are only interested in the matching string as output, you can try sed instead
sed -n 's/.*{{_(\(.*\))}}.*/\1/p' *.html
You can also count the unique occurrence of matches and things like that...
Update:
Or just use the -h | --no-filename with the grep that #Wiktor Stribiżew has provided.
grep -h -oP '(?<={{_\().+?(?=\)}})' *.html
Or the -c flag in order to display the count of matches per each file:
grep -c -oP '(?<={{_\().+?(?=\)}})' *.html
As in the posts before with it is possible to grep the value of an HTML property.
placeholder="SOME TEXT_HERE" -> grep -> "SOME TEXT_HERE"
grep -oP '(?<=placeholder=").+?(?=")' *html

How to grep specific IP Addresses using regex?

My simplified sample file is as follows ... The actual file has more text and IP in it. Just to make it easier for this example.
file.txt
10.1.1.9
10.1.1.33
10.1.1.35
I would like to grep only 10.1.1.9 & 10.1.1.33.
If I use grep '10.1.1.[9|33]' file.txt, this will grep everything including .35.
I know this can be achieve with grep -v 35 file.txt, but I wanted the solution in regex as the actual file contains more data than this sample.
What's wrong with my regex and how to fix it?
[user#linux]$ grep '10.1.1.[9|33]' file.txt
10.1.1.9
10.1.1.33
10.1.1.35
[user#linux]$
Desired Output (without .35)
[user#linux]$ grep '10.1.1.[regex here]' file.txt
10.1.1.9
10.1.1.33
[user#linux]$
It should be done with simple grep.
grep -E '10\.1\.1\.(9|33)' Input_file
Where -E option is(from man grep):
-E, --extended-regexp
Interpret PATTERN as an extended regular expression (ERE, see below). (-E is specified by POSIX.)

How to grep file to find lines like <version>1.1.9-beta</version>?

Looking for suggestion to cat file | grep REGEX to get the lines with <version>anything</version>.
grep -F '<version>1.1.9-beta</version>' file
-F will match your pattern as literal text
you don't need that useless cat
if you really mean anything: try grep '<version>.*</version>' file or grep -P '<version>.*?</version>' file , however searching xml with regex is bad idea.
Use the -E option to match a regular expression:
grep -E "<version>.*</version>" file
Refer to these rules for the regular expression: https://www.gnu.org/savannah-checkouts/gnu/grep/manual/grep.html#Regular-Expressions
For example, to match the typical version format (3.14, or 13.14, or 0.1458) you can type:
grep -E "<version>[0-9]?\.[0-9]?</version>" file
You can do:
grep '<version>[^<]*</version>' file.xml
[^<]* will match zero or more characters upto next <.

Bash (grep) regex performing unexpectedly

I have a text file, which contains a date in the form of dd/mm/yyyy (e.g 20/12/2012).
I am trying to use grep to parse the date and show it in the terminal, and it is successful,
until I meet a certain case:
These are my test cases:
grep -E "\d*" returns 20/12/2012
grep -E "\d*/" returns 20/12/2012
grep -E "\d*/\d*" returns 20/12/2012
grep -E "\d*/\d*/" returns nothing
grep -E "\d+" also returns nothing
Could someone explain to me why I get this unexpected behavior?
EDIT: I get the same behavior if I substitute the " (weak quotes) for ' (strong quotes).
The syntax you used (\d) is not recognised by Bash's Extended regex.
Use grep -P instead which uses Perl regex (PCRE). For example:
grep -P "\d+/\d+/\d+" input.txt
grep -P "\d{2}/\d{2}/\d{4}" input.txt # more restrictive
Or, to stick with extended regex, use [0-9] in place of \d:
grep -E "[0-9]+/[0-9]+/[0-9]" input.txt
grep -E "[0-9]{2}/[0-9]{2}/[0-9]{4}" input.txt # more restrictive
You could also use -P instead of -E which allows grep to use the PCRE syntax
grep -P "\d+/\d+" file
does work too.
grep and egrep/grep -E don't recognize \d. The reason your first three patterns work is because of the asterisk that makes \d optional. It is actually not found.
Use [0-9] or [[:digit:]].
To help troubleshoot cases like this, the -o flag can be helpful as it shows only the matched portion of the line. With your original expressions:
grep -Eo "\d*" returns nothing - a clue that \d isn't doing what you thought it was.
grep -Eo "\d*/" returns / (twice) - confirmation that \d isn't matching while the slashes are.
As noted by others, the -P flag solves the issue by recognizing "\d", but to clarify Explosion Pills' answer, you could also use -E as follows:
grep -Eo "[[:digit:]]*/[[:digit:]]*/" returns 20/12/
EDIT: Per a comment by #shawn-chin (thanks!), --color can be used similarly to highlight the portions of the line that are matched while still showing the entire line:
grep -E --color "[[:digit:]]*/[[:digit:]]*/" returns 20/12/2012 (can't do color here, but the bold "20/12/" portion would be in color)

Filter apache log file using regular expression

I have a big apache log file and I need to filter that and leave only (in a new file) the log from a certain IP: 192.168.1.102
I try using this command:
sed -e "/^192.168.1.102/d" < input.txt > output.txt
But "/d" removes those entries, and I needt to leave them.
Thanks.
What about using grep?
cat input.txt | grep -e "^192.168.1.102" > output.txt
EDIT: As noted in the comments below, escaping the dots in the regex is necessary to make it correct. Escaping in the regex is done with backslashes:
cat input.txt | grep -e "^192\.168\.1\.102" > output.txt
sed -n 's/^192\.168\.1\.102/&/p'
sed is faster than grep on my machines
I think using grep is the best solution but if you want to use sed you can do it like this:
sed -e '/^192\.168\.1\.102/b' -e 'd'
The b command will skip all following commands if the regex matches and the d command will thus delete the lines for which the regex did not match.