My intention is to determine if a whole line is present in a config file.
Here is an example:
ports.conf:
#NameVirtualHost *:80
NameVirtualHost *:80
Now I want to search for NameVirtualHost *:80 but NOT for #NameVirtualHost *:80!
My first thought about this was, of course, to using grep. Like this:
grep -F "NameVirtualHost *:80" ports.conf That is giving me boths lines which is not what I want.
My second thought was to use regex like this: grep -e "^NameVirtualHost \*:80" ports.conf. But obviously now I have to treat with escaping special characters line *
This might be not a big deal but I want to pass individual search strings in and dont wanna bother with escaping strings as I am using my script.
So my question is:
How do I escape special characters? or How can I achieve the same result with different tools?
grep has an option -x which does exactly that:
-x, --line-regexp
Select only those matches that exactly match the whole line. (-x is specified by POSIX.)
So if you change your first command to grep -Fx "NameVirtualHost *:80" ports.conf, you get what you want.
Use printf escaping
printf '%q' 'NameVirtualHost *:80'
All together
grep -e "^`printf '%q' 'NameVirtualHost *:80'`$" test
Or
reg="NameVirtualHost *:80"
grep -e "^`printf '%q' "$reg"`$" test
The quickest way I can think of to keep your regexp simple is
grep -F "NameVirtualHost *:80" ports.conf | grep -v "^#\|^\/\/"
Related
How do you use grep to do a text file search for a pattern like ABC='123'?
I'm currently using:
grep -rnwi some/path -e "ABC\s*=\s*[\'\"][^\'\"]+[\'\"]"
but this only finds text like ABC="123". It misses any instances that use single-quotes. What's wrong with my regex?
You are using a PCRE. So, you need the -P flag. So, use this:
grep -rnwi some/path -P "ABC\s*=\s*[\'\"][^\'\"]+[\'\"]"
We don't need a \\ for single quotes inside the character classes. So, your regex can also be written as:
"ABC\s*=\s*['\"][^'\"]+['\"]"
Input file:
ABC="123"
ABC='123'
Run grep with your PCRE:
grep -P "ABC\s*=\s*['\"][^'\"]+['\"]" input.txt
Output:
ABC="123"
ABC='123'
I am trying to sed -i to update all my html forms for url shortening. Basically I need to delete the .php from all the action="..." tags in my html forms.
But I am stuck at just identifying these instances. I am trying this testfile:
action = "yo.php"
action = 'test.php'
action='test.php'
action="upup.php"
And I am using this expression:
grep -R "action\s?=\s?(.*)php(\"|\')" testfile
And grep returns nothing at all.
I've tried a bunch of variations, and I can see that even the \s? isn't working because just this grep command also returns nothing:
grep -R "action\s?=\s?" testfile
grep -R "action\\s?=\\s?" testfile
(the latter I tried thinking maybe I had to escape the \ in \s).
Can someone tell me what's wrong with these commands?
Edit:
Fix 1 - apparently I need to escape the question make in \s? to make it be perceived as optional character rather than a literal question mark.
The way you're using it, grep accepts basic posix regex syntax. The single quote does not need to be escaped in it1, but some of the metacharacters you use do -- in particular, ?, (), and |. You can use
grep -R "action\s\?=\s\?\(.*\)php\(\"\|'\)" testfile
I recommend, however, that you use extended posix regex syntax by giving grep the -E flag:
grep -E -R "action\s?=\s?(.*)php(\"|')" testfile
As you can see, that makes the whole thing much more readable.
Addendum: To remove the .php extension from all action attributes in a file, you could use
sed -i 's/\(action\s*=\s*["'\''][^"'\'']*\)\.php\(["'\'']\)/\1\2/g' testfile
Shell strings make this look scarier than it is; the sed code is simply
s/\(action\s*=\s*["'][^"']*\)\.php\(["']\)/\1\2/g
I amended the regex slightly so that in a line action='foo.php' somethingelse='bar.php' the right .php would be removed. I tried to make this as safe as I can, but be aware that handling HTML with sed is always hacky.
Combine this with find and its -exec filter to handle a whole directory.
1 And that the double quote needs to be escaped is because you use a doubly-quoted shell string, not because the regex requires it.
You need to use the -P option to use Perl regexs:
$ grep -P "action\s?=\s?(.*)php(\"|\')" test
action = "yo.php"
action = 'test.php'
action='test.php'
action="upup.php"
try this unescaped plain regex, which only selects text within quotes:
action\s?=\s?["'](.*)\.php["']
you can fiddle around here:
https://regex101.com/r/lN8iG0/1
so on command line this would be:
grep -P "action\s?=\s?[\"'](.*)\.php[\"']" test
I'm trying to use grep to find all files whose contents contain ".ple". I tried:
grep -ilr /.\.ple/ *
But it did not work.
Would someone be able to tell me what I did wrong?
Thanks.
there are many programming/script languages allow us to write regex between /.../ (slashes). But with grep, you cannot do it, you should wrap your regex expression in quotes. I think that is the problem of your grep line. The correct one:
grep -ilr '.\.ple' *
with yours: grep -ilr /.\.ple/ * grep will look for lines like this:
foo/a.ple/bar
that is, the slash would be literal letter.
What do you mean by it did not work?
Try this:
grep -rlE '\.ple$'
The above should work.
Edit: To search all files whose contents contain ".ple" try this
grep -rl '\.ple'
I'm trying to grep for individual quantities in lines like this:
foo=24.587 bar=88 fox=jobs
and extract, say, all the '88' values..the number of columns isn't consistent so awk followed by a cut wont cut it.
I tried using sed like this:
sed -e 's/.*\s\(bar=.+\)\s.*/\1/g'
and that just dumps the entire line. I'm not sure how to correct this regexp, and more importantly why this regexp doesnt do what I expect?
Use -r (extended regex). This tends to use regexen more like you may expect. You have to remove the backslashes from the parens, though:
$ echo "foo=24.587 bar=88 fox=jobs" | sed -r 's/.*\s(bar=.+)\s.*/\1/g'
bar=88
sed -r 's/.*\s(bar=.+)\s.*/\1/g'
I have a text file, which contains a date in the form of dd/mm/yyyy (e.g 20/12/2012).
I am trying to use grep to parse the date and show it in the terminal, and it is successful,
until I meet a certain case:
These are my test cases:
grep -E "\d*" returns 20/12/2012
grep -E "\d*/" returns 20/12/2012
grep -E "\d*/\d*" returns 20/12/2012
grep -E "\d*/\d*/" returns nothing
grep -E "\d+" also returns nothing
Could someone explain to me why I get this unexpected behavior?
EDIT: I get the same behavior if I substitute the " (weak quotes) for ' (strong quotes).
The syntax you used (\d) is not recognised by Bash's Extended regex.
Use grep -P instead which uses Perl regex (PCRE). For example:
grep -P "\d+/\d+/\d+" input.txt
grep -P "\d{2}/\d{2}/\d{4}" input.txt # more restrictive
Or, to stick with extended regex, use [0-9] in place of \d:
grep -E "[0-9]+/[0-9]+/[0-9]" input.txt
grep -E "[0-9]{2}/[0-9]{2}/[0-9]{4}" input.txt # more restrictive
You could also use -P instead of -E which allows grep to use the PCRE syntax
grep -P "\d+/\d+" file
does work too.
grep and egrep/grep -E don't recognize \d. The reason your first three patterns work is because of the asterisk that makes \d optional. It is actually not found.
Use [0-9] or [[:digit:]].
To help troubleshoot cases like this, the -o flag can be helpful as it shows only the matched portion of the line. With your original expressions:
grep -Eo "\d*" returns nothing - a clue that \d isn't doing what you thought it was.
grep -Eo "\d*/" returns / (twice) - confirmation that \d isn't matching while the slashes are.
As noted by others, the -P flag solves the issue by recognizing "\d", but to clarify Explosion Pills' answer, you could also use -E as follows:
grep -Eo "[[:digit:]]*/[[:digit:]]*/" returns 20/12/
EDIT: Per a comment by #shawn-chin (thanks!), --color can be used similarly to highlight the portions of the line that are matched while still showing the entire line:
grep -E --color "[[:digit:]]*/[[:digit:]]*/" returns 20/12/2012 (can't do color here, but the bold "20/12/" portion would be in color)