Bash: Using quoted variable for grep within quoted expression

Bash: Using quoted variable for grep within quoted expression - regex

I'm trying to create a function within a bash script that queries a log file. Within the query function, I have something that resembles the following:
if [ -n "$(cat some-log-file.log | grep \"$1\")" ]; then
echo "There is a match."
else
echo "No lines matched the search term."
fi
If I send something I know will be in the log file as $1, like "I don't", I get the output:
$ ./y.sh query "I don't"
grep: don't": No such file or directory
No lines matched the search term.
If I try to single quote the $() expression, it sends the literal string and always evaluates true. I'm guessing it has something to do with the way grep interprets backslashes, but I can't figure it out. Maybe I'm overseeing something simple, but I've been at this for hours looking on forums and plugging in all kinds of strange combinations of quotes and escape characters. Any help or advice is appreciated.

It's actually really easy, if you realize that $() is allowed to have unescaped quotes:
if [ -n "$(cat some-log-file.log | grep "$1")" ]; then
echo "There is a match."
else
echo "No lines matched the search term."
fi
You can actually even skip that step, though, because grep gives an appropriate exit code:
if grep -q "$1" some-log-file.log; then
echo "There is a match."
else
echo "No lines matched the search term."
fi
In short, this happens for the same reason that "$1" works: Shell parameter expansion and command substitution happen before word splitting and quote removal. See more about how bash parses commands in the Shell Expansions section of the bash manual.

Related

Using bash to extract sentences that contain different tenses of a specific verb

I want to extract sentences from a book that contain a list of verbs and their different tenses.
For example, for the word embellish, I want my program not only be able to recognize embellish, but also embellishing, embellishes and embellished. Here's what I did in bash:
word="embelish"
echo "It was embellished ..." | grep -E "${word}""ed"
This could easily recognize the sentence It was embellished .... I want to use alternatives in bash like below command to recognize different tenses:
echo "It was embellished ..." | grep -E '"${word}""ed"|"${word}""es"'
However the command failed to recognize the sentence. I tried different combinations of alternatives without any success.
Could you suggest as how to use alternatives in the regular expression, so that I could detect the different tenses in one command?
Thanks!

You need to remove all double quotes from the grep argument and then replace the single by double quotes. That's because variables are not expanded withing single quotes. In fact, everything between single quotes is taken literally. Thus when you write
echo "It was embellished ..." | grep -E '"${word}""ed"|"${word}""es"'
You are searching for the literal string
"${word}""ed"|"${word}""es"
If, instead, you write
echo "It was embellished ..." | grep -E "${word}ed|${word}es"
the variable word is expanded and you are searching for either embellished or embellishes.
By the way: You can save some typing by grouping the ending ed, es, etc together.
echo "It was embellished ..." | grep -E "$word(ed|es|ing)"

Check if any replacement done by `perl -i -pe`

In GNU sed, I can display the result of successful substitution of the search pattern. Simple example as the following:
echo -e "nginx.service\nmariadb.service\nphp-fpm.service" > something.conf;
sed -ri 's|(mariadb)(\.service)|postgresql-9.4\2|w sed-output.log' something.conf;
[[ -s sed-output.log ]] && echo "Pattern found and modified. $(cat sed-output.log)" || echo "Pattern not found.";
Because sed has limitation while dealing with multilines, I switched to perl.
echo -e "nginx.service\nmariadb.service\nphp-fpm.service" > something.conf;
perl -i -pe 's|(mariadb)(\.service)|postgresql-9.4\2|' something.conf;
The code above did the same like sed, but how can I get the modified content ("postgresql-9.4.service") into a file, or printed out?
Basically what I would like to achieve is, after the script has been executed, it tells me if it's successful (and what actually substituted) and if not, I'll display a message of what couldn't be found and replaced.
Edit:
Highlighted that I want to get (only-the-) modified content, which indicates that my script is successful. Because with perl -i -pe 's/pattern/replace/' file, I couldn't know if it return true or false. Of course I can simple do grep -E "/pettern/" to find out, but that's not the question.

This code will throw an exit code equal to 0 when replacement is done:
$ perl -i -pe '$M += s|(mariadb)(\.service)|postgresql-9.4\2|;END{exit 1 unless $M>0}' something.conf
$ echo $?
0
When NO substitution is done, return code will be 1:
$ perl -i -pe '$M += s|(maria)(\.service)|postgresql-9.4\2|;END{exit 1 unless $M>0}' something.conf
$ echo $?
1
From Perl documentation
An END code block is executed as late as possible, that is, after perl
has finished running the program and just before the interpreter is
being exited, even if it is exiting as a result of a die() function.
(But not if it's morphing into another program via exec, or being
blown out of the water by a signal--you have to trap that yourself (if
you can).) You may have multiple END blocks within a file--they will
execute in reverse order of definition; that is: last in, first out
(LIFO). END blocks are not executed when you run perl with the -c
switch, or if compilation fails.
Number of replacements returned from s operator
s/PATTERN/REPLACEMENT/msixpodualngcer
Searches a string for a pattern, and if found, replaces that pattern
with the replacement text and returns the number of substitutions
made.

It isn't as tidy in Perl because you have to open your log file explicitly, and for a one-liner that has to be in a BEGIN block. But Perl's s/// returns the number of changes made, so you can test it for truth
Note also that $2 is better than \2 in Perl, as the latter represents a character with code point 2, or Unicode U+0002 START OF TEXT
perl -i -pe ' BEGIN { open F, ">perl-output.log" } print F $_ if s|(mariadb)(\.service)|postgresql-9.4$2| ' something.conf

You can check the output directly if you only print the substituted lines:
if [[ -z $(sed -n 's/mariadb\(\.service\)/postgresql-9.4\1/p' something.conf) ]]; then
echo nope
fi

Regular Expression won't work in bash, works in other tools

I have the following string:
Started GET "/stuff/search?search_string=Actin&organism_id=9&advanced_design=false&user_ip=172.16.0.1&filter=" for 172.16.0.4 at 2015-06-30 13:58:26 +0200
Parameters: {"search_string"=>"Actin", "organism_id"=>"9", "advanced_design"=>"false", "user_ip"=>"172.16.0.1", "filter"=>""}
Started GET "/stuff/search?search_string=NM_001101&organism_id=9&advanced_design=false&user_ip=172.16.0.1&filter=" for 172.16.0.4 at 2015-06-30 14:00:39 +0200
Parameters: {"search_string"=>"NM_001101", "organism_id"=>"9", "advanced_design"=>"false", "user_ip"=>"172.16.0.1", "filter"=>""}
Started GET "/stuff/search?search_string=ENST00000331789&organism_id=9&advanced_design=false&user_ip=172.16.0.1&filter=" for 172.16.0.4 at 2015-06-30 14:00:49 +0200
Parameters: {"search_string"=>"ENST00000331789", "organism_id"=>"9", "advanced_design"=>"false", "user_ip"=>"172.16.0.1", "filter"=>""}
and I want to extract the value of the "search_string" key. I need to do this in a bash script. For this I have came up with the following regular expression:
"\{(\"search_string\"\=\>\")([a-zA-Z0-9.\-_]+)(.*?)\}"
I have tested this on multiple online regular expression testers, like rubular or regex101.com and it works fine there. However, in bash, the regex does not match the text.
Here is my script (i have cut off the text for this question, but normally the text in a file which i am grep-ing):
#!/bin/bash
regex="\{(\"search_string\"\=\>\")([a-zA-Z0-9.\-_]+)(.*?)\}"
string='{"search_string"=>"NM_001101"}'
if [[ $string =~ $regex ]]
then
echo "OK"
else
echo "not OK"
fi
filename="/some/path/search.txt"
if [ -f "$filename" ]
then
result=$(grep -F "$regex" "$filename")
echo "$result"
else
echo "$filename is not a file or it does not exist"
fi
In this case, the script returns "not OK".
Obviously, the script is not ready yet as I am stuck with this regular expression. What am I doing wrong ?
Thanks!

Just escape all the backslashes other than the one before double quotes one more time.
regex="\\{\"search_string\"=>\"[a-zA-Z0-9._-]+(.*?)\\}"
string='{"search_string"=>"NM_001101"}'
echo $regex
if [[ $string =~ $regex ]]
then
echo "OK"
else
echo "not OK"
fi
IDEONE

This regex works in awk, so you could make some modifications to your script and use awk for the matching. awk readlines lines from stdin or every line of a file by default, and regex are enclosed like "//", commands are enclosed like "{}". Here I echoed your example, piped the stdin to awk and used the command "print ok" to test if the regex was matched. I think you can take this piece of code to make your script work the way you want in bash.
~$ echo '{"search_string"=>"NM_001101"}' | awk '/\{(\"search_string\"\=\>\")([a-zA-Z0-9.\-_]+)(.*?)\}/{print "ok"}'
ok

Grepping for a sentence from inside a bash script

I have a log file from which I want to grep for some error messages using a bash script, however I am not quite getting how to pass it the sentence and then use it in the grep call.
$./grep_sentence_script.sh "Call to server failed"
grep_sentence.sh
#!/bin/sh
sentence=$1
`grep $sentence logfile.log`
Could someone please help me with it.

put the variable inside double quotes.
#!/bin/sh
sentence=$1
grep "$sentence" logfile.log

Just this will be sufficient:
#!/bin/bash
grep -iF "$1" logfile.log
Important to use -F (fixed string) option in order to avoid regex interpretation of special meta characters like $, . etc.

How to get a part of a string with a regular expression in a /bin/sh script

I need to extract the part of a string in a shell script. The original string is pretty complicated, so I really need a regular expression to select the right part of the original string - justing removing a prefix and suffix won't work. Also, the regular expression needs to check the context of the string I want to extract, so I e.g. need a regular expression a\([^b]*\)b to extract 123 from 12a123b23.
The shell script needs to be portable, so I cannot make use of the Bash constructs [[ and BASH_REMATCH.
I want the script to be robust, so when the regular expression does not match, the script should notice this e.g. through a non-zero exit code of the command to be used.
What is a good way to do this?
I've tried various tools, but none of them fully solved the problem:
expr match "$original" ".*$regex.*" works except for the error case. With this command, I don't know how to detect if the regex did not match. Also, expr seems to take the extracted string to determine its exit code - so when I happened to extract 00, expr had an exit code of 1. So I would need to generally ignore the exit code with expr match "$original" ".*$regex.*" || true
echo "$original" | sed "s/.*$regex.*/\\1/" also works except for the error case. To handle this case, I'd need to test if I got back the original string, which is also quite unelegant.
So, isn't there a better way to do this?

You could use the -n option of sed to suppress output of all input lines and add the p option to the substitute command, like this:
echo "$original" | sed -n -e "s/.*$regex.*/\1/p"
If the regular expression matches, the matched group is printed as before. But now if the regular expression does not match, nothing is printed and you will need to test only for the empty string.

How about grep -o the only possible problem is portability, otherwise it satisfies all requirements:
➜ echo "hello and other things" | grep -o hello
hello
➜ echo $?
0
➜ echo "hello and other things" | grep -o nothello
➜ echo $?
1
One of the best things is that since it's grep you can pick what regex's you want whether BRE, ERE or Perl.

if egrep is available (pretty much all time)
egrep 'YourPattern' YourFile
or
egrep "${YourPattern}" YourFile
if only grep is available
grep -e 'YourPattern' YourFile
you check with a classical [ $? -eq 0 ] for the status of the command (also take into account bad YourFile access)
for the content itself, extract with sed or awk (for portability issue) (after the failure test)
Content="$( sed -n -e "s/.*\(${YourPattern}\).*/\1/p;q" )"

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Bash: Using quoted variable for grep within quoted expression - regex

Related

Using bash to extract sentences that contain different tenses of a specific verb

Check if any replacement done by `perl -i -pe`

Regular Expression won't work in bash, works in other tools

Grepping for a sentence from inside a bash script

How to get a part of a string with a regular expression in a /bin/sh script

Categories

Resources