Matching regex in bash - regex

I'm trying to match the parameters of a bash script with a regex
mykill.bash [-l] [-s SIGNO] pattern1 pattern2
I'm using this expression:
regex = ^(-l)?(\s-s\s[0-9]+)?(\s[a-zA-Z0-9]+){1,2}$ <br>
if [[ $# =~ $regex ]]; then echo 'cool'
for example ./mykill.bash -l -s 33 abc gives $#='-l -s 33 abc' which passes the debuggex.com tests (see image
but it doesn't work in my script

You have bash problems, not a regex problem.
When assigning variables in bash: no space around the = please.
Then if you want to preserve backslashes and whitespace in the regex, use single quotes around it, otherwise bash eats them for breakfast. You don't need to quote cool. And close the if with a fi.
regex='^(-l)?(\s-s\s[0-9]+)?(\s[a-zA-Z0-9]+){1,2}$ <br>'
if [[ $# =~ $regex ]]; then echo cool; fi
Or use the simpler form of the conditional:
[[ $# =~ $regex ]] && echo cool

Some versions of bash does not catch \s the way we are used to in other languages/flavors.
One version it does not catch \s is on my MacBook Air with GNU bash, version 3.2.48(1)-release-(x86_64-apple-darwin12).
And you should not have spaces around = when assigning a value to a variable.
As Ken-Y-N said in the comments to your post, your pattern has some problems as well, I fixed it in my code below.
This should do it if \s is the problem:
#!/bin/bash
re='^(-l\ )?(-s\ [0-9]+\ )?([a-zA-Z0-9]+)(\ [a-zA-Z0-9]+)?$'
if [[ $# =~ $re ]]; then
echo 'cool'
fi
There's no need to escape the spaces like i did, but I find it easier to read this way.

As per your input, this will match
if [[ $# =~ -l\ -s\ [0-9]+\ [a-zA-Z]+ ]]
then
echo 'cool'
else
echo 'check again'
fi

Related

Correct way to filter results with if statement in bash loop

I'm trying to work out a loop that will let me ignore some matches. So far I have:
for d in /home/chambres/web/x.org/public_html/2018/js/lib/*.js ; do
if [[ $d =~ /*.min.js/ ]];
then
echo "ignore $d"
else
filename="${d##*/}"
echo "$d"
#echo "$filename"
fi
done
However when I run it, they still seem to get included. What am I doing wrong?
/home/chambres/web/x.org/public_html/2018/js/lib/underscore.js.min.js
/home/chambres/web/x.org/public_html/2018/js/lib/tiny-slider.js
/home/chambres/web/x.org/public_html/2018/js/lib/tiny-slider.js.min.js
/home/chambres/web/x.org/public_html/2018/js/lib/underscore.js
BTW I'm a bit of a newbie with bash, so please be kind ;)
In Bash, regular expressions are not enclosed in /, so you should change your test to:
if [[ $d =~ \.min\.js$ ]]
As well as removing the enclosing /, I have escaped the . (otherwise they would match any character) and added a $ to match the end of the string.
But in fact you can use a simpler (and marginally faster) glob match in this case:
if [[ $d = *.min.js ]]
This matches any string that ends in .min.js.

Extract integers from string with bash

From a variable how to extract integers that will be in format *\d+.\d+.\d+* (4.12.3123) using bash.
filename="xzxzxzxz4.12.3123fsfsfsfsfsfs"
I have tried:
filename="xzxzxzxz4.12.3123fsfsfsfsfsfs"
if [[ "$filename" =~ (.*)(\d+.\d+.\d+)(.*) ]]; then
echo ${BASH_REMATCH}
echo ${BASH_REMATCH[1]}
echo ${BASH_REMATCH[2]}
echo ${BASH_REMATCH[3]}
else
echo 'nej'
fi
which does not work.
The easiest way to work with regexes in Bash, in terms of consistency between Bash versions and escaping, is to put the regex into a single-quoted variable and then use it unquoted, as below:
re='[0-9]+\.[0-9]+\.[0-9]+'
[[ $filename =~ $re ]] && printf '%s\n' "${BASH_REMATCH[#]}"
The main issue with your approach were that you were using the "Perl-style" \d, so in fact you could make your code work with:
if [[ "$filename" =~ (.*)([0-9]+\.[0-9]+\.[0-9]+)(.*) ]]; then
echo "${BASH_REMATCH[2]}"
fi
But this unnecessarily creates 3 capture groups, when you don't even need one. Note that I also changed . (any character) to \. (a literal .).
one way to extract:
grep -oP '\d\.\d+\.\d+' <<<$xfilename
There is one more way
$ filename="xzxzxzxz4.12.3123fsfsfsfsfsfs"
$ awk '{ if (match($0, /[0-9].[0-9]+.[0-9]+/, m)) print m[0] }' <<< "$filename"
4.12.3123

Regex in a bash scipt

I've got the following text file which contains:
12.3-456, test
test test test
If the line contains xx.x-xxx, then I want to print the line out. (X's are numbers)
I think I have the correct regex and have tested it here:
http://regexr.com/3clu3
I have then used this in a bash script but the line containing the text is not printed out.
What have I messed up?
#!/bin/bash
while IFS='' read -r line || [[ -n "$line" ]]; do
if [[ $line =~ /\d\d.\d-\d\d\d,/g ]]; then
echo $line
fi
done < input.txt
You need to use [0-9] instead of a \d in Bash regex. No regex delimiters are necessary, and the global flag is not necessary either. Also, you can contract it a bit using limiting quantifiers (like {3} that will match 3 occurrences of the pattern next to it). Besides, a dot matches any character in regex, so you need to escape it if you want to match a literal dot symbol.
Use
regex="[0-9]{2}\.[0-9]-[0-9]{3},"
if [[ $line =~ $regex ]]
...
This works:
#!/bin/bash
#regex="/\d\d.\d-\d\d\d,/g"
regex="[0-9\.\-]+\, [A-Za-z]+"
while IFS='' read -r line || [[ -n "$line" ]]; do
echo $line
if [[ $line =~ $regex ]]; then
echo "match"
fi
done
regex is [any of 0-9, '.', '-'] followed by ',' followed by alphachars. This could be refined in a number of ways - e.g. explicit places before/ after '-'.
Testing indicates:
$ ./sqltrace2.sh < input.txt
12.3-456, test
match
123.3-456, test
match
12.3-456,
test test test
test test test

Bash - correct way to escape dollar in regex

What is the correct way to escape a dollar sign in a bash regex? I am trying to test whether a string begins with a dollar sign. Here is my code, in which I double escape the dollar within my double quotes expression:
echo -e "AB1\nAB2\n\$EXTERNAL_REF\nAB3" | while read value;
do
if [[ ! $value =~ "^\\$" ]];
then
echo $value
else
echo "Variable found: $value"
fi
done
This does what I want for one box which has:
GNU bash, version 3.2.25(1)-release (x86_64-redhat-linux-gnu)
And the verbose output shows
+ [[ ! $EXTERNAL_REF =~ ^\$ ]]
+ echo 'Variable found: $EXTERNAL_REF'
However, on another box which uses
GNU bash, version 4.1.2(1)-release (x86_64-redhat-linux-gnu)
The comparison is expanded as follows
+ [[ ! $EXTERNAL_REF =~ \^\\\$ ]]
+ echo '$EXTERNAL_REF'
Is there a standard/better way to do this that will work across all implementations?
Many thanks
Why do you use a regular expression here? A glob is enough:
#!/bin/bash
while read value; do
if [[ "$value" != \$* ]]; then
echo "$value"
else
echo "Variable found: $value"
fi
done < <(printf "%s\n" "AB1" "AB2" '$EXTERNAL_REF' "AB3")
Works here with shopt -s compat32.
The regex doesn't need any quotes at all. This should work:
if [[ ! $value =~ ^\$ ]];
I would replace the double quotes with single quotes and remove a single \ and have the changes as below
$value =~ "^\\$"
can also be used as
$value =~ '^\$'
I never found the solution either, but for my purposes, I settled on the following workaround:
if [[ "$value" =~ ^(.)[[:alpha:]_][[:alnum:]_]+\\b && ${BASH_REMATCH[1]} == '$' ]]; then
echo "Variable found: $value"
else
echo "$value"
fi
Rather than trying to "quote" the dollar-sign, I instead match everything around it and I capture the character where the dollar-sign should be to do a direct-string comparison on. A bit of a kludge, but it works.
Alternatively, I've taken to using variables, but just for the backslash character (I don't like storing the entire regex in a variable because I find it confusing for the regex to not appear in the context where it's used):
bs="\\"
string="test\$test"
if [[ "$string" =~ $bs$ ]]; then
echo "output \"$BASH_REMATCH\""
fi

Why does BASH_REMATCH not work for a quoted regular expression?

The code is like this:
#!/bin/bash
if [[ foobarbletch =~ 'foo(bar)bl(.*)' ]]
then
echo "The regex matches!"
echo $BASH_REMATCH
echo ${BASH_REMATCH[1]}
echo ${BASH_REMATCH[2]}
fi
When I try to run it, it doesn't display anything:
bash-3.2$ bash --version
GNU bash, version 3.2.48(1)-release (x86_64-apple-darwin12)
Copyright (C) 2007 Free Software Foundation, Inc.
bash-3.2$ /bin/bash test_rematch.bash
bash-3.2$
Does anyone have ideas about this?
In your bash REGEX, you should remove quotes. That's why that doesn't work.
If you have space, I recommend to use this way :
#!/bin/bash
x='foo bar bletch'
if [[ $x =~ foo[[:space:]](bar)[[:space:]]bl(.*) ]]
then
echo The regex matches!
echo $BASH_REMATCH
echo ${BASH_REMATCH[1]}
echo ${BASH_REMATCH[2]}
fi
You can also assign the quoted regexp to a variable:
#!/bin/bash
x='foobarbletch'
foobar_re='foo(bar)bl(.*)'
if [[ $x =~ $foobar_re ]] ; then
echo The regex matches!
echo ${BASH_REMATCH[*]}
echo ${BASH_REMATCH[1]}
echo ${BASH_REMATCH[2]}
fi
This not only supports simple quoting but gives the regexp a name which can help readability.
Thanks to your debugging statement, echo The regex matches!, you should have noticed there is no problem with BASH_REMATCH, since the if statement evaluates to false.
In bash, regular expressions used with =~ are unquoted. If the string on the right is quoted, then it is treated as a string literal.
If you want to include whitespaces in your regex's, then use the appropriate character classes, or escape your space if you want a space.