Bash regex string variable match

Bash regex string variable match - regex

I have the following script i wrote in perl that works just fine. But i am trying to achieve the same thing using bash.
#!/usr/bin/perl
use 5.010;
use strict;
INIT {
my $string = 'Seconds_Behind_Master: 1';
my ($s) = ($string =~ /Seconds_Behind_Master: ([\d]+)/);
if ($s > 10) {
print "Too long... ${s}";
} else {
print "It's ok";
}
}
__END__
How can i achieve this using a bash script? Basically, i want to be able to read and match the value at the end of the string "Seconds_Behind_Master: N" where N can be any value.

You can use regular expression in bash, just like in perl.
#!/bin/bash
STRING="Seconds_Behind_Master: "
REGEX="Seconds_Behind_Master: ([0-9]+)"
RANGE=$( seq 8 12 )
for i in $RANGE; do
NEW_STRING="${STRING}${i}"
echo $NEW_STRING;
[[ $NEW_STRING =~ $REGEX ]]
SECONDS="${BASH_REMATCH[1]}"
if [ -n "$SECONDS" ]; then
if [[ "$SECONDS" -gt 10 ]]; then
echo "Too Long...$SECONDS"
else
echo "OK"
fi
else
echo "ERROR: Failed to match '$NEW_STRING' with REGEX '$REGEX'"
fi
done
Output
Seconds_Behind_Master: 8
OK
Seconds_Behind_Master: 9
OK
Seconds_Behind_Master: 10
OK
Seconds_Behind_Master: 11
Too Long...11
Seconds_Behind_Master: 12
Too Long...12
man bash #BASH_REMATCH

You can use a tool for it e.g. sed if you want to stay with regexps:
#!/bin/sh
string="Seconds_Behind_Master: 1"
s=`echo $string | sed -r 's/Seconds_Behind_Master: ([0-9]+)/\1/g'`
if [ $s -gt 10 ]
then
echo "Too long... $s"
else
echo "It's OK"
fi

The specific case of "more than a single digit" is particularly easy with just a pattern match:
case $string in
*Seconds_Behind_Master: [1-9][0-9]*) echo Too long;;
*) echo OK;;
esac
To emulate what your Perl code is doing more closely, you can extract the number with simple string substitutions.
s=${string##*Seconds_Behind_Master: }
s=${s%%[!0-9]*}
[ $s -gt 10 ] && echo "Too long: $s" || echo OK.
These are glob patterns, not regular expressions; * matches any string, [!0-9] matches a single character which is not a digit. All of this is Bourne-compatible, i.e. not strictly Bash only (you can use /bin/sh instead of /bin/bash).

Related

Using regular expressions in a ksh Script

I have a file (file.txt) that contains some text like:
000000000+000+0+00
000000001+000+0+00
000000002+000+0+00
and I am trying to check each line to make sure that it follows the format:
character*9, "+", character*3, "+", etc
so far I have:
#!/bin/ksh
file=file.txt
line_number=1
for line in $(cat $file)
do
if [[ "$line" != "[[.]]{9}+[[.]]{3}+[[.]]{1}+[[.]]{2} ]" ]]
then
echo "Invalid number ($line) check line $line_number"
exit 1
fi
let "line_number++"
done
however this does not evaluate correctly, no matter what I put in the lines the program terminates.

When you want line numbers of the mismatches, you can use grep -vn. Be careful with writing a correct regular expression, and you will have
grep -Evn "^.{9}[+].{3}[+].[+].{2}$" file.txt
This is not in the layout that you want, so change the layout with sed:
grep -Evn "^.{9}[+].{3}[+].[+].{2}$" file.txt |
sed -r 's/([^:]*):(.*)/Invalid number (\2) check line number \1./'
EDIT:
I changed .{1} into ..
The sed is also over the top. When you need spme explanation, you can start with echo "Linenr:Invalid line"

I'm having funny results putting the regex in the condition directly:
$ line='000000000+000+0+00'
$ [[ $line =~ ^.{9}\+.{3}\+.\+..$ ]] && echo ok
ksh: syntax error: `~(E)^.{9}\+.{3}\+.\+..$ ]] && echo ok
' unexpected
But if I save the regex in a variable:
$ re="^.{9}\+.{3}\+.\+..$"
$ [[ $line =~ $re ]] && echo ok
ok
So you can do
#!/bin/ksh
file=file.txt
line_number=1
re="^.{9}\+.{3}\+.\+..$"
while IFS= read -r line; do
if [[ ! $line =~ $re ]]; then
echo "Invalid number ($line) check line $line_number"
exit 1
fi
let "line_number++"
done < "$file"
You can also use a plain glob pattern:
if [[ $line != ?????????+???+?+?? ]]; then echo error; fi
ksh glob patterns have some regex-like syntax. If there's an optional space in there, you can handle that with the ?(sub-pattern) syntax
pattern="?????????+???+?( )?+??"
line1="000000000+000+0+00"
line2="000000000+000+ 0+00"
[[ $line1 == $pattern ]] && echo match || echo no match # => match
[[ $line2 == $pattern ]] && echo match || echo no match # => match
Read the "File Name Generation" section of the ksh man page.

Your regex looks bad - using sites like https://regex101.com/ is very helpful. From your description, I suspect it should look more like one of these;
^.{9}\+.{3}\+.{1}\+.{2}$
^[^\+]{9}\+[^\+]{3}\+[^\+]{1}\+[^\+]{2}$
^[0-9]{9}\+[0-9]{3}\+[0-9]{1}\+[0-9]{2}$
From the ksh manpage section on [[ - you would probably want to be using =~.
string =~ ere
True if string matches the pattern ~(E)ere where ere is an extended regular expression.
Note: As far as I know, ksh regex doesn't follow the normal syntax
You may have better luck with using grep:
# X="000000000+000+0+00"
# grep -qE "^[^\+]{9}\+[^\+]{3}\+[^\+]{1}\+[^\+]{2}$" <<<"${X}" && echo true
true
Or:
if grep -qE "^[^\+]{9}\+[^\+]{3}\+[^\+]{1}\+[^\+]{2}$" <<<"${line}"
then
exit 1
fi
You may also prefer to use a construct like below for handling files:
while read line; do
echo "${line}";
done < "${file}"

Trying to write a regex in bash

I am new to regex and I am trying to write a regex in a bash script .
I am trying to match line with a regex which has to return the second word in the line .
regex = "commit\s+(.*)"
line = "commit 5456eee"
if [$line =~ $regex]
then
echo $2
else
echo "No match"
fi
When I run this I get the following error:-
man.sh: line 1: regex: command not found
man.sh: line 2: line: command not found
I am new to bash scripting .
Can anyone please help me fix this .
I just want to write a regex to capture the word that follows commit

You don't want a regex, you want parameter expansion/substring extraction:
line="commit 5456eee"
first="${line% *}"
regex="${line#* }"
if [[ $line =~ $regex ]]
then
echo $2
else
echo "No match"
fi
$first == 'commit', $regex == '5456eee'. Bash provides all the tools you need.

If you really only need the second word you could also do it with awk
line = "commit 5456eee"
echo $line | awk '{ print $2 }'
or if you have a file:
cat filename | awk '{ print $2 }'
Even if it's no bash only solution, awk should be present on most linux os's.

You should remove the spaces around the equals sign, otherwise bash thinks you want to execute the regex command using = and "commit\s+(.*)" as arguments.
Then you should remove the spaces also in the if condition and quote the strings:
$ regex="commit\s+(.*)"
$ line="commit 5456eee"
$ if [ "$line"=~"$regex" ]
> then
> echo "Match"
> else
> echo "No match"
> fi
Match

maybe you didn't start your script with the
#!/bin/sh
or
#!/bin/bash
to define the language you're using... ?
It must be your first line.
then be careful, spaces are consistant in bash. In your "if" statement, it should be :
if [ $line =~ $regex ]
check this out and tell us more about the errors you get

if you make this script to a file like test.sh
and execute like that :
test.sh commit aaa bbb ccc
$0 $1 $2 $3 $4
you can get the arguments eassily by $0 $1...

A simple way to get the resulting capture group that was matched (if there is one) is to use BASH_REMATCH, which puts the match results into it's own array:
regex=$"commit (.*)"
line=$"commit 5456eee"
if [[ $line =~ $regex ]]
then
match=${BASH_REMATCH[1]}
echo $match
else
echo "No match"
fi
Since you have only one capture group it will be defined within the array as BASH_REMATCH[1]. In the above example I've assigned the variable $match to the result of BASH_REMATCH[1] which returns:
5456eee

How to test if string matches a regex in POSIX shell? (not bash)

I'm using Ubuntu system shell, not bash, and I found the regular way can not work:
#!/bin/sh
string='My string';
if [[ $string =~ .*My.* ]]
then
echo "It's there!"
fi
error [[: not found!
What can I do to solve this problem?

The [[ ... ]] are a bash-ism. You can make your test shell-agnostic by just using grep with a normal if:
if echo "$string" | grep -q "My"; then
echo "It's there!"
fi

Using grep for such a simple pattern can be considered wasteful. Avoid that unnecessary fork, by using the Sh built-in Glob-matching engine (NOTE: This does not support regex):
case "$value" in
*XXX*) echo OK ;;
*) echo fail ;;
esac
It is POSIX compliant. Bash have simplified syntax for this:
if [[ "$value" == *XXX* ]]; then :; fi
and even regex:
[[ abcd =~ b.*d ]] && echo ok

You could use expr:
if expr "$string" : "My" 1>/dev/null; then
echo "It's there";
fi
This would work with both sh and bash.
As a handy function:
exprq() {
local value
test "$2" = ":" && value="$3" || value="$2"
expr "$1" : "$value" 1>/dev/null
}
# Or `exprq "somebody" "body"` if you'd rather ditch the ':'
if exprq "somebody" : "body"; then
echo "once told me"
fi
Quoting from man expr:
STRING : REGEXP
anchored pattern match of REGEXP in STRING

How to check if string contains characters in regex pattern in shell?

How do I check if a variable contains characters (regex) other than 0-9a-z and - in pure bash?
I need a conditional check. If the string contains characters other than the accepted characters above simply exit 1.

One way of doing it is using the grep command, like this:
grep -qv "[^0-9a-z-]" <<< $STRING
Then you ask for the grep returned value with the following:
if [ ! $? -eq 0 ]; then
echo "Wrong string"
exit 1
fi
As #mpapis pointed out, you can simplify the above expression it to:
grep -qv "[^0-9a-z-]" <<< $STRING || exit 1
Also you can use the bash =~ operator, like this:
if [[ ! "$STRING" =~ [^0-9a-z-] ]] ; then
echo "Valid";
else
echo "Not valid";
fi

case has support for matching:
case "$string" in
(+(-[[:alnum:]-])) true ;;
(*) exit 1 ;;
esac
the format is not pure regexp, but it works faster then separate process with grep - which is important if you would have multiple checks.

Using Bash's substitution engine to test if $foo contains $bar
bar='[^0-9a-z-]'
if [ -n "$foo" -a -z "${foo/*$bar*}" ] ; then
echo exit 1
fi

How to check whether a string has at least one alphabetic character?

I want to check whether a string has at least one
alphabetic character?
a regex could be like:
"^.*[a-zA-Z].*$"
however, I want to judge whether a string has at least one
alphabetic character?
so I want to use, like
if [ it contains at least one alphabetic character];then
...
else
...
fi
so I'm at a loss on how to use the regex
I tried
if [ "$x"=~[a-zA-Z]+ ];then echo "yes"; else echo "no" ;fi
or
if [ "$x"=~"^.*[a-zA-Z].*$" ];then echo "yes"; else echo "no" ;fi
and test with x="1234", both of the above script output result of "yes", so they are wrong
how to achieve my goal?thanks!

Try this:
#!/bin/bash
x="1234"
y="a1234"
if [[ "$x" =~ [A-Za-z] ]]; then
echo "$x has one alphabet"
fi
if [[ "$y" =~ [A-Za-z] ]]; then
echo "Y is $y and has at least one alphabet"
fi

If you want to be portable, I'd call /usr/bin/grep with [A-Za-z].

Use the [:alpha:] character class that respects your locale, with a regular expression
[[ $str =~ [[:alpha:]] ]] && echo has alphabetic char
or a glob-style pattern
[[ $str == *[[:alpha:]]* ]] && echo has alphabetic char

It's quite common in sh scripts to use grep in an if clause. You can find many such examples in /etc/rc.d/.
if echo $theinputstring | grep -q '[a-zA-Z]' ; then
echo yes
else
echo no
fi

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Bash regex string variable match - regex

You can use a tool for it e.g. sed if you want to stay with regexps: #!/bin/sh string="Seconds_Behind_Master: 1" s=`echo $string | sed -r 's/Seconds_Behind_Master: ([0-9]+)/\1/g'` if [ $s -gt 10 ] then echo "Too long... $s" else echo "It's OK" fi

Related

Using regular expressions in a ksh Script

Trying to write a regex in bash

How to test if string matches a regex in POSIX shell? (not bash)

How to check if string contains characters in regex pattern in shell?

How to check whether a string has at least one alphabetic character?

Categories

Resources