regex match numbers in an array in a shell script - regex

I have an array of values coming from bash which i just want to check if there are numbers or not in it. It can contain -, + numbers and spaces at the start or end since bash is evaluating this as sting.
Since every number is represented with , at the end i added (,) to the regex.
Basically i want to check if element is a number or not.
The $val look like these.
[ [ -0.13450142741203308, -0.3073260486125946, -0.15199440717697144, -0.06535257399082184, 0.02075939252972603, 0.03708624839782715, 0.04876817390322685 ] ,[ 0.10357733070850372, 0.048686813563108444, -0.1413831114768982, -0.11497996747493744, -0.08910851925611496, -0.04536910727620125, 0.06921301782131195, 0.02547631226480007 ] ]
This is my code which looks at each value and evaluates each. However it doesn't seem to catch the cases.
re='^[[:space:]][+-]?[0-9]+([.][0-9]+)?(,)[[:space:]]$'
for j in ${val[*]}
do
if ! [[ "$j" =~ $re ]] ; then
echo "Error: Not a number: $j"
fi
done
Also it needs to ignore cases which throw [ or ] or ],.
Any ideas how to correct this ? Thanks for the help.

It is likely that $val is coming to you as a string.
If you don't need to validate each number as a fully legit number, you can use shell logic to filter those things that are obviously not numbers:
val='[ [ -0.13450142741203308, -0.3073260486125946, -0.15199440717697144, -0.06535257399082184, 0.02075939252972603, 0.03708624839782715, 0.04876817390322685 ] ,[ 0.10357733070850372, 0.048686813563108444, -0.1413831114768982, -0.11497996747493744, -0.08910851925611496, -0.04536910727620125, 0.06921301782131195, 0.02547631226480007 ] ]'
for e in $val; do # PURPOSELY no quote to break on spaces
e="${e/,}"
case $e in
''|*[!0-9.\-]*) printf "'%s' is bad\n" "$e" ;;
*) printf "'%s' is good\n" "$e" ;;
esac
done
Prints:
'[' is bad
'[' is bad
'-0.13450142741203308' is good
'-0.3073260486125946' is good
'-0.15199440717697144' is good
'-0.06535257399082184' is good
'0.02075939252972603' is good
'0.03708624839782715' is good
'0.04876817390322685' is good
']' is bad
'[' is bad
'0.10357733070850372' is good
'0.048686813563108444' is good
'-0.1413831114768982' is good
'-0.11497996747493744' is good
'-0.08910851925611496' is good
'-0.04536910727620125' is good
'0.06921301782131195' is good
'0.02547631226480007' is good
']' is bad
']' is bad
That is super fast but that will fail on malformed 'numbers' such as 123-456
If you do need to filter out malformed numbers, you can use awk for that:
echo "$val" | awk -v RS="[^0-9.+-]+" '($0+0==$0)'
# all legit numbers from the string...

If you populate $val with the given string, it's not an array, it's a string. Using it unquoted would apply word splitting to it which splits it into whitespace separated words. The spaces aren't part of the words, and some of the words (the last one in each bracketed sequence) don't end in a comma:
#! /bin/bash
val='[ [ -0.13450142741203308, -0.3073260486125946, -0.15199440717697144, -0.06535257399082184, 0.02075939252972603, 0.03708624839782715, 0.04876817390322685 ] ,[ 0.10357733070850372, 0.048686813563108444, -0.1413831114768982, -0.11497996747493744, -0.08910851925611496, -0.04536910727620125, 0.06921301782131195, 0.02547631226480007 ] ]'
re='^[+-]?[0-9]+([.][0-9]+)?,?$'
for j in $val ; do
if ! [[ $j =~ $re ]] ; then
echo "Error: Not a number: $j"
fi
done
To use a bash array, declare it with round parentheses and use whitespace to separate the elements:
#! /bin/bash
val=(-0.13450142741203308 -0.3073260486125946 -0.15199440717697144 -0.06535257399082184 0.02075939252972603 0.03708624839782715 0.04876817390322685 0.10357733070850372 0.048686813563108444 -0.1413831114768982 -0.11497996747493744 -0.08910851925611496 -0.04536910727620125 0.06921301782131195 0.02547631226480007)
re='^[+-]?[0-9]+([.][0-9]+)?$'
for j in "${val[#]}" ; do
if ! [[ $j =~ $re ]] ; then
echo "Error: Not a number: $j"
fi
done

Related

'$' in regexp in bash

I really don't know what I'm doing.
In variable a, I want to find the first appearance of '$' after the first appearance of 'Bitcoin', and print everything after it until the first newline.
I have the following code:
a = 'something Bitcoin something againe $jjjkjk\n againe something'
if [[ $a =~ .*Bitcoin.*[\$](.*).* ]]; then
echo "${BASH_REMATCH[1]}"
else
echo "no"
fi
In this example I would like to get 'jjjkjk'. All I get is 'no'.
This code might be really flawed, I have no experience in this. I think tho the problem might be with the '$' sign. Please help!
Properly handle newlines in bash with ANSI-C Quoting -- \n sequences become literal newlines.
a=$'something Bitcoin something againe $jjjkjk\n againe something'
regex=$'Bitcoin[^$]*[$]([^\n]+)'
[[ $a =~ $regex ]] && declare -p BASH_REMATCH
declare -ar BASH_REMATCH='([0]="Bitcoin something againe \$jjjkjk" [1]="jjjkjk")'
# .................................................................^^^^^^^^^^^^
To verify the contents contain newlines:
$ printf '%s' "$regex" | od -c
0000000 B i t c o i n [ ^ $ ] * [ $ ] (
0000020 [ ^ \n ] + )
0000026
Here is a working version of your code:
a='something Bitcoin something againe $jjjkjk\n againe something'
r=".*Bitcoin.*[\$]([^\n]*).*"
if [[ $a =~ $r ]]; then
echo "${BASH_REMATCH[1]}"
else
echo "no"
fi
You need to find 'Bitcoin' then find a '$' after it, no matter what is between, so you should use .* operator, also when you want to capture some text until a specific char, the best way is using [^](not) operator, in your case: [^\n] this means capture everything until \n.
Also you had an issue with your variable declaration. a = "..." is not valid, the spaces are waste. so the correct one is 'a=".."`.
Using double quotation is wrong too, this will replaces dollar sign with an empty variable (evaluation)

Bash regex does not accept slash

i am pretty new to bash shell scripting (and linux too)... i try to do a simple script which involves some regex for a string given by keyboard from a user.
clear
read -p "Insert e-mail > "
if [[ $REPLY =~ ^[.] ]]
then
echo "ERROR (code 1): e-mail cannot start with \".\""
elif [[ $REPLY =~ .[.]$ ]]
then
echo "ERROR (code 2): e-mail cannot end with \".\""
else
if [[ $REPLY =~ ^[0-9][0-9a-zA-Z!#$%^\&\'*+-]+$ ]] #THIS IS WHERE I NEED HELP
then
echo "Good!"
else
echo "Bad!"
fi
fi
so what i want to do is to make a regex
so that the user cant start with . or end with . (i pretty much did that and its working)...
next what i wanted to do was make the string start with a number and i did that with ^[0-9] (i think this is correct)
and after that..string could be anything like a number 0-9 or letters a-z and A-Z or the next characters: !#$%^&'*+-/
so when user entered 1& (it starts with number and the rest is in the acceptable characters) but it didn't work.. because it need to be \& (at the regex formula).
next the same problem occurred to character ' what i did, was to add again a backslash to regex formula (\') and it worked..
then i tried to do the same with / character (slash character) so what i did was add a backslash / (backslash slash) but when user entered 1/ (it starts with number and the rest are acceptable characters) unfortunately it printed "Bad!" ... it should print Good!..
why is that happening?
i tried \/ and \\/ but still... cant understand why it doesn't work!
Problem is presence of ! in your character class that is doing history expansion.
I suggest declaring your regex beforehand like this:
re="^[0-9][0-9a-zA-Z\!#$%^&/*'+-]+$"
Then use it as:
s='1/'
[[ $s =~ $re ]] && echo "good" || echo "bad"
good
Actually, /s work in character classes just fine:
$ [[ "1/" =~ ^[0-9][/]+$ ]]; echo $?
0

shell script in bash using regex in while loop

Hi i am try to validate user inputs to be not empty and is a number or with decimal
re='^[0-9]+$'
while [ "$num" == "" ] && [[ "$num" ~= $re ]]
do
echo "Please enter the price : "
read num
done
I was able to run smooth with just the 1st condition. When i add 2nd condition my program couldn't run.
----EDIT----------
Ok i try changing and the program run. But when i enter a number it still prompting for input.
re='^[0-9]+$'
while [ "$num" == "" ] && [ "$num" != $re ]
do
echo "Please enter the price : "
read num
done
regualar expression can be used with the operator =~ not ~= like you used it.
An additional binary operator, =~, is available, with the same
prece dence as == and !=. When it is used, the string to the right of
the operator is considered an extended regular expression and matched
accordingly (as in regex(3)). The return value is 0 if the string
matches the pattern, and 1 otherwise. If the regular expression is
syntactically incorrect, the conditional expression's return value is
2. If the shell option nocasematch is enabled, the match is performed
without regard to the case of alphabetic characters. Any part of the
pattern may be quoted to force the quoted portion to be matched as a
string. Bracket expressions in regular expressions must be treated
carefully, since normal quoting characters lose their meanings between
brackets. If the pattern is stored in a shell variable, quoting the
variable expansion forces the entire pattern to be matched as a string.
Substrings matched by parenthesized subexpressions within the regular
expression are saved in the array variable BASH_REMATCH. The element
of BASH_REMATCH with index 0 is the portion of the string matching the
entire regular expression. The element of BASH_REMATCH with index n is
the portion of the string matching the nth parenthesized subexpression.
consider theese examples (0 true/match, 1 false/no match)
re=^[0-9]+; [[ "1" =~ ${re} ]]; echo $? # 0
re=^[0-9]+; [[ "a" =~ ${re} ]]; echo $? # 1
re=^[0-9]+; [[ "a1" =~ ${re} ]]; echo $? # 1
re=^[0-9]+; [[ "1a" =~ ${re} ]]; echo $? # 0 because it starts with a number
use this one to check for a number
re=^[0-9]+$; [[ "1a" =~ ${re} ]]; echo $? # 1 because checked up to the end
re=^[0-9]+$; [[ "11" =~ ${re} ]]; echo $? # 0 because all nums
UPDATE: If you just want to check if the user inputs a number combine the lesson learned above with your needs. i think your conditions do not fit. perhaps this snippet solves your issue completely.
#!/bin/bash
re=^[0-9]+$
while ! [[ "${num}" =~ ${re} ]]; do
echo "enter num:"
read num
done
This snippet just requests input if ${num} is NOT (!) a number. During the first run ${num} is not set so it will not fit at least one number, ${num} then evaluates to an empty string. Afterwards it just contains the input entered.
Your error is simple; the variable can't be both empty and a number at the same time. Maybe you mean || "or" instead of && "and".
You can do this with glob patterns as well.
while true; do
read -r -p "Enter a price: " num
case $num in
"" | *[!.0-9]* | *.*.*) echo invalid ;;
*) break;;
esac
First off, there is the classic logic trap demonstrated in the OP's question:
while [ "$num" == "" ] && [ "$num" != $re ]
The issue here is the && which pretty much means the moment the left expression is false, the entire expression is false. i.e. the moment somebody types a non empty response, it breaks the loop and the regular expression test is never used. To fix the logic problem, one should consider changing && to ||, i.e.
while [ "$num" == "" ] || [ "$num" != $re ]
The second issue, is we are testing for negative matches to regular expression, pattern. So, this is done in two parts, one we need to use [[ "$num" =~ $re ]] for regular expression testing. Then, we need to look for negative matches, i.e. append a ! which yields:
while [ "$num" == "" ] || ! [[ "$num" =~ $re ]
Having got this far, many people observed that there is actually no need to test for the empty string. That edge condition is already covered by the regular expression itself, so, we optimize out the redundant test. The answer now reduces to:
while ! [[ "$num" =~ $re ]
In addition to the above observation, here are my notes about regular expression ( some of the observation has been collated from other answers ):
regular expressions can be tested with the [[ "$str" =~ regex ]] syntax
regular expressions match with $? == 0 ( 0 == no error )
regular expressions do not match with $? == 1 ( 1 == error )
regular expressions do not seem to work when quoted. recommend using [0-9] not "[0-9]"
To implement a number validation, the following pattern seems to work:
str=""
while ! [[ "${str?}" =~ ^[0-9]+$ ]]
do
read -p "enter a number: " str
done
You can mix regular expression filters with regular arithmetic filters for some really nice validation results:
str=""
while ! [[ "${str?}" =~ ^[0-9]+$ ]] \
|| (( str < 1 || str > 15 ))
do
read -p "enter a number between 1 and 15: " str
done
N.B. I used the ${str?} syntax ( instead of $str ) for variable expansion as it demonstrates good practice for catching typos.

IF statement in BASH isn't doing what's expected

Got a very simple script which checks barcodes basically. There's two barcodes that have to be checked that they are not confused when being made into a variable.
Basically the first barcode should contain only numbers 0-9, and the second barcode should contain two letters, then some numbers, then two more letters, like AB123456789CD.
If they're confused and read in the wrong order then it plays an error sound. This is what I have so far, the top one's working, but I'm not sure it it's the best solution, and the bottom one doesn't do what I want:
echo -e $BLUE"Please scan the first barcode"$ENDCOLOUR
read -p "Barcode: " BARCODE1
if [[ "$BARCODE1" =~ [a-z] ]] ; then
play -q ./error.wav
else
echo -e $BLUE"Please scan the second barcode"$ENDCOLOUR
read -p "Barcode: " BARCODE1
if [[ "$BARCODE2" =~ [a-z0-9] ]] ; then
play -q ./error.wav
else
echo "'$BARCODE1',$BARCODE2'" >> barcodes.csv
fi
fi
What's wrong? And is there a more optimal means of achieving this?
Only numbers:
if ! [[ $BARCODE1 =~ ^[0-9]+$ ]]; then
Because of the + sign this is going to enter the if statement for empty strings as well. + means one or more times and * means zero or more time.
Two characters, numbers, two characters:
if ! [[ $BARCODE1 =~ ^[a-zA-Z][a-zA-Z][0-9]+[a-zA-Z][a-zA-Z]$ ]]; then
Once again, this is not going to match for strings like 'AABB'. If you think that 'AABB' is a valid barcode, then use this:
if ! [[ $BARCODE1 =~ ^[a-zA-Z][a-zA-Z][0-9]*[a-zA-Z][a-zA-Z]$ ]]; then
EDIT:
Also, if you know exact count of numbers in a barcode, then you could use {n}
if ! [[ $BARCODE1 =~ ^[a-zA-Z]{2}[0-9]{9}[a-zA-Z]{2}$ ]]; then
Which means 2 letters, 9 numbers, 2 letters

How should I get bash 3.2 to find a pattern between wildcards

Trying to compare input to a file containing alert words,
read MYINPUT
alertWords=( `cat "AlertWordList" `)
for X in "${alertWords[#]}"
do
# the wildcards in my expression do not work
if [[ $MYINPUT =~ *$X* ]]
then
echo "#1 matched"
else
echo "#1 nope"
fi
done
The =~ operator deals with regular expressions, and so to do a wildcard match like you wanted, the syntax would look like:
if [[ $MYINPUT =~ .*$X.* ]]
However, since this is regex, that's not needed, as it's implied that it could be anywhere in the string (unless it's anchored using ^ and/or $, so this should suffice:
if [[ $MYINPUT =~ $X ]]
Be mindful that if your "words" happen to contain regex metacharacters, then this might do strange things.
I'd avoid =~ here because as FatalError points out, it will interpret $X as a regular expression and this can lead to surprising bugs (especially since it's an extended regular expression, so it has more special characters than standard grep syntax).
Instead, you can just use == because bash treats the RHS of == as a globbing pattern:
read MYINPUT
alertWords=($(<"AlertWordList"))
for X in "${alertWords[#]}"
do
# the wildcards in my expression do work :-)
if [[ $MYINPUT == *"$X"* ]]
then
echo "#1 matched"
else
echo "#1 nope"
fi
done
I've also removed a use of cat in your alertWords assignment, as it keeps the file reading inside the shell instead of spawning another process to do it.
If you want to use patterns, not regexes for matching, you can use case:
read MYINPUT
alertWords=( `cat "AlertWordList" `)
for X in "${alertWords[#]}"
do
# the wildcards in my expression do not work
case "$MYINPUT" in
*$X* ) echo "#1 matched" ;;
* ) echo "#1 nope" ;;
esac
done