How can I check the last character in a string in bash? - regex

I need to ensure that the last character in a string is a /
x="test.com/"
if [[ $x =~ //$/ ]] ; then
x=$x"extention"
else
x=$x"/extention"
fi
at the moment, false always fires.

Like this, for example:
$ x="test.com/"
$ [[ "$x" == */ ]] && echo "yes"
yes
$ x="test.com"
$ [[ "$x" == */ ]] && echo "yes"
$
$ x="test.c/om"
$ [[ "$x" == */ ]] && echo "yes"
$
$ x="test.c/om/"
$ [[ "$x" == */ ]] && echo "yes"
yes
$ x="test.c//om/"
$ [[ "$x" == */ ]] && echo "yes"
yes

You can index strings in Bash using ${var:index} and ${#var} to get the length of the string. Negative indices means the moving from the end to the start of the string so that -1 is index of the last character:
if [[ "${x:${#x}-1}" == "/" ]]; then
# last character of x is /
fi

Your condition was slightly incorrect. When using =~, the rhs is considered a pattern, so you'd say pattern and not /pattern/.
You'd have got expected results if you said
if [[ $x =~ /$ ]] ; then
instead of
if [[ $x =~ //$/ ]] ; then

You can do this generically using bash substrings $(string:offset:length} - length is optional
#x is the length of x
Therefore
$n = 1 # 1 character
last_char = ${x:${#x} - $n}
For future references,
$ man bash
has all the magic
${parameter:offset:length}
Substring Expansion. Expands to up to length characters of parameter
starting at the character specified by offset. If length is
omitted, expands to the substring of parameter starting at the
character specified by offset. length and offset are arithmetic
expressions ...

Related

Regex not equal operator?

I'm trying to return a function if the NAMESPACE variable is blank or if the VERSION variable doesn't match the correct pattern.
# return usage if namespace is blank or version doesn't match the version format.
if [[ "$NAMESPACE" == "" || "$VERSION" =~ ^([0-9]\.([1-9]|[1-9][0-9])\.[0-9])$ ]];
then
usage
fi
Currently I'm using =~ which returns true if the pattern is x.xx.x or x.x.x. But I'm having trouble finding what the operator would be for not equal (something similar to !=~)
You can negate after the || for a (A OR NOT B):
if [[ "$NAMESPACE" == "" || ! "$VERSION" =~ ^([0-9]\.([1-9]|[1-9][0-9])\.[0-9])$ ]];
then
usage
fi
Note that you need to have spaces around the !.
Alternatively you can reverse the (A OR NOT B) to NOT (NOT A AND B):
if ! [[ "$NAMESPACE" != "" && "$VERSION" =~ ^([0-9]\.([1-9]|[1-9][0-9])\.[0-9])$ ]];
then
usage
fi

BASH: testing that arguments are a list of numbers

I am trying to test that an infinite number of arguments ( "$#" ) to a bash script are numbers ( "#", "#.#", ".#", "#.") delimited by spaces (i.e. # # # # ...). I have tried:
[ "$#" -eq "$#" ]
similar to what I found in this answer but I get:
"[: too many arguments"
and I have also tried regular expressions but it seems once the regular expression is satisfied anything can come afterwards. here is my code:
if (($# >=1)) && [[ "$#" =~ ^-?[[:digit:]]*\.?[[:digit:]]+ ]]; then
it also needs to not allow "#.." or "..#"
I don't think that [ "$#" -eq "$#"] is going to work somehow.
A loop like this could help to read each argument and detect if it is an integer number (bash does not handle decimals):
for i in $#;do
if [ "$i" -eq "$i" ] 2>/dev/null
then
echo "$i is an integer !!"
else
echo "ERROR: not an integer."
fi
done
In your case , to determine if argument is a valid integer/decimal number instead of all those regex ifs, we can simply divide the number with it's self using bc program of bash.
If it is a valid number will return 1.00
So in your case this should work:
for i in $#;do
if [[ "$(bc <<< "scale=2; $i/$i")" == "1.00" ]] 2>/dev/null;then
echo "$i is a number and thus is accepted"
else
echo "Argument $i not accepted"
fi
done
Output:
root#debian:# ./bashtest.sh 1 3 5.3 0.31 23. .3 ..2 8..
1 is a number and thus is accepted
3 is a number and thus is accepted
5.3 is a number and thus is accepted
0.31 is a number and thus is accepted
23. is a number and thus is accepted
.3 is a number and thus is accepted
Argument ..2 not accepted
Argument 8.. not accepted
$# is an array of strings. You probably want to process the strings one at a time, not all together.
for i; do
if [[ $i =~ ^-?[[:digit:]]+\.?[[:digit:]]*$ ]] || [[ $i =~ ^-?\.?[[:digit:]]+$ ]]; then
echo yes - $i
else
echo no - $i
fi
done
In bash there is pattern matching with multiplier syntax that can help your problem. Here is a script to validate all arguments:
for ARG ; do
[[ "$ARG" = +([0-9]) ]] && echo "$ARG is integer number" && continue
[[ "$ARG" = +([0-9]).*([0-9]) ]] && echo "$ARG is float number" && continue
[[ "$ARG" = *([0-9]).+([0-9]) ]] && echo "$ARG is float number" && continue
[[ "$ARG" = -+([0-9]) ]] && echo "$ARG is negative integer number" && continue
[[ "$ARG" = -+([0-9]).*([0-9]) ]] && echo "$ARG is negative float number" && continue
[[ "$ARG" = -*([0-9]).+([0-9]) ]] && echo "$ARG is negative float number" && continue
echo "$ARG is not a number."
done
The for loop automatically uses the arguments received by the script to load the variable ARG.
Each test from the loop compares the value of the variable with a pattern [0-9] multiplied with + or * (+ is 1 or more , * is zero or more), sometimes there are multiple pattern next to each other.
Here is an example usage with output:
$ ./script.sh 123 -123 1.23 -12.3 1. -12. .12 -.12 . -. 1a a1 a 12345.6789 11..11 11.11.11
123 is integer number
-123 is negative integer number
1.23 is float number
-12.3 is negative float number
1. is float number
-12. is negative float number
.12 is float number
-.12 is negative float number
. is not a number.
-. is not a number.
1a is not a number.
a1 is not a number.
a is not a number.
12345.6789 is float number
11..11 is not a number.
11.11.11 is not a number.
I shall assume that you meant a decimal number, limited to either integers or floating numbers from countries that use a dot to mean decimal point. And such country does not use a grouping character (1,123,456.00125).
Not including: scientific (3e+4), hex (0x22), octal (\033 or 033), other bases (32#wer) nor arithmetic expressions (2+2, 9/7, 9**3, etc).
In that case, the number should use only digits, one (optional) sign and one (optional) dot.
This regex checks most of the above:
regex='^([+-]?)([0]*)(([1-9][0-9]*([.][0-9]+)?)|([.][0-9]+))$'
In words:
An optional sign (+ or -)
Followed by any amount of optional zeros.
Followed by either (…|…|…)
A digit [1-9] followed by zero or more digits [0-9] (optionally) followed by a dot and digits.
No digits followed by a dot followed by one or more digits.
Like this (since you tagged the question as bash):
regex='^([+-]?)([0]*)(([1-9][0-9]*([.][0-9]+)?)|([.][0-9]+))$'
[[ $n =~ $regex ]] || { echo "A $n is invalid" >&2; }
This will accept 0.0, and .0 as valid but not 0. nor 0.
Of course, that should be done in a loop, like this:
regex='^([+-]?)([0]*)(([1-9][0-9]*([.][0-9]+)?)|([.][0-9]+))$'
for n
do m=${n//[^0-9.+-]} # Only keep digits, dots and sign.
[[ $n != "$m" ]] &&
{ echo "Incorrect characters in $n." >&2; continue; }
[[ $m =~ $regex ]] ||
{ echo "A $n is invalid" >&2; continue; }
printf '%s\n' "${BASH_REMATCH[1]}${BASH_REMATCH[3]}"
done

Bash Regex for empty string returns true

if [[ " " =~ ^[0-9]*$ ]]; then echo "si"; else echo "no"; fi; //Echoes No
if [[ "" =~ ^[0-9]*$ ]]; then echo "si"; else echo "no"; fi; //Echoes Yes
Is this a bug or am I missing something?
This is as expected. You specified 0 or more times (*) a digit ([0-9]). An empty string is 0 times that.
Use a + (which means "1 or more times") instead of a *:
if [[ " " =~ ^[0-9]+$ ]]; then echo "si"; else echo "no"; fi; // Should echo No
if [[ "" =~ ^[0-9]+$ ]]; then echo "si"; else echo "no"; fi; // Should echo No
The first one is a space, which does not match the [0-9]* regex.
The second is empty, which is [0-9]* because * also implies 0 ocurrencies. If you make it match at least one ocurrency with +, then it is false:
$ if [[ " " =~ ^[0-9]+$ ]]; then echo "si"; else echo "no"; fi;
no
* in a regex means "0 or more", so with nothing in the target string, the regex trivially matches.
[0-9]* matches zero or more digits, so yes, it matches the empty string. If you don't want to match the empty string, use [0-9]+, which matches one or more digits.

Run file until the output matches regular expression

I would like to write a bash expression that would run the file "a.out" until the output of the file is equal to "b\na" where "\n" is a newline.
Here you go:
#/bin/bash
a.out | while :
do
read x
read y
[[ $x == 'b' && $y == 'a' ]] && break
echo $x $y
done
Tested this in bash on Ubuntu 13.04.
This might also help you quantify your results:
let ab=0 ba=0
for (( i=0; i<1000; ++i )); do
case "$(./a.out)" in
$'a\nb') let ab+=1;;
$'b\na') let ba+=1;;
esac
done
echo "a\\nb: $ab times; b\\na: $ba times"
Tested on Ubuntu 13.04
pcregrep matches b\na.
the -m flag to grep causes it to exit on first match.
until ./a.out | pcregrep -M 'b\na' | grep -m 1 a; do :; done
This should work:
./a.out | while read line; do
    [[ $s == 1 && $line == 'a' ]] && break
    s=0 
    [[ $line == 'b' ]] && s=1 
done 
Overkill way:
mkfifo myfifo
./a.out > myfifo &
pp=$!
while read line; do
[[ $s == 1 && $line == 'a' ]] && break
s=0
[[ $line == 'b' ]] && s=1
done < myfifo
kill $pp
rm myfifo
Something like
./a.out | sed '/^b$/!b;:l;n;/^b$/bl;/^a$/q'
Translation: if the current input line does not match ^b$ (beginning of line, b, end of line) start over with the next input line; otherwise, fetch the next input line; as long as we get another ^b$, keep reading, otherwise, if it matches ^a$, stop reading and quit.
:l declares a label so we have somewhere to go back to in the while loop. b without an explicit label branches to the end of the script (which then starts over with the next input line).

Bash: need to find text within matching braces (parantheses) in text

I have some text that looks like this:
(something1)something2
However something1 and something2 might also have some parentheses inside them such as
(some(thing)1)something(2)
I want to extract something1 (including internal parentheses if there are any) to a variable. Since I can count on the text always starting with an opening parentheses, I'm hoping that I can do something where I match the first parenthesis to the correct closing parentheses, and extract the middle.
Everything I have tried so far has the potential to match the wrong ending parentheses.
If you have perl, the:
perl -MText::Balanced -nlE 'say [Text::Balanced::extract_bracketed( $_, "()" )]->[0]' <<EOF
(something1)something2
(some(thing)1)something(2)
(some(t()()hing)()1)()something(2)
EOF
will prints
(something1)
(some(thing)1)
(some(t()()hing)()1)
Since this is apparently something that is impossible with regular expressions, I have resorted to pickup the the characters 1 by 1:
first=""
count=0
while test -n "$string"
do
char=${string:0:1} # Get the first character
if [[ "$char" == ")" ]]
then
count=$(( $count - 1 ))
fi
if [[ $count > 0 ]]
then
first="$first$char"
fi
if [[ "$char" == "(" ]]
then
count=$(( $count + 1 ))
fi
string=${string:1} # Trim the first character
if [[ $count == 0 ]]
then
second="$string"
string=""
fi
done
You can do it with perl:
echo "(some(thing)1)something(2)" | perl -ne '$_ =~ /(\((?:\(.*\)|[^(])*\))|\w+/s; print $1;'
awk can do it:
#!/bin/awk -f
{
for (i=1; i<=length; ++i) {
if (numLeft == 0 && substr($0, i, 1) == "(") {
leftPos = i
numLeft = 1
} else if (substr($0, i, 1) == "(") {
++numLeft
} else if (substr($0, i, 1) == ")") {
++numRight
}
if (numLeft && numLeft == numRight) {
print substr($0, leftPos, i-leftPos+1)
next
}
}
}
Input:
(something1)something2
(some(thing)1)something(2)
Output:
(something1)
(some(thing)1)