This question already has answers here:
How do I use regular expressions in bash scripts?
(2 answers)
Test whether string is a valid integer
(11 answers)
Closed 6 years ago.
What is a regular expression for a positive integer? I need it in an if clause in a bash script and I tried [[ $myvar == [1-9][0-9]* ]] and I don't get why it says, for instance, that 6 is not an integer and 20O0O0 is.
The == operator performs pattern matching, not regular expression matching. [1-9][0-9]* matches a string that starts with 1-9, following by a digit in the range 0-9, followed by anything, including an empty string. * is not an operator, but a wildcard. As such, basic pattern matching is not sufficient.
You can use extended pattern matching, which can be enabled explicitly, or (in the case of newer versions of bash) is assumed to be enabled for the argument to == and !=.
shopt -s extglob # may not be necessary
if [[ $myvar == [1-9]*([0-9]) ]]; then
The pattern *([0-9]) will match zero or more occurrences of the pattern enclosed in parentheses.
If you want to use a regular expression instead, use the =~ operator. Note that you now need to anchor your regular expression to the beginning and end of the string you are matching; patterns do so automatically.
if [[ $myvar =~ ^[0-9][1-9]*$ ]]; then
Note that some of the confusion stems from the fact that [...] is both a legal regular expression and pattern, and that characters like * are used in both but with slightly different meanings. Also note that extended patterns are equivalent in power to regular expressions (anything you can match with one you can match with the other), but I leave the proof of that as an exercise to the reader.
There is no need to use regex to check a positive integer. Just (( ... )) construct like this:
isInt() {
# do sanity check for argument if needed
local n="$1"
[[ $n == [1-9]* && $n -gt 0 ]] 2>/dev/null && echo '+ve integer' || echo 'nope'
}
Then use it as:
isInt '-123'
nope
isInt 'abc'
nope
isInt '.123'
nope
isInt '0'
nope
isInt '789'
+ve integer
isInt '0123'
nope
foo=1234
isInt 'foo'
nope
[[ $myvar =~ ^[+]*[[:digit:]]*$ ]] && echo "Positive Integer"
shouldn't do it?
If a 0 is not a positive number in your description and you are not ready to accept leading zeros or plus, then do
[[ $myvar =~ ^[1-9]+[[:digit:]]*$ ]] && echo "Positive Integer"
Related
DPHPV = /usr/local/nginx/conf/php81-remi.conf;
I am unable to figure out how to match a string that contains any 2 digits:
if [[ "$DPHPV" =~ *"php[:digit:][:digit:]-remi.conf"* ]]
You are not using the right regex here as * is a quantifier in regex, not a placeholder for any text.
Actually, you do not need a regex, you may use a mere glob pattern like
if [[ "$DPHPV" == *php[[:digit:]][[:digit:]]-remi.conf ]]
Note
== - enables glob matching
*php[[:digit:]][[:digit:]]-remi.conf - matches any text with *, then matches php, then two digits (note that the POSIX character classes must be used inside bracket expressions), and then -rem.conf at the end of string.
See the online demo:
#!/bin/bash
DPHPV='/usr/local/nginx/conf/php81-remi.conf'
if [[ "$DPHPV" == *php[[:digit:]][[:digit:]]-remi.conf ]]; then
echo yes;
else
echo no;
fi
Output: yes.
This question already has answers here:
bash regex with quotes?
(5 answers)
Closed 2 years ago.
My question is about the Bash binary operator =~ about which the Bash manual page says the following:
When it is used, the string to the right of the operator is considered a POSIX extended regular expression and matched accordingly (as in regex(3)). The return value is 0 if the string matches the pattern, and 1 otherwise.
Under the heading Compound Command the manual says of an expression in the form:
[[ expression ]]
Return a status of 0 or 1 depending on the evaluation of the conditional expression expression. Expressions are composed of the primaries described below under CONDITIONAL EXPRESSIONS...[and] An additional binary operator, =~, is available...
Which seems to indicate that the =~ operator is available within a compound command of the form
[[ <string> =~ <string> ]]
Indeed, the following expression invoked at the Bash command-line prompt:
[[ 'x' =~ 'x' ]]
exits with a return value of 0 which, according to the manual page, indicates the pattern matched. However:
[[ 'x' =~ '.' ]]
returns 1 indicating the pattern does not match. And
[[ 'x' =~ '^' ]]
also returns 1. I have tried this on GNU bash version 5.0.18(1)-release on Debian Linux, and 5.0.17(1)-release on Apple Darwin.
The entry for "regex" in section 7 of the Debian manual (and "re_format" on the Apple machine) begins by indicating that it describes "Regular expressions ("RE"s), as defined in POSIX.2" of which one form is "modern REs (roughly those of egrep; POSIX.2 calls these 'extended' REs)." If the POSIX.2 mentioned in the regex page is the same as the POSIX mentioned in the bash page, then that would mean that the "modern REs" described in the regex page are the same as the "POSIX extended regular expressions" that Bash considers the string to the right of the =~ to be.
The regex manual entry says further:
"A (modern) RE is one or more nonempty branches"
"A branch is one or more pieces"
"A piece is an atom"
"An atom is [inter alia] '.' (matching any single character) [or] '^' (matching the null string at the beginning of a line..."
As noted above, this expression:
[[ 'x' =~ '.' ]]
returns a value 1 indicating no match. Yet if Bash considers the string to the right of the =~ operator to be a POSIX regular expression, and if the single character '.' can be a POSIX regular expression that matches any single character, and 'x' is a single character, then ought not the string '.' to the right of the =~ operator to match the single character 'x' that is to the left of the =~ operator in the above expression? If so, then why is the return value 1?
Similarly, if '^' matches the null string at the beginning of a line, then ought not the string '^" to the right of the =~ operator to match the string 'x' to the left of the =~ operator in the above expression? If so then why does the expression [[ 'x' =~ '^' ]] return 1?
Post-solution Update
chepner's answer (and the comments) provide the working solution. The following is the relevant excerpt from the bash manual page that I had overlooked:
Any part of the pattern may be quoted to force the quoted portion to be matched as a string. Bracket expressions in regular expressions must be treated carefully, since normal quoting characters lose their meanings between brackets. If the pattern is stored in a shell variable, quoting the variable expansion forces the entire pattern to be matched as a string.
Quoted characters in a regular expression are treated literally, not as a regex metacharacters. [[ 'x' =~ '.' ]] is equivalent to [[ 'x' = . ]].
Dropping the quotes works as expected:
$ [[ 'x' =~ . ]] && echo works
works
For this reason, you often use an unquoted parameter expansion to specify a regular expression.
$ regex=. # or regex='.'
$ [[ 'x' =~ $regex ]] && echo works
works
So I have this code
function test(){
local output="ASD[test]"
if [[ "$output" =~ ASD\[(.*?)\] ]]; then
echo "success";
else
echo "fail"
fi;
}
And as you can see it's supposed to echo success since the string matches that regular expression. However this ends up returning fail. What did I do wrong?
The ? in ASD\[(.*?)\] doesn't belong there. It looks like you're trying to apply a non-greedy modifier to the *, which is *? in Perl-compatible syntax, but Bash doesn't support that. (See the guide here.) In fact, if you examine $? after the test, you'll see that it's not 1 (the normal "string didn't match" result) but 2, which indicates a syntax error in the regular expression.
If you use the simpler pattern ASD\[(.*)\], then the match will succeed. However, if you use that regex on a string which might have later instances of brackets, too much will get captured by the parentheses. For example:
output=ASD[test1],ASD[test2]
[[ $output =~ ASD\[(.*)\] ]] && echo "first subscript is '${BASH_REMATCH[1]}'"
#=> first subscript is 'test1],ASD[test2'
In languages that support the *? syntax, it makes the matching "non-greedy" so that it will match the smallest string it can that makes the overall match succeed; without the ?, such expressions always match the longest possible instead. Since Bash doesn't have non-greediness, your best bet is probably to use a character class that matches everything except a close bracket, making it impossible for the match to move past the first one:
[[ $output =~ ASD\[([^]]*)\] ]] && echo "first subscript is '${BASH_REMATCH[1]}'"
#=> first subscript is 'test1'
Note that this breaks if there are any nested layers of bracket pairs within the subscript brackets - but then, so does the *? version.
I want to automate some task in a shell script. Among the code I need to make a comparison between two names that share the same digit but differ in one letter. I have a bunch of strings:
YC1SM YM1SM YC1SN YM1SN
YC4SM YM4SM YC4SN YM4SN
I need to match between the following:
$a=YC1SM
$b=YM1SM
or
$a=YC4SM
$b=YM4SM
or
$a=YC4SN
$b=YM4SN
I need to have an if clause using regular expression basically to do something like this:
if [$a matches $b]; then
command xxx
fi
How can I do this match within bash?
Edit:
The names are all the same length. They all differ in just one letter. This differing letter occur at the same position in the strings (here, the second character).
Edit2:
Added more scenario
Build a pattern from variable a and match b against the pattern.
a=YC1SM
b=YM1SM
pattern="${a:0:1}?${a:2}"
echo "$pattern"
[[ $b == $pattern ]] && echo match
Y?1SM
match
If the unmatched char must be a letter, change ? to [[:alpha:]]
You can have this comparison like this using BASH regex:
a=YC123SM
b=YART123JKL
[[ "$a" =~ ([0-9]+) ]] && n1="${BASH_REMATCH[1]}"
[[ "$b" =~ ([0-9]+) ]] && n2="${BASH_REMATCH[1]}"
[[ "$n1" -eq "$n2" ]] && echo "same" || echo "not same"
same
You don't need a regex. Just use the substring operation like this:
c="${a:0:1}${b:1:1}${a:2}"
if [[ "$c" -eq "$b" ]]; then
command xxx
fi
The substring operator works like this: ${var:first:length}
So the first line tkaes the first character of a, then the second character of b, the from the third character to the end of a.
In your case this will create a copy of a (called c) that will have all of the letters from a except it will contain the second character from b, which is the only character that you say can be different. Since this character is copied from b to make c, c will now match b if that character was the only difference.
This question already has answers here:
Bash Regular Expression -- Can't seem to match any of \s \S \d \D \w \W etc
(6 answers)
Closed 5 years ago.
so I have this function
function test(){
local output="CMD[hahahhaa]"
if [[ "$output" =~ "/CMD\[.*?\]/" ]]; then
echo "LOOL"
else
echo "$output"
fi;
}
however executing test in command line would output $output instead of "LOOL" despite the fact that the pattern should be matching $output...
what did I do wrong?
Don't use quotes ""
if [[ "$output" =~ ^CMD\[.*?\]$ ]]; then
The regex operator =~ expects an unquoted regular expression on its RHS and does only a sub-string match unless the anchors ^ (start of input) and $ (end of input) are also used to make it match the whole of the LHS.
Quotations "" override this behaviour and force a simple string match instead i.e. the matcher starts looking for all these characters \[.*?\] literally.