Executing git-hooks on windows errors out - regex

So,
I have written a simple git-hooks for pre-push which works just fine on Linux or Mac, but doesn't work on Windows.
Script:
Tries to match the commit message with a regular expression, and should return 0 if matches or else exit.
Based on the articles I read, they say that the hook should just work.
Command:
if [[ "$message" =~ "$regular_expression" ]];
Error:
.git/hooks/pre-push: line 6: conditional binary operator expected
.git/hooks/pre-push: line 6: syntax error near `=~'
.git/hooks/pre-push: line 6: ` if [[ "$message" =~ "$regular_expression" ]]; then'
So apparently it seems to be failing on "[[" and "]]".
Now I have also tried removing the double brackets and keep only one.
Command:
if [ "$message" =~ "$regular_expression" ];
Error:
.git/hooks/pre-push: line 6: [: =~: binary operator expected
This message is flawed: TRY-1 Sample
Does anybody know how to solve this issue ?

The =~ construct in bash conditional expressions is not supported in the version of bash shipped with Git for Windows. It looks like the =~ operator was introduced in bash 3.0 but while Git for Windows is using bash 3.1 it seems to be missing this operator.
Possibly $(echo $message | grep "$regexp") will work as a substitute. eg:
$ bash -c '[[ "hello" =~ "^h" ]]'
bash: -c: line 0: conditional binary operator expected
bash: -c: line 0: syntax error near `=~'
bash: -c: line 0: `[[ "hello" =~ "^h" ]]'
$ bash -c '[ $(echo hello | grep "^h") ] && echo matched || echo nomatch'
matched
Update
Here is an example script that works to match something similar using the Git for Windows bash:
#!/bin/bash
#
# grep returns 0 on matching something, 1 whn it fails to match
msg='TEST-111 Sample'
re='([A-Z]{2,8}-[0-9]{1,4}[[:space:]])+[A-Za-z0-9]+[[:space:]]*[A-Za-z0-9]+$'
rx='^([A-Z]{2,8}-[0-9]{1,4})[[:space:]][[:alnum:]]+$'
echo $msg | grep -qE "$rx"
[ $? = 0 ] && echo matched || echo nomatch
This script returns matched for the sample phrase using the second regular expression. Its not really clear what the original expression is attempting to match up -- looks like multiple words so I'm not sure why you don't just match .*$. However, this shows a way to try out the regexp. Note: we are using extended regular expressions ([[:space:]]) so we have to use grep -E. Also we have to take some care about quoting as $ is being used in the regexp.

Related

how to construct regex to compare case insensitive strings in shell script?

I am passing command line arguments to a shell script and it is being compared aganist a regular expression. The following code is case-sensitive:
[[ $1 =~ ^(cat)|(dog)$ ]] && echo "match" || echo "no match"
How can I modify this regex that will ignore cases? I would be able to pass cAt and it should match.
I want to use /i regex flag as it ignores cases. But how do I use it inside a shell script? I have tried [[ $1 =~ /(cat)|(dog)/i ]] but the script exited with a syntax error.
StackOverflow has a similar question but it does not answer my inquiry. I want to use test to compare both strings and not interested to use shopt -s nocasematch or grep <expression>
just use
shopt -s nocasematch
before your command.
alternatively
shopt -s nocasematch && [[ 'doG' =~ (cat)|(dog) ]] && echo 'hi' || echo 'no match'

Changing the bash =~ operator to a sh compatible argument

So i am running a bash script in coreos and thus I do not have /bin/bash however I do have /bin/sh. Now sh has been fine until I was using someone elses bash script and they have the following line.
if [[ "$file" =~ ^https?:// ]]; then
and my os complained with sh: =~: unknown operand now i assume this mean that the ~= opeator is not compatible with sh but there has to be some other way to do this form looking on SO I discovered that ~= is sometype of regular expressions operator. My question is this then can I replace ~= with something? Note: I have grep on my machine.
I have grep on my machine
Going by the above line, you could write a simple conditional using an if statement as
if echo "$file" | grep -Eq "^https?://"; then
printf 'regex matches\n'
fi
The regex match in grep uses ERE (Extended Regular Expressions) which is available in any POSIX compliant grep that you have installed. The -q just suppresses the normal stdout printed but just returns an exit code to know if the match was successful.
Even if some package of grep you have doesn't have -E allowed, just use the basic regular expressions support, but deprive ? of its special value and pass it literally
if echo "$file" | grep -q "^https\?://"; then
You could rewrite this as a case command:
case "$file" in
http://* | https://* )
# ...
;;
esac
You can call out to expr which does basic regex matching:
if expr match "$file" 'https\?://'; then
The pattern is implicitly anchored to the start of the string.

retrieve a word after a regular expression in shell script

I am trying to retrieve specific fields from a text file which has a metadata as follows:
project=XYZ; cell=ABC; strain=C3H; sex=F; age=PQR; treatment=None; id=MLN
And I have the following script for retrieving the field 'cell'
while read line
do
cell="$(echo $line | cut -d";" -f7 )"
echo $cell
fi
done < files.txt
However the following script retrieves the whole field as cell=ABC , whereas I just want the value 'ABC' from the field, how do I retrieve the value after the regex, in the same line of code?
If extracting one value (or, generally, a non-repeating set of values captured by distinct capture groups) is enough and you're running bash, ksh, or zsh, consider using the regex-matching operator, =~: [[ string =~ regex ]]:
Tip of the hat to #Adrian Frühwirth for the gist of the ksh and zsh solutions.
Sample input string:
string='project=XYZ; cell=ABC; strain=C3H; sex=F; age=PQR; treatment=None; id=MLN'
Shell-specific use of =~ is discussed next; a multi-shell implementation of the =~ functionality via a shell function can be found at the end.
bash
The special BASH_REMATCH array variable receives the results of the matching operation: element 0 contains the entire match, element 1 the first capture group's (parenthesized subexpression's) match, and so on.
bash 3.2+:
[[ $string =~ \ cell=([^;]+) ]] && cell=${BASH_REMATCH[1]} # -> $cell == 'ABC'
bash 4.x:
While the specific command above works, using regex literals in bash 4.x is buggy, notably when involving word-boundary assertions \< and \> on Linux; e.g., [[ a =~ \<a ]] inexplicably doesn't match; workaround: use an intermediate variable (unquoted!): re='\a'; [[ a =~ $re ]] works (also on bash 3.2+).
bash 3.0 and 3.1 - or after setting shopt -s compat31:
Quote the regex to make it work:
[[ $string =~ ' cell=([^;]+)' ]] && cell=${BASH_REMATCH[1]} # -> $cell == 'ABC'
ksh
The ksh syntax is the same as in bash, except:
the name of the special array variable that contains the matched strings is .sh.match (you must enclose the name in {...} even when just implicitly referring to the first element with ${.sh.match}):
[[ $string =~ \ cell=([^;]+) ]] && cell=${.sh.match[1]} # -> $cell == 'ABC'
zsh
The zsh syntax is also similar to bash, except:
The regex literal must be quoted - for simplicity as a whole, or at least some shell metacharacters, such as ;.
you may, but needn't double-quote a regex provided as a variable value.
Note how this quoting behavior differs fundamentally from that of bash 3.2+: zsh requires quoting only for syntax reasons and always treats the resulting string as a whole as a regex, whether it or parts of it were quoted or not.
There are 2 variables containing the matching results:
$MATCH contains the entire matched string
array variable $match contains only the matches for the capture groups (note that zsh arrays start with index 1 and that you don't need to enclose the variable name in {...} to reference array elements)
[[ $string =~ ' cell=([^;]+)' ]] && cell=$match[1] # -> $cell == 'ABC'
Multi-shell implementation of the =~ operator as shell function reMatch
The following shell function abstracts away the differences between bash, ksh, zsh with respect to the =~ operator; the matches are returned in array variable ${reMatches[#]}.
As #Adrian Frühwirth notes, to write portable (across zsh, ksh, bash) code with this, you need to execute setopt KSH_ARRAYS in zsh so as to make its arrays start with index 0; as a side effect, you also have to use the ${...[]} syntax when referencing arrays, as in ksh and bash).
Applied to our example we'd get:
# zsh: make arrays behave like in ksh/bash: start at *0*
[[ -n $ZSH_VERSION ]] && setopt KSH_ARRAYS
reMatch "$string" ' cell=([^;]+)' && cell=${reMatches[1]}
Shell function:
# SYNOPSIS
# reMatch string regex
# DESCRIPTION
# Multi-shell implementation of the =~ regex-matching operator;
# works in: bash, ksh, zsh
#
# Matches STRING against REGEX and returns exit code 0 if they match.
# Additionally, the matched string(s) is returned in array variable ${reMatch[#]},
# which works the same as bash's ${BASH_REMATCH[#]} variable: the overall
# match is stored in the 1st element of ${reMatch[#]}, with matches for
# capture groups (parenthesized subexpressions), if any, stored in the remaining
# array elements.
# NOTE: zsh arrays by default start with index *1*.
# EXAMPLE:
# reMatch 'This AND that.' '^(.+) AND (.+)\.' # -> ${reMatch[#]} == ('This AND that.', 'This', 'that')
function reMatch {
typeset ec
unset -v reMatch # initialize output variable
[[ $1 =~ $2 ]] # perform the regex test
ec=$? # save exit code
if [[ $ec -eq 0 ]]; then # copy result to output variable
[[ -n $BASH_VERSION ]] && reMatch=( "${BASH_REMATCH[#]}" )
[[ -n $KSH_VERSION ]] && reMatch=( "${.sh.match[#]}" )
[[ -n $ZSH_VERSION ]] && reMatch=( "$MATCH" "${match[#]}" )
fi
return $ec
}
Note:
function reMatch (as opposed to reMatch()) is used to declare the function, which is required for ksh to truly create local variables with typeset.
I would not use cut, since you cannot specify more than one delimiter.
If your grep supports PCRE, then you can do:
$ string='project=XYZ; cell=ABC; strain=C3H; sex=F; age=PQR; treatment=None; id=MLN'
$ grep -oP '(?<=cell=)[^;]+' <<< "$string"
ABC
You can use sed, which in simple terms can be done as -
$ sed -r 's/.*cell=([^;]+).*/\1/' <<< "$string"
ABC
Another option is to use awk. With that you can do the following by specifying list of delimiters you want to consider as field separators:
$ awk -F'[;= ]' '{print $5}' <<< "$string"
ABC
You can certainly put more checks by iterating over the line so that you don't have to hard-code to print 5th field.
Note that if your shell does not support here-string notation <<< then you can echo the variable and pipe it to the command.
$ echo "$string" | cmd
Here's a native shell solution:
$ string='project=XYZ; cell=ABC; strain=C3H; sex=F; age=PQR; treatment=None; id=MLN'
$ cell=${string#*cell=}
$ cell=${cell%%;*}
$ echo "${cell}"
ABC
This removes the shortest leading match up to including cell= from the string, then removes the longest trailing match up to including the ; leaving you with ABC.
Here's another solution which uses read to split the strings:
$ cat t.sh
#!/bin/bash
while IFS=$'; \t' read -ra attributes; do
for foo in "${attributes[#]}"; do
IFS='=' read -r key value <<< "${foo}"
[ "${key}" = cell ] && echo "${value}"
done
done <<EOF
foo=X; cell=ABC; quux=Z;
foo=X; cell=DEF; quux=Z;
EOF
.
$ ./t.sh
ABC
DEF
For solutions using external tools see #jaypal's excellent answer.

Regex and if in shell script

my programs starts some services and store its output in tmp variable and I want to match the variable's content if it starts with FATAL keyword or not? and if it contains I will print Port in use using echo command
For example if tmp contains FATAL: Exception in startup, exiting.
I can do it by sed: echo $tmp | sed 's/^FATAL.*/"Port in use"/'
but I want to use the builtin if to match the pattern.
How can I use the shell built in features to match REGEX?
POSIX shell doesn't have a regular expression operator for UNIX ERE or PCRE. But it does have the case keyword:
case "$tmp" in
FATAL*) doSomethingDrastic;;
*) doSomethingNormal;;
esac
You didn't tag the question bash, but if you do have that shell you can do some other kinds of pattern matching or even ERE:
if [[ "$tmp" = FATAL* ]]; then
…
fi
or
if [[ $tmp =~ ^FATAL ]]; then
…
fi
if [ -z "${tmp%FATAL*}" ]
then echo "Start with"
else
echo "Does not start with"
fi
work on KSH, BASH under AIX. Think it's also ok under Linux.
It's not a real regex but the limited regex used for file matching (internal to the shell, not like sed/grep/... that have their own version inside) of the shell. So * and ? could be used

bash: assign grep regex results to array

I am trying to assign a regular expression result to an array inside of a bash script but I am unsure whether that's possible, or if I'm doing it entirely wrong. The below is what I want to happen, however I know my syntax is incorrect:
indexes[4]=$(echo b5f1e7bfc2439c621353d1ce0629fb8b | grep -o '[a-f0-9]\{8\}')
such that:
index[1]=b5f1e7bf
index[2]=c2439c62
index[3]=1353d1ce
index[4]=0629fb8b
Any links, or advice, would be wonderful :)
here
array=( $(echo b5f1e7bfc2439c621353d1ce0629fb8b | grep -o '[a-f0-9]\{8\}') )
$ echo ${array[#]}
b5f1e7bf c2439c62 1353d1ce 0629fb8b
#!/bin/bash
# Bash >= 3.2
hexstring="b5f1e7bfc2439c621353d1ce0629fb8b"
# build a regex to get four groups of eight hex digits
for i in {1..4}
do
regex+='([[:xdigit:]]{8})'
done
[[ $hexstring =~ $regex ]] # match the regex
array=(${BASH_REMATCH[#]}) # copy the match array which is readonly
unset array[0] # so we can eliminate the full match and only use the parenthesized captured matches
for i in "${array[#]}"
do
echo "$i"
done
here's a pure bash way, no external commands needed
#!/bin/bash
declare -a array
s="b5f1e7bfc2439c621353d1ce0629fb8b"
for((i=0;i<=${#s};i+=8))
do
array=(${array[#]} ${s:$i:8})
done
echo ${array[#]}
output
$ ./shell.sh
b5f1e7bf c2439c62 1353d1ce 0629fb8b