How to check whether string matches that of a domain - regex

I have written a piece of code to test whether a string matches a domain like this:
host=$1
if [[ $host =~ ^((?!-)[A-Za-z0-9-]{1,63}(?<!-)\\.)+[A-Za-z]{2,6}$ ]] ; then
echo "it is a domain!"
fi
With help from this website but for some reason, the above is not working.
Do you have any idea why?

Bash regex doesn't have lookaround, you can use Perl Regex with grep:
#!/bin/bash
if grep -oP '^((?!-)[A-Za-z0-9-]{1,63}(?<!-)\.)+[A-Za-z]{2,6}$' <<< "$1" >/dev/null 2>&1;then
echo valid
else
echo invalid
fi

Related

Regex operator and grep -E fail

I found the perfect regex for my needs here: Regex for no duplicate characters from a limited character pool Live demo Here
But when I test it with bash regex operator it always fails:
if [[ 'ABC' =~ ^(?!.*(.).*\1)[ABC]+$ ]]; then
echo "success"
else
echo "fail"
fi
I also tried it with grep:
echo "ABC" | grep -E "^(?!.*(.).*\1)[ABC]+$"
But I got "grep: Invalid back reference"
You should use -P of grep :
echo "ABC" | grep -P '^(?!.*(.).*\1)[ABC]+$'
There is no lookaround support in POSIX ERE, so you need to introduce a second condition:
s='ABCC'
rx1='^[ABC]+$'
rx2='(.).*\1'
if [[ "$s" =~ $rx1 && ! "$s" =~ $rx2 ]]; then
echo "success"
else
echo "fail"
fi
See the Bash online demo.
Details:
"$s" =~ ^[ABC]+$ - checks that the whole s string consists of one or more A, B or C chars
&& ! "$s" =~ (.).*\1 - and another condition requires the s string to have no repeating character.

Extract integers from string with bash

From a variable how to extract integers that will be in format *\d+.\d+.\d+* (4.12.3123) using bash.
filename="xzxzxzxz4.12.3123fsfsfsfsfsfs"
I have tried:
filename="xzxzxzxz4.12.3123fsfsfsfsfsfs"
if [[ "$filename" =~ (.*)(\d+.\d+.\d+)(.*) ]]; then
echo ${BASH_REMATCH}
echo ${BASH_REMATCH[1]}
echo ${BASH_REMATCH[2]}
echo ${BASH_REMATCH[3]}
else
echo 'nej'
fi
which does not work.
The easiest way to work with regexes in Bash, in terms of consistency between Bash versions and escaping, is to put the regex into a single-quoted variable and then use it unquoted, as below:
re='[0-9]+\.[0-9]+\.[0-9]+'
[[ $filename =~ $re ]] && printf '%s\n' "${BASH_REMATCH[#]}"
The main issue with your approach were that you were using the "Perl-style" \d, so in fact you could make your code work with:
if [[ "$filename" =~ (.*)([0-9]+\.[0-9]+\.[0-9]+)(.*) ]]; then
echo "${BASH_REMATCH[2]}"
fi
But this unnecessarily creates 3 capture groups, when you don't even need one. Note that I also changed . (any character) to \. (a literal .).
one way to extract:
grep -oP '\d\.\d+\.\d+' <<<$xfilename
There is one more way
$ filename="xzxzxzxz4.12.3123fsfsfsfsfsfs"
$ awk '{ if (match($0, /[0-9].[0-9]+.[0-9]+/, m)) print m[0] }' <<< "$filename"
4.12.3123

how to match regex in bash script with for loop?

I'm trying to match multiple strings from output of a command and do something for each one of them.
#!/usr/bin/env bash
echo 'Howdy, can you please give me the domain (without www)?'
read domain
routes=$(flynn -a shop-app route | grep $domain)
# echo $routes | egrep "http\/\S+"
pattern="http\/[^ ]+"
for word in $routes
do
[[ $word =~ $pattern ]]
if ${BASH_REMATCH[0]}
then
match="${BASH_REMATCH[0]}"
sed -i s/DOMAIN/$domain/g $domain.sh
sed -i s:ROUTE1:$match:g $domain.sh
fi
if ${BASH_REMATCH[1]}
then
match2="${BASH_REMATCH[1]}"
sed -i s:ROUTE2:$match2:g $domain.sh
fi
done
echo $match
update: the regex part works now but the loop is not working. I know the loop will find two match and want to do something with each one
the sample text:
http:www.lipi.ir shop-app-web http/d49ced12-c6ca-46a0-b919-6d97b6580ad3 false false /
http:lipi.ir shop-app-web http/ff919e9d-9bf7-4342-a4b3-ea184c698959 false false /

Regex - validate IPv6 shell script

I am able to validate IPv6 addresses using java with following regex:
([0-9a-fA-F]{0,4}:){1,7}([0-9a-fA-F]){0,4}
But I need to do this in shell script to which I am new.
This regex doesn't seem to work in shell. Have tried some other combinations also but nothing helped.
#!/bin/bash
regex="([0-9a-fA-F]{0,4}:){1,7}([0-9a-fA-F]){0,4}"
var="$1"
if [[ "$var" =~ "$regex" ]]
then
echo "matches"
else
echo "doesn't match!"
fi
It gives output doesn't match! for 2001:0Db8:85a3:0000:0000:8a2e:0370:7334
How can I write this in shell script?
Java regex shown in question would work in bash as well but make sure to not to use quoted regex variable. If the variable or string on the right hand side of =~ operator is quoted, then it is treated as a string literal instead of regex.
I also recommend using anchors in regex. Otherwise it will print matches for invalid input as: 2001:0db8:85a3:0000:0000:8a2e:0370:7334:foo:bar:baz.
Following script should work for you:
#!/bin/bash
regex='^([0-9a-fA-F]{0,4}:){1,7}[0-9a-fA-F]{0,4}$'
var="$1"
if [[ $var =~ $regex ]]; then
echo "matches"
else
echo "doesn't match!"
fi
[[ and =~ won't work with sh, and awk almost works everywhere.
Here is what I did
saved as ./check-ipv6.sh, chmod +x ./check-ipv6.sh
#!/bin/sh
regex='^([0-9a-fA-F]{0,4}:){1,7}[0-9a-fA-F]{0,4}$'
echo -n "$1" | awk '$0 !~ /'"$regex"'/{print "not an ipv6=>"$0;exit 1}'
Or you prefer bash than sh
#!/bin/bash
regex='^([0-9a-fA-F]{0,4}:){1,7}[0-9a-fA-F]{0,4}$'
awk '$0 !~ /'"$regex"'/{print "not an ipv6=>"$0;exit 1}' <<< "$1"
Test
~$ ./check-ipv6.sh 2001:0Db8:85a3:0000:0000:8a2e:0370:7334x
not an ipv6=>2001:0Db8:85a3:0000:0000:8a2e:0370:7334x
~$ echo $?
1
~$ ./check-ipv6.sh 2001:0Db8:85a3:0000:0000:8a2e:0370:7334
~$ echo $?
0

bash if [[ =~ regex compare not working?

I have a value in a variable that may be absolute or relative url, and I need to check which one it is.
I have found that there's a =~ operator in [[, but I can't get it to work. What am I doing wrong?
url="http://test"
if [[ "$url" =~ "^http://" ]];
then echo "absolute.";
fi;
You need to use regex without quote:
url="http://test"
if [[ "$url" =~ ^http:// ]]; then
echo "absolute."
fi
This outputs `absolute. as regex needs to be without quote in newer BASH (after BASH v3.1)
Or avoid regex and use glob matching:
if [[ "$url" == "http://"* ]]; then
echo "absolute."
fi