BASH script to extract UUID/PARTUUID from Variable - regex

I'm trying to develop a bash setup script that includes mounting and migrating a boot drive. I've got most of it working, but would like to populate my /boot/cmdline.txt and fstab files with drive UUID and PARTUUID numbers.
I basically set a variable with the output of blkid:
disk=$(blkid)
echo "${disk}"
RESULT:
/dev/mmcblk0p1: LABEL_FATBOOT="boot" LABEL="boot" UUID="69D5-9B27" TYPE="vfat" PARTUUID="d9b3f436-01"
/dev/mmcblk0p2: LABEL="rootfs" UUID="24eaa08b-10f2-49e0-8283-359f7eb1a0b6" TYPE="ext4" PARTUUID="d9b3f436-02"
/dev/sda1: LABEL="usbfs" UUID="493b6467-7b7b-4291-a86d-dea5e842780b" TYPE="ext4" PARTUUID="83122dbb-cacf-4612-9be2-4301a03e8093"
/dev/mmcblk0: PTUUID="d9b3f436" PTTYPE="dos"
My goal is to set one variable to capture the /dev/sda1 value for UUID and the other for the same drives PARTUUID. My basic premise is to do something like this (based on being able to do this in python:
#sudo code#
Disk=diskInfo
While line in Disk; do
If Line contains /dev/sda1
Then
Do some Regex to set vUUID = "493b6467-7b7b-4291-a86d-dea5e842780b"
Do some Regex to set vPARTUUID = "83122dbb-cacf-4612-9be2-4301a03e8093"
I think I want something like this - - but can't get it to work:
disk=$(blkid)
while read line; do
if [[ $line == '/dev/sda1'* ]]; then
if [[ $line =~ UUID=(["'])(?:(?=(\\?))\2.)*?\1 ]]; then #captures too much
vUUID=${BASH_REMATCH[1]}
fi
if [[ $line =~ PARTUUID=(["'])(?:(?=(\\?))\2.)*?\1 ]]; then #captures too much
vPARTUUID=${BASH_REMATCH[1]}
fi
fi
done <<< "$disk"

You don't need a loop here.
$ IFS=\" read -r _ vUUID _ vPARTUUID _ < <(blkid /dev/sda1 -s UUID -s PARTUUID)
$
$ echo $vUUID
9099-AD46
$
$ echo $vPARTUUID
90afc43c-5b4d-4721-b82a-000e585fef62
If there is no such disk read will silently fail with a non-zero exit status; so you can use it as a condition in an if-else expression.

You are almost there.
Would you please try:
pat='^/dev/sda1.* UUID="([^"]+)".* PARTUUID="([^"]+)"'
while IFS= read -r line; do
if [[ $line =~ $pat ]]; then
vUUID="${BASH_REMATCH[1]}"
vPARTUUID="${BASH_REMATCH[2]}"
fi
done < <(blkid)
result:
echo "vUUID=$vUUID"
vUUID=493b6467-7b7b-4291-a86d-dea5e842780b
echo "vPARTUUOID=$vPARTUUID"
vPARTUUOID=83122dbb-cacf-4612-9be2-4301a03e8093
Hope this helps.

Related

Check if a string contains valid pattern in Bash

I have a file a.txt contains a string like:
Axxx-Bxxxx
Rules for checking if it is valid or not include:
length is 10 characters.
x here is digits only.
Then, I try to check with:
#!/bin/bash
exp_len=10;
file=a.txt;
msg="checking string";
tmp="File not exist";
echo $msg;
if[ -f $file];then
tmp=$(cat $file);
if[[${#tmp} != $exp_len ]];then
msg="invalid length";
elif [[ $tmp =~ ^[A[0-9]{3}-B[0-9]{4}]$]];then
msg="valid";
else
msg="invalid";
fi
else
msg="file not exist";
fi
echo $msg;
But in valid case it doesn't work...
Is there someone help to correct me?
Thanks :)
Other than the regex fix, your code can be refactored as well, moreover there are syntax issues as well. Consider this code:
file="a.txt"
msg="checking string"
tmp="File not exist"
echo "$msg"
if [[ -f $file ]]; then
s="$(<$file)"
if [[ $s =~ ^A[0-9]{3}-B[0-9]{4}$ ]]; then
msg="valid"
else
msg="invalid"
fi
else
msg="file not exist"
fi
echo "$msg"
Changes are:
Remove unnecessary cat
Use [[ ... ]] when using bash
Spaces inside [[ ... ]] are required (your code was missing them)
There is no need to check length of 10 as regex will make sure that part as well
As mentioned in comments earlier correct regex should be ^A[0-9]{3}-B[0-9]{4}$ or ^A[[:digit:]]{3}-B[[:digit:]]{4}$
Note that a regex like ^[A[0-9]{3}-B[0-9]{4}]$ matches
^ - start of string
[A[0-9]{3} - three occurrences of A, [ or a digit
-B - a -B string
[0-9]{4} - four digits
] - a ] char
$ - end of string.
So, it matches strings like [A[-B1234], [[[-B1939], etc.
Your regex checking line must look like
if [[ $tmp =~ ^A[0-9]{3}-B[0-9]{4}$ ]];then
See the online demo:
#!/bin/bash
tmp="A123-B1234";
if [[ $tmp =~ ^A[0-9]{3}-B[0-9]{4}$ ]];then
msg="valid";
else
msg="invalid";
fi
echo $msg;
Output:
valid
Using just grep might be easier:
$ echo A123-B1234 > valid.txt
$ echo 123 > invalid.txt
$ grep -Pq 'A\d{3}-B\d{4}' valid.txt && echo valid || echo invalid
valid
$ grep -Pq 'A\d{3}-B\d{4}' invalid.txt && echo valid || echo invalid
invalid
With your shown samples and attempts, please try following code also.
#!/bin/bash
exp_len=10;
file=a.txt;
msg="checking string";
tmp="File not exist";
if [[ -f "$file" ]]
then
echo "File named $file is existing.."
awk '/^A[0-9]{3}-B[0-9]{4}$/{print "valid";next} {print "invalid"}' "$file"
else
echo "Please do check File named $file is not existing, exiting from script now..."
exit 1;
fi
OR In case you want to check if line in your Input_file should be 10 characters long(by seeing OP's attempted code's exp_len shell variable) then try following code, where an additional condition is also added in awk code.
#!/bin/bash
exp_len=10;
file=a.txt;
msg="checking string";
tmp="File not exist";
if [[ -f "$file" ]]
then
echo "File named $file is existing.."
awk -v len="$exp_len" 'length($0) == len && /^A[0-9]{3}-B[0-9]{4}$/{print "valid";next} {print "invalid"}' "$file"
else
echo "Please do check File named $file is not existing, exiting from script now..."
exit 1;
fi
NOTE: I am using here -f flag to test if file is existing or not, you can change it to -s eg: -s "$file" in case you want to check file is present and is of NOT NULL size.

Using regular expressions in a ksh Script

I have a file (file.txt) that contains some text like:
000000000+000+0+00
000000001+000+0+00
000000002+000+0+00
and I am trying to check each line to make sure that it follows the format:
character*9, "+", character*3, "+", etc
so far I have:
#!/bin/ksh
file=file.txt
line_number=1
for line in $(cat $file)
do
if [[ "$line" != "[[.]]{9}+[[.]]{3}+[[.]]{1}+[[.]]{2} ]" ]]
then
echo "Invalid number ($line) check line $line_number"
exit 1
fi
let "line_number++"
done
however this does not evaluate correctly, no matter what I put in the lines the program terminates.
When you want line numbers of the mismatches, you can use grep -vn. Be careful with writing a correct regular expression, and you will have
grep -Evn "^.{9}[+].{3}[+].[+].{2}$" file.txt
This is not in the layout that you want, so change the layout with sed:
grep -Evn "^.{9}[+].{3}[+].[+].{2}$" file.txt |
sed -r 's/([^:]*):(.*)/Invalid number (\2) check line number \1./'
EDIT:
I changed .{1} into ..
The sed is also over the top. When you need spme explanation, you can start with echo "Linenr:Invalid line"
I'm having funny results putting the regex in the condition directly:
$ line='000000000+000+0+00'
$ [[ $line =~ ^.{9}\+.{3}\+.\+..$ ]] && echo ok
ksh: syntax error: `~(E)^.{9}\+.{3}\+.\+..$ ]] && echo ok
' unexpected
But if I save the regex in a variable:
$ re="^.{9}\+.{3}\+.\+..$"
$ [[ $line =~ $re ]] && echo ok
ok
So you can do
#!/bin/ksh
file=file.txt
line_number=1
re="^.{9}\+.{3}\+.\+..$"
while IFS= read -r line; do
if [[ ! $line =~ $re ]]; then
echo "Invalid number ($line) check line $line_number"
exit 1
fi
let "line_number++"
done < "$file"
You can also use a plain glob pattern:
if [[ $line != ?????????+???+?+?? ]]; then echo error; fi
ksh glob patterns have some regex-like syntax. If there's an optional space in there, you can handle that with the ?(sub-pattern) syntax
pattern="?????????+???+?( )?+??"
line1="000000000+000+0+00"
line2="000000000+000+ 0+00"
[[ $line1 == $pattern ]] && echo match || echo no match # => match
[[ $line2 == $pattern ]] && echo match || echo no match # => match
Read the "File Name Generation" section of the ksh man page.
Your regex looks bad - using sites like https://regex101.com/ is very helpful. From your description, I suspect it should look more like one of these;
^.{9}\+.{3}\+.{1}\+.{2}$
^[^\+]{9}\+[^\+]{3}\+[^\+]{1}\+[^\+]{2}$
^[0-9]{9}\+[0-9]{3}\+[0-9]{1}\+[0-9]{2}$
From the ksh manpage section on [[ - you would probably want to be using =~.
string =~ ere
True if string matches the pattern ~(E)ere where ere is an extended regular expression.
Note: As far as I know, ksh regex doesn't follow the normal syntax
You may have better luck with using grep:
# X="000000000+000+0+00"
# grep -qE "^[^\+]{9}\+[^\+]{3}\+[^\+]{1}\+[^\+]{2}$" <<<"${X}" && echo true
true
Or:
if grep -qE "^[^\+]{9}\+[^\+]{3}\+[^\+]{1}\+[^\+]{2}$" <<<"${line}"
then
exit 1
fi
You may also prefer to use a construct like below for handling files:
while read line; do
echo "${line}";
done < "${file}"

how to match regex in bash script with for loop?

I'm trying to match multiple strings from output of a command and do something for each one of them.
#!/usr/bin/env bash
echo 'Howdy, can you please give me the domain (without www)?'
read domain
routes=$(flynn -a shop-app route | grep $domain)
# echo $routes | egrep "http\/\S+"
pattern="http\/[^ ]+"
for word in $routes
do
[[ $word =~ $pattern ]]
if ${BASH_REMATCH[0]}
then
match="${BASH_REMATCH[0]}"
sed -i s/DOMAIN/$domain/g $domain.sh
sed -i s:ROUTE1:$match:g $domain.sh
fi
if ${BASH_REMATCH[1]}
then
match2="${BASH_REMATCH[1]}"
sed -i s:ROUTE2:$match2:g $domain.sh
fi
done
echo $match
update: the regex part works now but the loop is not working. I know the loop will find two match and want to do something with each one
the sample text:
http:www.lipi.ir shop-app-web http/d49ced12-c6ca-46a0-b919-6d97b6580ad3 false false /
http:lipi.ir shop-app-web http/ff919e9d-9bf7-4342-a4b3-ea184c698959 false false /

Regex - validate IPv6 shell script

I am able to validate IPv6 addresses using java with following regex:
([0-9a-fA-F]{0,4}:){1,7}([0-9a-fA-F]){0,4}
But I need to do this in shell script to which I am new.
This regex doesn't seem to work in shell. Have tried some other combinations also but nothing helped.
#!/bin/bash
regex="([0-9a-fA-F]{0,4}:){1,7}([0-9a-fA-F]){0,4}"
var="$1"
if [[ "$var" =~ "$regex" ]]
then
echo "matches"
else
echo "doesn't match!"
fi
It gives output doesn't match! for 2001:0Db8:85a3:0000:0000:8a2e:0370:7334
How can I write this in shell script?
Java regex shown in question would work in bash as well but make sure to not to use quoted regex variable. If the variable or string on the right hand side of =~ operator is quoted, then it is treated as a string literal instead of regex.
I also recommend using anchors in regex. Otherwise it will print matches for invalid input as: 2001:0db8:85a3:0000:0000:8a2e:0370:7334:foo:bar:baz.
Following script should work for you:
#!/bin/bash
regex='^([0-9a-fA-F]{0,4}:){1,7}[0-9a-fA-F]{0,4}$'
var="$1"
if [[ $var =~ $regex ]]; then
echo "matches"
else
echo "doesn't match!"
fi
[[ and =~ won't work with sh, and awk almost works everywhere.
Here is what I did
saved as ./check-ipv6.sh, chmod +x ./check-ipv6.sh
#!/bin/sh
regex='^([0-9a-fA-F]{0,4}:){1,7}[0-9a-fA-F]{0,4}$'
echo -n "$1" | awk '$0 !~ /'"$regex"'/{print "not an ipv6=>"$0;exit 1}'
Or you prefer bash than sh
#!/bin/bash
regex='^([0-9a-fA-F]{0,4}:){1,7}[0-9a-fA-F]{0,4}$'
awk '$0 !~ /'"$regex"'/{print "not an ipv6=>"$0;exit 1}' <<< "$1"
Test
~$ ./check-ipv6.sh 2001:0Db8:85a3:0000:0000:8a2e:0370:7334x
not an ipv6=>2001:0Db8:85a3:0000:0000:8a2e:0370:7334x
~$ echo $?
1
~$ ./check-ipv6.sh 2001:0Db8:85a3:0000:0000:8a2e:0370:7334
~$ echo $?
0

Pattern matching in if statement in bash

I'm trying to count the words with at least two vowels in all the .txt files in the directory. Here's my code so far:
#!/bin/bash
wordcount=0
for i in $HOME/*.txt
do
cat $i |
while read line
do
for w in $line
do
if [[ $w == .*[aeiouAEIOU].*[AEIOUaeiou].* ]]
then
wordcount=`expr $wordcount + 1`
echo $w ':' $wordcount
else
echo "In else"
fi
done
done
echo $i ':' $wordcount
wordcount=0
done
Here is my sample from a txt file
Last modified: Sun Aug 20 18:18:27 IST 2017
To remove PPAs
sudo apt-get install ppa-purge
sudo ppa-purge ppa:
The problem is it doesn't match the pattern in the if statement for all the words in the text file. It goes directly to the else statement. And secondly, the wordcount in echo $i ':' $wordcount is equal to 0 which should be some value.
Immediate Issue: Glob vs Regex
[[ $string = $pattern ]] doesn't perform regex matching; instead, it's a glob-style pattern match. While . means "any character" in regex, it matches only itself in glob.
You have a few options here:
Use =~ instead to perform regular expression matching:
[[ $w =~ .*[aeiouAEIOU].*[AEIOUaeiou].* ]]
Use a glob-style expression instead of a regex:
[[ $w = *[aeiouAEIOU]*[aeiouAEIOU]* ]]
Note the use of = rather than == here; while either is technically valid, the former avoids building finger memory that would lead to bugs when writing code for a POSIX implementation of test / [, as = is the only valid string comparison operator there.
Larger Issue: Properly Reading Word-By-Word
Using for w in $line is innately unsafe. Use read -a to read a line into an array of words:
#!/usr/bin/env bash
wordcount=0
for i in "$HOME"/*.txt; do
while read -r -a words; do
for word in "${words[#]}"; do
if [[ $word = *[aeiouAEIOU]*[aeiouAEIOU]* ]]; then
(( ++wordcount ))
fi
done
done <"$i"
printf '%s: %s\n' "$i" "$wordcount"
wordcount=0
done
Try:
awk '/[aeiouAEIOU].*[AEIOUaeiou]/{n++} ENDFILE{print FILENAME":"n; n=0}' RS='[[:space:]]' *.txt
Sample output looks like:
$ awk '/[aeiouAEIOU].*[AEIOUaeiou]/{n++} ENDFILE{print FILENAME":"n; n=0}' RS='[[:space:]]' *.txt
one.txt:1
sample.txt:9
How it works:
/[aeiouAEIOU].*[AEIOUaeiou]/{n++}
Every time we find a word with two vowels, we increment variable n.
ENDFILE{print FILENAME":"n; n=0}
At the end of each file, we print the name of the file and the 2-vowel word count n. We then reset n to zero.
RS='[[:space:]]'
This tells awk to use any whitespace as a word separator. This makes each word into a record. Awk reads the input one record at a time.
Shell issues
The use of awk avoids a multitude of shell issues. For example, consider the line for w in $line. This will not work the way you hope. Consider a directory with these files:
$ ls
one.txt sample.txt
Now, let's take line='* Item One' and see what happens:
$ line='* Item One'
$ for w in $line; do echo "w=$w"; done
w=one.txt
w=sample.txt
w=Item
w=One
The shell treats the * in line as a wildcard and expands it into a list of files. Odds are you didn't want this. The awk solution avoids a variety of issues like this.
Using grep - this is pretty simple to do.
#!/bin/bash
wordcount=0
for file in ./*.txt
do
count=`cat $file | xargs -n1 | grep -ie "[aeiou].*[aeiou]" | wc -l`
wordcount=`expr $wordcount + $count`
done
echo $wordcount