Sed Capture Group Regex - regex

I have the following string as output
Config(1) = ( value1:4000 value2:2000 value3:500 value4:1000)
I want to capture all 4 values into 4 different variables in bash and I think the cleanest way to do that is with regex. And I think the best way to use regex for this is with sed.
I have tested the regex and can capture the value1 with
With sed I am trying this based on other answers:
echo "Config(1) = ( value1:4000 value2:2000 value3:500 value4:1000)" | sed -n 's/^\s*value1\:\(\d\+\)\s\?.*/\1/p'
This returns nothing

BASH supports regular expressions natively:
s='Config(1) = ( value1:4000 value2:2000 value3:500 value4:1000)'
pattern='value1:([0-9]+) value2:([0-9]+) value3:([0-9]+) value4:([0-9]+)'
if [[ "$s" =~ $pattern ]]
echo "${BASH_REMATCH[1]}"
echo "${BASH_REMATCH[2]}"
echo "${BASH_REMATCH[3]}"
echo "${BASH_REMATCH[4]}"

You could grep for the value with the -o flag to only output the match.
This outputs 4000
echo "Config(1) = ( value1:4000 value2:2000 value3:500 value4:1000)" | grep -Po '(?<=value1:)\d+'
Though it's tough to advise about if its the cleanest way to achieve your goal without more context, a program (in awk maybe?) that parses that output format might be interesting here.

This will work
Create a simple two statement script
var=`echo "Config(1) = ( value1:4000 value2:2000 value3:500 value4:1000)" | grep -Eo "\( .*\)"|sed 's/^.\(.*\).$/\1/'`
for v in $var; do
echo $v| awk -F: '{print $2}'
Run as
root#114855-T480:/home/yadav22ji# ./tpr
You can assign these values to variables as you said.

Parsing and capturing every value to the variable:
result=`echo "Config(1) = ( value1:4000 value2:2000 value3:500 value4:1000)"`
declare -A variables=( ["variableone"]="1" ["variabletwo"]="2" ["variablethree"]="3" ["variablefour"]="4" )
for index in ${!variables[*]}
export $index=$(echo $result | tr ' ' '\n' | sed "s/[()]//g" | grep value | awk -F ":" '{print $2}' | head -"${variables[$index]}" | tail -1)
Array item- name of the variable
Array index - counter line using in head command
[root#centos ~]# env | grep variable

I think following regex could be more shorter form in #hmm answer


How to use sed to replace every match according to each match?

$ echo 'a,b,c,d=1' | sed '__MAGIC_HERE__'
$ echo 'a,b,c,d=2' | sed '__MAGIC_HERE__'
Dose sed can cast this spell ?
I have to use sed twice to achieve this
v=`echo $s | sed -rn 's/.*([0-9]+)/\1/p'`
echo $s | sed "s/=.*//" | sed -rn "s/([a-z])/\1=$v/gp"
echo $s | sed -rn 's/.*([0-9]+)/\1/p' | { read v;echo $s | sed "s/=.*//" | sed -rn "s/([a-z])/\1=$v/gp"; }
The real use case is here and there is multiline content, Thanks to #hek2mgl, the awk is way more easier.
My usecase
export LS_COLORS='no=00:fi=00:di=01;34:ln=01;36:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:ex=01;32'
# SED Version
read -rd '' exts < <(
for i in $(echo $exts)
echo $i | sed -rn 's/.*=(.*)/\1/p' | { read v; echo $i | sed "s/=.*//" | sed -rn "s/([^|]+)\|?/:\*.\1=$v/gp"; }
done | tr -d '\n'
export LS_COLORS="$LS_COLORS$exts"
# AWK Version
read -r -d '' exts < <( echo $exts | xargs -n1 | awk -F= '{gsub(/\|/,"="$2":*.")}$2' | tr "\n" ":" )
export LS_COLORS="$LS_COLORS:*.$exts"
unset exts
Finale sed version
read -r -d '' exts < <( echo $exts | xargs -n1 | sed -r 's/\|/\n/g;:a;s/\n(.*(=.*))/\2:*.\1/;ta' | sed "s/^/*./g" | tr "\n" ":" )
export LS_COLORS="$LS_COLORS:$exts"
This might work for you (GNU sed):
sed -r 's/,/\n/g;:a;s/\n(.*(=.*))/\2,\1/;ta' file
Convert the separators to newlines (a unique character not found in the file) and then replace each occurrence of the newline by the required string and the original separator.
I would use awk:
awk -F= '{gsub(/,/,"="$2",")}1'
-F= splits the input line by = which let's us access the number in field two $2. gsub() replaces all occurrences of , by =$2,. The 1 at the end is an awk idiom. It will simply print the, modified, line.
Perl can...
echo 'a,b,c,d=1' | perl -ne 'chomp; my ($val) = m|=(\d+)|; s|\=.*||; print join(",", map {"$_=$val"} split/,/) . "\n";'
perl -ne # Loop over input and run command
chomp; # Remove trailing newline
my ($val) = m|=(\d+)|; # Find numeric value after '='
s|\=.*||; # Remove everything starting with '='
split /,/ # Split input on ',' => ( a, b, c, d )
map {"$_=$val" } # Create strings ( "a=1", "b=1", ... ) from results of split
join(",",...) # Join the results of previous map with ','
print .... "\n" # Print it all out with a newline at the end.
I hope you're not seriously going to use that mush of read/echo/xargs/sed/sed/tr in your code. Just use one small, simple awk script:
$ cat
exts=$( awk -F'=' '
NF {
gsub(/\||$/, "="$2":", $1)
out = out $1
sub(":$", "", out)
print out
' <<<"$exts" )
echo "$exts"
$ ./
Perl, another Perl alternative...
echo 'a,b,c,d=1' | perl -pe '($a)=/(\d+)$/; s/,/=$a,/g;'
echo 'a,b,c,d=2' | perl -pe '($a)=/(\d+)$/; s/,/=$a,/g;'
perl -e # perl one-liner switch
perl -ne # puts an implicit loop for each line of input
perl -pe # as 'perl -ne', but adds an implicit print at the end of each iteration
($a)=/(\d+)$/; # catch the number in d=1 or d=2, assign variable $a
s/,/=$a,/g; # substitute each ',' with '=1,' if $a=1

Regex w/grep against tnsnames.ora

I am trying to print out the contents of a TNS entry from the tnsnames.ora file to make sure it is correct from an Oracle RAC environment.
So if I do something like:
grep -A 4 "" $ORACLE_HOME/network/admin/tnsnames.ora
I will get back: =
(PROTOCOL = TCP)(HOST = = 1521))
Which is what I want. Now I have an environment variable being set for the JDBC connection string by an external program when the shell script gets called like:
export $
So I need to get TNS alias out of the above string. I'm not sure how to do multiple matches and reorder the matches with regex and need some help.
grep #.+: $DB_URL
I assume will get the
but I'm looking for
So I'm stuck at this part. How do I get the TNS alias and then pipe/combine it with the initial grep to display the text for the TNS entry?
#mklement0 #Walter A - I tried your ways but they are not exactly what I was looking for.
echo "" | grep -Po "#\K[^:]*"
echo "" | sed 's/.*#\(.*\):.*/\1/'
echo "" | cut -d"#" -f2 | cut -d":" -f1
echo "" | tr "#:" "\t" | cut -f2
echo "" | awk -F'[#:]' '{ print $2 }'
All these methods get me back:
What I am looking for is actually:
- For brevity, the commands below use bash/ksh/zsh here-string syntax to send strings to stdin (<<<"$var"). If your shell doesn't support this, use printf %s "$var" | ... instead.
The following awk command will extract the desired string ( from $DB_URL (
awk -F '[#:/]' '{ sub("^[^.]+", "", $2); print $4 $2 }' <<<"$DB_URL"
-F'[#:/]' tells awk to split the input into fields by either # or : or /. With your input, this means that the field of interest are part of the second field ($2) and the fourth field ($4). The sub() call removes the first .-based component from $2, and the print call pieces together the result.
To put it all together:
domain=$(awk -F '[#:/]' '{ sub("^[^.]+", "", $2); print $4 $2 }' <<<"$DB_URL")
grep -F -A 4 "$domain" "$ORACLE_HOME/network/admin/tnsnames.ora"
You don't strictly need intermediate variable $domain, but I've added it for clarity.
Note how -F was added to grep to specify that the search term should be treated as a literal, so that characters such as . aren't treated as regex metacharacters.
Alternatively, for more robust matching, use a regex that is anchored to the start of the line with ^, and \-escape the . chars (using shell parameter expansion) to ensure their treatment as literals:
grep -A 4 "^${domain//./\.}" "$ORACLE_HOME/network/admin/tnsnames.ora"
You can get a part of a string with
# Only GNU-grep
echo "" | grep -Po "#\K[^:]*"
# or
echo "" | sed 's/.*#\(.*\):.*/\1/'
# or
echo "" | cut -d"#" -f2 | cut -d":" -f1
# or, when the string already is in a var
echo "${DB_URL#*#}" | cut -d":" -f1
# or using a temp var
echo "${tmpvar%:*}"
I had skipped the alternative awk, that was given by #mklement0 already:
echo "" | awk -F'[#:]' '{ print $2 }'
The awk solution is straight-forward, when you want to use the same approach without awk you can do something like
echo "" | tr "#:" "\t" | cut -f2
or the ugly
echo "" | (IFS='#:' read -r _ url _; echo "$url")
What is happening here?
After introducing the new IFS I want to take the second word of the input. The first and third word(s) are caught in the dummy var's _ (you could have named them dummyvar1 and dummyvar2). The pipe | creates a subprocess, so you need ()to hold reading and displaying the var url in the same process.

Grep in bash with regex

I am getting the following output from a bash script:
INFOPLIST_FILE = MajorDomo/MajorDomo-Info.plist
and I would like to get only the path(MajorDomo/MajorDomo-Info.plist) using grep. In other words, everything after the equals sign. Any ideas of how to do this?
This job suites more to awk:
s='INFOPLIST_FILE = MajorDomo/MajorDomo-Info.plist'
awk -F' *= *' '{print $2}' <<< "$s"
If you really want grep then use grep -P:
grep -oP ' = \K.+' <<< "$s"
Not exactly what you were asking, but
echo "INFOPLIST_FILE = MajorDomo/MajorDomo-Info.plist" | sed 's/.*= \(.*\)$/\1/'
will do what you want.
You could use cut as well:
your_script | cut -d = -f 2-
(where your_script does something equivalent to echo INFOPLIST_FILE = MajorDomo/MajorDomo-Info.plist)
If you need to trim the space at the beginning:
your_script | cut -d = -f 2- | cut -d ' ' -f 2-
If you have multiple spaces at the beginning and you want to trim them all, you'll have to fall back to sed: your_script | cut -d = -f 2- | sed 's/^ *//' (or, simpler, your_script | sed 's/^[^=]*= *//')
Assuming your script outputs a single line, there is a shell only solution:
echo "${line#*= }"
IFS=' =' read -r _ x <<<"INFOPLIST_FILE = MajorDomo/MajorDomo-Info.plist"
printf "%s\n" "$x"

How to use sed to identify a string in brackets?

I want to find the string in that is placed with in the brackets. How do I use sed to pull the string?
# cat /sys/block/sdb/queue/scheduler
noop anticipatory deadline [cfq]
I'm not getting the exact result
# cat /sys/block/sdb/queue/scheduler | sed 's/\[*\]//'
noop anticipatory deadline [cfq
I'm expecting an output
It can be easier with grep, if it happens to be changing the position in which the text in between brackets is located:
$ grep -Po '(?<=\[)[^]]*' file
This is look-behind: whenever you find a string [, start fetching all the characters up to a ].
See another example:
$ cat a
noop anticipatory deadline [cfq]
hello this [is something] we want to [enclose] yeah
$ grep -Po '(?<=\[)[^]]*' a
is something
You can also use awk for this, in case it is always in the same position:
$ awk -F[][] '{print $2}' file
It is setting the field separators as [ and ]. And from that, prints the second one.
And with sed:
$ sed 's/[^[]*\[\([^]]*\).*/\1/g' file
It is a bit messy, but basically it is looking from the block of text in between [] and prints it back.
I found one possible solution-
cut -d "[" -f2 | cut -d "]" -f1
so the exact solution is
# cat /sys/block/sdb/queue/scheduler | cut -d "[" -f2 | cut -d "]" -f1
Another potential solution is awk:
s='noop anticipatory deadline [cfq]'
awk -F'[][]' '{print $2}' <<< "$s"
Another way by gnu grep :
grep -Po "\[\K[^]]*" file
with pure shell:
while read line; do [[ "$line" =~ \[([^]]*)\] ]] && echo "${BASH_REMATCH[1]}"; done < file
Another awk
echo 'noop anticipatory deadline [cfq]' | awk '{gsub(/.*\[|\].*/,x)}8'
perl -lne 'print $1 if(/\[([^\]]*)\]/)'
Tested here

How to extract a number from a string using grep and regex

I make a cat of a file and apply on it a grep with a regular expression like this
cat /tmp/tmp_file | grep "toto.titi\[[0-9]\+\].tata=55"
the command display the following output
is it possible to modify my grep command in order to extract the number 12 as displayed output of the command?
You can grab this in pure BASH using its regex capabilities:
[[ "$s" =~ ^toto.titi\[([0-9]+)\]\.tata=[0-9]+$ ]] && echo "${BASH_REMATCH[1]}"
You can also use sed:
sed 's/toto.titi\[\([0-9]*\)\].tata=55/\1/' <<< "$s"
OR using awk:
awk -F '[\\[\\]]' '{print $2}' <<<"$s"
use lookahead
echo toto.titi[12].tata=55|grep -oP '(?<=\[)\d+'
without perl regex,use sed to replace "["
echo toto.titi[12].tata=55|grep -o "\[[0-9]\+"|sed 's/\[//g'
Pipe it to sed and use a back reference:
cat /tmp/tmp_file | grep "toto.titi\[[0-9]\+\].tata=55" | sed 's/.*\[(\d*)\].*/\1/'