sed not able to print matching regex grop only

sed not able to print matching regex grop only - regex

I have some key value pair arguments. I need to print them as is.
Example.
echo $X
(a=b) (c=d) (e=f)
echo "$X" | sed -E 's/([a-zA-Z0-9_]*=[a-zA-Z0-9_]*)/match/1'
echo "$X" | sed -E 's/([a-zA-Z0-9_]*=[a-zA-Z0-9_]*)/\1/1'
echo "$X" | sed -E 's/([a-zA-Z0-9_]*=[a-zA-Z0-9_]*)/\1/2'
echo "$X" | sed -E 's/([a-zA-Z0-9_]*=[a-zA-Z0-9_]*)/\1/3'
Post the above expresion, I wanted to print matching groups one by one. using .* in pattern matching is greedy and is printing either first or last matching groups only. How can I print any matching group in this way.
Here is my expected output.
a=b
c=d
e=f

This grep one-liner will do:
grep -o '[^(]*=[^)]*'
example:
kent$ grep -o '[^(]*=[^)]*' <<<'(a=b) (c=d) (e=f)'
a=b
c=d
e=f

Replace ) ( with a newline and remove the remaining parentheses.
echo "$X" | sed 's/) (/\n/g;s/[()]//g'
To print the $nth line, you can pipe the output to
sed -n "$n p"

Related

How do I grep for all words that contain two consecutive e’s, and also contains two y’s

I want to find the set of words that contain two consecutive e’s, and also contains two y’s.
So far i got to /eeyy/

Alteration with ERE:
$ echo evyyree | grep -E '.*ee.*yy|.*yy.*ee'
evyyree
$ echo eveeryy | grep -E '.*ee.*yy|.*yy.*ee'
eveeryy
If the match needs to be in the same word, you can do:
$ echo "eee yyyy" | grep -E 'ee[^[:space:]]*yy|yy[^[:space:]]*ee' # no match
$ echo "eeeyyyy" | grep -E 'ee[^[:space:]]*yy|yy[^[:space:]]*ee'
eeeyyyy
Then only that word:
$ echo 'eeeyy heelo' | grep -Eo 'ee[^[:space:]]*yy|yy[^[:space:]]*ee'
eeeyy

Pipe it:
$ echo eennmmyy | grep ee | grep yy
eennmmyy

awk approach to match all words that contain both ee and yy:
s="eennmmyy heello thees-whyy someyy"
echo $s | awk '{for(i=1;i<=NF;i++) if($i~/ee/ && $i~/yy/) print $i}'
The output:
eennmmyy
thees-whyy

The only sensible and extensible way to do this is with awk:
awk '/ee/&&/yy/' file
Imagine trying to do it the grep way if you also had to find zz. Here's awk:
awk '/ee/&&/yy/&&/zz/' file
and here's grep:
grep -E 'ee.*yy.*zz|ee.*zz.*yy|yy.*ee.*zz|yy.*zz.*ee|zz.*yy.*ee|zz.*ee.*yy' file
Now add a 4th additional string to search for and see what that looks like!

How to use sed to replace every match according to each match?

$ echo 'a,b,c,d=1' | sed '__MAGIC_HERE__'
a=1,b=1,c=1,d=1
$ echo 'a,b,c,d=2' | sed '__MAGIC_HERE__'
a=2,b=2,c=2,d=2
Dose sed can cast this spell ?
EDIT
I have to use sed twice to achieve this
s='a,b,c,d=2'
v=`echo $s | sed -rn 's/.*([0-9]+)/\1/p'`
echo $s | sed "s/=.*//" | sed -rn "s/([a-z])/\1=$v/gp"
OR
s='a,b,c,d=2'
echo $s | sed -rn 's/.*([0-9]+)/\1/p' | { read v;echo $s | sed "s/=.*//" | sed -rn "s/([a-z])/\1=$v/gp"; }
EDIT
The real use case is here and there is multiline content, Thanks to #hek2mgl, the awk is way more easier.
EDIT
My usecase
export LS_COLORS='no=00:fi=00:di=01;34:ln=01;36:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:ex=01;32'
exts="
tar|tgz|arj|taz|lzh|zip|z|Z|gz|bz2|deb|rpm|jar=01;31
jpg|jpeg|gif|bmp|pbm|pgm|ppm|tga|xbm|xpm|tif|tiff|png=01;34
mov|fli|gl|dl|xcf|xwd|ogg|mp3|wav=01;35
flv|mkv|mp4|mpg|mpeg|avi=01;36
"
# SED Version
read -rd '' exts < <(
for i in $(echo $exts)
do
echo $i | sed -rn 's/.*=(.*)/\1/p' | { read v; echo $i | sed "s/=.*//" | sed -rn "s/([^|]+)\|?/:\*.\1=$v/gp"; }
done | tr -d '\n'
)
export LS_COLORS="$LS_COLORS$exts"
# AWK Version
read -r -d '' exts < <( echo $exts | xargs -n1 | awk -F= '{gsub(/\|/,"="$2":*.")}$2' | tr "\n" ":" )
export LS_COLORS="$LS_COLORS:*.$exts"
unset exts
EDIT
Finale sed version
read -r -d '' exts < <( echo $exts | xargs -n1 | sed -r 's/\|/\n/g;:a;s/\n(.*(=.*))/\2:*.\1/;ta' | sed "s/^/*./g" | tr "\n" ":" )
export LS_COLORS="$LS_COLORS:$exts"

This might work for you (GNU sed):
sed -r 's/,/\n/g;:a;s/\n(.*(=.*))/\2,\1/;ta' file
Convert the separators to newlines (a unique character not found in the file) and then replace each occurrence of the newline by the required string and the original separator.

I would use awk:
awk -F= '{gsub(/,/,"="$2",")}1'
-F= splits the input line by = which let's us access the number in field two $2. gsub() replaces all occurrences of , by =$2,. The 1 at the end is an awk idiom. It will simply print the, modified, line.

Perl can...
echo 'a,b,c,d=1' | perl -ne 'chomp; my ($val) = m|=(\d+)|; s|\=.*||; print join(",", map {"$_=$val"} split/,/) . "\n";'
a=1,b=1,c=1,d=1
Explained
perl -ne # Loop over input and run command
chomp; # Remove trailing newline
my ($val) = m|=(\d+)|; # Find numeric value after '='
s|\=.*||; # Remove everything starting with '='
split /,/ # Split input on ',' => ( a, b, c, d )
map {"$_=$val" } # Create strings ( "a=1", "b=1", ... ) from results of split
join(",",...) # Join the results of previous map with ','
print .... "\n" # Print it all out with a newline at the end.

I hope you're not seriously going to use that mush of read/echo/xargs/sed/sed/tr in your code. Just use one small, simple awk script:
$ cat tst.sh
exts="
tar|tgz|arj|taz|lzh|zip|z|Z|gz|bz2|deb|rpm|jar=01;31
jpg|jpeg|gif|bmp|pbm|pgm|ppm|tga|xbm|xpm|tif|tiff|png=01;34
mov|fli|gl|dl|xcf|xwd|ogg|mp3|wav=01;35
flv|mkv|mp4|mpg|mpeg|avi=01;36
"
exts=$( awk -F'=' '
NF {
gsub(/\||$/, "="$2":", $1)
out = out $1
}
END {
sub(":$", "", out)
print out
}
' <<<"$exts" )
echo "$exts"
$ ./tst.sh
tar=01;31:tgz=01;31:arj=01;31:taz=01;31:lzh=01;31:zip=01;31:z=01;31:Z=01;31:gz=01;31:bz2=01;31:deb=01;31:rpm=01;31:jar=01;31:jpg=01;34:jpeg=01;34:gif=01;34:bmp=01;34:pbm=01;34:pgm=01;34:ppm=01;34:tga=01;34:xbm=01;34:xpm=01;34:tif=01;34:tiff=01;34:png=01;34:mov=01;35:fli=01;35:gl=01;35:dl=01;35:xcf=01;35:xwd=01;35:ogg=01;35:mp3=01;35:wav=01;35:flv=01;36:mkv=01;36:mp4=01;36:mpg=01;36:mpeg=01;36:avi=01;36

Perl, another Perl alternative...
d=1:
echo 'a,b,c,d=1' | perl -pe '($a)=/(\d+)$/; s/,/=$a,/g;'
a=1,b=1,c=1,d=1
d=2:
echo 'a,b,c,d=2' | perl -pe '($a)=/(\d+)$/; s/,/=$a,/g;'
a=2,b=2,c=2,d=2
Explanations:
perl -e # perl one-liner switch
perl -ne # puts an implicit loop for each line of input
perl -pe # as 'perl -ne', but adds an implicit print at the end of each iteration
($a)=/(\d+)$/; # catch the number in d=1 or d=2, assign variable $a
s/,/=$a,/g; # substitute each ',' with '=1,' if $a=1

Print a part of string regex bash

From this content (in a file):
myspecificBhost.fqdn.com myspecificaBhost.fqdn.com myspecificzBhost.fqdn.com
I need to print the next 4 characters from the "B":
Bhost
I tried:
echo ${var:position1:lenght}
but position 1 is never equal

Using BASH regex:
s='myspecificBhost.fqdn.com myspecificaBhost.fqdn.com myspecificzBhost.fqdn.com'
[[ "$s" =~ (B[a-z][a-z][a-z][a-z]) ]] && echo "${BASH_REMATCH[1]}"
Bhost

try sed command:
sed -nr '/.*c(.{4,6}).*/s//\1/p' input.txt | cut -c2-6
RESULT:
Bhost
With grep command:
cat input.txt | grep -o B.... | head -1
RESULT:
Bhost

Try with:
cat file | grep -o B....

Bash using parameter substitution. Outputs the 4 characters
after the first 'B':
text='myspecificBhost.fqdn.com myspecificaBhost.fqdn.com myspecificzBhost.fqdn.com'
text=${text#*B}
text=${text:0:4}
echo "${text}"
Output:
host
To get the leading 'B' use
echo "B${text}"

Can not extract the capture group with either sed or grep

I want to extract the value pair from a key-value pair syntax but I can not.
Example I tried:
echo employee_id=1234 | sed 's/employee_id=\([0-9]+\)/\1/g'
But this gives employee_id=1234 and not 1234 which is actually the capture group.
What am I doing wrong here? I also tried:
echo employee_id=1234| egrep -o employee_id=([0-9]+)
but no success.

1. Use grep -Eo: (as egrep is deprecated)
echo 'employee_id=1234' | grep -Eo '[0-9]+'
1234
2. using grep -oP (PCRE):
echo 'employee_id=1234' | grep -oP 'employee_id=\K([0-9]+)'
1234
3. Using sed:
echo 'employee_id=1234' | sed 's/^.*employee_id=\([0-9][0-9]*\).*$/\1/'
1234

To expand on anubhava's answer number 2, the general pattern to have grep return only the capture group is:
$ regex="$precedes_regex\K($capture_regex)(?=$follows_regex)"
$ echo $some_string | grep -oP "$regex"
so
# matches and returns b
$ echo "abc" | grep -oP "a\K(b)(?=c)"
b
# no match
$ echo "abc" | grep -oP "z\K(b)(?=c)"
# no match
$ echo "abc" | grep -oP "a\K(b)(?=d)"

Using awk
echo 'employee_id=1234' | awk -F= '{print $2}'
1234

use sed -E for extended regex
echo employee_id=1234 | sed -E 's/employee_id=([0-9]+)/\1/g'

You are specifically asking for sed, but in case you may use something else - any POSIX-compliant shell can do parameter expansion which doesn't require a fork/subshell:
foo='employee_id=1234'
var=${foo%%=*}
value=${foo#*=}
 
$ echo "var=${var} value=${value}"
var=employee_id value=1234

How do I form the correct regular expression to capture everything before parentheses?

current I have a set strings that are of the format
customName(path/to/the/relevant/directory|file.ext#FileRefrence_12345)
From this I could like to extract customName, the characters before the first parentheses, using sed.
My best guesses so far are:
echo $s | sed 's/([^(])+\(.*\)/\1/g'
echo $s | sed 's/([^\(])+\(.*\)/\1/g'
However, using these I get the error:
sed: -e expression #1, char 21: Unmatched ( or \(
So how do I form the correct regular expression? and why is it relevant that I do not have a matched \( is it is just an escaped character for my expression, not a character used for formatting?

you could substitute everything after the opening parenthesis, like this (note that parentheses by default do not need to be escaped in sed)
echo 'customName(path/to/the/relevant/directory|file.ext#FileRefrence_12345)' |
sed -e 's/(.*//'

grep
kent$ echo "customName(blah)"|grep -o '^[^(]*'
customName
sed
kent$ echo "customName(blah)"|sed 's/(.*//'
customName
note I changed the stuff between the brackets.

Different options:
$ echo $s | sed 's/(.*//' #sed (everything before "(")
customName
$ echo $s | cut -d"(" -f1 #cut (delimiter is "(", print 1st block)
customName
$ echo $s | awk -F"(" '{print $1}' #awk (field separator is "(", print 1st)
customName
$ echo ${s%(*} #bash command substitution
customName

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

sed not able to print matching regex grop only - regex

This grep one-liner will do: grep -o '[^(]=[^)]' example: kent$ grep -o '[^(]=[^)]' <<<'(a=b) (c=d) (e=f)' a=b c=d e=f

Replace ) ( with a newline and remove the remaining parentheses. echo "$X" | sed 's/) (/\n/g;s/[()]//g' To print the $nth line, you can pipe the output to sed -n "$n p"

Related

How do I grep for all words that contain two consecutive e’s, and also contains two y’s

How to use sed to replace every match according to each match?

Print a part of string regex bash

Can not extract the capture group with either sed or grep

How do I form the correct regular expression to capture everything before parentheses?

Categories

Resources

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

sed not able to print matching regex grop only - regex

This grep one-liner will do: grep -o '[^(]*=[^)]*' example: kent$ grep -o '[^(]*=[^)]*' <<<'(a=b) (c=d) (e=f)' a=b c=d e=f

Replace ) ( with a newline and remove the remaining parentheses. echo "$X" | sed 's/) (/\n/g;s/[()]//g' To print the $nth line, you can pipe the output to sed -n "$n p"

Related

How do I grep for all words that contain two consecutive e’s, and also contains two y’s

How to use sed to replace every match according to each match?

Print a part of string regex bash

Can not extract the capture group with either sed or grep

How do I form the correct regular expression to capture everything before parentheses?

Categories

Resources

This grep one-liner will do: grep -o '[^(]=[^)]' example: kent$ grep -o '[^(]=[^)]' <<<'(a=b) (c=d) (e=f)' a=b c=d e=f