Regex and sed in sh script not evaluating properly

Regex and sed in sh script not evaluating properly - regex

first post here. Trying to capture just the integer output from an SNMP reply with regex. I've used a regex tester to come up with the correct pattern match but sed refuses to output the result. This is just a primitive fact finding script right now, it'll grow into something more complex but right now this is my stumbling block.
The reply to each line of the snmpget statements are:
IF-MIB::ifInOctets.1001 = Counter32: 692749329
IF-MIB::ifOutOctets.1001 = Counter32: 3119381688
I want to capture just the value after "Counter32: " and the regex (?<=: )(\d+) accomplishes that in the testers I could find online.
#!/bin/sh
SED_IFACES="-e '/(?<=: )(\d+)/g'"
INTERNET_IN=`snmpget -v 2c -c public 123.45.678.9 1.3.6.1.2.1.2.2.1.10.1001` | eval sed $SED_IFACES
INTERNET_OUT=`snmpget -v 2c -c public 123.45.678.9 1.3.6.1.2.1.2.2.1.16.1001` | eval sed $SED_IFACES
echo $INTERNET_IN
echo $INTERNET_OUT

$ cat file
IF-MIB::ifInOctets.1001 = Counter32: 692749329
IF-MIB::ifOutOctets.1001 = Counter32: 3119381688
$ awk '{print $NF}' file
692749329
3119381688
$ sed 's/.* //' < file
692749329
3119381688

You can do
sed 's/^.*Counter32: \(.*\)$/\1/'
Which captures the value and prints it out with the \1.
Also note that you are using Perl regular expressions in your example, and sed does not support these. It is also missing the substitution "s/" part.

Related

Transform a dynamic alphanumeric string

I have a Build called 700-I20190808-0201. I need to convert it to 7.0.0-I20190808-0201. I can do that with regular expression:
sed 's/\([0-9]\)\([0-9]\)\([0-9]\)\(.\)/\1.\2.\3\4/' abc.txt
But the solution does not work when the build ID is 7001-I20190809-0201. Can we make the regular expression dynamic so that it works for both (700 and 7001)?

Could you please try following.
awk 'BEGIN{FS=OFS="-"}{gsub(/[0-9]/,"&.",$1);sub(/\.$/,"",$1)} 1' Input_file

If you have Perl available, lookahead regular expressions make this straightforward:
$ cat foo.txt
700-I20190808-0201
7001-I20190809-0201
$ perl -ple 's/(\d)(?=\d+\-I)/\1./g' foo.txt
7.0.0-I20190808-0201
7.0.0.1-I20190809-0201

You can implement a simple loop using labels and branching using sed:
$ echo '7001-I20190809-0201' | sed ':1; s/^\([0-9]\{1,\}\)\([0-9][-.]\)/\1.\2/; t1'
7.0.0.1-I20190809-0201
$ echo '700-I20190809-0201' | sed ':1; s/^\([0-9]\{1,\}\)\([0-9][-.]\)/\1.\2/; t1'
7.0.0-I20190809-0201
If your sed support -E flag:
sed -E ':1; s/^([0-9]+)([0-9][-.])/\1.\2/; t1'

sed -e 's/\([0-9]\)\([0-9]\)\([0-9]\)\(.\)/\1.\2.\3.\4/' -e 's/\.\-/\-/' abc.txt
This worked for me, very simple one. Just needed to extract it in my ant script using replaceregex pattern

sed: struggling with substitution and regex for ^*=

I am running a linux bash script. From stout lines like: /gpx/trk/name=MyTrack1, I want to keep only the end of line after =.
I am struggling to understand why the following sed command is not working as I expect:
echo "/gpx/trk/name=MyTrack1" | sed -e "s/^*=//"
(I also tried)
echo "/gpx/trk/name=MyTrack1" | sed -e "s/^*\=//"
The return is always /gpx/trk/name=MyTrack1 and not MyTrack1

An even simpler way if this is the only structure you are concerned about:
echo "/gpx/trk/name=MyTrack1" | cut -d = -f 2

Simply try:
echo "/gpx/trk/name=MyTrack1" | sed 's/.*=//'
Solution 2nd: With another sed.
echo "/gpx/trk/name=MyTrack1" | sed 's/\(.*=\)\(.*\)/\2/'
Explanation: As per OP's request adding explanation for this code here:
s: Means telling sed to do substitution operation.
\(.*=\): Creating first place in memory to keep this regex's value which tells sed to keep everything in 1st place of memory from starting to till = so text /gpx/trk/name= will be in 1 place.
\(.*\): Creating 2nd place in memory for sed telling it to keep everything now(after the match of 1st one, so this will start after =) and have value in it as MyTrack1
/\2/: Now telling sed to substitute complete line with only 2nd memory place holder which is MyTrack1
Solution 3rd: Or with awk considering that your Input_file is same as shown samples.
echo "/gpx/trk/name=MyTrack1" | awk -F'=' '{print $2}'
Solution 4th: With awk's match.
echo "/gpx/trk/name=MyTrack1" | awk 'match($0,/=.*$/){print substr($0,RSTART+1,RLENGTH-1)}'

$ echo "/gpx/trk/name=MyTrack1" | sed -e "s/^.*=//"
MyTrack1
The regular expression ^.*= matches anything up to and including the last = in the string.
Your regular expression ^*= would match the literal string *= at the start of a string, e.g.
$ echo "*=/gpx/trk/name=MyTrack1" | sed -e "s/^*=//"
/gpx/trk/name=MyTrack1
The * character in a regular expression usually modifies the immediately previous expression so that zero or more of it may be matched. When * occurs at the start of an expression on the other hand, it matches the character *.

Not to take you off the sed track, but this is easy with Bash alone:
$ echo "$s"
/gpx/trk/name=MyTrack1
$ echo "${s##*=}"
MyTrack1
The ##*= pattern removes the maximal pattern from the beginning of the string to the last =:
$ s="1=2=3=the rest"
$ echo "${s##*=}"
the rest
The equivalent in sed would be:
$ echo "$s" | sed -E 's/^.*=(.*)/\1/'
the rest
Where #*= would remove the minimal pattern:
$ echo "${s#*=}"
2=3=the rest
And in sed:
$ echo "$s" | sed -E 's/^[^=]*=(.*)/\1/'
2=3=the rest
Note the difference in * in Bash string functions vs a sed regex:
The * in Bash (in this context) is glob like - itself means 'any character'
The * in a regex refers to the previous pattern and for 'any character' you need .*
Bash has extensive string manipulation functions. You can read about Bash string patterns in BashFAQ.

Extract version using grep/regex in bash

I have a file that has a line stating
version = "12.0.08-SNAPSHOT"
The word version and quoted strings can occur on multiple lines in that file.
I am looking for a single line bash statement that can output the following string:
12.0.08-SNAPSHOT
The version can have RELEASE tag too instead of SNAPSHOT.
So to summarize, given
version = "12.0.08-SNAPSHOT"
expected output: 12.0.08-SNAPSHOT
And given
version = "12.0.08-RELEASE"
expected output: 12.0.08-RELEASE

The following command prints strings enquoted in version = "...":
grep -Po '\bversion\s*=\s*"\K.*?(?=")' yourFile
-P enables perl regexes, which allow us to use features like \K and so on.
-o only prints matched parts instead of the whole lines.
\b ensures that version starts at a word boundary and we do not match things like abcversion.
\s stands for any kind of whitespace.
\K lets grep forget, that it matched the part before \K. The forgotten part will not be printed.
.*? matches as few chararacters as possible (the matching part will be printed) ...
(?=") ... until we see a ", which won't be included in the match either (this is called a lookahead).
Not all grep implementations support the -P option. Alternatively, you can use perl, as described in this answer:
perl -nle 'print $& if m{\bversion\s*=\s*"\K.*?(?=")}' yourFile

Seems like a job for cut:
$ echo 'version = "12.0.08-SNAPSHOT"' | cut -d'"' -f2
12.0.08-SNAPSHOT
$ echo 'version = "12.0.08-RELEASE"' | cut -d'"' -f2
12.0.08-RELEASE

Portable solution:
$ echo 'version = "12.0.08-RELEASE"' |sed -E 's/.*"(.*)"/\1/g'
12.0.08-RELEASE
or even:
$ perl -pe 's/.*"(.*)"/\1/g'.
$ awk -F"\"" '{print $2}'

Extract few matching strings from matching lines in file using sed

I have a file with strings similar to this:
abcd u'current_count': u'2', u'total_count': u'3', u'order_id': u'90'
I have to find current_count and total_count for each line of file. I am trying below command but its not working. Please help.
grep current_count file | sed "s/.*\('current_count': u'\d+'\).*/\1/"
It is outputting the whole line but I want something like this:
'current_count': u'3', 'total_count': u'3'

It's printing the whole line because the pattern in the s command doesn't match, so no substitution happens.
sed regexes don't support \d for digits, or x+ for xx*. GNU sed has a -r option to enable extended-regex support so + will be a meta-character, but \d still doesn't work. GNU sed also allows \+ as a meta-character in basic regex mode, but that's not POSIX standard.
So anyway, this will work:
echo -e "foo\nabcd u'current_count': u'2', u'total_count': u'3', u'order_id': u'90'" |
sed -nr "s/.*('current_count': u'[0-9]+').*/\1/p"
# output: 'current_count': u'2'
Notice that I skip the grep by using sed -n s///p. I could also have used /current_count/ as an address:
sed -r -e '/current_count/!d' -e "s/.*('current_count': u'[0-9]+').*/\1/"
Or with just grep printing only the matching part of the pattern, instead of the whole line:
grep -E -o "'current_count': u'[[:digit:]]+'
(or egrep instead of grep -E). I forget if grep -o is POSIX-required behaviour.

For me this looks like some sort of serialized Python data. Basically I would try to find out the origin of that data and parse it properly.
However, while being hackish, sed can also being used here:
sed "s/.*current_count': [a-z]'\([0-9]\+\).*/\1/" input.txt
sed "s/.*total_count': [a-z]'\([0-9]\+\).*/\1/" input.txt

regexp (sed) suppress "no match" output

I'm stuck on that and can't wrap my head around it: How can I tell sed to return the value found, and otherwise shut up?
It's really beyond me: Why would sed return the whole string if he found nothing? Do I have to run another test on the returned string to verify it? I tried using "-n" from the (very short) man page but it effectively suppresses all output, including matched strings.
This is what I have now :
echo plop-02-plop | sed -e 's/^.*\(.\)\([0-9][0-9]\)\1.*$/\2/'
which returns
02 (and that is fine and dandy, thank you very much), but:
echo plop-02plop | sed -e 's/^.*\(.\)\([0-9][0-9]\)\1.*$/\2/'
returns
plop-02plop (when it should return this = "" nothing! Dang, you found nothing so be quiet!
For crying out loud !!)
I tried checking for a return value, but this failed too ! Gasp !!
$ echo plop-02-plop | sed -e 's/^.*\(.\)\([0-9][0-9]\)\1.*$/\2/' ; echo $?
02
0
$ echo plop-02plop | sed -e 's/^.*\(.\)\([0-9][0-9]\)\1.*$/\2/' ; echo $?
plop-02plop
0
$
This last one I cannot even believe. Is sed really the tool I should be using? I want to extract a needle from a haystack, and I want a needle or nothing..?

sed by default prints all lines.
What you want to do is
/patt/!d;s//repl/
IOW delete lines not matching your pattern, and if they match, extract particular element from it, giving capturing group number for instance. In your case it will be:
sed -e '/^.*\(.\)\([0-9][0-9]\)\1.*$/!d;s//\2/'
You can also use -n option to suppress echoing all lines. Then line is printed only when you explicitly state it. In practice scripts using -n are usually longer and more cumbersome to maintain. Here it will be:
sed -ne 's/^.*\(.\)\([0-9][0-9]\)\1.*$/\2/p'
There is also grep, but your example shows, why sed is sometimes better.

Perhaps you can use egrep -o?
input.txt:
blooody
aaaa
bbbb
odor
qqqq
E.g.
sehe#meerkat:/tmp$ egrep -o o+ input.txt
ooo
o
o
sehe#meerkat:/tmp$ egrep -no o+ input.txt
1:ooo
4:o
4:o
Of course egrep will have slightly different (better?) regex syntax for advanced constructs (back-references, non-greedy operators). I'll let you do the translation, if you like the approach.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Regex and sed in sh script not evaluating properly - regex

$ cat file IF-MIB::ifInOctets.1001 = Counter32: 692749329 IF-MIB::ifOutOctets.1001 = Counter32: 3119381688 $ awk '{print $NF}' file 692749329 3119381688 $ sed 's/.* //' < file 692749329 3119381688

You can do sed 's/^.Counter32: \(.\)$/\1/' Which captures the value and prints it out with the \1. Also note that you are using Perl regular expressions in your example, and sed does not support these. It is also missing the substitution "s/" part.

Related

Transform a dynamic alphanumeric string

sed: struggling with substitution and regex for ^*=

Extract version using grep/regex in bash

Extract few matching strings from matching lines in file using sed

regexp (sed) suppress "no match" output

Categories

Resources

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Regex and sed in sh script not evaluating properly - regex

$ cat file IF-MIB::ifInOctets.1001 = Counter32: 692749329 IF-MIB::ifOutOctets.1001 = Counter32: 3119381688 $ awk '{print $NF}' file 692749329 3119381688 $ sed 's/.* //' < file 692749329 3119381688

You can do sed 's/^.*Counter32: \(.*\)$/\1/' Which captures the value and prints it out with the \1. Also note that you are using Perl regular expressions in your example, and sed does not support these. It is also missing the substitution "s/" part.

Related

Transform a dynamic alphanumeric string

sed: struggling with substitution and regex for ^*=

Extract version using grep/regex in bash

Extract few matching strings from matching lines in file using sed

regexp (sed) suppress "no match" output

Categories

Resources

You can do sed 's/^.Counter32: \(.\)$/\1/' Which captures the value and prints it out with the \1. Also note that you are using Perl regular expressions in your example, and sed does not support these. It is also missing the substitution "s/" part.