How to extract everything between two patterns (using sed?)?

How to extract everything between two patterns (using sed?)? - regex

I am running a curl command (plus a grep) and I want to extract everything between two patterns from the output.
Here's an example output from the curl (and grep):
Dload Upload Total Spent Left Speed
100 15848 0 15848 0 0 708k 0 --:--:-- --:--:-- --:--:-- 736k
</message><refDesc>PULL Task 8c4d1a50-3e05-4b58-8d1a-503e057b586d 4_Place_All_Users_In_Inactive</refDesc><refKey>8c4d1a50-3e05-4b58-8d1a-503e057b586d</refKey><status>SUCCESS</status></syncope21:exec><syncope21:exec xmlns:syncope21="http://syncope.apache.org/2.1"><end>2020-01-22T01:13:44.512Z</end><start>2020-01-22T01:13:44.506Z</start><jobType>TASK</jobType><key>40e64a39-47e7-4428-a64a-3947e7c4286b</key><message>Users [created/failures]: 0/0 [updated/failures]: 0/0 [deleted/failures]: 0/0 [no operation/ignored]: 0/0
and I want to extract everything between the </message> and the </start>, e.g., from the above, I want:
</message><refDesc>PULL Task 8c4d1a50-3e05-4b58-8d1a-503e057b586d 4_Place_All_Users_In_Inactive</refDesc><refKey>8c4d1a50-3e05-4b58-8d1a-503e057b586d</refKey><status>SUCCESS</status></syncope21:exec><syncope21:exec xmlns:syncope21="http://syncope.apache.org/2.1"><end>2020-01-22T01:13:44.512Z</end><start>2020-01-22T01:13:44.506Z</start>
I have tried following:
curl -X GET ...." | grep xxxxxxx | sed -n -e '/<\/message>/,/<\/start>/p'
but it doesn't seem to be working (it seems to be returning the entire output, rather than extracting.
Can someone tell me how to do that?
Thanks!
Jim

awk solutions: Could you please try following,if data is present in an Input_file.
awk 'match($0,/<\/message>.*<start>/){print substr($0,RSTART,RLENGTH)}' Input_file
OR with curl use it like :
curl -X GET ...." | awk '{gsub(/\r/,"")} match($0,/<\/message>.*<start>/){print substr($0,RSTART,RLENGTH)}'
sed solutions: OR with GNU sed's -z option:
sed -z 's/.*\(<\/message>.*<start>\).*/\1\n/' Input_file
with curl + sed:
curl -X GET ...." | sed -z 's/\r//g;s/.*\(<\/message>.*<start>\).*/\1\n/'
With shown sample output will be as follows.
</message><refDesc>PULL Task 8c4d1a50-3e05-4b58-8d1a-503e057b586d 4_Place_All_Users_In_Inactive</refDesc><refKey>8c4d1a50-3e05-4b58-8d1a-503e057b586d</refKey><status>SUCCESS</status></syncope21:exec><syncope21:exec xmlns:syncope21="http://syncope.apache.org/2.1"><end>2020-01-22T01:13:44.512Z</end><start>

Can you please try below using GNU sed:
sed -E -n 's#(^</message>.*</start>).*#\1#p'
so, basically, your command would look like:
curl -X GET ...." | grep xxxxxxx | sed -E -n 's#(^</message>.*</start>).*#\1#p'

This might work for you (GNU sed):
sed '/\n/!{s/<\/message>/\n&/;s/<\/start>/&\n/};/^<\/message>/P;D' file
If a line has not already been amended, insert a newline before </message> and after </start> and print only that part of the line.

Related

Grep first line which contain a date

I'm trying to fetch the first line in a log file which contain a date.
Here is an example of the log file :
SOME
LOG
2021-1-1 21:50:19.0|LOG|DESC1
2021-1-4 21:50:19.0|LOG|DESC2
2021-1-5 21:50:19.0|LOG|DESC3
2021-1-5 21:50:19.0|LOG|DESC4
In this context I need to get the following line:
2021-1-1 21:50:19.0|LOG|DESC1
An other log file example :
SOME
LOG
21-1-3 21:50:19.0|LOG|DESC1
21-1-3 21:50:19.0|LOG|DESC2
21-1-4 21:50:19.0|LOG|DESC3
21-1-5 21:50:19.0|LOG|DESC4
I need to fetch :
21-1-3 21:50:19.0|LOG|DESC1
At the moment I tried the following command :
cat /path/to/file | grep "$(date +"%Y-%m-%d")" | tail -1
cat /path/to/file | grep "$(date +"%-Y-%-m-%-d")" | tail -1
cat /path/to/file | grep -E "[0-9]+-[0-9]+-[0-9]" | tail -1

In case you are ok with awk, could you please try following. This will find the matched regex first line and exit from program, which will be faster since its NOT reading whole Input_file.
awk '
/^[0-9]{2}([0-9]{2})?-[0-9]{1,2}-[0-9]{1,2} [0-9]{2}:[0-9]{2}:[0-9]{2}\.[0-9]+/{
print
exit
}' Input_file

Using sed, being not too concerned about exactly how many digits are present:
sed -En '/^[0-9]+-[0-9]+-[0-9]+ [0-9]+:[0-9]+:[0-9]+[.][0-9]+[|]/ {p; q}' file

$ grep -m1 '^[0-9]' file1
2021-1-1 21:50:19.0|LOG|DESC1
$ grep -m1 '^[0-9]' file2
21-1-3 21:50:19.0|LOG|DESC1
If that's not all you need then edit your question to provide more truly representative sample input/output.

A simple grep with -m 1 (to exit after finding first match):
grep -m1 -E '^([0-9]+-){2}[0-9]+ ([0-9]{2}:){2}[0-9]+\.[0-9]+' file1
2021-1-1 21:50:19.0|LOG|DESC1
grep -m1 -E '^([0-9]+-){2}[0-9]+ ([0-9]{2}:){2}[0-9]+\.[0-9]+' file2
21-1-3 21:50:19.0|LOG|DESC1

This sed works with either GNU or POSIX sed:
sed -nE '/^[[:digit:]]{2,4}-[[:digit:]]{1,2}-[[:digit:]]{1,2}/{p;q;}' file
But awk, with the same BRE, is probably better:
awk '/^[[:digit:]]{2,4}-[[:digit:]]{1,2}-[[:digit:]]{1,2}/{print; exit}' file

Grep next word after pattern match

I'm trying to get grep/sed out the following output: "name":"test_backup_1" from the below response
{"backups":[{"name":"test_backup_1","status":"CORRUPTED","creationTime":"2019-11-08T15:03:49.460","id":"test_backup_1"}]}
I have been trying variations of the following grep -Eo 'name:"\w+\"' but no joy.
I'm not sure if it would be easier to achieve this using grep or sed?
The way I am running this is curling a response from the server and saving it to a local variable, then echo out the variable and pipe grep/sed
example of what I am running
echo ${view_backup} | grep -Eo '"name":"\w+\"'

Referencing #sundeep answer
grep -Eo '"name":"[^"]+"'
resulted in the expected output

Make sure to transform the file to one line before grep
and pipe from your curl
echo `curl --silent https://someurl | tr -d '\n' | grep -oP "(?<=name\":\")[^\"]+"`
will return
test_backup_1
If you want more variables you can chain the -oP grep like in this example where I get some data on a danish license plate (bt419329)
curl --silent https://www.tjekbil.dk/api/v2/nummerplade/bt41932 | grep -oP -m 1 "(?<=\"RegNr\":\")[^\"]+|(?<=\"MaerkeTypeNavn\":\")[^\"]+|(?<=\"MaksimumHastighed\":)[^,]+"| tr '\n' ' '
returns
BT41932 SKODA 218

Get the following character which match a string

I'm trying to retreive a specific data returned from a command line. Here is my command line:
snmpwalk -v2c -c community localhost 1.3.6.1.2.1.2 | grep tun0
Which give me as result:
IF-MIB::ifDescr.4 = STRING: tun0
In this result I want to retreive 4. I thought using regex, but maybe there is an easier way to fetch it.
Regex I tried :
\ifDescr.\s+\K\S+ https://regex101.com/r/9X04MD/1
[\n\r].*ifDescr.\s*([^\n\r]*) https://regex101.com/r/9X04MD/2
I would like to fetch it in a single command line like
snmpwalk -v2c -c community localhost 1.3.6.1.2.1.2 | grep tun0 | ?

There are so many options that don't involve using GNU grep's experimental -P option. For example given just your sample input to work off, here's one way with any sed:
$ echo "$out" | sed 's/.*\.\([0-9]\).*tun0/\1/'
4
or any awk:
$ echo "$out" | awk -F'[. ]' '/tun0/{print $2}'
4

I'd recommend pattern (?<=ifDescr\.)[^ =]+
Explanation:
(?<=ifDescr\.) - positive lookbehind, asserts that wat is preceeding is ifDescr.
[^ =]+ match one or more characters other than space or equal sign =
Demo

substring via sed

I have 2 kind of messages:
Board2Port1TS239.124.3.20:3000
Board4UserTagZDF_pippo_MFPService8011
If I receive the message 1 (it contains Port) the output should be Board2Port1
If I receive the message 2 (it doesn't contain Port) the output should be Board4
The numbers of Board and Port are not fixed.
/bin/echo "Board2Port1TS239.124.3.20:3000" | /bin/sed -e '/Port/ s/???/???/ ; /Port/! s/???/???/'
I can't find a solution... could anyone help me? thanks
Many thanks to Novocaine for the perfect solution.
I have another questione directly related to the previous one:
via shell the solution is ok:
[root#test3 snmptt]# /bin/echo 'Board2Port1TS239.124.3.21:3000' | /bin/sed -r 's/^(Board.(Port.)*).*/\1/g'
Board2Port1
Now I have to use this command inside a SNMPTT configuration. It doesn't work.
This is the snmptt.debug report
Done performing substitution on PREEXEC line: /bin/echo 'Board2Port1TS239.124.3.21:3000' | /bin/sed -r 's/^(Board.(Port.)*).*/\1/g'
PREEXEC command: /bin/echo 'Board2Port1TS239.124.3.21:3000' | /bin/sed -r 's/^(Board.(Port.)*).*/\1/g'
command output: Board2Port1TS239.124.3.21:3000
The config file command is:
PREEXEC /bin/echo '$p2' | /bin/sed -r 's/^(Board.(Port.)*).*/\1/g'
the output "Board2Port1TS239.124.3.21:3000" is equal to the input ($p2). I don't undertstans why.
Thanks in advance

sed -r 's/^(Board.(Port.)*).*/\1/g' File

assuming you receive single message each time:
sed '/:/{s/\([0-9]\)[^0-9].*:/\1/;q};s/\([0-9]\)[^0-9].*/\1/'
should work:
kent$ sed '/:/{s/\([0-9]\)[^0-9].*:/\1/;q};s/\([0-9]\)[^0-9].*/\1/' <<< "Board2Port1TS239.124.3.20:3000"
Board23000
kent$ sed '/:/{s/\([0-9]\)[^0-9].*:/\1/;q};s/\([0-9]\)[^0-9].*/\1/' <<< "Board4whatever3000"
Board4

Assuming that the input really looks like your example, and that Port and UserTag are fixed strings:
sed -r '/Port/{s/TS.*//;n};s/UserTag.*//'

sed regular expressions address ranges

I have a txt file that looks something like this
-----------------------------------
RUNNING PROCESSES
-----------------------------------
ftpd
kswapd
init
etc..
---------------------------------
HOSTNAME
--------------------------------
mypc.local.com
With sed I want to just get one section of this file. So just the RUNNING PROCESSES section, however I seem to be failing to get my regexp right to do so.
I got this far
sed -n '/^-./,/RUNNING PROCESSES/, /[[:space::]]/p' linux.txt | more
however it keeps complaining about
-e expression #1, char 26: unknown commmand `,'
Can anybody help??

Did you mean:
sed -n '/RUNNING PROCESSES/,/HOSTNAME/p' linux.txt |
sed -e '/^[- ]/d' -e '/^$/d'

I would probably prefer to use awk for that:
awk '/RUNNING PROCESSES/ {s=2}
/^---/ {s=s-1}
{if(s>0){print}}' linux.txt
That awk will give you:
RUNNING PROCESSES
-----------------------------------
ftpd
kswapd
init
etc..
You can then pipe that through sed '/^$/d' to filter out the blank lines.

Here is another variable of the answer accepted, but not extra call to another sed process
sed -n '/RUNNING PROCESSES/,/HOSTNAME/{s/RUNN.*\|HOSTNAME//;s/--*//;/^$/!p}' file

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to extract everything between two patterns (using sed?)? - regex

Can you please try below using GNU sed: sed -E -n 's#(^</message>.</start>).#\1#p' so, basically, your command would look like: curl -X GET ...." | grep xxxxxxx | sed -E -n 's#(^</message>.</start>).#\1#p'

This might work for you (GNU sed): sed '/\n/!{s/<\/message>/\n&/;s/<\/start>/&\n/};/^<\/message>/P;D' file If a line has not already been amended, insert a newline before </message> and after </start> and print only that part of the line.

Related

Grep first line which contain a date

Grep next word after pattern match

Get the following character which match a string

substring via sed

sed regular expressions address ranges

Categories

Resources

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to extract everything between two patterns (using sed?)? - regex

Can you please try below using GNU sed: sed -E -n 's#(^</message>.*</start>).*#\1#p' so, basically, your command would look like: curl -X GET ...." | grep xxxxxxx | sed -E -n 's#(^</message>.*</start>).*#\1#p'

This might work for you (GNU sed): sed '/\n/!{s/<\/message>/\n&/;s/<\/start>/&\n/};/^<\/message>/P;D' file If a line has not already been amended, insert a newline before </message> and after </start> and print only that part of the line.

Related

Grep first line which contain a date

Grep next word after pattern match

Get the following character which match a string

substring via sed

sed regular expressions address ranges

Categories

Resources

Can you please try below using GNU sed: sed -E -n 's#(^</message>.</start>).#\1#p' so, basically, your command would look like: curl -X GET ...." | grep xxxxxxx | sed -E -n 's#(^</message>.</start>).#\1#p'