substring via sed - regex

I have 2 kind of messages:
Board2Port1TS239.124.3.20:3000
Board4UserTagZDF_pippo_MFPService8011
If I receive the message 1 (it contains Port) the output should be Board2Port1
If I receive the message 2 (it doesn't contain Port) the output should be Board4
The numbers of Board and Port are not fixed.
/bin/echo "Board2Port1TS239.124.3.20:3000" | /bin/sed -e '/Port/ s/???/???/ ; /Port/! s/???/???/'
I can't find a solution... could anyone help me? thanks
Many thanks to Novocaine for the perfect solution.
I have another questione directly related to the previous one:
via shell the solution is ok:
[root#test3 snmptt]# /bin/echo 'Board2Port1TS239.124.3.21:3000' | /bin/sed -r 's/^(Board.(Port.)*).*/\1/g'
Board2Port1
Now I have to use this command inside a SNMPTT configuration. It doesn't work.
This is the snmptt.debug report
Done performing substitution on PREEXEC line: /bin/echo 'Board2Port1TS239.124.3.21:3000' | /bin/sed -r 's/^(Board.(Port.)*).*/\1/g'
PREEXEC command: /bin/echo 'Board2Port1TS239.124.3.21:3000' | /bin/sed -r 's/^(Board.(Port.)*).*/\1/g'
command output: Board2Port1TS239.124.3.21:3000
The config file command is:
PREEXEC /bin/echo '$p2' | /bin/sed -r 's/^(Board.(Port.)*).*/\1/g'
the output "Board2Port1TS239.124.3.21:3000" is equal to the input ($p2). I don't undertstans why.
Thanks in advance

sed -r 's/^(Board.(Port.)*).*/\1/g' File

assuming you receive single message each time:
sed '/:/{s/\([0-9]\)[^0-9].*:/\1/;q};s/\([0-9]\)[^0-9].*/\1/'
should work:
kent$ sed '/:/{s/\([0-9]\)[^0-9].*:/\1/;q};s/\([0-9]\)[^0-9].*/\1/' <<< "Board2Port1TS239.124.3.20:3000"
Board23000
kent$ sed '/:/{s/\([0-9]\)[^0-9].*:/\1/;q};s/\([0-9]\)[^0-9].*/\1/' <<< "Board4whatever3000"
Board4

Assuming that the input really looks like your example, and that Port and UserTag are fixed strings:
sed -r '/Port/{s/TS.*//;n};s/UserTag.*//'

Related

Hard regex with sed

In a script.sh file, I have the following line:
ExecStart=ssh -nN -R 46:192.168.0.1:56 192.168.0.2
I try to replace with sed the second port (56 here) knowing that its value can vary between 1 and 65535.
So I tried that without success :
sed -i -e "s/:.*[[:space/]]/other port number/2g' script.sh
Could you help me solve my regex?
You may use:
sed -i "s/:[0-9]\{1,5\} /:other port number /" script.sh
$ other_port_number="123"
$ echo "ExecStart=ssh -nN -R 46:192.168.0.1:56 192.168.0.2" | sed "s/:[0-9]\{1,5\} /:$other_port_number /"
ExecStart=ssh -nN -R 46:192.168.0.1:123 192.168.0.2

How to extract everything between two patterns (using sed?)?

I am running a curl command (plus a grep) and I want to extract everything between two patterns from the output.
Here's an example output from the curl (and grep):
Dload Upload Total Spent Left Speed
100 15848 0 15848 0 0 708k 0 --:--:-- --:--:-- --:--:-- 736k
</message><refDesc>PULL Task 8c4d1a50-3e05-4b58-8d1a-503e057b586d 4_Place_All_Users_In_Inactive</refDesc><refKey>8c4d1a50-3e05-4b58-8d1a-503e057b586d</refKey><status>SUCCESS</status></syncope21:exec><syncope21:exec xmlns:syncope21="http://syncope.apache.org/2.1"><end>2020-01-22T01:13:44.512Z</end><start>2020-01-22T01:13:44.506Z</start><jobType>TASK</jobType><key>40e64a39-47e7-4428-a64a-3947e7c4286b</key><message>Users [created/failures]: 0/0 [updated/failures]: 0/0 [deleted/failures]: 0/0 [no operation/ignored]: 0/0
and I want to extract everything between the </message> and the </start>, e.g., from the above, I want:
</message><refDesc>PULL Task 8c4d1a50-3e05-4b58-8d1a-503e057b586d 4_Place_All_Users_In_Inactive</refDesc><refKey>8c4d1a50-3e05-4b58-8d1a-503e057b586d</refKey><status>SUCCESS</status></syncope21:exec><syncope21:exec xmlns:syncope21="http://syncope.apache.org/2.1"><end>2020-01-22T01:13:44.512Z</end><start>2020-01-22T01:13:44.506Z</start>
I have tried following:
curl -X GET ...." | grep xxxxxxx | sed -n -e '/<\/message>/,/<\/start>/p'
but it doesn't seem to be working (it seems to be returning the entire output, rather than extracting.
Can someone tell me how to do that?
Thanks!
Jim
awk solutions: Could you please try following,if data is present in an Input_file.
awk 'match($0,/<\/message>.*<start>/){print substr($0,RSTART,RLENGTH)}' Input_file
OR with curl use it like :
curl -X GET ...." | awk '{gsub(/\r/,"")} match($0,/<\/message>.*<start>/){print substr($0,RSTART,RLENGTH)}'
sed solutions: OR with GNU sed's -z option:
sed -z 's/.*\(<\/message>.*<start>\).*/\1\n/' Input_file
with curl + sed:
curl -X GET ...." | sed -z 's/\r//g;s/.*\(<\/message>.*<start>\).*/\1\n/'
With shown sample output will be as follows.
</message><refDesc>PULL Task 8c4d1a50-3e05-4b58-8d1a-503e057b586d 4_Place_All_Users_In_Inactive</refDesc><refKey>8c4d1a50-3e05-4b58-8d1a-503e057b586d</refKey><status>SUCCESS</status></syncope21:exec><syncope21:exec xmlns:syncope21="http://syncope.apache.org/2.1"><end>2020-01-22T01:13:44.512Z</end><start>
Can you please try below using GNU sed:
sed -E -n 's#(^</message>.*</start>).*#\1#p'
so, basically, your command would look like:
curl -X GET ...." | grep xxxxxxx | sed -E -n 's#(^</message>.*</start>).*#\1#p'
This might work for you (GNU sed):
sed '/\n/!{s/<\/message>/\n&/;s/<\/start>/&\n/};/^<\/message>/P;D' file
If a line has not already been amended, insert a newline before </message> and after </start> and print only that part of the line.

extract a base directory from the output of ps

I am looking to extract a basedir from the output of ps -ef | grep classpath myprog.jar
root 20925 20886 1 17:41 pts/0 00:01:07 /opt/myprog/java/jre/bin -classpath myprog.jar
java is always a sub-dir under the basedir but the install path can vary from server to server e.g.
/usr/local/myprog/java/jre/bin
/opt/test/testing/myprog/java/jre/bin
So once i have my string how do I extract everything from before java until the beginning of the path?
That is, /usr/local/myprog or /opt/test/testing/myprog/
Using sed:
$ echo "root 20925 20886 1 17:41 pts/0 00:01:07 /opt/myprog/java/jre/bin -classpath myprog.jar" | sed 's/.*\ \(.*\)\/java.*/\1/'
/opt/myprog
Using grep -P:
ps -ef | grep -oP '\S+(?=/java)'
/opt/myprog
If your grep doesn't support -P then use:
s='root 20925 20886 1 17:41 pts/0 00:01:07 /opt/myprog/java/jre/bin -classpath myprog.jar'
[[ "$s" =~ (/[^[:blank:]]+)/java ]] && echo "${BASH_REMATCH[1]}"
/opt/myprog
echo "root 20925 20886 1 17:41 pts/0 00:01:07 /opt/myprog/java/jre/bin -classpath myprog.jar" | awk '{split($8,a,"/java"); print a[1]}'
Use pgrep to find all of the Java processes instead of using ps -ef | grep .... This way, you don't have to worry about your grep command showing up as one of your items.
Instead of running ps -ef, you can use the -o option to only pull up the desired fields, and most ps commands take --no-header to eliminate the header fields. This way, your script doesn't have to worry about header lines.
Finally, I am using Shell Parameter Expansion which is sometimes way easier than using sed to change a variable:
$ ps -o pid,args --no-headers $(pgrep -f "java .* myproj.jar") | while read pid command arguments
do
directory=${command%/java*}
echo "The directory for Process ID $pid is $directory"
done
By the way, you could be running multiple commands, so I loop through the ps command.
ps axo args | awk '/classpath myprog.jar/{print substr($0, 0,index($0, "java")-1)}'
For example:
$ echo '/opt/myprog/java/jre/bin -classpath myprog.jar' \
| awk '/classpath myprog.jar/{print substr($0, 0,index($0, "java")-1)}'
/opt/myprog/
You can (and probably should) switch both of the $0's to $1's if you know for sure that your path will not contain spaces. Or add additional fields to the ps -o list using commas (as in, o pid,args) and use $2 rather than $1.
You can match the following regex:
'((\/\w+)+)\/java'
and the first captured group \1 or $1 will contain the wanted string
Demo: http://regex101.com/r/zU2vV4

Transform mysql 'INSERT' statement into a CSV line

I need to convert mysql dump file to CSV format before importing to a data warehouse server.
INSERT INTO `temp` VALUES (30686631,1346959848246,1346959850865,1346959998054,'18663196147','18663196147','18668839208','17326812123',3372579,'1866319614700','A',1,'','',0,147,30686632,'KeyAd','1101','38.325.Monitor2.1101#10.40.10.170','10.40.10.40',5060,'10.40.10.46',5060,'100038455383251101_Monitor2#10.40.10.170','<sip:+18668839208#10.40.10.46:5060>;tag=sansay507370834rdb810','\"O\'HALLORAE,AEAN\" <sip:+17326812123#10.40.10.40;isup-oli=00>;tag=sansay507370829rdb1779','200',0,'',0,NULL,'','',3398812,NULL,NULL);
I'm using this command to remove mysql insert statement
sed -e 's/^INSERT INTO `temp` VALUES (//' -e 's/);$//' -e 's/(//;s/);//;s/,/|/g;s|["'\'']||g'
there seems to be an issue with names when they come between two slashes \ \ ,I can't figure out how to fix it.
From MySQL insert
'\"O\'HALLORAE,AEAN\"
can't figure out how to form the output to
"O'HALLORAN,SEAN"
Desierd output:
30686631|1346959848246|1346959850865|1346959998054|18663196147|18663196147|18668839208|17326812123|3372579|1866319614700|A|1|||0|147|30686632|KeyAd|1101|38.325.Monitor2.1101#10.40.10.170|10.40.10.40|5060|10.40.10.46|5060|100038455383251101_Monitor2#10.40.10.170|<sip:+18668839208#10.40.10.46:5060>;tag=sansay507370834rdb810| "O'HALLORAN,SEAN" <sip:+17326812123#10.40.10.40;isup-oli=00>;tag=sansay507370829rdb1779|200|0||0|NULL|||3398812|NULL|NULL
Try this:
$ sed -e 's/INSERT INTO `temp` VALUES (//' -e 's/);$//' -re 's/("[^"]*),([^"]*")/\1\x1\2/g;s/,/|/g;s/\x1/,/g;s/\\([^\])/\1/g' file | sed "s/'|/|/g;s/|'/|/g"
Output:
30686631|1346959848246|1346959850865|1346959998054|18663196147|18663196147|18668839208|17326812123|3372579|1866319614700|A|1|||0|147|30686632|KeyAd|1101|38.325.Monitor2.1101#10.40.10.170|10.40.10.40|5060|10.40.10.46|5060|100038455383251101_Monitor2#10.40.10.170|<sip:+18668839208#10.40.10.46:5060>;tag=sansay507370834rdb810|"O'HALLORAN,SEAN" <sip:+17326812123#10.40.10.40;isup-oli=00>;tag=sansay507370829rdb1779|200|0||0|NULL|||3398812|NULL|NULL
If ruby is an acceptable dependency for you, you can leverage its parser if you can transform the statement into a valid ruby array:
script.sh:
#!/bin/bash
# -r to preserve backslashes
read -r statement
ruby=$(echo -n $statement | sed -e 's/^.*VALUES //' -e 's/;$//' -e 's/^(/[/' -e 's/)$/]/' -e 's/NULL/"NULL"/g' -e 's/\\"/"/g')
echo $ruby | ruby -rcsv -e 'puts CSV.generate_line(eval($stdin.read), "|")'
Usage:
chmod +x script.sh
echo <your statement> | ./script.sh
30686631|1346959848246|1346959850865|1346959998054|18663196147|18663196147|18668839208|17326812123|3372579|1866319614700|A|1|""|""|0|147|30686632|KeyAd|1101|38.325.Monitor2.1101#10.40.10.170|10.40.10.40|5060|10.40.10.46|5060|100038455383251101_Monitor2#10.40.10.170|<sip:+18668839208#10.40.10.46:5060>;tag=sansay507370834rdb810|"""O'HALLORAE,AEAN"" <sip:+17326812123#10.40.10.40;isup-oli=00>;tag=sansay507370829rdb1779"|200|0|""|0|NULL|""|""|3398812|NULL|NULL
This loads as expected on openoffice (after setting the delimiter to "|")

sed regular expressions address ranges

I have a txt file that looks something like this
-----------------------------------
RUNNING PROCESSES
-----------------------------------
ftpd
kswapd
init
etc..
---------------------------------
HOSTNAME
--------------------------------
mypc.local.com
With sed I want to just get one section of this file. So just the RUNNING PROCESSES section, however I seem to be failing to get my regexp right to do so.
I got this far
sed -n '/^-./,/RUNNING PROCESSES/, /[[:space::]]/p' linux.txt | more
however it keeps complaining about
-e expression #1, char 26: unknown commmand `,'
Can anybody help??
Did you mean:
sed -n '/RUNNING PROCESSES/,/HOSTNAME/p' linux.txt |
sed -e '/^[- ]/d' -e '/^$/d'
I would probably prefer to use awk for that:
awk '/RUNNING PROCESSES/ {s=2}
/^---/ {s=s-1}
{if(s>0){print}}' linux.txt
That awk will give you:
RUNNING PROCESSES
-----------------------------------
ftpd
kswapd
init
etc..
You can then pipe that through sed '/^$/d' to filter out the blank lines.
Here is another variable of the answer accepted, but not extra call to another sed process
sed -n '/RUNNING PROCESSES/,/HOSTNAME/{s/RUNN.*\|HOSTNAME//;s/--*//;/^$/!p}' file