Related
I'm trying to parse the highstate output of Salt has proven to be difficult. Without changing the output to json due to the fact that I still want it to be human legible.
What's the best way to convert the Summary into something machine readable?
Summary for app1.domain.com
--------------
Succeeded: 278 (unchanged=12, changed=6)
Failed: 0
--------------
Total states run: 278
Total run time: 7.383 s
--
Summary for app2.domain.com
--------------
Succeeded: 278 (unchanged=12, changed=6)
Failed: 0
--------------
Total states run: 278
Total run time: 7.448 s
--
Summary for app0.domain.com
--------------
Succeeded: 293 (unchanged=13, changed=6)
Failed: 0
--------------
Total states run: 293
Total run time: 7.510 s
Without a better idea I'm trying to grep and awk the output and insert it into a csv.
These two work:
cat ${_FILE} | grep Summary | awk '{ print $3} ' | \
tr '\n' ',' | sed '$s/,$/\n/' >> /tmp/highstate.csv;
cat ${_FILE} | grep -oP '(?<=unchanged=)[0-9]+' | \
tr '\n' ',' | sed '$s/,$/\n/' >> /tmp/highstate.csv;
But this one fails but works in Reger
cat ${_FILE} | grep -oP '(?<=\schanged=)[0-9]+' | \
tr '\n' ',' | sed '$s/,$/\n/' >> /tmp/highstate.csv;
EDIT1: #vintnes #ikegami I agree I'd much rather take the json output parse the output but Salt doesn't offer a summary of changes when outputting to josn. So far this is what I have and while very ugly, it's working.
cat ${_FILE} | grep Summary | awk '{ print $3} ' | \
tr '\n' ',' | sed '$s/,$/\n/' >> /tmp/highstate_tmp.csv;
cat ${_FILE} | grep -oP '(?<=unchanged=)[0-9]+' | \
tr '\n' ',' | sed '$s/,$/\n/' >> /tmp/highstate_tmp.csv;
cat ${_FILE} | grep unchanged | awk -F' ' '{ print $4}' | \
grep -oP '(?<=changed=)[0-9]+' | tr '\n' ',' | sed '$s/,$/\n/' >> /tmp/highstate_tmp.csv;
cat ${_FILE} | { grep "Warning" || true; } | awk -F: '{print $2+0} END { if (!NR) print "null" }' | \
tr '\n' ',' | sed '$s/,$/\n/' >> /tmp/highstate_tmp.csv;
cat ${_FILE} | { grep "Failed" || true; } | awk -F: '{print $2+0} END { if (!NR) print "null" }' | \
tr '\n' ',' | sed '$s/,$/\n/' >> /tmp/highstate_tmp.csv;
csvtool transpose /tmp/highstate_tmp.csv > /tmp/highstate.csv;
sed -i '1 i\instance,unchanged,changed,warning,failed' /tmp/highstate.csv;
Output:
instance,unchanged,changed,warning,failed
app1.domain.com,12,6,,0
app0.domain.com,13,6,,0
app2.domain.com,12,6,,0
Here you go. This will also work if your output contains warnings. Please note that the output is in a different order than you specified; it's the order in which each record occurs in the file. Don't hesitate with any questions.
$ awk -v OFS=, '
BEGIN { print "instance,unchanged,changed,warning,failed" }
/^Summary/ { instance=$NF }
/^Succeeded/ { split($3 $4 $5, S, /[^0-9]+/) }
/^Failed/ { print instance, S[2], S[3], S[4], $2 }
' "$_FILE"
split($3 $4 $5, S, /[^0-9]+/) handles the possibility of warnings by disregarding the first two "words" Succeeded: ### and using any number of non-digits as a separator.
edit: Printed on /^Fail/ instead of using /^Summ/ and END.
perl -e'
use strict;
use warnings qw( all );
use Text::CSV_XS qw( );
my $csv = Text::CSV_XS->new({ auto_diag => 2, binary => 1 });
$csv->say(select(), [qw( instance unchanged change warning failed )]);
my ( $instance, $unchanged, $changed, $warning, $failed );
while (<>) {
if (/^Summary for (\S+)/) {
( $instance, $unchanged, $changed, $warning, $failed ) = $1;
}
elsif (/^Succeeded:\s+\d+ \(unchanged=(\d+), changed=(\d+)\)/) {
( $unchanged, $changed ) = ( $1, $2 );
}
elsif (/^Warning:\s+(\d+)/) {
$warning = $1;
}
elsif (/^Failed:\s+(\d+)/) {
$failed = $1;
$csv->say(select(), [ $instance, $unchanged, $changed, $warning, $failed ]);
}
}
'
Provide input via STDIN, or provide path to file(s) from which to read as arguments.
Terse version:
perl -MText::CSV_XS -ne'
BEGIN {
$csv = Text::CSV_XS->new({ auto_diag => 2, binary => 1 });
$csv->say(select(), [qw( instance unchanged change warning failed )]);
}
/^Summary for (\S+)/ and #row=$1;
/^Succeeded:\s+\d+ \(unchanged=(\d+), changed=(\d+)\)/ and #row[1,2]=($1,$2);
/^Warning:\s+(\d+)/ and $row[3]=$1;
/^Failed:\s+(\d+)/ and ($row[4]=$1), $csv->say(select(), \#row);
'
Improving answer from #vintnes.
Producing output as tab separated CSV
Write awk script that reads values from lines by their order.
Print each record as it is read.
script.awk
BEGIN {print("computer","succeeded","unchanged","changed","failed","states run","run time");}
FNR%8 == 1 {arr[1] = $3}
FNR%8 == 3 {arr[2] = $2; arr[3] = extractNum($3); arr[4] = extractNum($4)}
FNR%8 == 4 {arr[5] = $2;}
FNR%8 == 6 {arr[6] = $4;}
FNR%8 == 7 {arr[7] = $4; print arr[1],arr[2],arr[3],arr[4],arr[5],arr[6],arr[7];}
function extractNum(str){match(str,/[[:digit:]]+/,m);return m[0];}
run script
Tab separated CSV output
awk -v OFS="\t" -f script.awk input-1.txt input-2.txt ...
Comma separated CSV output
awk -v OFS="," -f script.awk input-1.txt input-2.txt ...
Output
computer succeeded unchanged changed failed states run run time
app1.domain.com 278 12 6 0 278 7.383
app2.domain.com 278 12 6 0 278 7.448
app0.domain.com 293 13 6 0 293 7.510
computer,succeeded,unchanged,changed,failed,states run,run time
app1.domain.com,278,12,6,0,278,7.383
app2.domain.com,278,12,6,0,278,7.448
app0.domain.com,293,13,6,0,293,7.510
Explanation
BEGIN {print("computer","succeeded","unchanged","changed","failed","states run","run time");}
Print the heading CSV line
FNR%8 == 1 {arr[1] = $3}
Extract the arr[1] value from 3rd field in (first line from 8 lines)
FNR%8 == 3 {arr[2] = $2; arr[3] = extractNum($3); arr[4] = extractNum($4)}
Extract the arr[2,3,4] values from 2nd,3rd,4th fields in (third line from 8 lines)
FNR%8 == 4 {arr[5] = $2;}
Extract the arr[5] value from 2nd field in (4th line from 8 lines)
FNR%8 == 6 {arr[6] = $4;}
Extract the arr[6] value from 4th field in (6th line from 8 lines)
FNR%8 == 7 {arr[7] = $4;
Extract the arr[7] value from 4th field in (7th line from 8 lines)
print arr[1],arr[2],arr[3],arr[4],arr[5],arr[6],arr[7];}
print the array elements for the extracted variable at the completion of reading 7th line from 8 lines.
function extractNum(str){match(str,/[[:digit:]]+/,m);return m[0];}
Utility function to extract numbers from text field.
I have the next lines in files:
UserParameter=cassandra.status[*], curl -s "http://$1:$2/server-status?auto" | grep -e $3 | awk '{ print $$2 }'
UserParameter=ping.status[*],curl -s --retry 3 --max-time 3 'http://localhost:1111/engines?$1' | awk '/last_seen = / {split($$1, a, "/"); print a[2]}; END { if (!NR) print "NO_MATCHING_ENGINES" }' | tr "\n" "
and so on.
I want to display that line where comma after [*] is missed or there are any extra characters besides comma.
For example:
UserParameter=ping.status[*],,,curl -s --retry 3 --max-time 3 'http://localhost:1111/engines?$1' | awk '/last_seen = / {split($$1, a, "/"); print a[2]}; END { if (!NR) print "NO_MATCHING_ENGINES" }' | tr "\n" "
UserParameter=ping.status[*] curl -s --retry 3 --max-time 3 'http://localhost:1111/engines?$1' | awk '/last_seen = / {split($$1, a, "/"); print a[2]}; END { if (!NR) print "NO_MATCHING_ENGINES" }' | tr "\n" "
UserParameter=ping.status[*],;!curl -s --retry 3 --max-time 3 'http://localhost:1111/engines?$1' | awk '/last_seen = / {split($$1, a, "/"); print a[2]}; END { if (!NR) print "NO_MATCHING_ENGINES" }' | tr "\n" "
will be printed as long as there are extra characters and spaces besides single comma.
But:
UserParameter=ping.status[*],curl -s --retry 3 --max-time 3 'http://localhost:1111/engines?$1' | awk '/last_seen = / {split($$1, a, "/"); print a[2]}; END { if (!NR) print "NO_MATCHING_ENGINES" }' | tr "\n" "
will not be printed as long as there is single comma after [*].
I was trying to develop a pattern for egrep, but it doesn't fit for all cases where for example besides comma any other character which follows after [*]:
egrep (\[\*\].(|;|:|,|\.|))
I'll appreciate any help! Thank you!
grep -vE '\[\*\],[$/[:alpha:] ]' input
Do not print lines that match the pattern: [*], followed by any of: $, /, alphabetic character, or a space.
I wanna extract IP and download-total from mikrotik command /queue simple print stat
Here's some example :
0 name="101" target=192.168.10.101/32 rate=0bps/0bps total-rate=0bps
packet-rate=0/0 total-packet-rate=0 queued-bytes=0/0
total-queued-bytes=0 queued-packets=0/0 total-queued-packets=0
bytes=17574842/389197663 total-bytes=0 packets=191226/308561
total-packets=0 dropped=9/5899 total-dropped=0
1 name="102" target=192.168.10.102/32 rate=0bps/0bps total-rate=0bps
packet-rate=0/0 total-packet-rate=0 queued-bytes=0/0
total-queued-bytes=0 queued-packets=0/0 total-queued-packets=0
bytes=65593392/183786457 total-bytes=0 packets=163260/166022
total-packets=0 dropped=175/2403 total-dropped=0
2 name="103" target=192.168.10.103/32 rate=0bps/0bps total-rate=0bps
packet-rate=0/0 total-packet-rate=0 queued-bytes=0/0
total-queued-bytes=0 queued-packets=0/0 total-queued-packets=0
bytes=3263234/67407044 total-bytes=0 packets=41437/52602
total-packets=0 dropped=0/546 total-dropped=0
All that I need is :
192.168.10.101 389197663
192.168.10.102 183786457
192.168.10.103 67407044
But I get
target=192.168.10.101/32
bytes=17574842/389197663
target=192.168.10.102/32
bytes=65593392/183786457
target=192.168.10.103/32
bytes=3263234/67407044
I try it with grep -oP 'target=.*?\ |[^\-]bytes=.*?\ ' | sed 's/^ //g'.
So, how can I parse it? Sorry for bad english..
Just continue your line of parsing with another pipes (most easy way i think)
grep -oP 'target=.*?\ |[^\-]bytes=.*?\ ' file | sed 's/^ //g' | sed -r 's/target=([^/]*)[/].*/\1/; s/bytes=[^/]*[/]//' | sed 'N; s/\n/ /'
output
192.168.10.101 389197663
192.168.10.102 183786457
192.168.10.103 67407044
sed '/^[0-9]\{1,\}[[:blank:]]\{1,\}name/,/^[[:blank:]]*$/ {
/^[0-9]/{
s#.*target=\([^/]*\).*#\1#;h;d
}
\#^[[:blank:]]*bytes=[0-9]*/\([0-9]*\).*# !d
s//\1/
G
s/\(.*\)\n\(.*\)/\2 \1/p
}
d
' YourFile
A bit long but do the job in 1 sed
awk '{
if ( $3 ~ /target=/ ) split( $3, aIP, "[=/]")
if ( $1 ~ /^[[:blank:]]*bytes=[0-9]*/ ) {
split( $1, aByt, "/")
print aIP[2] " " aByt[2]
}
}' YourFile
same in awk
if always same exact structure
awk 'BEGIN{ RS="" }
{ split( $3, aIP, "[=/]"); split( $12, aByt, "/")
print aIP[2] " " aByt[2]
}' YourFile
Suppose we have a string like
"dir1|file1|dir2|file2"
and would like to turn it into
"-f dir1/file1 -f dir2/file2"
Is there an elegant way to do this with sed or awk for a general case of n > 2?
My attempt was to try
echo "dir1|file1|dir2|file2" | sed 's/\(\([^|]\)|\)*/-f \2\/\4 -f \6\/\8/'
An awk solution:
awk -F'|' '{ for (i=1;i<=NF;i+=2) printf "-f %s/%s%s", $i, $(i+1), ((i==NF-1) ? "\n" : " ") }' \
<<<"dir1|file1|dir2|file2"
-F'|' splits the input into fields by |
for (i=1;i<=NF;i+=2) loops over the field indices in increments of 2
printf "-f %s/%s%s", $i, $(i+1), ((i==NF-1) ? "\n" : " ") prints pairs of consecutive fields joined with / and prefixed with -f<space>
((i==NF-1) ? "\n" : " ") terminates each field-pair either with a space, if more fields follow, or a \n to terminate the overall output.
In a comment, the OP suggests a shorter variation, which may be of interest if you don't need/want the output to be \n-terminated:
awk -F'|' '{ for (i=1;i<=NF;++i) printf "%s", (i%2 ? " -f " $i : "/" $i ) }' \
<<<"dir1|file1|dir2|file2"
This might work for you (GNU sed):
sed 's/\([^|]*\)|\([^|]*\)|\?/-f \1\/\2 /g;s/ $//' file
This will work for dir1|file1|dir2|file2|dirn|filen type strings
The regexp forms two back references (\1,\2 used in the replacement part of the substitution command s/pattern/replacement/), the first is all non-|'s, then a |, the second is all non-|'s then an optional | i.e. for the first application of the substitution (N.B. the g flag is implemented and so the substitutions may be multiple) dir1 becomes \1 and file1 becomes \2. All that remains is to prepend -f and replace the first | by / and the second | by a space. The last space is not needed at the end of the line and is removed in the second substitution command.
$ awk -v RS='|' 'NR%2{p=$0;next} {printf " -f %s/%s", p, $0}' <<< 'dir1|file1|dir2|file2'
-f dir1/file1 -f dir2/file2
A gnu-awk solution:
s="dir1|file1|dir2|file2"
awk 'BEGIN{ FPAT="[^|]+\\|[^|]+" } {
for (i=1; i<=NF; i++) {
sub(/\|/, "/", $i);
if (i>1)
printf " ";
printf "-f " $i
};
print ""
}' <<< "$s"
-f dir1/file1 -f dir2/file2
FPAT is used for grabbing dir1|file2 into single field.
I've got a script producing output from Twitter's streaming API into a format like this
semmelracet_dev | 450587667 | 1 semla till idag! #semmelreport | 569866960802062336 | 1424701845728
Where field 3 is the actual tweet.
What I want to do was to grab the integer from that field and insert it into a database as a separate field/column.
To just insert those fields is not a problem, but getting the INT and handling it separately is. Could I enforce usage and split the field after the INT?
Sorry about not including expexted output. Basically i'm constructing a mysql insert like
"... insert into report values ("semmelracet_dev", 450587667, "1 semla till idag! #semmelreport", 1, 569866960802062336, 1424701845728)"
Any ideas?
EDIT again, or if it's something that's not doable, maybe keep all the columns and in field 3 just keep the int when inserting them into the database?
EDIT 2
Tried the solution from jeanrjc below with mixed success
cat tweetReport.txt | awk -F"\|" '{n=split($3,s," "); for (i=1;i<=n;i++) if
(s[i] + 0 == s[i]) int_val = s[i]}{print "\""$1"\","$2", \""$3"\",
"int_val", "$4", "$5}')
-bash: syntax error near unexpected token `)'
I then removed the trailing ) and got
cat tweetReport.txt | awk -F"\|" '{n=split($3,s," "); for (i=1;i<=n;i++) if
(s[i] + 0 == s[i]) int_val = s[i]}{print "\""$1"\","$2", \""$3"\",
"int_val", "$4", "$5}'
awk: warning: escape sequence `\|' treated as plain `|'
"semmelracet_dev ", 450587667 , " 1 semla till idag! #semmelreport ", 1,
569866960802062336 , 1424701845728 "",, "", 1, ,
Which is better, but with some jibberish i don't quite understand..
I'm not sure I fully understand what you want, but I guessed that you wanted to extract (or get rid of) the int value of the 3rd field, is that right ?
To do so:
awk -F"|" '{print $3}' file | awk '{for (i=1; i<=NF; i++) if ($i + 0 == $i) print $i}'
where ($i + 0 == $i) tests whether this word is an int or not, then print it.
I hope that from that, you'll manage to get what you want. Precise your expected output otherwise.
EDIT : To obtain desired output:
$ cat tweet.txt
semmelracet_dev | 999999999 | 2 foo bar! #fooreport | 999996696080209999 | 1429999845728
semmelracet_dev | 450587667 | 1 semla till idag! #semmelreport | 569866960802062336 | 1424701845728
$ awk -F"\|" '{n=split($3,s," "); for (i=1;i<=n;i++) if (s[i] + 0 == s[i]) int_val = s[i]}{print "\""$1"\","$2", \""$3"\", "int_val", "$4", "$5}' tweet.txt
"semmelracet_dev ", 999999999 , " 2 foo bar! #fooreport ", 2, 999996696080209999 , 1429999845728
"semmelracet_dev ", 450587667 , " 1 semla till idag! #semmelreport ", 1, 569866960802062336 , 1424701845728
Which you can capture in a variable and then pass it to construct your mysql insert.
HTH
I'm using a bashism to feed data to awk, you can use something else:
$ t="semmelracet_dev | 450587667 | 1 semla till idag! #semmelreport | 569866960802062336 | 1424701845728"
$ awk -F'|' '{n=$3;sub(/^ */,"",n);sub(/ .*/,"",n);print n;}' <<<"$t"
1
This simply does a couple of substitutions to "trim" data around the pipe, then remove anything after the first space.
If you want help inserting this number into a database, you'll have to be a bit more explicit about what tools you're using. For example, this might work:
$ n=$(awk -F'|' '{n=$3;sub(/^ */,"",n);sub(/ .*/,"",n);print n;}' <<<"$t")
$ psql -c $(printf 'INSERT INTO table (n) VALUES (%d);' "$n")
Or if you'd prefer to get these data from a log file and pipe thing through psql, you could do it this way:
awk -F'|' -vfmt="INSERT INTO table (n) VALUES (%d);" '
{
n=$3; sub(/^ */,"",n); sub(/ .*/,"",n);
printf(fmt,n);
}' input.txt \
| psql
awk 'BEGIN{FS="|";} {print($3);}' | sed -r 's/([0-9]+)(.*)/\1/'