Linux Bash Regular Expressions, retrieving data from SNMPGet Output - regex

I've been working on getting a few simple monitoring tools running at home, and decided to be funny and retrieve the printer data along with everything else, however now that I've got the SNMP portion of it working quite well, I can't seem to be able to parse the data that my SNMPGET command retrieves properly in Linux, the current script I am using is as follows:
#!/usr/bin/env bash
# RegEx for Strings: "(.+?)"| -?\d+
RegExStr='"(.+?)"| -?\d+'
# ***
# Brother HL-2150N Printer
# ***
# Order Data: Toner Naame, Toner Level, Drum Name, Drum Status, Total Pages Printer, Display Status
Input=$(snmpget -v 1 -c public 192.168.16.112 SNMPv2-SMI::mib-2.43.11.1.1.6.1.1 SNMPv2-SMI::mib-2.43.11.1.1.8.1.1 SNMPv2-SMI::mib-2.43.11.1.1.6.1.2 SNMPv2-SMI::mib- 2.43.11.1.1.9.1.1 SNMPv2-SMI::mib-2.43.10.2.1.4.1.1 SNMPv2-SMI::mib-2.43.16.5.1.2.1.1 -m BROTHER-MIB)
Output1=( $(echo $Input | egrep -o $RegExStr) )
# Output
echo $Input
echo ${Output1[#]}
Which, oddly enough does not work. I'm fairly certain my regular expression ( "(.+?)" ) is correct, as I've tested it numerous times in various different syntax checkers and testers. It's supposed to select all the data that's between quotation marks ("").
Anyhow, the SNMPGET return is:
SNMPv2-SMI::mib-2.43.11.1.1.6.1.1 = STRING: "Black Toner Cartridge" SNMPv2-SMI::mib-2.43.11.1.1.8.1.1 = INTEGER: -2 SNMPv2-SMI::mib-2.43.11.1.1.6.1.2 = STRING: "Drum Unit" SNMPv2-SMI::mib-2.43.11.1.1.9.1.1 = INTEGER: -3 SNMPv2-SMI::mib-2.43.10.2.1.4.1.1 = Counter32: 13630 SNMPv2-SMI::mib-2.43.16.5.1.2.1.1 = STRING: "SLAAP "
I've tried various things myself, and using grep returns a blank string. to my understanding grep does not support every regular expression command by itself, so I started using egrep, while this returns SOMETHING, it is everything inside the original string divided by spaces, starting at the first quotation mark.
Is there anything I'm missing? I've looked around, and adjusted my methods a few times but never seemed to get a usable array in return.
Anyhow, I appreciate any help/pointers you'd be able to give me. I'd like to be able to get this running, even if just for fun and a good learning experience. Thank you in advance though! I'll be fidgeting on with it some more myself, but will check here every now and then.

From your output:
To get all strings:
grep -oP 'STRING: *"\K[^"]*'
Black Toner Cartridge
Drum Unit
SLAAP
To get all integers:
grep -oP '(INTEGER|Counter32): *\K[^ ]*'
-2
-3
13630

With awk you can do this:
awk 'NR%2==0' RS=\" <<< $Input
Black Toner Cartridge
Drum Unit
SLAAP
Or into a variable
Output1=$(awk 'NR%2==0' RS=\" <<< $Input)

Related

egrep - regex filtering characters only working when run via cron?

This is baffling to me, please help :-)
I have a program which sometimes runs by CLI, and sometimes though cron, both as the same user, and both in the bash.
In cron I use SHELL=/bin/bash to force bash.
The offending command within the script is:
egrep -v "$^" playlist.txt | egrep -v "[^ -.[:alnum:]]" >>formattedPlaylist.txt
Basically, it should remove all blank lines from the playlist, then remove any line which contains anything other than [A-Za-z0-9 - .].
For some reason, when run as a user from cli, this does not filter out many characters, whereas if cron runs it, it works exactly as expected.
The characters which are not filtered out are:
% $ # ! * & ( ) '
Any ideas??
Try:
sed '/[^-A-Za-z0-9.\x27 ]/d;/''/d;/^\s*$/d' playlist.txt > cleaned_playlist.txt
Input text:
A goat
232423
-sdf-g
Here it goes
'keep me
$ let it go
\ this one too
Output:
A goat
232423
-sdf-g
Here it goes
'keep me
Try setting your locale explicitly.
LC_ALL=C egrep -v "$^|[^ -.[:alnum:]]" playlist.txt >>formattedPlaylist.txt
I also simplified the command by merging the two regular expressions, but the locale fix is the answer to your question.

Can adding a particular number to a bunch of "time" strings, be done in Regex

I have a "srt" file(like standard movie-subtitle format) like shown in below link:http://pastebin.com/3k8a53SC
Excerpt:
1
00:00:53,000 --> 00:00:57,000
<any text that may span multiple lines>
2
00:01:28,000 --> 00:01:35,000
<any text that may span multiple lines>
But right now the subtitles timing is all wrong, as it lags behind by 9 seconds.
Is it possible to add 9 seconds(+9) to every time entry with regex ?
Even if the milliseconds is set to 000 then it's fine, but the addition of 9 seconds should adhere to "60 seconds = 1 minute & 60 minutes = 1 hour" rules.
Also the subtitle text after timing entry must not get altered by regex.
By the way the time format for each time string is "Hours:Minutes:Seconds.Milliseconds".
Quick answer is "no", that's not an application for regex. A regular expression lets you MATCH text, but not change it. Changing things is outside the scope of the regex itself, and falls to the language you're using -- perl, awk, bash, etc.
For the task of adjusting the time within an SRT file, you could do this easily enough in bash, using the date command to adjust times.
#!/usr/bin/env bash
offset="${1:-0}"
datematch="^(([0-9]{2}:){2}[0-9]{2}),[0-9]{3} --> (([0-9]{2}:){2}[0-9]{2}),[0-9]{3}"
os=$(uname -s)
while read line; do
if [[ "$line" =~ $datematch ]]; then
# Gather the start and end times from the regex
start=${BASH_REMATCH[1]}
end=${BASH_REMATCH[3]}
# Replace the time in this line with a printf pattern
linefmt="${line//[0-2][0-9]:[0-5][0-9]:[0-5][0-9]/%s}\n"
# Calculate new times
case "$os" in
Darwin|*BSD)
newstart=$(date -v${offset}S -j -f "%H:%M:%S" "$start" '+%H:%M:%S')
newend=$(date -v${offset}S -j -f "%H:%M:%S" "$end" '+%H:%M:%S')
;;
Linux)
newstart=$(date -d "$start today ${offset} seconds" '+%H:%M:%S')
newend=$(date -d "$end today ${offset} seconds" '+%H:%M:%S')
;;
esac
# And print the result
printf "$linefmt" "$newstart" "$newend"
else
# No adjustments required, print the line verbatim.
echo "$line"
fi
done
Note the case statement. This script should auto-adjust for Linux, OSX, FreeBSD, etc.
You'd use this script like this:
$ ./srtadj -9 < input.srt > output.srt
Assuming you named it that, of course. Or more likely, you'd adapt its logic for use in your own script.
No, sorry, you can’t. Regex are a context free language (see Chomsky e.g. https://en.wikipedia.org/wiki/Chomsky_hierarchy) and you cannot calculate.
But with a context sensitive language like perl it will work.
It could be a one liner like this ;-)))
perl -n -e 'if(/^(\d\d:\d\d:\d\d)([-,\d\s\>]*)(\d\d:\d\d:\d\d)(.*)/) {print plus9($1).$2.plus9($3).$4."\n";}else{print $_} sub plus9{ ($h,$m,$s)=split(/:/,shift); $t=(($h*60+$m)*60+$s+9); $h=int($t/3600);$r=$t-($h*3600);$m=int($r/60);$s=$r-($m*60);return sprintf "%02d:%02d:%02d", $h, $m, $s;}‘ movie.srt
with move.srt like
1
00:00:53,000 --> 00:00:57,000
hello
2
00:01:28,000 --> 00:01:35,000
I like perl
3
00:02:09,000 --> 00:02:14,000
and regex
you will get
1
00:01:02,000 --> 00:01:06,000
hello
2
00:01:37,000 --> 00:01:44,000
I like perl
3
00:02:18,000 --> 00:02:23,000
and regex
You can change the +9 in the "sub plus9{...}", if you want another delta.
How does it work?
We are looking for lines that matches
dd:dd:dd something dd:dd:dd something
and then we call a sub, which add 9 seconds to the matched group one ($1) and group three ($3). All other lines are printed unchanged.
added
If you want to put the perl oneliner in a file, say plus9.pl, you can add newlines ;-)
if(/^(\d\d:\d\d:\d\d)([-,\d\s\>]*)(\d\d:\d\d:\d\d)(.*)/) {
print plus9($1).$2.plus9($3).$4."\n";
} else {
print $_
}
sub plus9{
($h,$m,$s)=split(/:/,shift);
$t=(($h*60+$m)*60+$s+9);
$h=int($t/3600);
$r=$t-($h*3600);
$m=int($r/60);
$s=$r-($m*60);
return sprintf "%02d:%02d:%02d", $h, $m, $s;
}
Regular expressions strictly do matching and cannot add/substract. You can match each datetime string using python, for example, add 9 seconds to that, and then rewrite the string in the appropriate spot. The regular expression I would use to match it would be the following:
(?<hour>\d+):(?<minute>\d+):(?<second>\d+),(?<msecond>\d+)
It has labeled capture groups so it's really easy to get each section (you won't need msecond but it's there for visualization, I guess)
Regex101

search in shell

folks
I have a output file looks like:
Title: [name of component]
**garbage output**
**garbage output**
Test run: succuess 17 failure 2
**garbage output**
**garbage output**
and there are many components like this. I cannot change the way of the output. So I'd want to just grab the lines of title and test result.
My question is, how to write a regular expression to achieve this?
I tried:
cat output | sed -e 'm/Tests run(.*)/g'
but it always complains: unknown command `m'
Other methods except regex would also be appreciated!!!
Thanks a lot
You don't need cat, try
grep -E '^Title:|^Test run' fileName
on older systems you may need to use egrep '^Title...'.
Edit for
I want exclude Title with certain prefix from the result, like "Title: foo XXX" or "Title bar XXX".
There is certainly a regex for grep -E that would handle this, but for the first few years of cmd-line work, AND as you appear to be using this to cleanup test.log output, it is good to use the unix tool box approach, in this case, 'get something working and add a little more to it', i.e.
grep -E '^Title:|^Test run' fileName | egrep -v '^Title: foo XXX|^Title:bar XXX'
This is the power of the unix pipeline, got too much output?, then keep adding more grep -vs to clean it up.
Note that *grep -v means exclude lines that match the following patterns.
I hope this helps

How to extract a substring matching a pattern from a Unix shell variable

I'm relatively new to Unix shell scripting. Here's my problem. I've used this script...
isql -S$server -D$database -U$userID -P$password << EOF > $test
exec MY_STORED_PROC
go
EOF
echo $test
To generate this result...
Msg 257, Level 16, State 1:
Server 'MY_SERVER', Procedure 'MY_STORED_PROC':
Implicit conversion from datatype 'VARCHAR' to 'NUMERIC' is not allowed. Use
the CONVERT function to run this query.
(1 row affected)
(return status = 257)
Instead of echoing the isql output, I would like to extract the "257" and stick it in another variable so I can return 257 from the script. I'm thinking some kind of sed or grep command will do this, but I don't really know where to start.
Any suggestions?
bash can strip parts from the content of shell variables.
${parameter#pattern} returns the value of $parameter without the part at the beginning that matches pattern.
${parameter%pattern} returns the value of $parameter without the part at the end that matches pattern.
I guess there is a better way to do this, but this should work.
So you could combine this into:
% strip the part before the value:
test=${test#Msg }
% strip the part after the value:
test=${test%, Level*}
echo $test
If you're interested in the (return status = xxx) part, it would be:
result=${test#*(result status = }
result=${result%)*}
echo $result
The relevant section of the bash manpage is "Parameter Expansion".
Here is a quick and dirty hack for you, though you should really start learning this stuff yourself:
RC=`tail -1 $test |sed 's/(return status = \([0-9]\+\))/\1/'`

making graphs with a shell script

i need to make a graph with numeric values in a time period, the values represent online users in a web page.
the script will be exectued with cron every 30 mins and the needed html file will be downloaded with wget. but there are some yet unanswered questions & problems:
-i need to get just the numeric value from html code (but grep returns the whole line), how can I get only the numeric value? I can get the line with grep, it looks like this:
Users online: 24 917 </div>
How can I get just the 24917?
-what would be easier? to generate .svg file with the graph, or save values in a .csv file (and generate graph with OOo or something similar). Maybe some other good ideas?
Thanks in advance,
-skazhy
You can do the following to get your number:
Set the regular expression:
digits='[[:digit:]]+ *[[:digit:]]*'
followed by these two lines:
num=$(echo $line | grep -Eo "$digits")
num=${num// }
or these:
# Bash >= 3.2 (syntax may be different for 3.0/3.1)
[[ $line =~ $digits ]]
num=${BASH_REMATCH[#]// }
to extract the number from the variable $line containing the line in your question.
Gnuplot should be readily available. A few examples of its output can be found here.
These are from here.
Just one process (grep):
array=( $(grep whatever filename ) ) && echo "${array[2]}${array[3]}"