Using a variable between regexp inside foreach

Using a variable between regexp inside foreach - regex

I need to get all macs based on a list that match checkpoint and cisco vendors. I do a "show ip arp" on Cisco Router, catch the arp output and save on file. Now, I need to check line by line what macs I have.
To be exact, I need to match only the lines containing mac addresses of the Cisco and Checkpoint vendors. To do that I have to get IP address, the first forth hex digits, the dot, and more two hex digits to define a mac vendor e.g (001c.7f or e02f.6d) and print this line on a file. After that, I need to compare the "IP-ARP.txt" and "MAC-ADDRESS.txt" (this last one contains a full mac-address vendors). If my output match, save this line on another file.
Here a piece of files:
IP-ARP.txt
Internet 172.20.14.12 0 001c.7f41.186e ARPA
Internet 172.20.14.13 57 001c.7f41.074e ARPA
Internet 172.20.14.14 0 0200.5ebd.e17d ARPA
Internet 172.20.19.11 - 7081.050f.9402 ARPA
Internet 172.20.19.12 54 7cad.7499.e602 ARPA
Internet 172.20.19.13 7 e02f.6d14.c1bf ARPA
Internet 172.20.19.14 104 e02f.6d15.1d7f ARPA
MAC-ADDRESS.txt
001c.7f
001c.ab
001b.de
001b.ff
001c.cd
001c.de
e02f.6c
e02f.7c
Thank's in advance!

(updated again due to changing spec)
What we have here is a list of identifying strings (the OUI MAC address parts written as fragments of a MAC address string) and a list of data strings that need to be checked against this list.
My solution uses the fileutil package from the Tcl library. It's not quite necessary since you could use standard Tcl commands, but it simplifies the script a lot.
package require fileutil
Define some filenames to use.
set filename(macaddr) MAC-ADDRESS.txt
set filename(iparp) IP-ARP.txt
set filename(output) output.txt
If the identifying strings list is subject to change, you may want to read it from file everytime you run your script:
set idlist [::fileutil::cat $filename(macaddr)]
Or if these addresses seldom change, you can just hard-code it in your script and edit when necessary:
set idlist {001c.7f 001c.ab 001b.de 001b.ff 001c.cd 001c.de e02f.6c e02f.7c}
Set the contents of the output file to the empty string.
::fileutil::writeFile $filename(output) {}
To select the lines in your IP-ARP.txt file that match any of these addresses, there are several ways to traverse it. My suggestion is to use the fileutil::foreachLine command. Basic invocation is like this:
::fileutil::foreachLine varName filename script
(The first parameter is an arbitrary variable name: on every iteration the current line will be stored in that variable. The second is the name of the file to traverse, and the third parameter is a script to run once for every line in that file.)
The script calls a command that matches id strings using the string match command. The regexp command could be used instead, but I think that's quite unnecessary in this case. Every line in the IP-ARP.txt file is either blank or a proper Tcl list with five elements, where the MAC address is the fourth. Also, the second element is the ip number, and only those beginning with 172 are to be used. This means that the matching command can be written like this:
proc matchid {idlist line} {
set ipAddr [lindex $line 1]
set macAddr [lindex $line 3]
if {[string match 172* $ipAddr]} {
foreach id $idlist {
if {[string match $id* $macAddr]} {
return "$ipAddr $macAddr\n"
}
}
}
}
(Matching the ip address in this way only works if the address is in dotted decimal form: if it can be in any other form the Tcllib ip module should be used to match it.)
The result of the command is either a line containing the ip address and the MAC address if the line matched, or the empty string if it didn't.
Now lets traverse the contents of the IP-ARP.txt file. For each line, match the contents against the id list and get either an output string or an empty string. If the string isn't empty, append it to the output file.
::fileutil::foreachLine line $filename(iparp) {
set res [matchid $idlist $line]
if {$res ne {}} {
::fileutil::appendToFile $filename(output) $res
}
}
And that's it. The complete program:
package require fileutil
set filename(macaddr) MAC-ADDRESS.txt
set filename(iparp) IP-ARP.txt
set filename(output) output.txt
set idlist [::fileutil::cat $filename(macaddr)]
::fileutil::writeFile $filename(output) {}
proc matchid {idlist line} {
set ipAddr [lindex $line 1]
set macAddr [lindex $line 3]
if {[string match 172* $ipAddr]} {
foreach id $idlist {
if {[string match $id* $macAddr]} {
return "$ipAddr $macAddr\n"
}
}
}
}
::fileutil::foreachLine line $filename(iparp) {
set res [matchid $idlist $line]
if {$res ne {}} {
::fileutil::appendToFile $filename(output) $res
}
}
Documentation for the Tcllib fileutil module
Documentation: foreach, if, lindex, package, proc, set, string
(Note: the 'Hoodiecrow' mentioned in the comments is me, I used that nick earlier.)

Given a partial mac address (in the form "xxxx.xx") in a variable $mac, to match a line containing a mac address starting with that value:
^.*\b$mac[0-9a-f]{2}\.[0-9a-f]{4}\b.*$
If your language matches lines that contain a pattern, you can omit the wrapping:
\b$mac[0-9a-f]{2}\.[0-9a-f]{4}\b

Based on Hoodiecrow's answer, I did it:
set Ouilist { 0000.0c 0001.42 0001.43 0001.63 0001.64 ... }
set Macs1 [open "IP-ARP.txt" r]
foreach a [split [read -nonewline $Macs1] \n] {
set macAddr [lindex $a 3]
set IP [lindex $a 1]
if { [regexp {(172)\.([0-9]+)\.([0-9]+)\.([0-9]+)} $IP RealIP] } {
regexp {([0-9a-f])([0-9a-f])([0-9a-f])([0-9a-f])\.([0-9a-f])([0-9a-f])} $macAddr OuiPart
if { $OuiPart in $Ouilist } {
puts "$RealIP $macAddr\r
}
}
That way I get all ips that start with 172 and mac-address that are Cisco and Checkpoint vendors.

Related

Need help on a simple TCL script regarding regexp

I'm trying to write a tcl script basically to do following. Based on the syslog below,
LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet1/0/23, changed state to down
When that log is seen on the router, the script needs to push out a shell command which includes the interface index number, in this case it is 23. I can use regex to scrape the interface in the syslog by doing this below.
set interface ""
if {[regexp {.* (GigabitEthernet1/0/[0-9]*)} $syslog_msg match interface]}
if [catch {cli_exec $cli1(fd) "show port_diag unit 1 port 23"} result] {
error $result $errorInfo
}
}
But how can I use only the interface index number (which is 23) in the command above? Do I need to extract [0-9]* from the regexp and store it as a vairable or somethig like that?

Please just enclose the expression [0-9]* with parentheses and append
a variable name, say num, to be assigned to the second capture group.
Here is a snipped code to demonstrate:
if {[regexp {.* (GigabitEthernet1/0/([0-9]*))} $syslog_msg match interface num]} {
puts $num
}
Output:
23
If the result looks okay, modify the command within the curly braces to perform your task as:
if {[regexp {.* (GigabitEthernet1/0/([0-9]*))} $syslog_msg match interface num]} {
if [catch {cli_exec $cli1(fd) "show port_diag unit 1 port $num"} result] {
error $result $errorInfo
}
}

Perl Range command mismatching similar strings with one ending in a carriage return

The range command in Perl
RANGE
/^ identifier cust_pri/ .. /addr-type-none/
matches on strings with cust_pri and cust_pri_sip where a carriage return is immediately after the string cust_pri (and cust_pri_sip). I don't want a match on cust_pri_sip but only on cust_pri.
I tried putting in \r\n and both individually to no avail. Is there a string or metachar I can put into the end of perl range to help differentiate the two strings?
I need to look at data for both types of interfaces but on the first range command it is also collecting the data the second range command is also collecting (cust_pri_sip) causing my first script to error out. The second works find. I cannot change the input data and I need a way to differentiate the two.
This is a sub script of the main Perl program
WIDTH = 65
DIRECTORY = /home/myfiles/
MASTER Config Lines
identifier cust_pri
description *
addr prefix 0.0.0.0
network interfaces M00|1:\d*
tcp media profile
monitoring filters
node functionality
default location string
alt family realm
addr-type-none
RANGE
/^ identifier cust_pri/ .. /addr-type-none/
#
There is another sub script that is similar to above
RANGE
/^ identifier cust_pri_sip/ .. /addr-type-none/
The first script also collects the data of both scripts because it matches.

You can explicitly exclude _sip with /^ identifier cust_pri(?!_sip)/ or you can say cust_pri has to be at the end of the line with nothing after it with /^ identifier cust_pri$/

How to parse csv output requiring multiple matches using one-liner?

I have a scenario, where I have post-process / filter values taken out from DB. I'm using perl ple for the task. All works well until I come across extracted output (csv) which contains multiple text tags. See sample here. The code works same (extract regex) correctly if there is just one text tag. In my db there are instances where there are more then one text files (i.e rule conditions).
The code is
echo "COPY (SELECT rule_data FROM custom_rule) TO STDOUT with CSV HEADER" | psql -U qradar -o /tmp/Rules.csv qradar;
perl -ple '
($enabled) = /(?<=enabled="").*?(?="")/g;
($group) = /(?<=group="").*?(?="")/g;
($name) = /(?<=<name>).*?(?=<\/name>)/g;
($text) = /(?<=<text>).*?(?=<\/text>)/g;
$_= "$enabled;$group;$name;$text";
s/<.*?>//g;
' Rules.csv > rules_revised.csv
Just running the code on sample output I get following content in rule_revised file.
true;Flow Property Tests;DoS: Local Flood (Other);when the flow bias
is any of the following outbound
Actually the line is truncated after outbound which infact should carry information similar to this..
when at least 3 flows are seen with the same Source IP,
Destination IP in 5 minutes and when the IP protocol is one of the
following IPSec, Uncommon and when the source packets is greater than
60000
I have tried to correct this by making the regex greedy removing the ? in $text but then it overflow all in-between text till the last text and at the end removing lt;.*?>messes the rest as it includes all the tag characters (i.e html) elements which I originally intended to dis include before making the regex greedy change.

The reason you are getting a truncated result with multiple matches is that you only store the first one.
($text) = /(?<=<text>).*?(?=<\/text>)/g;
This only stores the first match. If you change that scalar to an array, you will capture all matches:
(#text) = /(?<=<text>).*?(?=<\/text>)/g;
When you interpolate the array, it will insert spaces (the value of $") between the elements. If you do not want that, you can change the value of $" to an acceptable delimiter. To be clear, you would change two characters to get the following lines:
(#text) = /(?<=<text>).*?(?=<\/text>)/g;
...
$_= "$enabled;$group;$name;#text";
If I run your code on your sample with these changes the output looks like this:
false;Flow Property Tests;DoS: Local Flood (Other);when the flow bias is any of the following outbound when at least 3 flows are seen with the same Source IP, Destination IP in 5 minutes when the IP protocol is one of the following IPSec, Uncommon when the source packets is greater than 60000

Have you tried to use the s modifier, it make the dot match newline:
perl -ple '
($enabled) = /(?<=enabled="").*?(?="")/g;
($group) = /(?<=group="").*?(?="")/g;
($name) = /(?<=<name>).*?(?=<\/name>)/g;
($text) = /(?<=<text>).*?(?=<\/text>)/gs;
# here ___^
$_= "$enabled;$group;$name;$text";
s/<.*?>//g;
' Rules.csv > rules_revised.csv

Stuck on perl regex expression for string with ending white space

Following is a line from an ftp log:
2013-03-05 18:37:31 543.21.12.22 []sent
/home/mydomain/public_html/court-9746hd/Chairman-confidential-video.mpeg
226 court-9746hd#mydomain.com 256
I am using a program called Simple Event Correlate which pulls values from inside the parenthesis of a regex expression and sets those values to a variable.
So, here is an entry in a SEC config file which is supposed to operate on the previous log file line:
pattern=sent \/home\/mydomain\/public_html\/(.*)\/(.*)
This succeeds in pulling out the logged in user, court-9746hd, and setting it to a variable, but fails to properly extract the file name downloaded, or, Chairman-confidential-video.mpeg
Instead, it pulls out the file downloaded as: Chairman-confidential-video.mpeg 226 court-9746hd#mydomain.com 256
So you see, I'm having difficulty getting the second extraction to stop at the first white space after the file name. I've tried:
pattern=sent \/home\/mydomain\/public_html\/(.*)\/(.*)\s
but I only get the same result. Any help would be greatly appreciated.

If you only want to match non-whitespace, replace .* with \S* or if space is the only character you want to exclude then use [^ ]* instead.
Also, man perlre is a good reference.

Rather than using the .* construct, use something narrower in scope, as a general rule. In this case what you want is something which is not a white space, so say that explicitly:
pattern=sent \/home\/mydomain\/public_html\/([^\s]+)\/([^\s]+)

One option is to first capture the full path from the line, and then use File::Spec to get the user and file info:
use strict;
use warnings;
use File::Spec;
my $line = '2013-03-05 18:37:31 543.21.12.22 []sent /home/mydomain/public_html/court-9746hd/Chairman-confidential-video.mpeg 226 court-9746hd#mydomain.com 256';
my ( $path ) = $line =~ m!\s+(/home\S+)\s+!;
my ( $user, $file ) = ( File::Spec->splitdir($path) )[ -2, -1 ];
print "User: $user\nFile: $file";
Output:
User: court-9746hd
File: Chairman-confidential-video.mpeg
However, if you want to only use a regex, the following will work:
m!/home/.+/.+/([^/]+)/(\S+)!

How can I assign a variable using $expect_out in TCL/EXPECT?

If I want to match DEF_23 using the following regexp:
expect {
-re "DEF_\[0-9]*"
set result $expect_out(1,string)
}
why does it say no such element in array?
How does $expect_out work, and how can I capture the DEF using a regexp and assign it to the variable result?

You're looking for expect_out(0,string) -- the array element 1,string would be populated if you had capturing parentheses in your regular expression.
The expect manpage documents the use of expect_out in the documentation of the expect command:
Upon matching a pattern (or eof or full_buffer), any matching and previously unmatched output is saved in the variable expect_out(buffer). Up to 9 regexp substring matches are saved in the variables expect_out(1,string) through expect_out(9,string). If the -indices flag is used before a pattern, the starting and ending indices (in a form suitable for lrange) of the 10 strings are stored in the variables expect_out(X,start) and expect_out(X,end) where X is a digit, corresponds to the substring position in the buffer. 0 refers to strings which matched the entire pattern and is generated for glob patterns as well as regexp patterns.
There is an illustrative example in the manpage.

It seems that the above explication is not precise!
Check this example:
$ cat test.exp
#!/usr/bin/expect
set timeout 5
log_user 0
spawn bash
send "ls -1 db*\r"
expect {
-re "^db.*$" {
set bkpfile $expect_out(0,string)
}
}
send_user "The filename is: $bkpfile\n"
close
$ ls -1 db*
dbupgrade.log
$ ./test.exp
can't read "bkpfile": no such variable
while executing
"send_user "The filename is: $bkpfile\n""
(file "./test.exp" line 15)
$
The test result is the same when $expect_out(1,string) or $expect_out(buffer)is used.
Am I missing something or this is the expected behavior?

Aleksandar - it should work if you change the match to "\ndb.*$".
If you turn on exp_internal 1, you will see the buffer contains something like this: "ls -1 db*\r\ndbupgrade.log\r\n08:46:09"
So, the caret (^) will throw your pattern match off.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Using a variable between regexp inside foreach - regex

Given a partial mac address (in the form "xxxx.xx") in a variable $mac, to match a line containing a mac address starting with that value: ^.\b$mac[0-9a-f]{2}\.[0-9a-f]{4}\b.$ If your language matches lines that contain a pattern, you can omit the wrapping: \b$mac[0-9a-f]{2}\.[0-9a-f]{4}\b

Related

Need help on a simple TCL script regarding regexp

Perl Range command mismatching similar strings with one ending in a carriage return

How to parse csv output requiring multiple matches using one-liner?

Stuck on perl regex expression for string with ending white space

How can I assign a variable using $expect_out in TCL/EXPECT?

Categories

Resources

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Using a variable between regexp inside foreach - regex

Given a partial mac address (in the form "xxxx.xx") in a variable $mac, to match a line containing a mac address starting with that value: ^.*\b$mac[0-9a-f]{2}\.[0-9a-f]{4}\b.*$ If your language matches lines that contain a pattern, you can omit the wrapping: \b$mac[0-9a-f]{2}\.[0-9a-f]{4}\b

Related

Need help on a simple TCL script regarding regexp

Perl Range command mismatching similar strings with one ending in a carriage return

How to parse csv output requiring multiple matches using one-liner?

Stuck on perl regex expression for string with ending white space

How can I assign a variable using $expect_out in TCL/EXPECT?

Categories

Resources

Given a partial mac address (in the form "xxxx.xx") in a variable $mac, to match a line containing a mac address starting with that value: ^.\b$mac[0-9a-f]{2}\.[0-9a-f]{4}\b.$ If your language matches lines that contain a pattern, you can omit the wrapping: \b$mac[0-9a-f]{2}\.[0-9a-f]{4}\b