Distinguish multiple regex hits in a line?

Distinguish multiple regex hits in a line? - regex

I'm trying to replace IP-Addresses with random numbers in Perl:
while (my $line = <file>){
$line =~ $regex{'ipadress'};
my $rand0 = int(rand(256));
my $rand1 = int(rand(256));
my $rand2 = int(rand(256));
my $rand3 = int(rand(256));
$& = "$rand0.$rand1.$rand2.$rand3\n";`
}
The problem is that in some cases there are multiple IP-Addresses in one line.
How to avoid that they all get the same random numbers?

Well for a start $& is read-only and you can't assign to it like that to modify the target string.
I'm also unsure whether the key to your hash is really ipadress (with one d) but I'm sure you can fix it if not.
I would write something like this. The /e modifier on the substitute operator causes the replacement string to be executed to determine what to replace the match with. The join statement generates four byte values from 0 to 255 and joins them with dots to form a random address.
while (my $line = <$fh>) {
$line =~ s{$regex{ipadress}}{
join '.', map int(rand(256)), 0..3
}eg;
print $line;
}

This might be helpful:
sub rip { return join(".", map { int(rand(256)) } (1..4) ) }
open my $f, '<', 'input' or die($!);
while (my $line = <$f>){
$line =~ s/$regex{'ipadress'}/rip()/eg;
}
close($f);

These answers are good ways to ensure that new random numbers are picked for each IP address. But the poster's main question is, "How to avoid that they all get the same random numbers?" and it's unclear to me whether they meant "get four random numbers for each IP address in the line" or "guarantee that no two randomly-chosen IP addresses are the same."
In case it's the latter: the probability of getting the same results from four calls of rand(256) twice in a row is one in 232, which seems hardly worth worrying about, but if you are required to guarantee that they're different, you can keep a hash of addresses you've already picked, and update it each time you generate a new address. Stealing from #perreal's solution:
sub rip {
my $picked_addrs = shift;
my $new_addr;
do {
$new_addr = join(".", map { int(rand(256)) } (1..4) );
} while defined($picked_addrs->{$new_addr});
$picked_addrs->{$new_addr} = 1;
return $new_addr;
}
open my $f, '<', 'input' or die($!);
while (my $line = <$f>){
my %picked_addrs;
$line =~ s/$regex{'ipadress'}/rip(\%picked_addrs)/eg;
}
close($f);
If you want to make sure that you never pick the same address twice anywhere in the file, just declare %picked_addrs outside the while loop, so it doesn't get reset for each line:
open my $f, '<', 'input' or die($!);
my %picked_addrs;
while (my $line = <$f>){
$line =~ s/$regex{'ipadress'}/rip(\%picked_addrs)/eg;
}
close($f);

Related

Extract specific values from a log file

I want to extract two values on the same line of a log file using Perl.
Network Next Hop metric locprf Path
*|i10.1.5.0/24 10.6.76.242 2 100 0 65000?
*|i10.1.9.0/24 10.6.76.242 2 100 0 64345 63800?
*|i10.2.9.0/25 10.6.76.242 2 100 0?
For each line, I want to extract the network address and the number before the ?
I have this but it extracts only the network address.
open( CONF, '<', 'putty-wan.log' ) or die "\n";
my #ip;
open( FICHE, ">RouterNetwork.txt" ) || die ( "Vous ne pouvez pas créer le fichier \"RouterNetwork.txt\"" );
while ( my $line = <CONF> ) {
if ( $line =~ /(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\/\d{1,2})/ ) {
print FICHE $1, "\n";
}
}
close(FICHE);
close CONF;
Now I want the regular expression to add or any way to get per line, the network address and the number just before ?.

Given the shown format, you can process the line with
my ($ip, $n) = map { s/^\D*|\D*$//gr } (split ' ', $line)[0,-1];
or, when the line is in the $_ variable
my ($ip, $n) = map { s/^\D*|\D*$//gr } (split)[0,-1];
With the /r non-desctructive modifier the new string is returned (leaving the original unchanged, what we don't care about here). It's available since v5.16. If your version of Perl is older use
my ($ip, $n) = map { s/^\D*|\D*$//g; $_ } (split)[0,-1];
As for processing the whole file, you need a way to detect header lines. How to do this depends on details of your file format. Given the sample, perhaps skip lines starting with letter-only words
use warnings;
use strict;
use feature 'say';
my $file = 'putty-wan.log';
open my $fh, '<', $file or die "Can't open $file: $!";
while (<$fh>)
{
next if /^[a-zA-Z]+\b/;
my ($ip, $num) = map { s/^\D*|\D*$//gr } (split)[0,-1];
say "$ip $num";
}
Some comments
Please always start with use warnings;, and with use strict;
Use three-argument form of open, with a lexical filehandle. It's better
Always include $! in your die statements, to see the actual error. This would be the "default" way to do it while sometimes other error variables are needed as well.
While there is nothing wrong with using || as you do, the or is very handy for flow control, having a suitably lower precedence. But above all, it's good to be consistent in any case.
It's been clarified that the last part on the line can also be 6500 ? or 65000 i or such.
Then store all fields in an array and process it from the back, looking for the first field with numbers.
while (<$fh>)
{
next if /^[a-zA-Z]+\b/;
my #fields = split;
my $ip = (shift #fields) =~ s/^\D*//gr; #/# need v5.16 for /r
my $num;
while (my $f = pop #fields) {
($num) = $f =~ /(\d+)/;
last if $num;
}
say "$ip $num";
}
The IP is still obtained from the first field, and cleaned up the same way as before.

There are nothing particular to do, only to continue the line description with the number you want to capture:
use strict;
use warnings;
open (my $conf, '<', 'putty-wan.log') || die "Don't eat too much Montbéliard saussages\n";
open (my $output, '>', 'RouterNetwork.txt') || die ('Vous ne pouvez pas créer le fichier "RouterNetwork.txt"');
while( <$conf> ) { # the current line is stored in $_
print $output "$1\t$2\n" if /(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\/\d{1,2}).*\b(\d+)\?/;
}
close $output;
close $conf;
Note the word boundary before the number to be sure to obtain the whole number and not the last digit only.
The pattern can also be shorten to: /([\d.]{7,15}\/\d\d?).*?(\d+)\?/
Take care to not use old school programming style and look at perl current practises. (use strict and warnings systematically)
Note that with log files, a fields approach (split the line by whitespaces) is sometimes more handy.

Ignore blank variable return from qx

I'm having difficulty with one little bit of my code.
open ("files","$list");
while (my $sort = <files>) {
chomp $sort;
foreach my $key (sort keys %ips) {
if ($key =~ $sort) {
print "key $key\n";
my $match =qx(iptables -nL | grep $key 2>&1);
print "Match Results $match\n";
chomp $match;
my $banned = $1 if $match =~ (/(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/);
print "Banned Results $banned\n";
if ($key =~ $banned) {
print "Already banned $banned\n";
} else {
system ("iptables -A INPUT -s $key -j DROP");
open my $fh, '>>', 'banned.out';
print "Match Found we need to block it $key\n";
print $fh "$key:$timestamp\n";
close $fh;
}
}
}
}
So basically what I'm doing is opening a list of addresses 1 per line.
Next I'm sorting down my key variable from another section of my script and matching it with my list, if it matches then it continues on to the if statement.
Now with that matched key I need to check and see if its blocked already or not, so I'm using a qx to execute iptables and grep for that variable. If it matches everything works perfectly.
If it does not match, in other words my iptables -nL | grep $key returns a blank value instead of moving on to my else statement it "grabs" that blank value for $match and continues to execute.
For the life of me I can't figure out how to strip that blank value out and basically show it as no return.
I know there are modules for iptables etc however I have to keep this script as generic as possible.

The problem is that, when iptables returns no results, $banned is left at its default value of undef. Used as a regex, $banned matches every string, so your condition:
if ($key =~ $banned) {
always matches. I think what you meant to write was probably
if ($key eq $banned) {
which will fail if either $banned is undef (because $matched was empty or didn't match the regex) or if the IP address you pulled out with the regex was somehow different from $key.
If you're confident that the first IP in the iptables result will be the same as $key then you could simplify your condition to just
if ($match =~ /(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/) {

I suggest you put the entire output from iptables -nL into an array and grep it using Perl. That way you will be calling the utility only once, and it is easy to detect an empty list.
If you write
my #iptables = qx(iptables -nL);
at the top of your code, then you can query this output by
my #match = grep /\b$key\b/, #iptables;
and if there are no records that contain the IP address then a subsequent
if (#match) { ... }
will fail.
There are a few other problems with your code. Firstly, you must always use strict and use warnings at the start of your program, and declare all variables at their first point of use. This will uncover many simple errors that you may otherwise easily overlook, and applies especially if you are asking for help with your code.
Your open call should look like
open my $fh, '<', $file or die $!;
together with
while (my $sort = <$fh>) { ... }
And you seem to have missed the point of hashes. There is no need to read through all of the keys in a hash looking for a match, as the hash elements can be accessed directly with $ips{$sort}. If the value returned is undef then the element doesn't exist, or you can explicitly check for its existence with if (exists $ips{$sort}) { ... }.
I cannot help further as I have no access to a platform that provides iptables. If you need more help then please post some output from the utility.

String replace in Perl

I am trying to deobfuscate code. This code uses a lot of long variable names which are substituted with meaningful names at the time of running the code.
How do I preserve the state while searching and replacing?
For instance, with an obfuscated line like this:
${${"GLOBALS"}["ttxdbvdj"]}=_hash(${$urqboemtmd}.substr(${${"GLOBALS"}["wkcjeuhsnr"]},${${"GLOBALS"}["gjbhisruvsjg"]}-${$rrwbtbxgijs},${${"GLOBALS"}["ibmtmqedn"]}));
There are multiple mappings in mappings.txt which match above obfuscated line like:
$rrwbtbxgijs = hash_length;
$urqboemtmd = out;
At the first run, it will replace $rrwbtbxgijs with hash_length in the obfuscated line above. Now, when it comes across the second mapping during the next iteration of the outer while loop, it will replace $urqboemtmd with out in the obfuscated line.
The problem is:
When it comes across first mapping, it does the substitution. However, when it comes across next mapping in the same line for a different matching string, the previous search/replace result is not there.
It should preserve the previous substitution. How do I do that?
I wrote a Perl script, which would pick one mapping from mapping.txt and search the entire obfuscated code for all the occurrences of this mapping and replace it with the meaningful text.
Here is the code I wrote:
#! /usr/bin/perl
use warnings;
($mapping, $input) = #ARGV;
open MAPPING, '<', $mapping
or die "couldn't read from the file, $mapping with error: $!\n";
while (<MAPPING>) {
chomp;
$line = $_;
($key, $value) = split("=", $line);
open INPUT, '<', $input;
while (<INPUT>) {
chomp;
if (/$key/) {
$_=~s/\Q$key/$value/g;
print $_,"\n";
}
}
close INPUT;
}
close MAPPING;

To match the literal meta characters inside your string, you can use quotemeta or:
s/\Q$key\E/$replace/

Just tell Perl not to interpret the characters in $key:
s/\Q$key/$value/g

Consider using B::Deobfuscate and gradually enter variable names into its configuration file as you figure out what they do.
I'm a little confused about your request to save state. What exactly are you doing/do you intend to do with the output? Here's an (untested) example of doing all the substitutions in one pass, if that helps?
my %map;
while ( my $line = <MAPPING> ) {
chomp $line;
my ($key, $value) = split("=", $line);
$map{$key} = $value;
}
close MAPPING;
my $search = qr/(#{[ join '|', map quotemeta, sort { length $b <=> length $a } keys %map ]})/;
while ( my $line = <INPUT> ) {
$line =~ s/$search/$map{$1}/g;
print OUTPUT $line;
}

Perl substitution using a hash

open (FH,"report");
read(FH,$text,-s "report");
$fill{"place"} = "Dhahran";
$fill{"wdesc:desc"} = "hot";
$fill{"dayno.days"} = 4;
$text =~ s/%(\w+)%/$fill{$1}/g;
print $text;
This is the content of the "report" template file
"I am giving a course this week in %place%. The weather is %wdesc:desc%
and we're now onto day no %dayno.days%. It's great group of blokes on the
course but the room is like the weather - %wdesc:desc% and it gets hard to
follow late in the day."
For reasons that I won't go into, some of the keys in the hash I'll be using will have dots (.) or colons (:) in them, but the regex stops working for these, so for instance in the example above only %place% gets correctly replaced. By the way, my code is based on this example.
Any help with the regex greatly appreciated, or maybe there's a better approach...

You could loosen it right up and use "any sequence of anything that isn't a %" for the replaceable tokens:
$text =~ s/%([^%]+)%/$fill{$1}/g;

Good answers so far, but you should also decide what you want to do with %foo% if foo isn't a key in the %fill hash. Plausible options are:
Replace it with an empty string (that's what the current solutions do, since undef is treated as an empty string in this context)
Leave it alone, so "%foo%" stays as it is.
Do some kind of error handling, perhaps printing a warning on STDERR, terminating the translation, or inserting an error indicator into the text.
Some other observations, not directly relevant to your question:
You should use the three-argument version of open.
That's not the cleanest way to read an entire file into a string. For that matter, for what you're doing you might as well process the input one line at a time.
Here's how I might do it (this version leaves unrecognized "%foo%" strings alone):
#!/usr/bin/perl
use strict;
use warnings;
my %fill = ( place => 'Dhahran',
'wdesc:desc' => 'hot',
'dayno.days' => 4 );
my $filename = 'report';
open my $FH,,'<', $filename or die "$filename: $!\n";
while (my $line = <$FH>) {
foreach my $key (keys %fill) {
$line =~ s/\Q%$key%/$fill{$key}/g;
}
print $line;
}
And here's a version that dies with an error message if there's an unrecognized key:
#!/usr/bin/perl
use strict;
use warnings;
my %fill = ( place => 'Dhahran',
'wdesc:desc' => 'hot',
'dayno.days' => 4 );
my $filename = 'report';
open my $FH,,'<', $filename or die "$filename: $!\n";
while (my $line = <$FH>) {
$line =~ s/%([^%]*)%/Replacement($1)/eg;
print $line;
}
sub Replacement {
my($key) = #_;
if (exists $fill{$key}) {
return $fill{$key};
}
else {
die "Unrecognized key \"$key\" on line $.\n";
}
}

http://codepad.org/G0WEDNyH
$text =~ s/%([a-zA-Z0-9_\.\:]+)%/$fill{$1}/g;
By default \w equates to [a-zA-Z0-9_], so you'll need to add in the \. and \:.

Perl Regex Problem!

I am reading a string from a file:
2343,0,1,0 ... 500 times ...3
Above is an example of $_ when it is read from a file. It is any number, followed by 500 comma separated 0's/1's then the number 3.
while(<FILE>){
my $string = $_;
chomp($string);
my $a = chop($string);
my $found;
if($string=~m/^[0-9]*\,((0,|1,){$i})/){
$found = $&.$a;
print OTH $found,"\n";
}
}
I am using chop to get the number 3 from the end of the string. Then matching the first number followed by $i occurences of 0, or 1. The problem I'm having is that chop is not working on the string for some reason. In the if statement when I try to concat the match and the chopped number all I get returned is the contents of $&.
I have also tried using my $a = substr $a,-1,1; to get the number 3 and this also hasn't worked.
The thing that's odd is that this code works in Eclipse on Windows, and when I put it onto a Linux server it won't work. Can anyone spot the silly mistake I'm making?

As a rule, I tend always to allow for unseen whitespace in my data. I find that it makes my code more robust expecting that somebody didn't see an extra space at the end of a line or string (as in writing to a log). So I think this would solve your problem:
my ( $a ) = $string =~ /(\S)\s*$/;
Of course, since you know you are looking for a number, it's better to be more precise:
my ( $a ) = $string =~ /(\d+)\s*$/;

Take care of the end of line char… I can not test here but I assume you just chop a newline. Try first to trim your string then chop it. See for example http://www.somacon.com/p114.php

Instead of trying to do it that way, why not use a regexp to pull out everything you need in one go?
my $x = "4123,0,1,0,1,4";
$x =~ /^[0-9]+,((?:0,|1,){4})([0-9]+)/;
print "$1\n$2\n";
Produces:
0,1,0,1,
4
Which is pretty much what you're looking for. Both sets of needed answers are in the match variables.
Note that I included ?: in the front of the 0,1, matching so that it didn't end up in the output match variables.

I'm really not sure what you are trying to achieve here but I've tried the code on Win32 and Solaris and it works. Are you sure $i is the correct number? Might be easier to use * or ?
use strict;
use warnings;
while(<DATA>){
my $string = $_;
chomp($string);
my $a = chop($string);
print "$string\n";
my $found;
if($string=~m/^[0-9]*\,((0,|1,)*)/){
$found = $&.$a;
print $found,"\n";
}
}
__DATA__
2343,0,1,0,0,1,1,0,0,0,1,1,0,0,0,1,1,0,0,0,1,1,0,0,0,1,1,0,0,0,1,1,0,0,0,1,1,0,0,0,1,1,0,3

I don't see much reason to use a regex in this case, just use split.
use strict;
use warnings;
use autodie; # open will now die on failure
my %data;
{
# limit the scope of $fh
open my $fh, '<', 'test.data';
while(<$fh>){
chomp;
s(\s+){}g; # remove all spaces
my($number,#bin) = split ',', $_;
# uncomment if you want to throw away the 3
# pop #bin if $bin[-1] == 3;
$data{$number} = \#bin;
}
close $fh;
}
If all you want is the 3
while(<$fh>){
# the .* forces it to look for the last set of numbers
my($last_number) = /.*([0-9]+)/;
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Distinguish multiple regex hits in a line? - regex

This might be helpful: sub rip { return join(".", map { int(rand(256)) } (1..4) ) } open my $f, '<', 'input' or die($!); while (my $line = <$f>){ $line =~ s/$regex{'ipadress'}/rip()/eg; } close($f);

Related

Extract specific values from a log file

Ignore blank variable return from qx

String replace in Perl

Perl substitution using a hash

Perl Regex Problem!

Categories

Resources