How to use Regex in a While If statement? Perl

How to use Regex in a While If statement? Perl - regex

I'm new to programming and I've run into an issue. We have to use Perl to write a script that opens a file, then loops through each line using a Regex - then print out the results. The opening of the file and the loop I have, but I can't figure out how to implement the Regex. It outputs 0 matched results, when the assignment outline suggests the number to be 338. If I don't use the Regex, it outputs 2987, which is the total number of lines - which is correct. So there's something incorrect with the Regex I just can't figure out. Any help would be greatly appreciated!
Here's what I have thus far:
use warnings;
use strict;
my $i = 0;
my $filename = 'C:\Users\sample.log.txt';
open (fh, '<', $filename) or die $!;
while(<fh>) {
if ($filename=~ /(sshd)/){
$i++;
}
}
close(fh);
print $i;

Consider this piece of code of yours:
while(<fh>) {
if ($filename=~ /(sshd)/){
$i++;
}
}
You are indeed looping through the file lines, but you keep checking if the file name matches your regex. This is clearly not what you intend.
You meant:
while (my $line = <fh>) {
if ($line =~ /sshd/){
$i++;
}
}
Parentheses around the regex seem superfluous (they are meat to capture, while you are only matching).
Since expression while (<fh>) assigns the content of the line to special variable $_ (which is the default argument for regexp matching), this can be shortened as:
while (<fh>) {
$i++ if /sshd/;
}

OP code has some errors which I've correcte
use warnings;
use strict;
use feature 'say';
my $i = 0;
my $filename = 'C:\Users\sample.log.txt';
open my $fh, '<', $filename
or die "Couldn't open $filename";
map{ $i++ if /sshd/ } <$fh>;
close($fh);
say "Found: $i";

Related

Why does this regex in perl work for one word but not another?

I'm new to perl so please excuse me if my question seems obvious. I made a small perl script that just examines itself to extract a particular substring I'm looking for and I'm getting results that I can't explain. Here is the script:
use 5.006;
use strict;
use warnings;
use File::Find;
my #files;
find(
sub { push #files, $File::Find::name unless -d; },
"."
);
my #filteredfiles = grep(/.pl/, #files);
foreach my $fileName (#filteredfiles)
{
open (my $fh, $fileName) or die "Could not open file $fileName";
while (my $row = <$fh>)
{
chomp $row;
if ($row =~ /file/)
{
my ($substring) = $row =~ /file\(([^\)]*)\)/;
print "$substring\n" if $substring;
}
}
close $fh;
}
# file(stuff)
# directory(stuff)
Now, when I run this, I get the following output:
stuff
[^\
Why is it printing the lines out of order? Since the "stuff" line occurs later in the file, shouldn't it print later?
Why is it printing that second line wrong? It should be "\(([^\". It's missing the first 3 characters.
If I change my regex to the following: /directory\(([^\)]*)\)/, I get no output. The only difference is the word. It should be finding the second comment. What is going on here?

use 5.006 kind of odd if you are just beginning to learn Perl ... That is an ancient version.
You should not build a potentially huge list of all files in all locations under the current directory and then filter it. Instead, push only the files you want to the list.
Especially with escaped meta characters, regex patterns can be become hard to read very quickly, so use the /x modifier to insert some whitespace into those patterns.
You do not have to match twice: Just check & capture at the same time.
If open fails, include the reason in the error message.
Your second question above does not make sense. You seem to expect your pattern to match the literal string file\(([^\)]*)\)/, but it cannot.
use strict;
use warnings;
use File::Find;
my #files;
find(
sub {
return if -d;
return unless / [.] pl \z/x;
push #files, $File::Find::name;
},
'.',
);
for my $file ( #files ) {
open my $fh, '<', $file
or die "Could not open file $file: $!";
while (my $line = <$fh>) {
if (my ($substring) = ($line =~ m{ (?:file|directory) \( ([^\)]*) \) }x)) {
print "$substring\n";
}
}
close $fh;
}
# file(stuff)
# directory(other)
Output:
stuff
other

Issue with Perl Regex

new perl coder here.
When I copy and paste the text from a website into a text file and read from that file, my perl script works with no issues. When I use getstore to create a file from the website automatically which is what I want, the output is a bunch of |'s.
The text looks identical when I copy and paste, or download the text with getstore.. I'm unable to figure out the problem. Any help would be highly appreciated.
The output that I desire is as follows:
|www\.arkinsoftware\.in|www\.askmeaboutrotary\.com|www\.assculturaleincontri\.it|www\.asu\.msmu\.ru|www\.atousoft\.com|www\.aucoeurdelanature\.
enter code here
Here is the code I am using:
#!/usr/bin/perl
use strict;
use warnings;
use LWP::Simple;
getstore("http://www.malwaredomainlist.com/hostslist/hosts.txt", "malhosts.txt");
open(my $input, "<", "malhosts.txt");
while (my $line = <$input>) {
chomp $line;
$line =~ s/.*\s+//;
$line =~ s/\./\\\./g;
print "$line\|";
}

The bunch of | you get, is from the unfitting comment-lines at the beginning. So the solution is to ignore all "unfitting" lines.
So instead of
$line =~ s/.*\s+//;
use
next unless $line =~ s/^127.*\s+//;
so you would ignore every line except thos starting with 127.

Here's what I'd do:
my $first = 1;
while (<$input>) {
/^127\.0\.0\.1\s+(.+?)\s*$/ or next;
print '|' if !$first;
$first = 0;
print quotemeta($1);
}
This matches your input in a more precise way, and quotemeta takes care of true regex escaping.

I'd probably go with something like:
#!/usr/bin/perl
use strict;
use warnings;
use LWP::Simple;
getstore( "http://www.malwaredomainlist.com/hostslist/hosts.txt",
"malhosts.txt" );
open( my $input, "<", "malhosts.txt" );
print join ( "|",
map { m/^\d/ && ! m/localhost/ ?
quotemeta ((split)[1]) : () } <$input> );
Gives:
0koryu0\.easter\.ne\.jp|1\-atraffickim\.tf|10\-trafficimj\.tf|109\-204\-26\-16\.netconnexion\.managedbroadband\.co\.uk|11\-atraasikim\.tf|11\.lamarianella\.info|12\-tgaffickvcmb\.tf| #etc.

Why is this regex logic failing?

I'm trying to build a quick script that will take a url, and check it against a list of PCREs to see if there's a match. However, it doesn't seem to be working. I've tried printing everything to make sure it's output the way I want (including the ARGV[0], passing it with single quotes appears to keep all the characters in tact). But it's still not working.
This is the script
#!/usr/bin/perl
use strict;
use warnings;
if (not($ARGV[0])) {
die "Useage: checkurl.pl \"<url>\"";
}
if ($ARGV[1]) {
die "Too many command line arguments, try checkurl.pl \"<url>\"";
}
$_ = $ARGV[0];
print "$_\n";
my $file = "pcre.txt";
open my $info, $file or die "Could not open $file: $!";
while( my $line = <$info>) {
if (/$line/) {
print "Match found, the url matches the following PCRE: \n";
print "$line\n";
}
}
This is the test URL (warning, this was an actual Angler EK link, I've defanged it, just in case it's still live, so you have to fix it to properly check the PCRE)
hxxp://nosprivsliikeradan.pfgfoxriver-localguide2[.]com/boards/viewforum.php?f=5x827&sid=7q0as14.5i4x8
This is the PCRE in the pcre.txt file that matches the above URL
^http:\/\/(?!www|forums?)[^\.]+\.[^\.]+\.(?:[^\.\x2f]+?|[^\.]+\.[^\.]{2})\/[a-z]+\/?view(?:forum|topic)\.php\?[a-z]=(?=[^\n]{0,64}\.)[0-9a-z\.]{1,6}(?:&[a-z0-9]*=[0-9a-z\.]*){1,2}$

Your pattern is actually /^...$\n/ because you read it from a file and it contains a newline character. You need to chomp the line before interpolating it into the match operator:
while (my $line = <$info>) {
chomp($line);
if (/$line/) {
...
}
}

Regex and creating output file (Perl beginner)

I'm in an intro to Perl course and we are tasked with taking an input.txt file (of the Gettysburg Address - that has all instances of the word 'old' changed to 'new') and creating an output.txt file that switches 'new' back to having 'old'. I've got a general regex that switches all instances of 'new' to 'old', but it needs to work regardless of case in the input file. I'm wondering how I could add that in? Also, I'm looking to verify that I have my output.txt built in correctly? When I run what I have, I get no output.txt file created in my directory. Here is what I have so far:
open(my $getty, "<", "input.txt")
or die "Cannot open < input.txt: $!";
open(my $getty, ">", "output.txt")
or die "Cannot open < output.txt: $!";
while(my $line = <$getty>) {
if ($line =~ 's/new/old/') {
$line =~ s/new/old/;
}
}

You don't need to put an if condition.
while(my $line = <$getty>) {
$line =~ s/new/old/gi;
}

The good recipe is to change smth. from shell by Perl:
perl -pi.orig -e 's{old}{new}g' filename.txt
This produce replacement in file filename.txt wtih an original file filename.txt.orig

I guess your course is over, however here is a complete answer for anyone who might need it:
The if is unnecessary. You need to use print on the output filehandle to create the file. Here is the complete working code (there is no comma after the filehandle in the print statement):
use strict;
open(my $in_file, "<", "input.txt")
or die "Cannot open < input.txt: $!";
open(my $out_file, ">", "output.txt")
or die "Cannot open < output.txt: $!";
while(my $line = <$in_file>) {
$line =~ s/new/old/i;
print $out_file $line;
}
If the case of the input word should be the same in the output word, this is a solution:
use strict;
open(my $in_file, "<", "input.txt")
or die "Cannot open < input.txt: $!";
open(my $out_file, ">", "output.txt")
or die "Cannot open < output.txt: $!";
while(my $line = <$in_file>) {
$line =~ s{(new)}
{
my #chars = split '', $1;
my #old = qw/o l d/;
my #out;
foreach my $char (#chars) {
if($char =~ /\p{Uppercase}/) {
push #out, uc(shift #old);
}
else {
push #out, shift #old;
}
}
join('', #out);
}esi;
print $out_file $line;
}
What happens here is that I use s{pattern}{replacement}. The e modifier makes the replacement part perl code and s makes it possible for me to use whitespace in the expression. In the replecement code I go trough every char of the "new" (captured with braces so I can check it in the variable $1). If the char is uppercase I use the uc function to make the output char uppercase aswell.

How to Circumvent Perl's string escaping the replacement string in s/// when it's read from a file?

This question is similar to my last one, with one difference to make the toy script more similar to my actual one.
Here is the toy script, replace.pl (Edit: now with 'use strict;', etc)
#! /usr/bin/perl -w
use strict;
open(REPL, "<", $ARGV[0]) or die "Couldn't open $ARGV[0]: $!!";
my %replacements;
while(<REPL>) {
chomp;
my ($orig, $new, #rest) = split /,/;
# Processing+sanitizing of orig/new here
$replacements{$orig} = $new;
}
close(REPL) or die "Couldn't close '$ARGV[0]': $!";
print "Performing the following replacements\n";
while(my ($k,$v) = each %replacements) {
print "\t$k => $v\n";
}
open(IN, "<", $ARGV[1]) or die "Couldn't open $ARGV[1]: $!!";
while ( <IN> ) {
while(my ($k,$v) = each %replacements) {
s/$k/$v/gee;
}
print;
}
close(IN) or die "Couldn't close '$ARGV[1]': $!";
So, now lets say I have two files, replacements.txt (using the best answer from the last question, plus a replacement pair that doesn't use substitution):
(f)oo,q($1."ar")
cat,hacker
and test.txt:
foo
cat
When I run perl replace.pl replacements.txt test.txt I would like the output to be
far
hacker
but instead it's '$1."ar"' (too much escaping) but the results are anything but (even with the other suggestions from that answer for the replacement string). The foo turns into ar, and the cat/hacker is eval'd to the empty string, it seems.
So, what changes do I need to make to replace.pl and/or replacements.txt? Other people will be creating the replacements.txt's, so I'd like to make that file as simple as possible (although I acknowledge that I'm opening the regex can of worms on them).
If this isn't possible to do in one step, I'll use macros to enumerate all possible replacement pairs for this particular file, and hope the issue doesn't come up again.

Please don't give us non-working toy scripts that don't use strict and warnings. Because one of the first things people will do in debugging is to turn those on, and you've just caused work.
Second tip, use the 3-argument version of open rather than the 2-argument version. It is safer. Also in your error checking do as perlstyle says (see http://perldoc.perl.org/perlstyle.html for the full advice) and include the file name and $!.
Anyways your problem is that the code you were including was q($1."ar"). When executed this returns the string $1."ar". Get rid of the q() and it works fine. BUT it causes warnings. That can be fixed by moving the quoting into the replace script, and out of the original script.
Here is a fixed script for you:
#! /usr/bin/perl -w
use strict;
open(REPL, "<", $ARGV[0]) or die "Couldn't open '$ARGV[0]': $!!";
my %replacements;
while(<REPL>) {
chomp;
my ($orig, $new) = split /,/;
# Processing+sanitizing of orig/new here
$replacements{$orig} = '"' . $new . '"';
}
close(REPL) or die "Couldn't close '$ARGV[0]': $!";
print "Performing the following replacements\n";
while(my ($k,$v) = each %replacements) {
print "\t$k => $v\n";
}
open(IN, "<", $ARGV[1]) or die "Couldn't open '$ARGV[1]': $!!";
while ( <IN> ) {
while(my($k,$v) = each %replacements) {
s/$k/$v/gee;
}
print;
}
close(IN) or die "Couldn't close '$ARGV[1]': $!";
And the modified replacements.txt is:
(f)oo,${1}ar
cat,hacker

You have introduced one more level of interpolation since the last question.
You can get the right result by either:
Lay a 3rd "e" modifier on your substitution
s/$k/$v/geee; # eeek
Remove a layer of interpolation in replacements.txt by making the first line
(f)oo,$1."ar"

Get rid of the q() in the replacement string;
Should be just
(f)oo,$1."ar"
as in ($k,$v) = split /,/, $_;
Warning: using external input data in evals is very, very dangerous
Or, just make it
(f)oo,"${1}ar"
No modification to the code is necessary either way e.g. s///gee.
Edit #drhorrible, if it doesen't work then you have other problems.
use strict;use warnings;
my $str = "foo";
my $repl = '(f)oo,q(${1}."ar")';
my ($k,$v) = split /,/, $repl;
$str =~ s/$k/$v/gee;
print $str,"\n";
$str = "foo";
$repl = '(f)oo,$1."ar"';
($k,$v) = split /,/, $repl;
$str =~ s/$k/$v/gee;
print $str,"\n";
$str = "foo";
$repl = '(f)oo,"${1}ar"';
($k,$v) = split /,/, $repl;
$str =~ s/$k/$v/gee;
print $str,"\n";
output:
${1}."ar"
far
far

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to use Regex in a While If statement? Perl - regex

OP code has some errors which I've correcte use warnings; use strict; use feature 'say'; my $i = 0; my $filename = 'C:\Users\sample.log.txt'; open my $fh, '<', $filename or die "Couldn't open $filename"; map{ $i++ if /sshd/ } <$fh>; close($fh); say "Found: $i";

Related

Why does this regex in perl work for one word but not another?

Issue with Perl Regex

Why is this regex logic failing?

Regex and creating output file (Perl beginner)

How to Circumvent Perl's string escaping the replacement string in s/// when it's read from a file?

Categories

Resources