Retrieving elements in a list perl

Retrieving elements in a list perl - list

I'm trying to retrieve an element from a list using perl using another variable that is assigned an integer.
use File::Copy qw(move)
my #List;
my $newDir = "foo/bar";
my $file = "foo/bar.txt";
push #List, $file;
my $file2 = "foo/bar2.txt";
my $var = 1;
push #List, $file2;
move $List[$var-1], $newDir;
Instead of moving "file" the program fails.

my $newDir = foo/bar;
There are quotes missing around the string:
my $newDir = 'foo/bar';
and similarly for other paths.
Also, are you sure you want to use a file name with a newline at the end in the move command?
Also, something like
use File::Copy;
is missing, too.
Moreover, check the return value of move:
move $List[$var-1], $newDir or die $!;

Firstly, you have no lists in your code. You have an array called #List, but misnaming a variable doesn't change what it is. Arrays are not lists.
Secondly, your code works exactly as expected for me. I've added error reporting on the move call. That might help track down your problem, but I suspect it's caused by files or directories not being where you think they are.
$ cat move
#!/usr/bin/perl
use strict;
use warnings;
use File::Copy qw(move);
my #List;
my $newDir = "foo/bar";
my $file = "foo/bar.txt";
push #List, $file;
my $file2 = "foo/bar2.txt";
my $var = 1;
push #List, $file2;
move $List[$var-1], $newDir or warn $!;
$ find foo
foo
foo/bar
foo/bar.txt
$ perl move
$ find foo
foo
foo/bar
foo/bar/bar.txt

Related

Using Perl to match all words after a particular word

I am using Perl and need to get all domain names from http://www.malwaredomainlist.com/hostslist/hosts.txt into a flat file.
I think the easiest way to do this is to use a regular expression but I can't get my head around how to build the expression.
my code so far:
#!/usr/bin/perl
use LWP::Simple;
$url = 'http://www.malwaredomainlist.com/hostslist/hosts.txt';
$content = get $url;
open(my $fh, '>', '/home/jay/feed.txt');
#logic here
}
close $fh;
I'm not sure if I should loop over each line and perform an expression on that or if I should take the whole file as a string and work with that.

The page is just a text/plain document, so I think I would just copy and paste the page into my editor and remove the unwanted information. However if you would prefer a Perl program then this is all that is necessary. It uses LWP::Simple::get to fetch the text page and a regex to search it for lines starting with digits and dots, returning the second field of each
use strict;
use warnings;
use feature 'say';
use LWP::Simple qw/ get /;
my $url = 'http://www.malwaredomainlist.com/hostslist/hosts.txt';
say for get($url) =~ /^[\d.]+\s+(\S+)/gam;
or as a one-liner
perl -MLWP::Simple=get -E"say for get(shift) =~ /^[\d.]+\s+(\S+)/gam" http://www.malwaredomainlist.com/hostslist/hosts.txt

Unless you have a particular need, iterating by line is the way forward. Otherwise you just tie up memory unnecessarily.
However when you're fetching a url, it's a bit academic - I would suggest that fetching it to a file first isn't a bad thing though, so you can re-process it without needing to refetch.
Given source data sample:
for ( split ( "\n", $content ) ) {
next unless m/^\d/; #skip lines that don't start with a digit.
my ( $IP, $hostname ) = split;
my $domainname = $hostname =~ s/^\w+\.//r;
print $domainname,"\n";
}
This doesn't entirely work with your list though, because in that list you have a mix of hostnames and domain names, and it's not actually all that easy to tell the difference.
After all, the 'tld' at the end might be .com or it might be .org.it

127.0.0.1\s+(.*)
should work fine with global modifier.
Demo

Unless saving the list file locally is a requirement (in which case you might be better off just using wget or curl), there is no need to save it in an external file to process it line-by-line.
You can instead open a filehandle to the string itself.
In the script below, extract_hosts would work the same whether you give it a reference to a string or a filename:
#!/usr/bin/env perl
use strict;
use warnings;
use Carp qw( croak );
use LWP::Simple qw( get );
my $url = 'http://www.malwaredomainlist.com/hostslist/hosts.txt';
my $malware_hosts = get $url;
unless (defined $malware_hosts) {
die "Failed to get content from '$url'\n";
}
my $hosts = extract_hosts(\$malware_hosts);
print "$_\n" for #$hosts;
sub extract_hosts {
my $src = shift;
open my $fh, '<', $src
or croak "Failed to open '$src' for reading: $!";
my #hosts;
while (my $entry = <$fh>) {
next unless $entry =~ /\S/;
next if $entry =~ /^#/;
my (undef, $host) = split ' ', $entry;
push #hosts, $host;
}
close $fh
or croak "Failed to close '$src': $!";
\#hosts;
}
This will give you the list of hosts.

Code to grep the hostnames from the given file.
use LWP::Simple;
my $url = 'http://www.malwaredomainlist.com/hostslist/hosts.txt';
my $content = get $url;
my #server_names = split(/127\.0\.0\.1\s*/, $content);
open(my $fh, '>', '/home/jay/feed.txt');
print $fh "#server_names";
close $fh;

Here is another implementation. It uses HTML::Tiny which is part of the core so you don't have to install anything.
use HTTP::Tiny;
my $response = HTTP::Tiny->new->get('http://www.malwaredomainlist.com/hostslist/hosts.txt');
die "Failed!\n" unless $response->{success};
my #content;
for my $line ( split ( "\n", $response->{content} ) ){
next if ( $line =~ /^#|^$/);
push #content, ((split ( " ", $line ))[1]);
}
print Dumper (\#content);

Perl: unable to get the correct match from the file

I need help with my script. I am writing a script that will check if the username is still existing in /etc/passwd. I know this can be done on BASH but as much as possible I want to avoid using it, and just focus on writing using Perl instead.
Okay, so my problem is that, my script could not find the right match in my $password_file. I still got the No root found error even though it is still in the file.
Execution of the script.
jpd#skriv ~ $ grep root /etc/passwd
root:x:0:0:root:/root:/bin/bash
jpd#skriv ~ $ ~/Copy/documents/scripts/my_perl/test.pl root
Applying pattern match (m//) to #id will act on scalar(#id) at /home/jpd/Copy/documents/scripts/my_perl/test.pl line 16.
No root found!
jpd#skriv ~ $
Also, why do I always get this "Applying pattern match..." warning?
Here's the code:
#!/usr/bin/perl
use strict;
use warnings;
my $stdin = $ARGV[0];
my $password_file = '/etc/passwd';
open (PWD, $password_file) || die "Error: $!\n";
my #lines = (<PWD>);
close PWD;
for my $line (#lines) {
my #list = split /:/, $line;
my #id = "$list[0]\n";
if (#id =~ m/$stdin/) {
die "Match found!\n";
} else {
print "No $stdin found!\n";
exit 0;
}
}
Thanks in advance! :)
Regards,
sedawkgrep
Perl Newbie

I have a few things to point out regarding your code:
Good job using use strict; and use warnings;. They should be included in EVERY perl script.
Pick meaningful variable names.
$stdin is too generic. $username does a better job of documenting the intent of your script.
Concerning your file processing:
Include use autodie; anytime you're working with files.
This pragma will automatically handle error messages, and will give you better information than just "Error: $!\n". Also, if you are wanting to do a manual error messages, be sure to remove the new line from your message or die won't report the line number.
Use Lexical file handles and the three argument form of open
open my $fh, '<', $password_file;
Don't load an entire file into memory unless you need to. Instead, use while loop and process the file line by line
Concerning your comparison: #id =~ m/$stdin/:
Always use a scalar to the left of comparison =~
The comparison operator binds a scalar to a pattern. Therefore the line #id =~ m/$stdin/ is actually comparing the size of #id to your pattern: "1" =~ m/$stdin/. This is obviously a bug.
Be sure to escape the regular expression special characters using quotemeta or \Q...\E:
$list[0] =~ m/\Q$stdin/
Since you actually want a direct equality, don't use a regex at all, but instead use eq
You're exiting after only processing the first line of your file.
In one fork you're dying if you find a match in the first line. In your other fork, you're exiting with the assumption that no other lines are going to match either.
With these changes, I would correct your script to the following:
#!/usr/bin/perl
use strict;
use warnings;
use autodie;
my $username = $ARGV[0];
my $password_file = '/etc/passwd';
open my $fh, '<', $password_file;
while (<$fh>) {
chomp;
my #cols = split /:/;
if ($cols[0] eq $username) {
die "Match found!\n";
}
}
print "No $username found!\n";

#!/usr/bin/perl
use strict;
use warnings;
my $stdin = $ARGV[0];
my $password_file = '/etc/passwd';
open (PWD,"<$password_file");
my #lines = <PWD>;
my #varr = grep (m/root/, #lines);
Then check varr array and split it if you need.

You'd be better off using a hash for key lookups, but with minimal modification this should work:
open my $in, '<', 'in.txt';
my $stdin = $ARGV[0];
while(<$in>){
chomp;
my #list = split(/\:/);
my ($id) = $list[0];
if ($id eq $stdin) {
die "Match found\n";
}
}

Find/Replace in files recursively but touch only files with matches

I would like to quickly search and replace with or without regexp in files recursively. In addition, I need to search only in specific files and I do not want to touch the files that do not match my search_pattern otherwise git will think all the parsed files were modified (it what happens with find . --exec sed).
I tried many solutions that I found on internet using find, grep, sed or ack but I don't think they are really good to match specific files only.
Eventually I wrote this perl script:
#!/bin/perl
use strict;
use warnings;
use File::Find;
my $search_pattern = $ARGV[0];
my $replace_pattern = $ARGV[1];
my $file_pattern = $ARGV[2];
my $do_replace = 0;
sub process {
return unless -f;
return unless /(.+)[.](c|h|inc|asm|mac|def|ldf|rst)$/;
open F, $_ or print "couldn't open $_\n" && return;
my $file = $_;
my $i = 0;
while (<F>) {
if (m/($search_pattern)/o) {$i++};
}
close F;
if ($do_replace and $i)
{
printf "found $i occurence(s) of $search_pattern in $file\n";
open F, "+>".$file or print "couldn't open $file\n" && return;
while (<F>)
{
s/($search_pattern)/($replace_pattern)/g;
print F;
}
close F;
}
}
find(\&process, ".");
My question is:
Is there any better solution like this one below (which not exists) ?
`repaint -n/(.+)[.](c|h|inc|asm|mac|def|ldf|rst)$/ s/search/replacement/g .`
Subsidiary questions:
How's my perl script ? Not too bad ? Do I really need to reopen every files that match my search_pattern ?
How people deal with this trivial task ? Almost every good text editor have a "Search and Replace in files" feature, but not vim. How vim users can do this ?
Edit:
I also tried this script ff.pl with ff | xargs perl -pi -e 's/foo/bar/g' but it doesnt work as I expected. It created a backup .bak even though I didn't give anything after the -pi. It seems it is the normal behaviour within cygwin but with this I cannot really use perl -pi -e
#!/bin/perl
use strict;
use warnings;
use File::Find;
use File::Basename;
my $ext = $ARGV[0];
sub process {
return unless -f;
return unless /\.(c|h|inc|asm|mac|def|ldf|rst)$/;
print $File::Find::name."\n" ;
}
find(\&process, ".");
Reedit:
I finally came across this solution (under cygwin I need to remove the backup files)
find . | egrep '\.(c|h|asm|inc)$' | xargs perl -pi.winsucks -e 's/<search>/<replace>/g'
find . | egrep '\.(c|h|asm|inc)\.winsucks$' | xargs rm

The following is a cleaned up version of your code.
Always include use strict; and use warnings at the top of EVERY perl script. If you're doing file processing, include use autodie; as well.
Go ahead and slurp the entire file. That way you only have to read and write optionally write it once.
Consider using File::Find::Rule for cases like this. Your implmentation using File::Find works, and actually is probably the preferred module in this case, but I like the interface for the latter.
I removed the capture groups from the regex. In ones in the RHS were a bug, and the ones in the LHS were superfluous.
And the code:
use strict;
use warnings;
use autodie;
use File::Find;
my $search_pattern = $ARGV[0];
my $replace_pattern = $ARGV[1];
my $file_pattern = $ARGV[2];
my $do_replace = 0;
sub process {
return if !-f;
return if !/[.](?:c|h|inc|asm|mac|def|ldf|rst)$/;
my $data = do {
open my $fh, '<', $_;
local $/;
<$fh>;
};
my $count = $data =~ s/$search_pattern/$replace_pattern/g
or return;
print "found $count occurence(s) of $search_pattern in $_\n";
return if !$do_replace;
open my $fh, '>', $_;
print $fh $data;
close $fh;
}
find(\&process, ".");

Not bad, but several minor notes:
$do_replace is always 0 so it will not replace
in-place open F, "+>" will not work on cygwin + windows
m/($search_pattern)/o /o is good, () is not needed.
$file_pattern is ignored, you overwrite it with your own
s/($search_pattern)/($replace_pattern)/g;
() is unneeded and will actually disturb a counter in the $replace_pattern
/(.+)[.](c|h|inc|asm|mac|def|ldf|rst)$/ should be written as
/\.(c|h|inc|asm|mac|def|ldf|rst)$/ and maybe /i also
Do I really need to reopen every files that match my search_pattern ?
You don't do.
Have no idea about vim, I use emacs, which has several method to accomplish this.

What's wrong with the following command?
:grep foo **/*.{foo,bar,baz}
:cw
It won't cause any problem with any VCS and is pretty basic Vimming.
You are right that Vim doesn't come with a dedicated "Search and Replace in files" feature but there are plugins for that.

why not just:
grep 'pat' -rl *|xargs sed -i 's/pat/rep/g'
or I didn't understand the Q right?

I suggest find2perl if it doesn't work out of the box, you can tweak the code it generates:
find2perl /tmp \! -name ".*?\.(c|h|inc|asm|mac|def|ldf|rst)$" -exec "sed -e s/aaa/bbb/g {}"
it will print the following code to stdout:
#! /usr/bin/perl -w
eval 'exec /usr/bin/perl -S $0 ${1+"$#"}'
if 0; #$running_under_some_shell
use strict;
use File::Find ();
# Set the variable $File::Find::dont_use_nlink if you're using AFS,
# since AFS cheats.
# for the convenience of &wanted calls, including -eval statements:
use vars qw/*name *dir *prune/;
*name = *File::Find::name;
*dir = *File::Find::dir;
*prune = *File::Find::prune;
sub wanted;
sub doexec ($#);
use Cwd ();
my $cwd = Cwd::cwd();
# Traverse desired filesystems
File::Find::find({wanted => \&wanted}, '/tmp');
exit;
sub wanted {
my ($dev,$ino,$mode,$nlink,$uid,$gid);
(($dev,$ino,$mode,$nlink,$uid,$gid) = lstat($_)) &&
! /^\..*.?\\.\(c|h|inc|asm|mac|def|ldf|rst\)\$\z/s &&
doexec(0, 'sed -e s/aaa/bbb/g {}');
}
sub doexec ($#) {
my $ok = shift;
my #command = #_; # copy so we don't try to s/// aliases to constants
for my $word (#command)
{ $word =~ s#{}#$name#g }
if ($ok) {
my $old = select(STDOUT);
$| = 1;
print "#command";
select($old);
return 0 unless <STDIN> =~ /^y/;
}
chdir $cwd; #sigh
system #command;
chdir $File::Find::dir;
return !$?;
}
If you want to execute, you can pipe it to perl:
find2perl /tmp \! -name ".*?\.(c|h|inc|asm|mac|def|ldf|rst)$" -exec "sed -e s/aaa/bbb/g" | perl

You can try this plugin for Vim:
https://github.com/skwp/greplace.vim
Basically, it allows you to type in a search phases (with/without regex) and ask you for the files to search in.

perl hash array reading from files

I'm trying to read multiple files that have the same format and want to make some statistics based on regex.
i.e I want to count similar items that are within the []
NC_013618 NC_013633 ([T(nad6 trnE ,cob trnT ,)])
C_013481 NC_013479 ([T(trnP ,rrnS trnF trnV rrnL nad1 trnI ,)])
NC_013485 NC_003159 ([T(trnC ,trnY ,)])
NC_013554 NC_013254 ([T(trnR ,trnN ,)])
NC_013607 NC_013618 ([T(nad6 trnE ,cob trnT ,)])
the problem is that i'm not getting right values, below is my code:
use strict;
use warnings;
my %data;
#FILES = glob("../mitos-crex/*.out");
foreach my $file (#FILES) {
local $/ = undef;
open my $fh, '<', $file;
$data{$file} = <$fh>;
}
my #t;
my $c = 0;
foreach my $line (keys %data) {
foreach my $l ($data{$line}) {
print $l."\n";
($t[$c]) = $l =~ m/(\[.*\])/;
$c++;
}
}
#the problem is here the counter is not giving the right value
print $c;
my %counts;
$counts{$_}++ for #t;
thanks in advance

First of all, always use strict and use warnings. This measure is vital for all programming, as it will quickly reveal simple problems that you may otherwise overlook or waste time on debugging. This is especially true and a simple courtesy if you are asking for others' help with your program
You seem to have become confused between slurping an entire file into a single string, and into an array of lines. The way you have written it, each element $data{file} is a single scalar value containing all of the file's data, and then you try to iterate over it with foreach $l ($data{$line}) { ... } which executes just once and so only find the first [...] string in the file
Ordinarily I would say that you shouldn't read in all of your file data in this way, as the problem is likely to have a better streamed solution, but I don't know what else you want to use the captured data for, so my solution follows your own design
I think you need to slurp the data into a virtual array, instead of a scalar, and then iterate over that in your loops. You must leave $/ defined so that the file is read in lines, and build an anonymous array with [ <$fh> ]. Then you can iterate over the lines with foreach my $line (#{ $data{$file} }) { ... }
use strict;
use warnings;
my %data;
my #files = glob("../mitos-crex/*.out");
foreach my $file (#files) {
open my $fh, '<', $file or die $!;
$data{$file} = [ <$fh> ];
}
my $c = 0;
my #t;
foreach my $file (keys %data) {
foreach my $line (#{ $data{$file} }) {
($t[$c]) = $line =~ /(\[.*\])/;
$c++;
}
}
print $c;
my %counts;
$counts{$_}++ for #t;

The counter is giving a correct value. Your problem is that you are slurping the file (reading it all in at once), but then only storing the first value found:
($t[$c]) = $data{$line} =~ m/(\[.*\])/; # only finds first value in file
Either loop over each file properly, and use the above regex for each line, or do something like:
push #t, ($data{$line} =~ m/(\[.*\])/g);
You should always use
use strict;
use warnings;
And solve the errors/warnings that result. Not doing so is a bad idea, and is only hiding the problems in your code -- not solving them.
Also, you should be aware that this statement:
foreach $l ($data{$line}) {
Only iterates once, because each "line" here is an entire file, and $data{$line} is besides a scalar value. Moreover, you iterate using $l as an alias, but you still use $data{$line} inside the loop, which makes the loop completely redundant.

How to pass a replacing regex as a command line argument to a perl script

I am trying to write a simple perl script to apply a given regex to a filename among other things, and I am having trouble passing a regex into the script as an argument.
What I would like to be able to do is somthing like this:
> myscript 's/hi/bye/i' hi.h
bye.h
>
I have produced this code
#!/utils/bin/perl -w
use strict;
use warnings;
my $n_args = $#ARGV + 1;
my $regex = $ARGV[0];
for(my $i=1; $i<$n_args; $i++) {
my $file = $ARGV[$i];
$file =~ $regex;
print "OUTPUT: $file\n";
}
I cannot use qr because apparently it cannot be used on replacing regexes (although my source for this is a forum post so I'm happy to be proved wrong).
I would rather avoid passing the two parts in as seperate strings and manually doing the regex in the perl script.
Is it possible to pass the regex as an argument like this, and if so what is the best way to do it?

There's more than one way to do it, I think.
The Evial Way:
As you basically send in a regex expression, it can be evaluated to get the result. Like this:
my #args = ('s/hi/bye/', 'hi.h');
my ($regex, #filenames) = #args;
for my $file (#filenames) {
eval("\$file =~ $regex");
print "OUTPUT: $file\n";
}
Of course, following this way will open you to some very nasty surprises. For example, consider passing this set of arguments:
...
my #args = ('s/hi/bye/; print qq{MINE IS AN EVIL LAUGH!\n}', 'hi.h');
...
Yes, it will laugh at you most evailly.
The Safe Way:
my ($regex_expr, #filenames) = #args;
my ($substr, $replace) = $regex_expr =~ m#^s/((?:[^/]|\\/)+)/((?:[^/]|\\/)+)/#;
for my $file (#filenames) {
$file =~ s/$substr/$replace/;
print "OUTPUT: $file\n";
}
As you can see, we parse the expression given to us into two parts, then use these parts to build a full operator. Obviously, this approach is less flexible, but, of course, it's much more safe.
The Easiest Way:
my ($search, $replace, #filenames) = #args;
for my $file (#filenames) {
$file =~ s/$search/$replace/;
print "OUTPUT: $file\n";
}
Yes, that's right - no regex parsing at all! What happens here is we decided to take two arguments - 'search pattern' and 'replacement string' - instead of a single one. Will it make our script less flexible than the previous one? No, as we still had to parse the regex expression more-or-less regularly. But now user clearly understand all the data that is given to a command, which is usually quite an improvement. )
#args in both examples corresponds to #ARGV array.

The s/a/b/i is an operator, not simply a regular expression, so you need to use eval if you want it to be interpreted properly.
#!/usr/bin/env perl
use warnings;
use strict;
my $regex = shift;
my $sub = eval "sub { \$_[0] =~ $regex; }";
foreach my $file (#ARGV) {
&$sub($file);
print "OUTPUT: $file\n";
}
The trick here is that I'm substituting this "bit of code" into a string to produce Perl code that defines an anonymous subroutine $_[0] =~ s/a/b/i; (or whatever code you pass it), then using eval to compile that code and give me a code reference I can call from within the loop.
$ test.pl 's/foo/bar/' foo nicefood
OUTPUT: bar
OUTPUT: nicebard
$ test.pl 'tr/o/e/' foo nicefood
OUTPUT: fee
OUTPUT: nicefeed
This is more efficient than putting an eval "\$file =~ $regex;" inside the loop as then it'll get compiled and eval-ed at every iteration rather than just once up-front.
A word of warning about eval - as raina77ow's answer explains, you should avoid eval unless you're 100% sure you are always getting your input from a trusted source...

s/a/b/i is not a regex. It is a regex plus substitution. Unless you use the string eval, make this work might be pretty tough (consider s{a}<b>e and so on).

The trouble is that you are trying to pass a perl operator when all you really need to pass is the arguments:
myscript hi bye hi.h
In the script:
my ($find, $replace, #files) = #ARGV;
...
$file =~ s/$find/$replace/i;
Your code is a bit clunky. This is all you need:
use strict;
use warnings;
my ($find, $replace, #files) = #ARGV;
for my $file (#files) {
$file =~ s/$find/$replace/i;
print "$file\n";
}
Note that this way allows you to use meta characters in the regex, such as \w{2}foo?. This can be both a good thing and a bad thing. To make all characters intepreted literally (disable meta characters), you can use \Q ... \E like so:
... s/\Q$find\E/$replace/i;

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Retrieving elements in a list perl - list

Related

Using Perl to match all words after a particular word

Perl: unable to get the correct match from the file

Find/Replace in files recursively but touch only files with matches

perl hash array reading from files

How to pass a replacing regex as a command line argument to a perl script

Categories

Resources