Can't combine two regex statements to work at once - regex

I have a simple perl code that displays the names of the files in a given directory
opendir my $dir, "./images/sampleImages" or die "Cannot open directory: $!";
my #Images = grep {!/^\./} readdir $dir;
closedir $dir;
I have added a regex statement to remove all single and double dots, since readdir adds them as special characters and this is working perfectly fine. However I want only the files with the extensions of .jpg/.jpeg/.png/.gif to be read. I have found a regex that does that, which is :
(/\.(gif|jpg|jpeg|png)$/i)
But I can't seem to make them work together. I have tried everything I could find on the subject here, combining the statements in different ways but either they give errors, or flat out do not work the way it should. Any help would be greatly appreciated, thanks in advance!

Does this work?
my #Images = grep { !/^\./ && /\.(gif|jpg|jpeg|png)$/i } readdir $dir;

You can use multiple match operators:
my #dirs =
grep { /\.(?:gif|jpe?g|png)\z/i && ! /\A\./ }
readdir $dir;
I typically make one grep per idea so I can disable them individually:
my #dirs =
grep { /\.(?:gif|jpe?g|png)\z/i }
grep { ! /\A\./ }
readdir $dir;
And often I add in a map to make the full path since readdir doesn't do that for you (and usually I do it wrong first then come back to do this):
my #dirs =
map { catfile( $dir, $_ ) }
grep { /\.(?:gif|jpe?g|png)\z/ }
grep { ! /\A\./ }
readdir $dir;
And, readdir doesn't "add" . and ... Those are the virtual directories that mean the current directory and the parent directory. If those are the only things that you want to remove, then you can improve that part to match only those rather than all hidden files:
my #dirs =
map { catfile( $dir, $_ ) }
grep { /\.(?:gif|jpe?g|png)\z/ }
grep { ! /\A\.\.?\z/ }
readdir $dir;
Again, even though I know that and written that a million times, I still only add it once I see . and .. in the output. Some things never change.

Following code filters in only images, no need to look at . and .. at all
use strict;
use warnings;
use feature 'say';
my $dirname = './images/sampleImages';
opendir my $dir, $dirname
or die "Cannot open directory: $dirname";
my #Images = grep { /\.(gif|jpg|jpeg|png)$/i } readdir $dir;
closedir $dir;
map{ say } #Images;

Related

Regex for input file in Perl [duplicate]

This question already has answers here:
How do I read in the contents of a directory in Perl?
(9 answers)
Closed 8 years ago.
Is there a function in Perl that lists all the files and directories in a directory?
I remember that Java has the File.list() to do this? Is there a comparable method in Perl?
If you want to get content of given directory, and only it (i.e. no subdirectories), the best way is to use opendir/readdir/closedir:
opendir my $dir, "/some/path" or die "Cannot open directory: $!";
my #files = readdir $dir;
closedir $dir;
You can also use:
my #files = glob( $dir . '/*' );
But in my opinion it is not as good - mostly because glob is quite complex thing (can filter results automatically) and using it to get all elements of directory seems as a too simple task.
On the other hand, if you need to get content from all of the directories and subdirectories, there is basically one standard solution:
use File::Find;
my #content;
find( \&wanted, '/some/path');
do_something_with( #content );
exit;
sub wanted {
push #content, $File::Find::name;
return;
}
this should do it.
my $dir = "bla/bla/upload";
opendir DIR,$dir;
my #dir = readdir(DIR);
close DIR;
foreach(#dir){
if (-f $dir . "/" . $_ ){
print $_," : file\n";
}elsif(-d $dir . "/" . $_){
print $_," : folder\n";
}else{
print $_," : other\n";
}
}
readdir() does that.
Check http://perldoc.perl.org/functions/readdir.html
opendir(DIR, $some_dir) || die "can't opendir $some_dir: $!";
#dots = grep { /^\./ && -f "$some_dir/$_" } readdir(DIR);
closedir DIR;
Or File::Find
use File::Find;
finddepth(\&wanted, '/some/path/to/dir');
sub wanted { print };
It'll go through subdirectories if they exist.
If you are a slacker like me you might like to use the File::Slurp module. The read_dir function will reads directory contents into an array, removes the dots, and if needed prefix the files returned with the dir for absolute paths
my #paths = read_dir( '/path/to/dir', prefix => 1 ) ;
This will list Everything (including sub directories) from the directory you specify, in order, and with the attributes. I have spent days looking for something to do this, and I took parts from this entire discussion, and a little of my own, and put it together. ENJOY!!
#!/usr/bin/perl --
print qq~Content-type: text/html\n\n~;
print qq~<font face="arial" size="2">~;
use File::Find;
# find( \&wanted_tom, '/home/thomas/public_html'); # if you want just one website, uncomment this, and comment out the next line
find( \&wanted_tom, '/home');
exit;
sub wanted_tom {
($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,$atime,$mtime,$ctime,$blksize,$blocks) = stat ($_);
$mode = (stat($_))[2];
$mode = substr(sprintf("%03lo", $mode), -3);
if (-d $File::Find::name) {
print "<br><b>--DIR $File::Find::name --ATTR:$mode</b><br>";
} else {
print "$File::Find::name --ATTR:$mode<br>";
}
return;
}

Excluding a file with perl grep

I want to go over all of the files in the directory, except for files ending with '.py'.
The line in the existing script is:
my #files = sort(grep(!/^(\.|\.\.)$/, readdir($dir_h)));
And I want something like:
my #files = sort(grep(!/^(\.|\.\.|"*.py")$/, readdir($dir_h)));
Can you please help with the exact syntax?
grep uses regular expressions, not globs (aka wildcards). The correct syntax is
my #files = sort(grep(!/^(\.|\.\.|.*\.py)$/, readdir($dir_h)));
or, without the unnecessary parentheses
my #files = sort grep ! /^(\.|\.\.|.*\.py)$/, readdir $dir_h;
As the parentheses in the regular expression aren't used for capturing, but only for precedence, you can change them to non-capturing:
my #files = sort grep ! /^(?:\.|\.\.|.*\.py)$/, readdir $dir_h;
You can express the same in many different ways, e.g.
/^\.{1,2}$|\.py$/
i.e. dot once or twice with nothing around, or .py at the end.
perl's build in grep is actually very clever - it iterates an array, applying a condition to each element in turn. It sets each element to $_.
This condition can be a simple regular expression, but it doesn't have to be.
So you can - for example:
my #files = grep { -f $_ } readir(DIR);
But because -f defaults to $_ you can also:
my #files = grep { -f } readdir (DIR);
You can also apply a regular expression to $_
my #files = grep { not m/\.py$/ } readdir (DIR);
(Note - this is the same as not $_ =~ m/\.py$/ - patterns apply to $_ by default).
So you can do what you want by:
my #files = sort grep { not m/\.py$/ and -f } readdir (DIR);
Although note - that will work in the current working directory, not for reading a separate path. You can use readdir for different directories, but personally I prefer glob - because it fills in the path as well:
my #files = sort grep { not m/\.py$/ and -f } glob ( "$dir/*" );
Check that the directory entries are files and then exclude those that end in .py:
#!/usr/bin/env perl
use warnings;
use strict;
my $dir = "/home/me/somedir";
# good examples in the perldoc:
# perldoc -f readdir
opendir(my $DIR, $dir) || die "Unable to open $dir : $!";
# -f checks that it is a plain file ( perldoc perlfunc )
# !~ means does not match ( perldoc perlre )
# m|\.py$| means a match string that ends in '.py'
my #files = sort grep { -f "$dir/$_" && $_ !~ m|\.py$| } readdir($DIR);

How to use Perl to grep specific information from many files in a list of directories [duplicate]

This question already has answers here:
how to improve grep efficiency in perl when the file number is huge
(2 answers)
Closed 8 years ago.
I have a directory structure like this
$workdir/XXXX/YYYY.log.
where XXXX is the sub directory name (there are many sub directories) and YYYY is the log file name (there are many log files also).
I need to extract some information from all the logs. Currently I use
#Info = qx(grep "information" -r $workdir)
and then output the #Info to a file to do this.
Is there a more efficient way to do this?
I would use something like this in pure Perl. I think a lot of the problem is that there is no reassurance that the process is progressing. This solution prints the name of each subdirectory and each log file to STDERR as it is encountered, but send all the grepped lines to STDOUT.
You will have to modify the condition in the while loop so the the correct lines are selected.
It wouldn't be too hard to produce a "percent complete" figure, or an estimated time of completion, if that is what you wish.
use strict;
use warnings;
use autodie;
use File::Spec;
my $workdir = '/path/to/work/dir';
opendir my($dh), '.';
my #subdirs = grep { -d and /\A[^.]/ } readdir $dh;
closedir $dh;
for my $subdir (#subdirs) {
$subdir = File::Spec->catdir($workdir, $subdir);
print STDERR "$subdir\n";
opendir my($dh), $subdir;
my #logs = grep { /\.log\z/i } readdir $dh;
closedir $dh;
#logs = grep { -f } map { File::Spec->catfile($subdir, $_) } #logs
for my $log (#logs) {
print STDERR " $log\n";
open my $fh, '<', $log;
while (<$fh>) {
print " $_" if /condition/;
}
}
}

Find/Replace in files recursively but touch only files with matches

I would like to quickly search and replace with or without regexp in files recursively. In addition, I need to search only in specific files and I do not want to touch the files that do not match my search_pattern otherwise git will think all the parsed files were modified (it what happens with find . --exec sed).
I tried many solutions that I found on internet using find, grep, sed or ack but I don't think they are really good to match specific files only.
Eventually I wrote this perl script:
#!/bin/perl
use strict;
use warnings;
use File::Find;
my $search_pattern = $ARGV[0];
my $replace_pattern = $ARGV[1];
my $file_pattern = $ARGV[2];
my $do_replace = 0;
sub process {
return unless -f;
return unless /(.+)[.](c|h|inc|asm|mac|def|ldf|rst)$/;
open F, $_ or print "couldn't open $_\n" && return;
my $file = $_;
my $i = 0;
while (<F>) {
if (m/($search_pattern)/o) {$i++};
}
close F;
if ($do_replace and $i)
{
printf "found $i occurence(s) of $search_pattern in $file\n";
open F, "+>".$file or print "couldn't open $file\n" && return;
while (<F>)
{
s/($search_pattern)/($replace_pattern)/g;
print F;
}
close F;
}
}
find(\&process, ".");
My question is:
Is there any better solution like this one below (which not exists) ?
`repaint -n/(.+)[.](c|h|inc|asm|mac|def|ldf|rst)$/ s/search/replacement/g .`
Subsidiary questions:
How's my perl script ? Not too bad ? Do I really need to reopen every files that match my search_pattern ?
How people deal with this trivial task ? Almost every good text editor have a "Search and Replace in files" feature, but not vim. How vim users can do this ?
Edit:
I also tried this script ff.pl with ff | xargs perl -pi -e 's/foo/bar/g' but it doesnt work as I expected. It created a backup .bak even though I didn't give anything after the -pi. It seems it is the normal behaviour within cygwin but with this I cannot really use perl -pi -e
#!/bin/perl
use strict;
use warnings;
use File::Find;
use File::Basename;
my $ext = $ARGV[0];
sub process {
return unless -f;
return unless /\.(c|h|inc|asm|mac|def|ldf|rst)$/;
print $File::Find::name."\n" ;
}
find(\&process, ".");
Reedit:
I finally came across this solution (under cygwin I need to remove the backup files)
find . | egrep '\.(c|h|asm|inc)$' | xargs perl -pi.winsucks -e 's/<search>/<replace>/g'
find . | egrep '\.(c|h|asm|inc)\.winsucks$' | xargs rm
The following is a cleaned up version of your code.
Always include use strict; and use warnings at the top of EVERY perl script. If you're doing file processing, include use autodie; as well.
Go ahead and slurp the entire file. That way you only have to read and write optionally write it once.
Consider using File::Find::Rule for cases like this. Your implmentation using File::Find works, and actually is probably the preferred module in this case, but I like the interface for the latter.
I removed the capture groups from the regex. In ones in the RHS were a bug, and the ones in the LHS were superfluous.
And the code:
use strict;
use warnings;
use autodie;
use File::Find;
my $search_pattern = $ARGV[0];
my $replace_pattern = $ARGV[1];
my $file_pattern = $ARGV[2];
my $do_replace = 0;
sub process {
return if !-f;
return if !/[.](?:c|h|inc|asm|mac|def|ldf|rst)$/;
my $data = do {
open my $fh, '<', $_;
local $/;
<$fh>;
};
my $count = $data =~ s/$search_pattern/$replace_pattern/g
or return;
print "found $count occurence(s) of $search_pattern in $_\n";
return if !$do_replace;
open my $fh, '>', $_;
print $fh $data;
close $fh;
}
find(\&process, ".");
Not bad, but several minor notes:
$do_replace is always 0 so it will not replace
in-place open F, "+>" will not work on cygwin + windows
m/($search_pattern)/o /o is good, () is not needed.
$file_pattern is ignored, you overwrite it with your own
s/($search_pattern)/($replace_pattern)/g;
() is unneeded and will actually disturb a counter in the $replace_pattern
/(.+)[.](c|h|inc|asm|mac|def|ldf|rst)$/ should be written as
/\.(c|h|inc|asm|mac|def|ldf|rst)$/ and maybe /i also
Do I really need to reopen every files that match my search_pattern ?
You don't do.
Have no idea about vim, I use emacs, which has several method to accomplish this.
What's wrong with the following command?
:grep foo **/*.{foo,bar,baz}
:cw
It won't cause any problem with any VCS and is pretty basic Vimming.
You are right that Vim doesn't come with a dedicated "Search and Replace in files" feature but there are plugins for that.
why not just:
grep 'pat' -rl *|xargs sed -i 's/pat/rep/g'
or I didn't understand the Q right?
I suggest find2perl if it doesn't work out of the box, you can tweak the code it generates:
find2perl /tmp \! -name ".*?\.(c|h|inc|asm|mac|def|ldf|rst)$" -exec "sed -e s/aaa/bbb/g {}"
it will print the following code to stdout:
#! /usr/bin/perl -w
eval 'exec /usr/bin/perl -S $0 ${1+"$#"}'
if 0; #$running_under_some_shell
use strict;
use File::Find ();
# Set the variable $File::Find::dont_use_nlink if you're using AFS,
# since AFS cheats.
# for the convenience of &wanted calls, including -eval statements:
use vars qw/*name *dir *prune/;
*name = *File::Find::name;
*dir = *File::Find::dir;
*prune = *File::Find::prune;
sub wanted;
sub doexec ($#);
use Cwd ();
my $cwd = Cwd::cwd();
# Traverse desired filesystems
File::Find::find({wanted => \&wanted}, '/tmp');
exit;
sub wanted {
my ($dev,$ino,$mode,$nlink,$uid,$gid);
(($dev,$ino,$mode,$nlink,$uid,$gid) = lstat($_)) &&
! /^\..*.?\\.\(c|h|inc|asm|mac|def|ldf|rst\)\$\z/s &&
doexec(0, 'sed -e s/aaa/bbb/g {}');
}
sub doexec ($#) {
my $ok = shift;
my #command = #_; # copy so we don't try to s/// aliases to constants
for my $word (#command)
{ $word =~ s#{}#$name#g }
if ($ok) {
my $old = select(STDOUT);
$| = 1;
print "#command";
select($old);
return 0 unless <STDIN> =~ /^y/;
}
chdir $cwd; #sigh
system #command;
chdir $File::Find::dir;
return !$?;
}
If you want to execute, you can pipe it to perl:
find2perl /tmp \! -name ".*?\.(c|h|inc|asm|mac|def|ldf|rst)$" -exec "sed -e s/aaa/bbb/g" | perl
You can try this plugin for Vim:
https://github.com/skwp/greplace.vim
Basically, it allows you to type in a search phases (with/without regex) and ask you for the files to search in.

Grep in perl never seems to work for me

I have a simple script that reads and list file from a directory.
But I don't want to list hidden files, files with a " . " in front.
So I've tried using the grep function to do this, but it returns nothing. I get no listing of files.
opendir(Dir, $mydir);
while ($file = readdir(Dir)){
$file = grep !/^\./ ,readdir Dir;
print "$file\n";
I don't think I'm using the regular expression correctly.
I don't want to use an array cause the array doesn't format the list correctly.
You can either iterate over directory entries using a loop, or read all the entries in the directory at once:
while (my $file = readdir(Dir)) {
print "$file\n" if $file !~ /^\./;
}
or
my #files = grep { !/^\./ } readdir Dir;
See perldoc -f readdir.
You're calling readdir() twice in a loop. Don't.
or like so:
#!/usr/bin/env perl -w
use strict;
opendir my $dh, '.';
print map {$_."\n"} grep {!/^\./} readdir($dh);
Use glob:
my #files = glob( "$mydir/*" );
print "#files\n";
See perldoc -f glob for details.
while ($file = readdir(Dir))
{
print "\n$file" if ( grep !/^\./, $file );
}
OR you can use a regualr expression :
while ($file = readdir(Dir))
{
print "\n$file" unless ( $file =~ /^\./ );
}