Regex for input file in Perl [duplicate] - regex

This question already has answers here:
How do I read in the contents of a directory in Perl?
(9 answers)
Closed 8 years ago.
Is there a function in Perl that lists all the files and directories in a directory?
I remember that Java has the File.list() to do this? Is there a comparable method in Perl?

If you want to get content of given directory, and only it (i.e. no subdirectories), the best way is to use opendir/readdir/closedir:
opendir my $dir, "/some/path" or die "Cannot open directory: $!";
my #files = readdir $dir;
closedir $dir;
You can also use:
my #files = glob( $dir . '/*' );
But in my opinion it is not as good - mostly because glob is quite complex thing (can filter results automatically) and using it to get all elements of directory seems as a too simple task.
On the other hand, if you need to get content from all of the directories and subdirectories, there is basically one standard solution:
use File::Find;
my #content;
find( \&wanted, '/some/path');
do_something_with( #content );
exit;
sub wanted {
push #content, $File::Find::name;
return;
}

this should do it.
my $dir = "bla/bla/upload";
opendir DIR,$dir;
my #dir = readdir(DIR);
close DIR;
foreach(#dir){
if (-f $dir . "/" . $_ ){
print $_," : file\n";
}elsif(-d $dir . "/" . $_){
print $_," : folder\n";
}else{
print $_," : other\n";
}
}

readdir() does that.
Check http://perldoc.perl.org/functions/readdir.html
opendir(DIR, $some_dir) || die "can't opendir $some_dir: $!";
#dots = grep { /^\./ && -f "$some_dir/$_" } readdir(DIR);
closedir DIR;

Or File::Find
use File::Find;
finddepth(\&wanted, '/some/path/to/dir');
sub wanted { print };
It'll go through subdirectories if they exist.

If you are a slacker like me you might like to use the File::Slurp module. The read_dir function will reads directory contents into an array, removes the dots, and if needed prefix the files returned with the dir for absolute paths
my #paths = read_dir( '/path/to/dir', prefix => 1 ) ;

This will list Everything (including sub directories) from the directory you specify, in order, and with the attributes. I have spent days looking for something to do this, and I took parts from this entire discussion, and a little of my own, and put it together. ENJOY!!
#!/usr/bin/perl --
print qq~Content-type: text/html\n\n~;
print qq~<font face="arial" size="2">~;
use File::Find;
# find( \&wanted_tom, '/home/thomas/public_html'); # if you want just one website, uncomment this, and comment out the next line
find( \&wanted_tom, '/home');
exit;
sub wanted_tom {
($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,$atime,$mtime,$ctime,$blksize,$blocks) = stat ($_);
$mode = (stat($_))[2];
$mode = substr(sprintf("%03lo", $mode), -3);
if (-d $File::Find::name) {
print "<br><b>--DIR $File::Find::name --ATTR:$mode</b><br>";
} else {
print "$File::Find::name --ATTR:$mode<br>";
}
return;
}

Related

Can't combine two regex statements to work at once

I have a simple perl code that displays the names of the files in a given directory
opendir my $dir, "./images/sampleImages" or die "Cannot open directory: $!";
my #Images = grep {!/^\./} readdir $dir;
closedir $dir;
I have added a regex statement to remove all single and double dots, since readdir adds them as special characters and this is working perfectly fine. However I want only the files with the extensions of .jpg/.jpeg/.png/.gif to be read. I have found a regex that does that, which is :
(/\.(gif|jpg|jpeg|png)$/i)
But I can't seem to make them work together. I have tried everything I could find on the subject here, combining the statements in different ways but either they give errors, or flat out do not work the way it should. Any help would be greatly appreciated, thanks in advance!
Does this work?
my #Images = grep { !/^\./ && /\.(gif|jpg|jpeg|png)$/i } readdir $dir;
You can use multiple match operators:
my #dirs =
grep { /\.(?:gif|jpe?g|png)\z/i && ! /\A\./ }
readdir $dir;
I typically make one grep per idea so I can disable them individually:
my #dirs =
grep { /\.(?:gif|jpe?g|png)\z/i }
grep { ! /\A\./ }
readdir $dir;
And often I add in a map to make the full path since readdir doesn't do that for you (and usually I do it wrong first then come back to do this):
my #dirs =
map { catfile( $dir, $_ ) }
grep { /\.(?:gif|jpe?g|png)\z/ }
grep { ! /\A\./ }
readdir $dir;
And, readdir doesn't "add" . and ... Those are the virtual directories that mean the current directory and the parent directory. If those are the only things that you want to remove, then you can improve that part to match only those rather than all hidden files:
my #dirs =
map { catfile( $dir, $_ ) }
grep { /\.(?:gif|jpe?g|png)\z/ }
grep { ! /\A\.\.?\z/ }
readdir $dir;
Again, even though I know that and written that a million times, I still only add it once I see . and .. in the output. Some things never change.
Following code filters in only images, no need to look at . and .. at all
use strict;
use warnings;
use feature 'say';
my $dirname = './images/sampleImages';
opendir my $dir, $dirname
or die "Cannot open directory: $dirname";
my #Images = grep { /\.(gif|jpg|jpeg|png)$/i } readdir $dir;
closedir $dir;
map{ say } #Images;

Perl script to read a directory and having file name started by a particular name [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I have a directory containing thousand of file.
Suppose I have 3 pdf files having same name like:
sample_Q1.pdf
sample_Q2.pdf
sample_Q3.pdf
Now I want to find the file list having particular name started with "Sample".
I'm currently using:
#!/usr/bin/perl
use strict;
use warnings;
my $dir = '/home/gaurav/Desktop/CSP';
opendir( DIR, $dir ) or die $!;
while ( my $file = readdir(DIR) ) {
# We only want files
next unless ( -f "$dir/$file" );
# Use a regular expression to find files ending in .txt
next unless ( $file =~ /\.pdf$/ );
print "$file\n";
}
closedir(DIR);
exit 0;
I'd suggest that rather than readdir what you really want is glob. This latter expands a pattern in the same way as the shell does, and returns matches. The reason it's particularly useful is that readdir returns file names, where glob returns full paths. (e.g. directory too).
E.g.
#!/usr/bin/perl
use strict;
use warnings;
my $dir = '/home/gaurav/Desktop/CSP';
foreach my $file ( glob ( "$dir/sample*.pdf" ) ) {
print $file,"\n";
}
if you want to skip any 'non files' there's two approaches:
next unless -f $file;
Or:
foreach my $file ( grep { -f $_ } glob ( "$dir/sample*.pdf" ) ) {
But that's probably a moot point unless you've got directories suffixed .pdf which would be a bit unusual.
my $DirectoryName = '......';
chdir( $DirectoryName ) or die "Can't change directory: $!\n";
my #files = glob( "sample*");
for my $file (#files) {
print $file,"\n";
}
This will give list of all files name which start with sample and have extention .pdf:
use warnings;
use strict;
opendir my $dir, "/home/gaurav/Desktop/CSP" or die "Can't open directory: $!";
my #files = grep { /sample.*\.pdf$/ } readdir $dir;
print "#files";
closedir $dir;
Use a foreach loop if you want print file names one by one:
foreach my $file (#files)
{
print "$file\n";
}
I will do that like below.
$file =~ /^sample.*\.pdf$/i
Here 'i' is used for case insensitive. it will search both 'sample' and 'Sample'.
while ( my $file = readdir(DIR) )
{
if((-f "$dir/$file" ) && ($file =~ /^sample.*\.pdf$/i))
{
print "$file \n";
}
}

How to use Perl to grep specific information from many files in a list of directories [duplicate]

This question already has answers here:
how to improve grep efficiency in perl when the file number is huge
(2 answers)
Closed 8 years ago.
I have a directory structure like this
$workdir/XXXX/YYYY.log.
where XXXX is the sub directory name (there are many sub directories) and YYYY is the log file name (there are many log files also).
I need to extract some information from all the logs. Currently I use
#Info = qx(grep "information" -r $workdir)
and then output the #Info to a file to do this.
Is there a more efficient way to do this?
I would use something like this in pure Perl. I think a lot of the problem is that there is no reassurance that the process is progressing. This solution prints the name of each subdirectory and each log file to STDERR as it is encountered, but send all the grepped lines to STDOUT.
You will have to modify the condition in the while loop so the the correct lines are selected.
It wouldn't be too hard to produce a "percent complete" figure, or an estimated time of completion, if that is what you wish.
use strict;
use warnings;
use autodie;
use File::Spec;
my $workdir = '/path/to/work/dir';
opendir my($dh), '.';
my #subdirs = grep { -d and /\A[^.]/ } readdir $dh;
closedir $dh;
for my $subdir (#subdirs) {
$subdir = File::Spec->catdir($workdir, $subdir);
print STDERR "$subdir\n";
opendir my($dh), $subdir;
my #logs = grep { /\.log\z/i } readdir $dh;
closedir $dh;
#logs = grep { -f } map { File::Spec->catfile($subdir, $_) } #logs
for my $log (#logs) {
print STDERR " $log\n";
open my $fh, '<', $log;
while (<$fh>) {
print " $_" if /condition/;
}
}
}

Searching for Files with specific Regex in filename in Perl

Hi all I was wondering how I can go about searching for files in perl.
Right now I have a line with information that I have tokenized with tab as a delimiter stored into an array. (using split) These arrays contain stub text of filenames I want to search for in a directory. For example Engineering_4.txt would just be "Engin" in my array.
If there are two different files... Engineering_4 and Engineering_5, it would search both these files for content and just extract the information I need from one of them (only 1 contains information I want). I would imagine my script will have to search and store all file names that match and then search through each of these files.
My question is how do I go about searching for files in a directory matching a regular expression in Perl? Also is there a way to limit the file types that I want to search for. For example, I just want to only search for ".txt" files.
Thanks everyone
I guess since you already know the directory you could open it and read it while also filtering it :
opendir D, 'yourDirectory' or die "Could not open dir: $!\n";
my #filelist = grep(/yourRegex/i, readdir D);
You can do this using glob function of <glob> operator.
while (<Engin*.txt>) {
print "$_\n";
}
The glob function returns an array of matching files when provided a wildcard expression.
This means that the files can also be sort-ed before processing:
use Sort::Key::Natural 'natsort';
foreach my $file ( natsort glob "*.txt" ) { # Will loop over only txt files
open my $fh, '<', $file or die $!; # Open file and process
}
You can also use the File::Find module:
#!/usr/bin/env perl
use strict;
use warnings;
use File::Find;
my #dirs = #ARGV ? #ARGV : ('.');
my #list;
find( sub{
push #list, $File::Find::name if -f $_ && $_ =~ m/.+\.txt/ },
#dirs );
print "$_\n" for #list;

Grep in perl never seems to work for me

I have a simple script that reads and list file from a directory.
But I don't want to list hidden files, files with a " . " in front.
So I've tried using the grep function to do this, but it returns nothing. I get no listing of files.
opendir(Dir, $mydir);
while ($file = readdir(Dir)){
$file = grep !/^\./ ,readdir Dir;
print "$file\n";
I don't think I'm using the regular expression correctly.
I don't want to use an array cause the array doesn't format the list correctly.
You can either iterate over directory entries using a loop, or read all the entries in the directory at once:
while (my $file = readdir(Dir)) {
print "$file\n" if $file !~ /^\./;
}
or
my #files = grep { !/^\./ } readdir Dir;
See perldoc -f readdir.
You're calling readdir() twice in a loop. Don't.
or like so:
#!/usr/bin/env perl -w
use strict;
opendir my $dh, '.';
print map {$_."\n"} grep {!/^\./} readdir($dh);
Use glob:
my #files = glob( "$mydir/*" );
print "#files\n";
See perldoc -f glob for details.
while ($file = readdir(Dir))
{
print "\n$file" if ( grep !/^\./, $file );
}
OR you can use a regualr expression :
while ($file = readdir(Dir))
{
print "\n$file" unless ( $file =~ /^\./ );
}