Excluding a file with perl grep

Excluding a file with perl grep - regex

I want to go over all of the files in the directory, except for files ending with '.py'.
The line in the existing script is:
my #files = sort(grep(!/^(\.|\.\.)$/, readdir($dir_h)));
And I want something like:
my #files = sort(grep(!/^(\.|\.\.|"*.py")$/, readdir($dir_h)));
Can you please help with the exact syntax?

grep uses regular expressions, not globs (aka wildcards). The correct syntax is
my #files = sort(grep(!/^(\.|\.\.|.*\.py)$/, readdir($dir_h)));
or, without the unnecessary parentheses
my #files = sort grep ! /^(\.|\.\.|.*\.py)$/, readdir $dir_h;
As the parentheses in the regular expression aren't used for capturing, but only for precedence, you can change them to non-capturing:
my #files = sort grep ! /^(?:\.|\.\.|.*\.py)$/, readdir $dir_h;
You can express the same in many different ways, e.g.
/^\.{1,2}$|\.py$/
i.e. dot once or twice with nothing around, or .py at the end.

perl's build in grep is actually very clever - it iterates an array, applying a condition to each element in turn. It sets each element to $_.
This condition can be a simple regular expression, but it doesn't have to be.
So you can - for example:
my #files = grep { -f $_ } readir(DIR);
But because -f defaults to $_ you can also:
my #files = grep { -f } readdir (DIR);
You can also apply a regular expression to $_
my #files = grep { not m/\.py$/ } readdir (DIR);
(Note - this is the same as not $_ =~ m/\.py$/ - patterns apply to $_ by default).
So you can do what you want by:
my #files = sort grep { not m/\.py$/ and -f } readdir (DIR);
Although note - that will work in the current working directory, not for reading a separate path. You can use readdir for different directories, but personally I prefer glob - because it fills in the path as well:
my #files = sort grep { not m/\.py$/ and -f } glob ( "$dir/*" );

Check that the directory entries are files and then exclude those that end in .py:
#!/usr/bin/env perl
use warnings;
use strict;
my $dir = "/home/me/somedir";
# good examples in the perldoc:
# perldoc -f readdir
opendir(my $DIR, $dir) || die "Unable to open $dir : $!";
# -f checks that it is a plain file ( perldoc perlfunc )
# !~ means does not match ( perldoc perlre )
# m|\.py$| means a match string that ends in '.py'
my #files = sort grep { -f "$dir/$_" && $_ !~ m|\.py$| } readdir($DIR);

Related

Script to add line with part of the pattern used to find the line

I'd like to parse all *.php files, and for each line like
$res = $DB -> query($queryVar);
I need to get:
file_put_contents('php://stderr', print_r($queryVar, TRUE));
$res = $DB -> query($queryVar);
The name of the variable $queryVar may change! I need to get it from the code!
My initial idea:
find -not -path "*/\." -name "*.php" -type f -print0 | xargs -0 sed -i 's,SOMETHING,SOMETHING,'
but it seems to be not possible to get the name of the query variable with sed.
I also started looking at Perl: Perl: append a line after the last line that match a pattern (but incrementing part of the pattern)
But I was able to do only this:
perl -pe 's/(-> query\(.*\))/AAAAA $1 AAAAA\n$1/' < filename.php
With 2 problems: I get the result on standard output, I need something like sed to edit the original file, as I will call it from find | xargs and anyway I get the whole found line and not only the variable:
$res = $DB AAAAA -> query( $SQL) AAAAA
-> query( $SQL);

Given a file named filename.php, you can run the following command:
perl -pi -e 's/^(.+-> query\((.+?)\).*)$/file_put_contents\("php:\/\/stderr", print_r\($2, TRUE\)\);\n$1/;' filename.php
It will update the file in-place with the substitution you intended to perform.

You can use perl's -i flag to edit the file in place.
To only capture the query variable you need to add a capture group within the () part, as follows:
perl -i -pe 's/^(.*-> query\((.*)\);)$/inserted_code_here($2);\n$1/' x.php
Then replace inserted_code_here with whatever you want to put on the line before the query call.

You can use perl like sed. But really, by doing so you throw away a lot of its potential as a language. I couldn't quite tell from your question - is $queryVar a literal, or is it a variable you need to replace?
Why not try this:
#!/usr/bin/perl
use strict;
use warnings;
use File::Find;
sub process_php {
next unless m/\.php$/;
open( my $input, "<", $File::Find::name ) or warn $!;
open( my $output, ">", $File::Find::name . ".new" )
or warn $!;
while ( my $line = <$input> ) {
my ($query_id) = ( $line =~ m/-> query\((.*)\))/ );
if ($query_id) {
print {$output} "file_put_contents('php://stderr', print_r(",
$query_id, " TRUE));\n";
}
print {$output} $line;
}
close($input);
close($output);
}
find( \&process_php, "/path/to/php/files" );
This will:
search all the '*.php' files under the directory path.
traverse them looking for your string.
If it exists, add a new line just before it.
write a '.new' file, with the new content (Once you're happy this works, you can swap 'em over).

Find/Replace in files recursively but touch only files with matches

I would like to quickly search and replace with or without regexp in files recursively. In addition, I need to search only in specific files and I do not want to touch the files that do not match my search_pattern otherwise git will think all the parsed files were modified (it what happens with find . --exec sed).
I tried many solutions that I found on internet using find, grep, sed or ack but I don't think they are really good to match specific files only.
Eventually I wrote this perl script:
#!/bin/perl
use strict;
use warnings;
use File::Find;
my $search_pattern = $ARGV[0];
my $replace_pattern = $ARGV[1];
my $file_pattern = $ARGV[2];
my $do_replace = 0;
sub process {
return unless -f;
return unless /(.+)[.](c|h|inc|asm|mac|def|ldf|rst)$/;
open F, $_ or print "couldn't open $_\n" && return;
my $file = $_;
my $i = 0;
while (<F>) {
if (m/($search_pattern)/o) {$i++};
}
close F;
if ($do_replace and $i)
{
printf "found $i occurence(s) of $search_pattern in $file\n";
open F, "+>".$file or print "couldn't open $file\n" && return;
while (<F>)
{
s/($search_pattern)/($replace_pattern)/g;
print F;
}
close F;
}
}
find(\&process, ".");
My question is:
Is there any better solution like this one below (which not exists) ?
`repaint -n/(.+)[.](c|h|inc|asm|mac|def|ldf|rst)$/ s/search/replacement/g .`
Subsidiary questions:
How's my perl script ? Not too bad ? Do I really need to reopen every files that match my search_pattern ?
How people deal with this trivial task ? Almost every good text editor have a "Search and Replace in files" feature, but not vim. How vim users can do this ?
Edit:
I also tried this script ff.pl with ff | xargs perl -pi -e 's/foo/bar/g' but it doesnt work as I expected. It created a backup .bak even though I didn't give anything after the -pi. It seems it is the normal behaviour within cygwin but with this I cannot really use perl -pi -e
#!/bin/perl
use strict;
use warnings;
use File::Find;
use File::Basename;
my $ext = $ARGV[0];
sub process {
return unless -f;
return unless /\.(c|h|inc|asm|mac|def|ldf|rst)$/;
print $File::Find::name."\n" ;
}
find(\&process, ".");
Reedit:
I finally came across this solution (under cygwin I need to remove the backup files)
find . | egrep '\.(c|h|asm|inc)$' | xargs perl -pi.winsucks -e 's/<search>/<replace>/g'
find . | egrep '\.(c|h|asm|inc)\.winsucks$' | xargs rm

The following is a cleaned up version of your code.
Always include use strict; and use warnings at the top of EVERY perl script. If you're doing file processing, include use autodie; as well.
Go ahead and slurp the entire file. That way you only have to read and write optionally write it once.
Consider using File::Find::Rule for cases like this. Your implmentation using File::Find works, and actually is probably the preferred module in this case, but I like the interface for the latter.
I removed the capture groups from the regex. In ones in the RHS were a bug, and the ones in the LHS were superfluous.
And the code:
use strict;
use warnings;
use autodie;
use File::Find;
my $search_pattern = $ARGV[0];
my $replace_pattern = $ARGV[1];
my $file_pattern = $ARGV[2];
my $do_replace = 0;
sub process {
return if !-f;
return if !/[.](?:c|h|inc|asm|mac|def|ldf|rst)$/;
my $data = do {
open my $fh, '<', $_;
local $/;
<$fh>;
};
my $count = $data =~ s/$search_pattern/$replace_pattern/g
or return;
print "found $count occurence(s) of $search_pattern in $_\n";
return if !$do_replace;
open my $fh, '>', $_;
print $fh $data;
close $fh;
}
find(\&process, ".");

Not bad, but several minor notes:
$do_replace is always 0 so it will not replace
in-place open F, "+>" will not work on cygwin + windows
m/($search_pattern)/o /o is good, () is not needed.
$file_pattern is ignored, you overwrite it with your own
s/($search_pattern)/($replace_pattern)/g;
() is unneeded and will actually disturb a counter in the $replace_pattern
/(.+)[.](c|h|inc|asm|mac|def|ldf|rst)$/ should be written as
/\.(c|h|inc|asm|mac|def|ldf|rst)$/ and maybe /i also
Do I really need to reopen every files that match my search_pattern ?
You don't do.
Have no idea about vim, I use emacs, which has several method to accomplish this.

What's wrong with the following command?
:grep foo **/*.{foo,bar,baz}
:cw
It won't cause any problem with any VCS and is pretty basic Vimming.
You are right that Vim doesn't come with a dedicated "Search and Replace in files" feature but there are plugins for that.

why not just:
grep 'pat' -rl *|xargs sed -i 's/pat/rep/g'
or I didn't understand the Q right?

I suggest find2perl if it doesn't work out of the box, you can tweak the code it generates:
find2perl /tmp \! -name ".*?\.(c|h|inc|asm|mac|def|ldf|rst)$" -exec "sed -e s/aaa/bbb/g {}"
it will print the following code to stdout:
#! /usr/bin/perl -w
eval 'exec /usr/bin/perl -S $0 ${1+"$#"}'
if 0; #$running_under_some_shell
use strict;
use File::Find ();
# Set the variable $File::Find::dont_use_nlink if you're using AFS,
# since AFS cheats.
# for the convenience of &wanted calls, including -eval statements:
use vars qw/*name *dir *prune/;
*name = *File::Find::name;
*dir = *File::Find::dir;
*prune = *File::Find::prune;
sub wanted;
sub doexec ($#);
use Cwd ();
my $cwd = Cwd::cwd();
# Traverse desired filesystems
File::Find::find({wanted => \&wanted}, '/tmp');
exit;
sub wanted {
my ($dev,$ino,$mode,$nlink,$uid,$gid);
(($dev,$ino,$mode,$nlink,$uid,$gid) = lstat($_)) &&
! /^\..*.?\\.\(c|h|inc|asm|mac|def|ldf|rst\)\$\z/s &&
doexec(0, 'sed -e s/aaa/bbb/g {}');
}
sub doexec ($#) {
my $ok = shift;
my #command = #_; # copy so we don't try to s/// aliases to constants
for my $word (#command)
{ $word =~ s#{}#$name#g }
if ($ok) {
my $old = select(STDOUT);
$| = 1;
print "#command";
select($old);
return 0 unless <STDIN> =~ /^y/;
}
chdir $cwd; #sigh
system #command;
chdir $File::Find::dir;
return !$?;
}
If you want to execute, you can pipe it to perl:
find2perl /tmp \! -name ".*?\.(c|h|inc|asm|mac|def|ldf|rst)$" -exec "sed -e s/aaa/bbb/g" | perl

You can try this plugin for Vim:
https://github.com/skwp/greplace.vim
Basically, it allows you to type in a search phases (with/without regex) and ask you for the files to search in.

Whats wrong with my perl substitution?

I have a directory of files I am trying to split down into subdirectories using perl due to the quantity of files. The filenames are formatted with dates at the start in the form YYYYMMDD and I'm trying to split on that. I am using the following code adapted from this StackOverflow Answer:
#!/usr/bin/perl -w
use strict;
opendir DIR, "." or die "opendir: $!";
my #files = readdir(DIR);
closedir DIR;
foreach my $f (#files) {
-f $f or next;
(my $new_name = $f) =~ s!^((....)(..)(..).*)$!$2/$3/$4/$1/;
-e $new_name and die "$new_name already exists";
rename($f, $new_name);
}
However I get a 'Substitution replacement not terminated at movefiles.pl line 10.' when I try and run this code. As far as I can see I am escaping and terminating the substitution correctly?

You are using ! as a regular expression delimiter. You have one to start it, one to separate the match part from the replace part, but don't have one at the end.

Escaping brackets in file names

I've got a few files named stuff like this: file (2).jpg. I'm writing a little Perl script to rename them, but I get errors due to the brackets not being replaced. So. Can someone tell me how to escape all the brackets (and spaces, if they cause a problem) in a string so I can pass it to a command. The script is below:
#Load all jpgs into an array.
#pix = `ls *.JPG`;
foreach $pix (#pix) {
#Let you know it's working
print "Processing photo ".$pix;
$pix2 = $pix;
$pix2 =~ \Q$pix\E; # Problem line
#Use the program exiv2 to rename the file with timestamp
system("exiv2 -r %Y_%m%d_%H%M%S $pix2");
}
The error is this:
Can't call method "Q" without a package or object reference at script.sh line [problem line].
This is my first time with regex, so I'm looking for the answers that explain what to do as well as giving an answer. Thanks for any help.

Why do not use a simple?
find . -name \*.JPG -exec exiv2 -r "%Y_%m%d_%H%M%S" "{}" \;
Ps:
The \Q disabling pattern metacharacters until \E inside the regex.
For example, if you want match a path "../../../somefile.jpg", you can write:
$file =~ m:\Q../../../somefile.jpg\E:;
instead of
$file =~ m:\.\./\.\./\.\./somefile\.jpg:; #e.g. escaping all "dots" what are an metacharacter for regex.

I found this perl renaming script that was written by Larry Wall a while back... it does what you need and so much more. I keep in in my $PATH, and use it daily...
#!/usr/bin/perl -w
use Getopt::Std;
getopts('ht', \%cliopts);
do_help() if( $cliopts{'h'} );
#
# rename script examples from lwall:
# pRename.pl 's/\.orig$//' *.orig
# pRename.pl 'y/A-Z/a-z/ unless /^Make/' *
# pRename.pl '$_ .= ".bad"' *.f
# pRename.pl 'print "$_: "; s/foo/bar/ if <stdin> =~ /^y/i' *
$op = shift;
for (#ARGV) {
$was = $_;
eval $op;
die $# if $#;
unless( $was eq $_ ) {
if( $cliopts{'t'} ) {
print "mv $was $_\n";
} else {
rename($was,$_) || warn "Cannot rename $was to $_: $!\n";
}
}
}
sub do_help {
my $help = qq{
Usage examples for the rename script example from Larry Wall:
pRename.pl 's/\.orig\$//' *.orig
pRename.pl 'y/A-Z/a-z/ unless /^Make/' *
pRename.pl '\$_ .= ".bad"' *.f
pRename.pl 'print "\$_: "; s/foo/bar/ if <stdin> =~ /^y/i' *
CLI Options:
-h This help page
-t Test only, do not move the files
};
die "$help\n";
return 0;
}

Grep in perl never seems to work for me

I have a simple script that reads and list file from a directory.
But I don't want to list hidden files, files with a " . " in front.
So I've tried using the grep function to do this, but it returns nothing. I get no listing of files.
opendir(Dir, $mydir);
while ($file = readdir(Dir)){
$file = grep !/^\./ ,readdir Dir;
print "$file\n";
I don't think I'm using the regular expression correctly.
I don't want to use an array cause the array doesn't format the list correctly.

You can either iterate over directory entries using a loop, or read all the entries in the directory at once:
while (my $file = readdir(Dir)) {
print "$file\n" if $file !~ /^\./;
}
or
my #files = grep { !/^\./ } readdir Dir;
See perldoc -f readdir.

You're calling readdir() twice in a loop. Don't.

or like so:
#!/usr/bin/env perl -w
use strict;
opendir my $dh, '.';
print map {$_."\n"} grep {!/^\./} readdir($dh);

Use glob:
my #files = glob( "$mydir/*" );
print "#files\n";
See perldoc -f glob for details.

while ($file = readdir(Dir))
{
print "\n$file" if ( grep !/^\./, $file );
}
OR you can use a regualr expression :
while ($file = readdir(Dir))
{
print "\n$file" unless ( $file =~ /^\./ );
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Excluding a file with perl grep - regex

Related

Script to add line with part of the pattern used to find the line

Find/Replace in files recursively but touch only files with matches

Whats wrong with my perl substitution?

Escaping brackets in file names

Grep in perl never seems to work for me

Categories

Resources