Escaping brackets in file names - regex

I've got a few files named stuff like this: file (2).jpg. I'm writing a little Perl script to rename them, but I get errors due to the brackets not being replaced. So. Can someone tell me how to escape all the brackets (and spaces, if they cause a problem) in a string so I can pass it to a command. The script is below:
#Load all jpgs into an array.
#pix = `ls *.JPG`;
foreach $pix (#pix) {
#Let you know it's working
print "Processing photo ".$pix;
$pix2 = $pix;
$pix2 =~ \Q$pix\E; # Problem line
#Use the program exiv2 to rename the file with timestamp
system("exiv2 -r %Y_%m%d_%H%M%S $pix2");
}
The error is this:
Can't call method "Q" without a package or object reference at script.sh line [problem line].
This is my first time with regex, so I'm looking for the answers that explain what to do as well as giving an answer. Thanks for any help.

Why do not use a simple?
find . -name \*.JPG -exec exiv2 -r "%Y_%m%d_%H%M%S" "{}" \;
Ps:
The \Q disabling pattern metacharacters until \E inside the regex.
For example, if you want match a path "../../../somefile.jpg", you can write:
$file =~ m:\Q../../../somefile.jpg\E:;
instead of
$file =~ m:\.\./\.\./\.\./somefile\.jpg:; #e.g. escaping all "dots" what are an metacharacter for regex.

I found this perl renaming script that was written by Larry Wall a while back... it does what you need and so much more. I keep in in my $PATH, and use it daily...
#!/usr/bin/perl -w
use Getopt::Std;
getopts('ht', \%cliopts);
do_help() if( $cliopts{'h'} );
#
# rename script examples from lwall:
# pRename.pl 's/\.orig$//' *.orig
# pRename.pl 'y/A-Z/a-z/ unless /^Make/' *
# pRename.pl '$_ .= ".bad"' *.f
# pRename.pl 'print "$_: "; s/foo/bar/ if <stdin> =~ /^y/i' *
$op = shift;
for (#ARGV) {
$was = $_;
eval $op;
die $# if $#;
unless( $was eq $_ ) {
if( $cliopts{'t'} ) {
print "mv $was $_\n";
} else {
rename($was,$_) || warn "Cannot rename $was to $_: $!\n";
}
}
}
sub do_help {
my $help = qq{
Usage examples for the rename script example from Larry Wall:
pRename.pl 's/\.orig\$//' *.orig
pRename.pl 'y/A-Z/a-z/ unless /^Make/' *
pRename.pl '\$_ .= ".bad"' *.f
pRename.pl 'print "\$_: "; s/foo/bar/ if <stdin> =~ /^y/i' *
CLI Options:
-h This help page
-t Test only, do not move the files
};
die "$help\n";
return 0;
}

Related

Shell script regex rename

Hi I am trying to rename 1000 of files from
xyzPL 7-1-16 page1+2(1).xlsx into 7-1-16.xlsx
xyzPL 12-1-16 page1+2(1).xls into 12-1-16.xls
xyzPL 12-10-16 page1+2(1).xls into 12-10-16.xls
So far I have following for loop
for in *.xls; do echo mv "$f" "${f/_*_/_}"; done
What the expression should I put for ${f/_*_/_}
Thank you!
I'd suggest looking into the rename (or, on some platforms, prename) facility.
It's not part of bash itself but should be available in all the regular distros.
It allows Perl regular expressions to be used to rename files and will almost certainly be a good sight quicker than a bash-based for loop.
By way of example, the following command should handle the three cases you've shown:
rename -n 's/^xyzPL (\d+-\d+-\d+).?*\.(xl‌​sx?)$/$1.$2/' xyzPL*.xls xyzPL*.xlsx
It captures the n-n-n bit into $1 and the file extension into $2 and then performs a simple (as if anything in Perl could be considered simple) substitution.
Note particularly the -n flag, this will print out what the command will do without actually doing it. It's very useful for checking what will happen before actually doing it.
Once you're satisfied it won't screw up everything, just run it again without the -n. Of course, being the paranoid type, I'd tend to back up the entire directory anyway.
A slightly souped-up version of the Perl-based rename command, originally from the 1st Edition of the Camel Book (Programming in Perl, by Larry Wall).
#!/usr/bin/env perl
#
# #(#)$Id: rename.pl,v 1.8 2011/06/03 22:30:22 jleffler Exp $
#
# Rename files using a Perl substitute or transliterate command
use strict;
use warnings;
use Getopt::Std;
my(%opts);
my($usage) = "Usage: $0 [-fnxV] perlexpr [filenames]\n";
my($force) = 0;
my($noexc) = 0;
my($trace) = 0;
die $usage unless getopts('fnxV', \%opts);
if ($opts{V})
{
printf "%s\n", q'RENAME Version $Revision: 1.8 $ ($Date: 2011/06/03 22:30:22 $)';
exit 0;
}
$force = 1 if ($opts{f});
$noexc = 1 if ($opts{n});
$trace = 1 if ($opts{x});
my($op) = shift;
die $usage unless defined $op;
if (!#ARGV) {
#ARGV = <STDIN>;
chop(#ARGV);
}
for (#ARGV)
{
if (-e $_ || -l $_)
{
my($was) = $_;
eval $op;
die $# if $#;
next if ($was eq $_);
if ($force == 0 && -f $_)
{
print STDERR "rename failed: $was - $_ exists\n";
}
else
{
print "+ $was --> $_\n" if $trace;
print STDERR "rename failed: $was - $!\n"
unless ($noexc || rename($was, $_));
}
}
else
{
print STDERR "$_ - $!\n";
}
}
Without using rename utility you can do this in pure BASH:
for file in *.xls*; do
f="${file#* }"
mv "$file" "${f/ *./.}"
done
You could do a subshell with the file name piped to awk...
$(echo "$f" | awk -F '[ .]' '{ print $2 "." $4 }')

Excluding a file with perl grep

I want to go over all of the files in the directory, except for files ending with '.py'.
The line in the existing script is:
my #files = sort(grep(!/^(\.|\.\.)$/, readdir($dir_h)));
And I want something like:
my #files = sort(grep(!/^(\.|\.\.|"*.py")$/, readdir($dir_h)));
Can you please help with the exact syntax?
grep uses regular expressions, not globs (aka wildcards). The correct syntax is
my #files = sort(grep(!/^(\.|\.\.|.*\.py)$/, readdir($dir_h)));
or, without the unnecessary parentheses
my #files = sort grep ! /^(\.|\.\.|.*\.py)$/, readdir $dir_h;
As the parentheses in the regular expression aren't used for capturing, but only for precedence, you can change them to non-capturing:
my #files = sort grep ! /^(?:\.|\.\.|.*\.py)$/, readdir $dir_h;
You can express the same in many different ways, e.g.
/^\.{1,2}$|\.py$/
i.e. dot once or twice with nothing around, or .py at the end.
perl's build in grep is actually very clever - it iterates an array, applying a condition to each element in turn. It sets each element to $_.
This condition can be a simple regular expression, but it doesn't have to be.
So you can - for example:
my #files = grep { -f $_ } readir(DIR);
But because -f defaults to $_ you can also:
my #files = grep { -f } readdir (DIR);
You can also apply a regular expression to $_
my #files = grep { not m/\.py$/ } readdir (DIR);
(Note - this is the same as not $_ =~ m/\.py$/ - patterns apply to $_ by default).
So you can do what you want by:
my #files = sort grep { not m/\.py$/ and -f } readdir (DIR);
Although note - that will work in the current working directory, not for reading a separate path. You can use readdir for different directories, but personally I prefer glob - because it fills in the path as well:
my #files = sort grep { not m/\.py$/ and -f } glob ( "$dir/*" );
Check that the directory entries are files and then exclude those that end in .py:
#!/usr/bin/env perl
use warnings;
use strict;
my $dir = "/home/me/somedir";
# good examples in the perldoc:
# perldoc -f readdir
opendir(my $DIR, $dir) || die "Unable to open $dir : $!";
# -f checks that it is a plain file ( perldoc perlfunc )
# !~ means does not match ( perldoc perlre )
# m|\.py$| means a match string that ends in '.py'
my #files = sort grep { -f "$dir/$_" && $_ !~ m|\.py$| } readdir($DIR);

Find/Replace in files recursively but touch only files with matches

I would like to quickly search and replace with or without regexp in files recursively. In addition, I need to search only in specific files and I do not want to touch the files that do not match my search_pattern otherwise git will think all the parsed files were modified (it what happens with find . --exec sed).
I tried many solutions that I found on internet using find, grep, sed or ack but I don't think they are really good to match specific files only.
Eventually I wrote this perl script:
#!/bin/perl
use strict;
use warnings;
use File::Find;
my $search_pattern = $ARGV[0];
my $replace_pattern = $ARGV[1];
my $file_pattern = $ARGV[2];
my $do_replace = 0;
sub process {
return unless -f;
return unless /(.+)[.](c|h|inc|asm|mac|def|ldf|rst)$/;
open F, $_ or print "couldn't open $_\n" && return;
my $file = $_;
my $i = 0;
while (<F>) {
if (m/($search_pattern)/o) {$i++};
}
close F;
if ($do_replace and $i)
{
printf "found $i occurence(s) of $search_pattern in $file\n";
open F, "+>".$file or print "couldn't open $file\n" && return;
while (<F>)
{
s/($search_pattern)/($replace_pattern)/g;
print F;
}
close F;
}
}
find(\&process, ".");
My question is:
Is there any better solution like this one below (which not exists) ?
`repaint -n/(.+)[.](c|h|inc|asm|mac|def|ldf|rst)$/ s/search/replacement/g .`
Subsidiary questions:
How's my perl script ? Not too bad ? Do I really need to reopen every files that match my search_pattern ?
How people deal with this trivial task ? Almost every good text editor have a "Search and Replace in files" feature, but not vim. How vim users can do this ?
Edit:
I also tried this script ff.pl with ff | xargs perl -pi -e 's/foo/bar/g' but it doesnt work as I expected. It created a backup .bak even though I didn't give anything after the -pi. It seems it is the normal behaviour within cygwin but with this I cannot really use perl -pi -e
#!/bin/perl
use strict;
use warnings;
use File::Find;
use File::Basename;
my $ext = $ARGV[0];
sub process {
return unless -f;
return unless /\.(c|h|inc|asm|mac|def|ldf|rst)$/;
print $File::Find::name."\n" ;
}
find(\&process, ".");
Reedit:
I finally came across this solution (under cygwin I need to remove the backup files)
find . | egrep '\.(c|h|asm|inc)$' | xargs perl -pi.winsucks -e 's/<search>/<replace>/g'
find . | egrep '\.(c|h|asm|inc)\.winsucks$' | xargs rm
The following is a cleaned up version of your code.
Always include use strict; and use warnings at the top of EVERY perl script. If you're doing file processing, include use autodie; as well.
Go ahead and slurp the entire file. That way you only have to read and write optionally write it once.
Consider using File::Find::Rule for cases like this. Your implmentation using File::Find works, and actually is probably the preferred module in this case, but I like the interface for the latter.
I removed the capture groups from the regex. In ones in the RHS were a bug, and the ones in the LHS were superfluous.
And the code:
use strict;
use warnings;
use autodie;
use File::Find;
my $search_pattern = $ARGV[0];
my $replace_pattern = $ARGV[1];
my $file_pattern = $ARGV[2];
my $do_replace = 0;
sub process {
return if !-f;
return if !/[.](?:c|h|inc|asm|mac|def|ldf|rst)$/;
my $data = do {
open my $fh, '<', $_;
local $/;
<$fh>;
};
my $count = $data =~ s/$search_pattern/$replace_pattern/g
or return;
print "found $count occurence(s) of $search_pattern in $_\n";
return if !$do_replace;
open my $fh, '>', $_;
print $fh $data;
close $fh;
}
find(\&process, ".");
Not bad, but several minor notes:
$do_replace is always 0 so it will not replace
in-place open F, "+>" will not work on cygwin + windows
m/($search_pattern)/o /o is good, () is not needed.
$file_pattern is ignored, you overwrite it with your own
s/($search_pattern)/($replace_pattern)/g;
() is unneeded and will actually disturb a counter in the $replace_pattern
/(.+)[.](c|h|inc|asm|mac|def|ldf|rst)$/ should be written as
/\.(c|h|inc|asm|mac|def|ldf|rst)$/ and maybe /i also
Do I really need to reopen every files that match my search_pattern ?
You don't do.
Have no idea about vim, I use emacs, which has several method to accomplish this.
What's wrong with the following command?
:grep foo **/*.{foo,bar,baz}
:cw
It won't cause any problem with any VCS and is pretty basic Vimming.
You are right that Vim doesn't come with a dedicated "Search and Replace in files" feature but there are plugins for that.
why not just:
grep 'pat' -rl *|xargs sed -i 's/pat/rep/g'
or I didn't understand the Q right?
I suggest find2perl if it doesn't work out of the box, you can tweak the code it generates:
find2perl /tmp \! -name ".*?\.(c|h|inc|asm|mac|def|ldf|rst)$" -exec "sed -e s/aaa/bbb/g {}"
it will print the following code to stdout:
#! /usr/bin/perl -w
eval 'exec /usr/bin/perl -S $0 ${1+"$#"}'
if 0; #$running_under_some_shell
use strict;
use File::Find ();
# Set the variable $File::Find::dont_use_nlink if you're using AFS,
# since AFS cheats.
# for the convenience of &wanted calls, including -eval statements:
use vars qw/*name *dir *prune/;
*name = *File::Find::name;
*dir = *File::Find::dir;
*prune = *File::Find::prune;
sub wanted;
sub doexec ($#);
use Cwd ();
my $cwd = Cwd::cwd();
# Traverse desired filesystems
File::Find::find({wanted => \&wanted}, '/tmp');
exit;
sub wanted {
my ($dev,$ino,$mode,$nlink,$uid,$gid);
(($dev,$ino,$mode,$nlink,$uid,$gid) = lstat($_)) &&
! /^\..*.?\\.\(c|h|inc|asm|mac|def|ldf|rst\)\$\z/s &&
doexec(0, 'sed -e s/aaa/bbb/g {}');
}
sub doexec ($#) {
my $ok = shift;
my #command = #_; # copy so we don't try to s/// aliases to constants
for my $word (#command)
{ $word =~ s#{}#$name#g }
if ($ok) {
my $old = select(STDOUT);
$| = 1;
print "#command";
select($old);
return 0 unless <STDIN> =~ /^y/;
}
chdir $cwd; #sigh
system #command;
chdir $File::Find::dir;
return !$?;
}
If you want to execute, you can pipe it to perl:
find2perl /tmp \! -name ".*?\.(c|h|inc|asm|mac|def|ldf|rst)$" -exec "sed -e s/aaa/bbb/g" | perl
You can try this plugin for Vim:
https://github.com/skwp/greplace.vim
Basically, it allows you to type in a search phases (with/without regex) and ask you for the files to search in.

How can I strip all comments from a Perl script except for the shebang line?

I have a Perl script that strips comments from other Perl scripts:
open (INFILE, $file);
#data = <INFILE>;
foreach $data (#data)
{
$data =~ s/#.*/ /g;
print "$data";
}
The problem is, this code also removes the shebang line:
#!/usr/bin/perl
How can I strip comments except for the shebang?
Writing code to strip comments is not trivial, since the # character can be used in other contexts than just comments. Use perltidy instead:
perltidy --delete-block-comments --delete-side-comments foo
will strip # comments (but not POD) from file foo and write the output to foo.tdy. The shebang is not stripped.
There is a method PPR::decomment() that can be used:
use strict;
use warnings;
use PPR;
my $document = <<'EOF';
print "\n###################################\n";
print '\n###################################\n';
print '\nFollowed by comment \n'; # The comment
return $function && $function !~ /^[\s{}#]/;
EOF
my $res = PPR::decomment( $document );
print $res;
Output:
print "\n###################################\n";
print '\n###################################\n';
print '\nFollowed by comment \n';
return $function && $function !~ /^[\s{}#]/;
perltidy is the method to do this if it's anything but an exercise. There's also PPI for parsing perl. Could use the PPI::Token::Comment token to do something more complicated than just stripping.
However, to answer your direct question, don't try to solve everything in a single regex. Instead, break up your problems into logic pieces of information and logic. In this instead, if you want to skip the first line, do so by using line by line processing which conveniently sets the current line number in $.
use strict;
use warnings;
use autodie;
my $file = '... your file...';
open my $fh, '<', $file;
while (<$fh>) {
if ($. != 1) {
s/#.*//;
}
print;
}
Disclaimer
The approach of using regex's for this problem is definitely flawed as everyone has already said. However, I'm going to give your instructor the benefit of the doubt, and that she/he is aiming to teach by intentionally giving you a problem that is outside of the perview of regex's ability. Good look finding all of those edge cases and figuring out how to do with them.
Whatever you do, don't try to solve them using a single regex. Break your problem up and use lots of if's and elsif's
Since you asked for a regex solution:
'' =~ /(?{
system("perltidy", "--delete-block-comments", "--delete-side-comments", $file);
die "Can't launch perltidy: $!\n" if $? == -1;
die "perltidy killed by signal ".( $? & 0x7F )."\n" if $? & 0x7F;
die "perltidy exited with error ".( $? >> 8 )."\n" if $? >> 8;
});
It seems like you are leaning towards using the following:
#!/usr/bin/perl
while (<>) {
if ($. != 1) {
s/#.*//;
}
print;
}
But it doesn't work on itself:
$ chmod u+x stripper.pl
$ stripper.pl stripper.pl >stripped_stripper.pl
$ chmod u+x stripped_stripper.pl
$ stripped_stripper.pl stripper.pl
Substitution pattern not terminated at ./stripped_stripper.pl line 4.
$ cat stripped_stripper.pl
#!/usr/bin/perl
while (<>) {
if ($. != 1) {
s/
}
print;
}
It also fails to remove comments on the first line:
$ cat >first.pl
# This is my first Perl program!
print "Hello, World!\n";
$ stripper.pl first.pl
# This is my first Perl program!
print "Hello, World!\n";

Using BASH - Find CSS block or definition and print to screen

I have a number of .css files spread across some directories. I need to find those .css files, read them and if they contain a particular class definition, print it to the screen.
For example, im looking for ".ExampleClass" and it exists in /includes/css/MyStyle.css, i would want the shell command to print
.ExampleClass {
color: #ff0000;
}
Use find to filter on all css files and execute a sed script on those files printing lines, between two regular expressions:
find ${DIR} -type f -name "*.css" -exec sed -n '/\.ExampleClass.{/,/}/p' \{\} \+
Considering that the css file can have multiline class definitions, and that there can be several ocurrences in the same file, I'd bet perl is the way to go.
For example:
#input: css filename , and css class name without dot (in the example, ExampleClass)
my ($filen,$classn) = #ARGV;
my $out = findclassuse($filen,$classn);
# print filename and result if non empty
print ("===== $filen : ==== \n" . $out . "\n") if ($out);
sub findclassuse {
my ($filename,$classname) = #_;
my $output = "";
open(my $fh, '<', $filename) or die $!;
$/ = undef; # so that i read the full file content
my $css = <$fh>;
$css =~ s#/\*.*?\*/# #g; # strip out comments
close $fh;
while($css =~ /([^}{]*\.$classname\b.*?{.*?})/gs) {
$output .= "\n\n" . $1;
}
return $output;
}
But this is not 100% foolproof, there remains some issues with comments, and the css parsing is surely not perfect.
find /starting/directory -type f -name '*.css' | xargs -ti grep '\.ExampleClass' {}
will find all the css files, print the filename and search string and then print the results of the grep. You could pipe the output through sed to remove any unnecessary text.
ETA: the regex needs work if we want to catch multiline expressions. Likely the EOL character should be set to } so that complete classes are considered one line. If this were done, then piping the find to perl -e rather than grep would be more effective
Assuming you never do anything weird, like putting the opening brace on a separate line, or putting an unindented (nested) closing brace before the intended one, you can do this:
sed -n '/\.ExampleClass *{/,/^}/p' *.css
And if the files are all over a directory structure:
find . -name *.css | xargs sed ...
This version handles multi-line blocks as well as blocks on a single line:
sed -n '/^[[:space:]]*\.ExampleClass[[:space:]]*{/{p;q}; /^[[:space:]]*\.ExampleClass[[:space:]]*{/,/}/p'
Examples:
foo { bar }
or
foo {
bar
}