How to use a variable name in a Perl Regex bash script - regex

I am trying to write a bash shell script to rename a bunch of photos to my own numbering system. All images filenames are like "IMG_0000.JPG" and I can get the script to match and rename(overwrite) all the photos with the following Perl-regex code:
#!/bin/bash
rename -f 's/\w{4}\d{4}.JPG/replacement.jpg/' *.JPG;
But when I try to use a variable as the name of the replacement, as I keep seeing on other posts here and elsewhere on the internet, nothing happens:
#!/bin/bash
$replacement = "000.jpg";
rename -f 's/\w{4}\d{4}.JPG/$replacement/' *.JPG;
How can I get such a variable to work correctly in my bash script? (NOTE: I am not looking to simply strip the "IMG_" from the filename)

Take the replacement out of single quotes:
#!/bin/bash
$replacement="000.jpg"
rename -f 's/\w{4}\d{4}.JPG/'$replacement'/' *.JPG
Bash does not inspect single quoted strings for interpolation.

Using double quotes and correct variable assignment:
#!/bin/bash
replacement="000.jpg"
rename -f "s/\w{4}\d{4}\.JPG/$replacement/" *.JPG
Note that this can cause trouble, e.g. when renaming two files with names like IMG_0001.JPG and FOO_9352.JPG: The first file will be renamed to 000.jpg, then the second file will also be renamed to 000.jpg, overwriting the first.

Related

How to update path in Perl script in multiple files

I am working on creating some training material where I am using perl. One of the things I want to do is have the scripts be set up for the student correctly, regardless of where they extra the compressed files. I am working on a Windows batch file that will copy the perl templates to the working location and then update path in the copy of the perl template files to the correct location. The perl template have this as the first line:
#!_BASE_software/perl/bin/perl.exe
The batch file looks like this:
SET TRAINING=%~dp0
copy %TRAINING%\template\*.pl %TRAINING%work
%TRAINING%software\perl\bin\perl -pi.bak -e 's/_BASE_/%TRAINING%/g' %TRAINING%work\*.pl
I have a few problems with this:
Perl doesn't seem to like the wildcard in the filename
It turns out that %TRAINING% is going to expand into a string with backslashes which need to be converted into forwardslashes and needs to be escaped within the regex.
How do I fix this?
First of all, Windows doesn't use the shebang line, so I'm not sure why you're doing any of this work in the first place.
Perl will read the shebang line and look for options if perl is found in the path, even on Windows, but that means that #!perl is sufficient if you want to pass options via the shebang line (e.g. #!perl -n).
Now, it's possible that you use Cygwin, MSYS or some other unix emulation instead of Windows to run the program, but you are placing a Windows path in the shebang line (C:...) rather than a unix path, so that doesn't make sense either.
There are three additional problems with the attempt:
cmd uses double-quotes for quoting.
cmd doesn't perform wildcard expansion like sh, so it's up to your program do it.
You are trying to generate Perl code from cmd. ouch.
If we go ahead, we get:
"%TRAINING%software\perl\bin\perl" -MFile::DosGlob=glob -pe"BEGIN { #ARGV = map glob, #ARGV; $base = $ENV{TRAINING} =~ s{\\}{/}rg } s/_BASE_/$base/g" -i.bak -- %TRAINING%work\*.pl
If we add line breaks for readability, we get the following (that cmd won't accept):
"%TRAINING%software\perl\bin\perl"
-MFile::DosGlob=glob
-pe"
BEGIN {
#ARGV = map glob, #ARGV;
$base = $ENV{TRAINING} =~ s{\\}{/}rg
}
s/_BASE_/$base/g
"
-i.bak -- %TRAINING%work\*.pl

How to run list of perl regex from file in terminal

I'm fairly new to the whole coding game, and am very grateful for every answer!
I am working on a directory with many .txt files in them and have a file with looong list of regex like "perl -p -i -e 's/\n\n/\n/g' *.xml" they all work if I copy them to terminal. But is there a possibility to run them straight from the file?
I tried ./unicode.sh but that resulted in:
No such file or directory.
Any ideas?
Thank you so much!
Here's a (mostly) equivalent Perl script to the oneliner perl -p -i -e 's/\n\n/\n/g' *.xml (one main difference being that this has strict and warnings enabled, which is strongly recommended), which you could expand upon by putting more code to modify the current line in the body of the while loop.
#!/usr/bin/env perl
use warnings;
use strict;
if (!#ARGV) { # if no files on command line
#ARGV = glob('*.xml'); # get a default list of files
}
local $^I = ''; # enable inplace editing (like perl -i)
while (<>) { # read each line of each file into $_
s/\n\n/\n/g; # modify $_ with a regex
# more regexes here...
print; # write the line $_ back out
}
You can save this script in a file such as process.pl, and then run it with perl process.pl, or do chmod u+x process.pl and then run it via ./process.pl.
On the other hand, you really shouldn't modify XML files with regular expressions, there are lots of Perl modules to do XML processing - I wrote about that some more here. Also, in the example you showed, s/\n\n/\n/g actually won't have any effect, since when reading files line-by-line, no string will contain two \n's (you can change how Perl reads files, but I don't see any mention of that in the question).
Edit: You've named the script in your example unicode.sh - if you're processing Unicode files, then Perl has very powerful features to help with that, although the code won't necessarily end up as nice and short as I've showed above. You'll have to tell us some more about what you're doing, and show some example input and output, to get suggestions about that. See also e.g. perlunitut.
It's likely if you got no such file or directory, your problem was you forgot to make unicode.sh executable, as in chmod +x unicode.sh, assuming that's a script that you wrote.
Of course the normal way to run multiple perl commands is this thing that looks like runme.pl which you write, i.e., a perl script.
That said, yes, everything will work from the terminal, you just need to be careful about escaping that bash performs.

Mass rename in shell script

I have a bunch of files which are of this format:
blabla.log.YYYY.MM.DD
Where YYYY.MM.DD is something like (2016.01.18)
I have quite a few folders with about 1000 files in each, so I wanted to have a simple script to rename them. I want to rename them to
blabla.log
So basically, I'm just stripping the date at the end. Here is what I have:
for f in [a-zA-Z]*.log.[0-9][0-9][0-9][0-9].[0-9][0-9].[0-9][0-9]; do
mv -v $f ${f#[0-9][0-9][0-9][0-9].[0-9][0-9].[0-9][0-9]};
done
This script outputs this:
mv: `blabla.log.2016.01.18' and `blabla.log.2016.01.18' are the same file
For more information:
I'm on windows, but I run this script in gitbash
For some reason, my gitbash doesn't recognize the "rename" command
Some regex patterns (like [0-9]{4} don't seem to work)
I'm really at a lost. Thanks.
EDIT: I need to rename every single file that has a date at the end and that is of the from: *.log.2016.01.18. They all need to keep their original names. All that should change is the removal of the date.
You have to use % instead of #: you want to remove from the end, not the start of your string.
Also, you're missing a . in what has to be removed, you don't want to end up with blabla.log..
Quoting the variable names prevents surprises when file names contain special characters.
Together:
mv -v "$f" "${f%.[0-9][0-9][0-9][0-9].[0-9][0-9].[0-9][0-9]}"

batch renaming of files with perl expressions

This should be a basic question for a lot of people, but I am a biologist with no programming background, so please excuse my question.
What I am trying to do is rename about 100,000 gzipped data files that have existing name of a code (example: XG453834.fasta.gz). I'd like to name them to something easily readable and parseable by me (example: Xanthomonas_galactus_str_453.fasta.gz).
I've tried to use sed, rename, and mmv, to no avail. If I use any of those commands on a one-off script then they work fine, it's just when I try to incorporate variables into a shell script do I run into problems. I'm not getting any errors, just no names are changed, so I suspect it's an I/O error.
Here's what my files look like:
#! /bin/bash
# change a bunch of file names
file=names.txt
while IFS=' ' read -r r1 r2;
do
mmv ''$r1'.fasta.gz' ''$r2'.fasta.gz'
# or I tried many versions of: sed -i 's/"$r1"/"$r2"/' *.gz
# and I tried many versions of: rename -i 's/$r1/$r2/' *.gz
done < "$file"
...and here's the first lines of my txt file with single space delimiter:
cat names.txt
#find #replace
code1 name1
code2 name2
code3 name3
I know I can do this with python or perl, but since I'm stuck here working on this particular script I want to find a simple solution to fixing this bash script and figure out what I am doing wrong. Thanks so much for any help possible.
Also, I tried to cat the names file (see comment from Ashoka Lella below) and then use awk to move/rename. Some of the files have variable names (but will always start with the code), so I am looking for a find & replace option to just replace the "code" with the "name" and preserve the file name structure.
I suspect I am not escaping the variable within the single tick of the perl expression, but I have poured over a lot of manuals and I can't find the way to do this.
If you're absolutely sure than the filenames doesn't contain spaces of tabs, you can try the next
xargs -n2 < names.txt echo mv
This is for DRY run (will only print what will do) - if you satisfied with the result, remove the echo ...
If you want check the existence ot the target, use
xargs -n2 < names.txt echo mv -i
if you want NEVER allow overwriting of the target use
xargs -n2 < names.txt echo mv -n
again, remove the echo if youre satisfied.
I don't think that you need to be using mmv, a simple mv will do. Also, there's no need to specify the IFS, the default will work for you:
while read -r src dest; do mv "$src" "$dest"; done < names.txt
I have double quoted the variable names as it is generally considered good practice but in this case, a space in either of the filenames will result in read not working as you expect.
You can put an echo before the mv inside the loop to ensure that the correct command will be executed.
Note that in your file names.txt, the .fasta.gz suffix is already included, so you shouldn't be adding it inside the loop aswell. Perhaps that was your problem?
This should rename all files in column1 to column2 of names.txt. Provided they are in the same folder as names.txt
cat names.txt| awk '{print "mv "$1" "$2}'|sh

Linux cp with a regexp

I would like to copy some files in a directory, renaming the files but conserving extension. Is this possible with a simple cp, using regex ?
For example :
cp ^myfile\.(.*) mydir/newname.$1
So I could copy the file conserving the extension but renaming it. Is there a way to get matched elements in the cp regex to use it in the command ?
If not, I'll do a perl script I think, or if you have another way...
Thanks
Suppose you have myfile.a, myfile.b, myfile.c:
for i in myfile.*; do echo mv "$i" "${i/myfile./newname.}"; done
This creates (upon removal of echo) newname.a, newname.b, newname.c.
The shell doesn't understand general regexes; you'll have to outsource to auxiliary programs for that. The classical scripty way to solve your task would be something like
for a in myfile.* ; do
b=`echo $a | sed 's!^myfile!mydir/newname!'`
cp $a $b
done
Or have a perl script generate a list of commands that you then source into the shell.
I really like the regex syntax of the rename perl script (by Robin Barker and Larry Wall), e.g.:
rename "s/OldFile/NewFile/" OldFile*
OldFile.c and OldFile.h are renamed to NewFile.c and NewFile.h, respectively
I simply wanted the exact same thing with a copy command:
copy "s/OldFile/NewFile/" OldFile*
So I duplicated that script and changed the rename statement to copy via File::Copy. Et voila! A copy command with perl-regex syntax:
https://gist.github.com/jcward/0ead33bd79f2061c68728cc82582241f