Using rename and grep patterns to rename files in Terminal - regex

I am trying to change the name of a number of files in a directory using rename, but the pattern that I'm using is not working with rename, even though it works in my text editor (BBEdit). I would like to know how to modify the rename command I'm using or the pattern so that I can get rid of the long prefix that each file has.
My directory listing looks like this:
bos012_attempt_2018-02-15-01-52-18_KIC Document 0001.pdf
gem512_attempt_2018-02-14-20-30-11_Geo HW 2.pdf
kgs252_attempt_2018-02-14-23-35-03_kgs252_hw2.pdf
nrs728_attempt_2018-02-15-10-04-42_mids.png
oko018_attempt_2018-02-15-23-57-57_Hw2.pdf
I want to change this to
KIC Document 0001.pdf
Geo HW 2.pdf
kgs252_hw2.pdf
mids.png
Hw2.pdf
Using rename -vs 's/\D\D\D\d\d\d_attempt_2018-\d\d-\d\d-\d\d-\d\d-\d\d_/''/g' *
produces no changes in the names. Nevertheless, changing the pattern \D\D\D\d\d\d_attempt_2018-\d\d-\d\d-\d\d-\d\d-\d\d_ to no character works just fine in my text editor. I've tried different things, e.g.
rename -vs \D\D\D\d\d\d_attempt_2018-\d\d-\d\d-\d\d-\d\d-\d\d_ '' *
and nothing.
mydir $ rename -nvs \D\D\D\d\d\d_attempt_2018-\d\d-\d\d-\d\d-\d\d-\d\d_ '' *
returns
Using expression: sub { use feature ':5.18';
s/\Q${\"DDDddd_attempt_2018\-dd\-dd\-dd\-dd\-dd_"}// }
'bos012_attempt_2018-02-15-01-52-18_KIC_Document_0001.pdf' unchanged
'gem512_attempt_2018-02-14-20-30-11_Geo_HW_2.pdf' unchanged
'nrs728_attempt_2018-02-15-10-04-42_mids.png' unchanged
'oko018_attempt_2018-02-15-23-57-57_Hw2.pdf' unchanged

Depending on your version of rename, you may use this rename command:
rename -n 's/.*_attempt_\d{4}(-\d{2}){5}_//' *.{pdf,png}
'bos012_attempt_2018-02-15-01-52-18_KIC Document 0001.pdf' would be renamed to 'KIC Document 0001.pdf'
'gem512_attempt_2018-02-14-20-30-11_Geo HW 2.pdf' would be renamed to 'Geo HW 2.pdf'
'kgs252_attempt_2018-02-14-23-35-03_kgs252_hw2.pdf' would be renamed to 'kgs252_hw2.pdf'
'oko018_attempt_2018-02-15-23-57-57_Hw2.pdf' would be renamed to 'Hw2.pdf'
'nrs728_attempt_2018-02-15-10-04-42_mids.png' would be renamed to 'mids.png'
If you are satisfied with the output remove -n argument of dry-run.

Related

Rename multiple files with different names to same name and different numbers

I have multiple pictures of trucks with random messy names and different formats (jpeg, jpg, png etc.) and I want to rename them to "truck1.jpeg", "truck2.jpg", "truck3.png" and so on. How do I do it using the rename command?
It's probably easier to use bash and mv, since AFAIK you need something like bash to generate the number sequence. In bash
i=1
for x in *; do
echo $x '->' truck$i.${x##*.}
mv "$x" truck$i.${x##*.} && i=$((i+1))
done
The for x in * operates on all files whose names do not begin with a dot and are in the current directory. You can adjust the glob to be more exclusive, but this script will need modification if the files are in other directories. Again, probably easier to collect the files in one directory, or maybe put it in a script file and execute it in multiple directories using find ... -exec.
This uses i as a counter to generate the digits. The trick is the ${x##*.} expression which takes the file name and deletes everything up to the final dot. This allows you to preserve and reattach the file extension to the new name. You have to be careful to set i correctly or you will overwrite old truck1 files with new ones.

rsync --exclude-from 'list' file not working

I am trying to use rsync to complete an unfinished transfer from a remote server to a local machine using
rsync -a user#domain.com:~/source/ /dest/
where /dest/ is the location of the partially completed transfer. However, due to bandwidth concerns I need to run rsync to a /tmp_dest/ on a different machine that does not have a copy of /dest/, from where I can then later move /tmp_dest/ to /dest/
The solution I have come up with thus far is to use rync's --exclude-from option, using a file containing a complete list of files from /dest/.
The command would look something like this
rsync -a --exclude-from 'list.txt' user#domain.com:~/source/ /tmp_dest/
At this point I feel as though I have scoured everywhere for a solution and tried every variant I came across.
This included relative and absolute paths for the 'list.txt'
relative:
path 1/file 1
path 2/file 2
--or--
absolute:
/absolute/source/path 1/file 1
/absolute/source/path 2/file 2
I have tried the above with combinations of including - to explicitly exclude that line (where I have seen examples of people wanting to also + other files)
- /absolute/source/path 1/file 1
- /absolute/source/path 2/file 2
I have tried putting leading **/ in front of the file paths to rectify the relative path problem
**/path 1/file 1
**/path 2/file 2
I have also tried navigating to the directory containing 'list' and executing rsync from there, to avoid the issue where rsync looks for
/path/to/the/list/something1/to.exclude
/path/to/the/list/something2/to.exclude
/path/to/the/list/something3/to.exclude
and undoubtedly finding nothing
I have also ensued that the correct line breaks are being used in the 'list' file. i.e. LF (Unix) line breaks.
I have tried to create the 'list' with the following command
find . -type f | tee list.txt
this initially created a file looking something like this
./yyyy-mm-dd folder 1/sub folder [foo]/file.a
./(yyyy) folder 2 {foo2}/file.b
./folder, 3/sub-folder 3/file.c
as you can see, there are spaces and other characters in the file paths, but from my current understanding, this shouldn't affect. But perhaps I am mistaken and will need to escape any characters with special meaning, which I may then need help with
which I then perform a replace on ./ in notepad++ or some other text editor that preserves the LF (Unix) line breaks to get the desired result.
(e.g. as above, I've tried replacing ./ with nothing, with /absolute/path/for/source/ noting the leading slash, or even double wildcards to match any parent tree structure containing the files.
The only thing I feel that I haven't tried is escaping the spaces in the file names and paths, but I have read that this shouldn't be an issue.
Perhaps I am overlooking something and any help would be appreciated.
Here is from rsync man page how to use "--exclude-from":
--exclude-from=FILE read exclude patterns from FILE
Use the following command:
rsync -a --exclude-from=list.txt user#domain.com:~/source/ /tmp_dest/
And also it is better to use full path name of list.txt file

Iterating over directory with specified path in Bash

pathToBins=$1
bins="${pathToBins}contigs.fa.metabat-bins-*"
for fileName in $bins
do
echo $fileName
done
My goal is to attach a path to my file name. I can iterate over a folder and get the file name when I don't attach the path. My challenge is when I add the path echo fileName my regular expression no longer works and I get "/home/erikrasmussen/Desktop/Script/realLargeMetaBatBinscontigs.fa.metabat-bins-*" where the regular expression '*' is treated like a string. How can I get the path and also the full file name while iterating over a folder of files?
Although I don't really know how your files are arranged on your hard drive, a casual glance at "/home/erikrasmussen/Desktop/Script/realLargeMetaBatBinscontigs.fa.metabat-bins-*" suggests that it is missing a / before contigs. If that is the case, then you should change your definition of bins to:
bins="${pathToBins}/contigs.fa.metabat-bins-*"
However, it is much more robust to use bash arrays instead of relying on filenames to not include whitespace and metacharacters. So I would suggest:
bins=(${pathToBins}/contigs.fa.metabat-bins-*)
for fileName in "${bins[#]}"
do
echo "$fileName"
done
Bash normally does not expand a pattern which doesn't match any file, so in that case you will see the original pattern. If you use the array formulation above, you could set the bash option nullglob, which will cause the unmatched pattern to vanish instead, leaving an empty array.

Mass rename in shell script

I have a bunch of files which are of this format:
blabla.log.YYYY.MM.DD
Where YYYY.MM.DD is something like (2016.01.18)
I have quite a few folders with about 1000 files in each, so I wanted to have a simple script to rename them. I want to rename them to
blabla.log
So basically, I'm just stripping the date at the end. Here is what I have:
for f in [a-zA-Z]*.log.[0-9][0-9][0-9][0-9].[0-9][0-9].[0-9][0-9]; do
mv -v $f ${f#[0-9][0-9][0-9][0-9].[0-9][0-9].[0-9][0-9]};
done
This script outputs this:
mv: `blabla.log.2016.01.18' and `blabla.log.2016.01.18' are the same file
For more information:
I'm on windows, but I run this script in gitbash
For some reason, my gitbash doesn't recognize the "rename" command
Some regex patterns (like [0-9]{4} don't seem to work)
I'm really at a lost. Thanks.
EDIT: I need to rename every single file that has a date at the end and that is of the from: *.log.2016.01.18. They all need to keep their original names. All that should change is the removal of the date.
You have to use % instead of #: you want to remove from the end, not the start of your string.
Also, you're missing a . in what has to be removed, you don't want to end up with blabla.log..
Quoting the variable names prevents surprises when file names contain special characters.
Together:
mv -v "$f" "${f%.[0-9][0-9][0-9][0-9].[0-9][0-9].[0-9][0-9]}"

Rename files using regular expression in linux

I have a set of files named like
20151016_174721.jpg
and I want to rename them like
2015-10-16 17.47.21.jpg
I tried using rename using the following:
rename -n "s/(\d{4})(\d{2})(\d{2})_(\d{2})(\d{2})(\d{2}).*$/$1-$2-$3 $4.$5.$6.jpg/" *.jpg
But it ends up telling me
20151016_174721.jpg renamed as -- ...jpg
And I cannot understand why.
You can use:
rename 's/(\d{4})(\d{2})(\d{2})_(\d{2})(\d{2})(\d{2})(.*)$/$1-$2-$3 $4.$5.$6$7/' *.jpg
Make sure to use single quotes in your pattern to avoid shell attempting to expand $1, $2 etc.