Bash multiple copy latest files with timestamp using regex - regex

please, I need a help with rename of multiple files. One application in our generating everyday 3 reports with filemask OPEN_REPORTn_yyyymmddHH24Miss.csv, e.g listing like this one:
/mnt/server/OPEN_REPORT1_20180604130922.csv
/mnt/server/OPEN_REPORT2_20180604130922.csv
/mnt/server/OPEN_REPORT3_20180604130922.csv
I want this files copy as
/mnt/server/OPEN_REPORT1.csv
/mnt/server/OPEN_REPORT2.csv
/mnt/server/OPEN_REPORT3.csv
and keep original files without change the name (so, that means that I must list only 3 last files)
I have this solution:
cp $(ls -t /mnt/server/OPEN_REPORT1_* | head -n1) /mnt/server/OPEN_REPORT1.csv
cp $(ls -t /mnt/server/OPEN_REPORT2_* | head -n1) /mnt/server/OPEN_REPORT2.csv
cp $(ls -t /mnt/server/OPEN_REPORT3_* | head -n1) /mnt/server/OPEN_REPORT3.csv
But this solution is not too effective because I'm using more cp command as I need. I want copy those files with only one use cp command and with regular expressions.
I'm trying solution like this one:
for file in $(ls -t /mnt/server/OPEN_REPORT?_??????????????.csv | head -n3); do echo ${file} | sed 's/OPEN_REPORT([0-9]{1})/$1/'; done
but result for echo doesn't looks fine.
Please any help with solution? Thanks for any advice
SOLUTION (thanks to David Peltier):
for file in $(ls -t /mnt/server/OPEN_REPORT?_??????????????.csv | head -n3); do cp $file ${file%_*}.csv; done

try this
for file in $(ls -1 /mnt/server/*.csv); do cp /mnt/server/$file /mnt/server/${file%_*}.csv;done
Bash can do replacement and you no longer need to use sed.
${var%Pattern}, ${var%%Pattern}
${var%Pattern} Remove from $var the shortest part of $Pattern that matches the back end of $var.
${var%%Pattern} Remove from $var the longest part of $Pattern that matches the back end of $var.
https://www.tldp.org/LDP/abs/html/parameter-substitution.html

Related

Pass sed output to mv

I'm trying to batch rename text files according to a string they contain.
I used sed to isolate the pattern with \( and \) as I couldn't get this to work in grep.
sed -i '' 's/<title>\(.*\)<\/title>/&/g' *.txt | mv *.txt $sed.txt
(the text I want to use as filename is between html title tags)`
Where I wrote $sed would be the output of sed.
hope that's clear!
A simple loop in bash can accomplish this. If each file is valid HTML, meaning you have only one <title> tag in the file, you can rename them all this way:
for file in *.txt; do
mv "$file" `sed -n 's/<title>\([^<]*\)<\/title>/\1/p;' $file| sed -e 's/[ ][ ]*/_/g'`.txt
done
So, if you have files 1.txt, 2.txt and 3.txt, each with cat, dog and my hippo in their TITLE tags, you'll end up with cat.txt, dog.txt and my_hippo.txt after the above loop.
EDIT: quoted initial $file in case there are spaces in filenames; and added a second sed to convert any spaces in the <title> tag to _'s in resulting filenames. NOTE the whitespace inside the []'s in the second sed command is a literal space and tab character.
You can enclose expression in grave accent characters (`) to make it insert its output to the place you want. Try:
mv *.txt `sed -i '' 's/<title>\(.*\)<\/title>/&/g' *.txt`.txt
It is rather not flexible, but should work.
(I haven't used it in a while and cannot test it now, so I might be wrong).
Here is the command I would use:
for i in *.txt ; do
sed "s=<title>\(.*\)</title>=mv '$i' '\1'=e" $i
done
The sed substitution search for pattern in each one of your .txt files. For each file it creates string mv 'file_name' 'found_pattern'.
With the e command at the end of sed commands, this resulting string is directly executed in terminal, thus it renames your files.
Some hints:
Note the use of =s instead of /s as delimiters for sed substition: it's more readable as you already have /s in your pattern (you could use many other symbols if you don't like =). And in this way you don't have to escape the / in your pattern.
The e command for sed executes the created string.
(I'm speaking of this one below:
sed "s=<title>\(.*\)</title>=mv '$i' '\1'=e" $i
^
)
So use it with caution! I would recommand to first use the line without final e: it won't execute any mv command, but just print instead what would be executed if you were to add the e.
What I read from your question is:
you have a number of text (html) files in a directory
each file contains at least the tag <title> ... </title>
you want to extract the content (elements.text) and use it as filename
last you want to rename that file to the extracted filename
Is this correct?
So, then you need to loop through the files, e.g. with xargs or find
ls '*.txt' | xargs -i\{\} command "{}" ...
find -maxdepth 1 -type f -name '*.txt' -exec command "{}" ... \;
I always replace the xargs substitues by -i\{\} because the resulting command is compatible if I use it sometimes with find and its substitute {}.
Next the -maxdepth option will help find not to dive deeper in directory, if no subdir, you can leave it out.
command could be something very simple like echo "Testing File: {}" or a really small script if you use it with bash:
find . -name '*.txt' -exec bash -c 'CUR_FILE="{}"; echo "Working on: $CUR_FILE"; ls -l "$CUR_FILE";' \;
The big decision for your question is: how to get the text from title element.
A simple solution (suitable if opening and closing tag is on same textline) would be by grep
A more solid solution is to use a HTML Parser and navigate by DOM operation
The simple solution base on:
get the title line
remove the everything before and after title content
So do it together:
ls *.txt | xargs -i\{\} bash -c 'TITLE=$(egrep "<title>[^<]*</title>" "{}"); NEW_FNAME=$(echo "$TITLE" | sed -e "s#.*<title>\([^<]*\)</title>.*#\1#"); mv -v "{}" "$NEW_FNAME.txt"'
Same with usage of find:
find . -maxdepth 1 -type f -name '*.txt' -exec bash -c 'TITLE=$(egrep "<title>[^<]*</title>" "{}"); NEW_FNAME=$(echo "$TITLE" | sed -e "s#.*<title>\([^<]*\)</title>.*#\1#"); mv -v "{}" "$NEW_FNAME.txt"' \;
Hopefully it is what you expected.

Find directories with names matching pattern and move them

I have a bunch of directories like 001/ 002/ 003/ mixed in with others that have letters in their names. I just want to grab all the directories with numeric names and move them into another directory.
I try this:
file */ | grep ^[0-9]*/ | xargs -I{} mv {} newdir
The matching part works, but it ends up moving everything to the newdir...
I am not sure I understood correctly but here is at least something to help.
Use a combination of find and xargs to manipulate lists of files.
find -maxdepth 1 -regex './[0-9]*' -print0 | xargs -0 -I'{}' mv "{}" "newdir/{}"
Using -print0 and -0 and quoting the replacement symbol {} make your script more robust. It will handle most situations where non-printable chars are presents. This basically says it passes the lines using a \0 char delimiter instead of a \n.
mv is not powerfull enough by itself. It cannot work on patterns.
Try this approach: Rename multiple files by replacing a particular pattern in the filenames using a shell script
Either use a loop or a rename command.
With loop and array,
Your script would be something like this:
#!/bin/bash
DIR=( $(file */ | grep ^[0-9]*/ | awk -F/ '{print $1}') )
for dir in "${DIR[#]}"; do
mv $dir /path/to/DIRECTORY
done

Regex to rename all files recursively removing everything after the character "?" commandline

I have a series of files that I would like to clean up using commandline tools available on a *nix system. The existing files are named like so.
filecopy2.txt?filename=3
filecopy4.txt?filename=33
filecopy6.txt?filename=198
filecopy8.txt?filename=188
filecopy3.txt?filename=19
filecopy5.txt?filename=1
filecopy7.txt?filename=5555
I would like them to be renamed removing all characters after and including the "?".
filecopy2.txt
filecopy4.txt
filecopy6.txt
filecopy8.txt
filecopy3.txt
filecopy5.txt
filecopy7.txt
I believe the following regex will grab the bit I want to remove from the name,
\?(.*)
I just can't figure out how to accomplish this task beyond this.
A bash command:
for file in *; do
mv $file ${file%%\?filename=*}
done
find . -depth -name '*[?]*' -exec sh -c 'for i do
mv "$i" "${i%[?]*}"; done' sh {} +
With zsh:
autoload zmv
zmv '(**/)(*)\?*' '$1$2'
Change it to:
zmv -Q '(**/)(*)\?*(D)' '$1$2'
if you want to rename dot files as well.
Note that if filenames may contain more than one ? character, both will only trim from the rightmost one.
If all files are in the same directory (ignoring .dotfiles):
$ rename -n 's/\?filename=\d+$//' -- *
If you want to rename files recursively in a directory hierarchy:
$ find . -type f -exec rename -n 's/\?filename=\d+$//' {} +
Remove -n option, to do the renaming.
I this case you can use the cut command:
echo 'filecopy2.txt?filename=3' | cut -d? -f1
example:
find . -type f -name "*\?*" -exec sh -c 'mv $1 $(echo $1 | cut -d\? -f1)' mv {} \;
You can use rename if you have it:
rename 's/\?.*$//' *
I use this after downloading a bunch of files where the URL included parameters and those parameters ended up in the file name.
This is a Bash script.
for file in *; do
mv $file ${file%%\?*};
done

BASH - find specific folder with find and filter with regex

I have a folder containing many folders with subfolder (/...) with the following structre:
_30_photos/combined
_30_photos/singles
_47_foo.bar
_47_foo.bar/combined
_47_foo.bar/singles
_50_foobar
With the command find . -type d -print | grep '_[0-9]*_' all folder with the structure ** will be shown. But I have generate a regex which captures only the */combined folders:
_[0-9]*_[a-z.]+/combined but when I insert that to the find command, nothing will be printed.
The next step would be to create for each combined folder (somewhere on my hdd) a folder and copy the content of the combined folder to the new folder. The new folder name should be the same as the parent name of the subfolder e.g. _47_foo.bar. Could that be achieved with an xargs command after the search?
You do not need grep:
find . -type d -regex ".*_[0-9]*_.*/combined"
For the rest:
find . -type d -regex "^\./.*_[0-9]*_.*/combined" | \
sed 's!\./\(.*\)/combined$!& /somewhere/\1!' | \
xargs -n2 cp -r
With basic grep you will need to escape the +:
... | grep '_[0-9]*_[a-z.]\+/combined'
Or you can use the "extended regexp" version (egrep or grep -E [thanks chepner]) in which the + does not have to be escaped.
xargs may not be the most flexible way of doing the copying you describe above, as it is tricky to use with multiple commands. You may find more flexibility with a while loop:
... | grep '_[0-9]*_[a-z.]\+/combined' | while read combined_dir; do
mkdir some_new_dir
cp -r ${combined_dir} some_new_dir/
done
Have a look at bash string manipulation if you want a way to automate the name of some_new_dir.
target_dir="your target dir"
find . -type d -regex ".*_[0-9]+_.*/combined" | \
(while read s; do
n=$(dirname "$s")
cp -pr "$s" "$target_dir/${n#./}"
done
)
NOTE:
this fails if you have linebreaks "\n" in your directory names
this uses a subshell to not clutter your env - inside a script you don't need that
changed the regex slightly: [0-9]* to [0-9]+
You can use this command:
find . -type d | grep -P "_[0-9]*_[a-z.]+/combined"

BASH script to *create* filenames with spaces from filenames with "%20"

First, I know this sounds ass backwards. It is. But I'm looking to convert (on the BASH command line) a bunch of script-generated thumbnail filenames that do have a "%20" in them to the equivalent without filenames. In case you're curious, the reason is because the script I'm using created the thumbnail filenames from their current URLs, and it added the %20 in the process. But now WordPress is looking for files like "This%20Filename.jpg" and the browser is, of course, removing the escape character and replacing it with spaces. Which is why one shouldn't have spaces in filenames.
But since I'm stuck here, I'd love to convert my existing thumbnails over. Next, I will post a question for help fixing the problem in the script mentioned above. What I'm looking for now is a quick script to do the bad thing and create filenames with spaces out of filenames with "%20"s.
Thanks!
If you only want to replace each literal %20 with one space:
for i in *; do
mv "$i" "${i//\%20/ }"
done
(for instance this will rename file%with%20two%20spaces to file%with two spaces).
You'll probably need to apply %25->% too though, and other similar transforms.
convmv can do this, no script needed.
$ ls
a%20b.txt
$ convmv --unescape *.txt --notest
mv "./a%20b.txt" "./a b.txt"
Ready!
$ ls
a b.txt
personally, I don't like file names with spaces - beware you will have to treat them specially in future scripts. Anyway, here is the script that will do what you want to achieve.
#!/bin/sh
for fname in `ls *%20*`
do
newfname=`echo $fname | sed 's/%20/ /g'`
mv $fname "$newfname"
done;
Place this to a file, add execute permission and run this from the directory where you have file with %20 in their names.
Code :
#!/bin/bash
# This is where your files currently are
DPATH="/home/you/foo/*.txt"
# This is where your new files will be created
BPATH="/home/you/new_foo"
TFILE="/tmp/out.tmp.$$"
[ ! -d $BPATH ] && mkdir -p $BPATH || :
for f in $DPATH
do
if [ -f $f -a -r $f ]; then
/bin/cp -f $f $BPATH
sed "s/%20/ /g" "$f" > $TFILE && mv $TFILE "$f"
else
echo "Error: Cannot read $f"
fi
done
/bin/rm $TFILE
Not bash, but for the more general case of %hh (encoded hex) in names.
#!/usr/bin/perl
foreach $c(#ARGV){
$d=$c;
$d=~s/%([a-fA-F0-9][a-fA-F0-9])/my $a=pack('C',hex($1));$a="\\$a"/eg;
print `mv $c $d` if ($c ne $d);
}