Regex to rename all files recursively removing everything after the character "?" commandline - regex

I have a series of files that I would like to clean up using commandline tools available on a *nix system. The existing files are named like so.
filecopy2.txt?filename=3
filecopy4.txt?filename=33
filecopy6.txt?filename=198
filecopy8.txt?filename=188
filecopy3.txt?filename=19
filecopy5.txt?filename=1
filecopy7.txt?filename=5555
I would like them to be renamed removing all characters after and including the "?".
filecopy2.txt
filecopy4.txt
filecopy6.txt
filecopy8.txt
filecopy3.txt
filecopy5.txt
filecopy7.txt
I believe the following regex will grab the bit I want to remove from the name,
\?(.*)
I just can't figure out how to accomplish this task beyond this.

A bash command:
for file in *; do
mv $file ${file%%\?filename=*}
done

find . -depth -name '*[?]*' -exec sh -c 'for i do
mv "$i" "${i%[?]*}"; done' sh {} +
With zsh:
autoload zmv
zmv '(**/)(*)\?*' '$1$2'
Change it to:
zmv -Q '(**/)(*)\?*(D)' '$1$2'
if you want to rename dot files as well.
Note that if filenames may contain more than one ? character, both will only trim from the rightmost one.

If all files are in the same directory (ignoring .dotfiles):
$ rename -n 's/\?filename=\d+$//' -- *
If you want to rename files recursively in a directory hierarchy:
$ find . -type f -exec rename -n 's/\?filename=\d+$//' {} +
Remove -n option, to do the renaming.

I this case you can use the cut command:
echo 'filecopy2.txt?filename=3' | cut -d? -f1
example:
find . -type f -name "*\?*" -exec sh -c 'mv $1 $(echo $1 | cut -d\? -f1)' mv {} \;
You can use rename if you have it:
rename 's/\?.*$//' *

I use this after downloading a bunch of files where the URL included parameters and those parameters ended up in the file name.
This is a Bash script.
for file in *; do
mv $file ${file%%\?*};
done

Related

append epoch date at the beginning of a file in bash

I have a list of 20 files, 10 of them already have 1970-01-01- at the beginning of the name and 10 does not ( the remaining ones all start with a small letter ) .
So my task was to rename those files that do not have the epoch date in the beginning with the epoch date too. Using bash, the below code works, but I could not solve it using a regular expression for example using rename. I had to extract the basename and then further mv. An elegant solution would be just use one pipe instead of two.
Works
find ./ -regex './[a-z].*' | xargs -I {} basename {} | xargs -I {} mv {} 1970-01-01-{}
Hence looking for a solution with just one xargs or -exec?
You can just use a single rename command:
rename -n 's/^([a-z])/1970-01-01-$1/' *
Assuming you're operating on all the files present in current directory.
Note that -n flag (dry run) will only show intended actions by rename command but won't really rename any files.
If you want to combine with find then use:
find . -type f -maxdepth 1 -name '[a-z]*.txt' -execdir rename -n 's/^/1970-01-01-/' {} +
I always prefer readable code over short code.
r() {
base=$(basename "$1")
dir=$(dirname "$1")
if [[ "$base" =~ ^1970-01-01- ]]
then
: "ignore, already has correct prefix"
else
echo mv "$1" "$dir/1970-01-01-$base"
fi
}
export -f r
find . -type f -exec bash -c 'r {}' \;
This also just prints out what would have been done (for testing). Remove the echo before the mv to have to real thing.
Mind that the mv will overwrite existing files (if there is a ./a/b/c and an ./a/b/1970-01-01-c already). Use option -i to mv to be save from this.

Filename match and export

I have several files in a folder. I have to match the filenames using a regex pattern. In the regex pattern I have a word which would be a variable. I want all the files matched with the pattern to be moved to a separate directory with an alternate filename replacing the string with which I had made the match.
Eg,
I have many files with filenames having the word foo in the directory like,
gadgeagfooafsa
fsafsaffooarwf
fasfsfoofsafff
I have to list these files and copy it to another directory replacing the word foo from it. I have specified the new pattern to be "kuh", Like the above files should be copied to the new folder as
gadgeagkuhafsa
fsafsafkuharwf
fasfskuhfsafff
Finally, can I pipe different commands together to execute these in one line? :)
I had tried this command, but it didn't work, somehow the copy is failing.
ls | grep ".*foo[} ].*" | xargs cp -t work/
find + bash solution:
find . -type f -name "*foo*" -exec bash -c 'fn=${0##*/}; cp "$0" "new_dest/${fn//foo/kuh}"' {} \;
fn=${0##*/} - extracting file basename
${fn//foo/kuh} - substituting foo with kuh in filename
Replace/adjust new_dest with your current destination directory name.
I chose /tmp as the new destination, and only used two of the example files
newdest="/tmp"; fp="foo"; np="kuh"; for f in $(find . -type f -name "*$fp*"); do new=$(echo $f| sed "s/$fp/$np/g"); cp -f $f $newdest/$new ; done
which moves and renames the files
ls /tmp/*kuh*
/tmp/fsafsafkuharwf /tmp/gadgeagkuhafsa
If all the files are in same folder
with bash
for i in *foo* ;do mv "$i" /tmp/"${i/foo/kuh}";done

Bash script to Rename multiple files in subfolder to their folder name

I have the following file structure:
Applications/Snowflake/applications/Salford_100/wrongname_120.nui; wrongname_200_d.nui
Applications/Snowflake/applications/Salford_900/wrongname_120.nui; wrongname_200_d.nui
Applications/Snowflake/applications/Salford_122/wrongname_120.nui; wrongname_200_d.nui
And I want to rename the fles to the same name as the directories they're in, but the files with "_d" at the end should retain its last 2 characters. The file pattern would always be "salford_xxx" where xxx is always 3 digits. So the resulting files would be:
Applications/Snowflake/applications/Salford_100/Salford_100.nui; Salford_100_d.nui
Applications/Snowflake/applications/Salford_900/Salford_900.nui; Salford_900_d.nui
Applications/Snowflake/applications/Salford_122/Salford_122.nui; Salford_122_d.nui
The script would run from a different location in
Applications/Snowflake/Table-updater
I imagine this would require a for loop and a sed regex, but Im open to any suggestions.
(Thanks #ghoti for your advice)
I've Tried this, which currently does not account for files with "_d" yet and I just get one file renamed correctly. Some help would be appreciated.
cd /Applications/snowflake/table-updater/Testing/applications/salford_*
dcomp="$(basename "$(pwd)")"
for file in *; do
ext="${file##*.}"
mv -v "$file" "$dcomp.$ext"
done
Ive now updated the script following #varun advice (thank you) and it now also searches through all files in the parent dir that contain salford in the name, missing out the parent name. Please see below
#!/bin/sh
#
# RenameToDirName2.sh
#
set -e
cd /Applications/snowflake/table-updater/Testing/Applications/
find salford* -maxdepth 1 -type d \( ! -name . \) -exec sh -c '(cd {} &&
(
dcomp="$(basename "$(pwd)")"
for file in *;
do ext="${file#*.}"
zz=$(echo $file|grep _d)
if [ -z $zz ]
then
mv -v "$file" "$dcomp.$ext"
else
mv -v "$file" "${dcomp}_d.$ext"
fi
done
)
)' ';'
The thing is, I've just realised that in these salford sub directories there are other files with different extensions that I don't want renaming. Ive tried putting in an else if statement to stipulate *.Nui files only, calling my $dcomp variable, like this
else
if file in $dcomp/*.nui
then
#continue...
But I get errors. Where should this go in my script and also do I have the correct syntax for this loop? Can you help?
You can write:
(
cd ../applications/ && \
for name in Salford_[0-9][0-9][0-9] ; do
mv "$name"/*_[0-9][0-9][0-9].nui "$name/$name.nui"
mv "$name"/*_[0-9][0-9][0-9]_d.nui "$name/${name}_d.nui"
done
)
(Note: the (...) is a subshell, to restrict the scope of the directory-change and of the name variable.)
#eggfoot,I have modified my script, which will look into all the directories in folder applications and look for for folders which have Salford in it.
So you can call my script like this
./rename.sh /home/username/Applications/Snowflake e/applications
#!/bin/bash
# set -x
path=$1
dir_list=$(find $path/ -type d)
for index_dir in $dir_list
do
aa=$(echo $index_dir|grep Salford)
if [ ! -z $aa ]
then
files_list=$(find $index_dir/ -type f)
for index in $files_list
do
xx=$(basename $index)
z=$(echo $xx|grep '_d')
if [ -z $z ]
then
result=$(echo $index | sed 's/\/\(.*\)\/\(.*\)\/\(.*\)\(\..*$\)/\/\1\/\2\/\2\4/')
mv "$index" "$result"
else
result=$(echo $index | sed 's/\/\(.*\)\/\(.*\)\/\(.*\)_d\(\..*$\)/\/\1\/\2\/\2_d\4/')
mv "$index" "$result"
fi
done
fi
done
Regarding sed, it uses the s command of sed and substitute the file name with directory name, keeping the extension as it is.
Regarding your script, you need to use grep command to find files which have _d and than you can use parameter substitution changing the mv for files with _d and one without _d.
dcomp="$(basename "$(pwd)")"
for file in *; do
ext="${file##*.}"
zz=$(echo $file|grep _d)
if [ -z $zz ]
then
mv -v "$file" "$dcomp.$ext"
else
mv -v "$file" "${dcomp}_d.$ext"
fi
done

Pass sed output to mv

I'm trying to batch rename text files according to a string they contain.
I used sed to isolate the pattern with \( and \) as I couldn't get this to work in grep.
sed -i '' 's/<title>\(.*\)<\/title>/&/g' *.txt | mv *.txt $sed.txt
(the text I want to use as filename is between html title tags)`
Where I wrote $sed would be the output of sed.
hope that's clear!
A simple loop in bash can accomplish this. If each file is valid HTML, meaning you have only one <title> tag in the file, you can rename them all this way:
for file in *.txt; do
mv "$file" `sed -n 's/<title>\([^<]*\)<\/title>/\1/p;' $file| sed -e 's/[ ][ ]*/_/g'`.txt
done
So, if you have files 1.txt, 2.txt and 3.txt, each with cat, dog and my hippo in their TITLE tags, you'll end up with cat.txt, dog.txt and my_hippo.txt after the above loop.
EDIT: quoted initial $file in case there are spaces in filenames; and added a second sed to convert any spaces in the <title> tag to _'s in resulting filenames. NOTE the whitespace inside the []'s in the second sed command is a literal space and tab character.
You can enclose expression in grave accent characters (`) to make it insert its output to the place you want. Try:
mv *.txt `sed -i '' 's/<title>\(.*\)<\/title>/&/g' *.txt`.txt
It is rather not flexible, but should work.
(I haven't used it in a while and cannot test it now, so I might be wrong).
Here is the command I would use:
for i in *.txt ; do
sed "s=<title>\(.*\)</title>=mv '$i' '\1'=e" $i
done
The sed substitution search for pattern in each one of your .txt files. For each file it creates string mv 'file_name' 'found_pattern'.
With the e command at the end of sed commands, this resulting string is directly executed in terminal, thus it renames your files.
Some hints:
Note the use of =s instead of /s as delimiters for sed substition: it's more readable as you already have /s in your pattern (you could use many other symbols if you don't like =). And in this way you don't have to escape the / in your pattern.
The e command for sed executes the created string.
(I'm speaking of this one below:
sed "s=<title>\(.*\)</title>=mv '$i' '\1'=e" $i
^
)
So use it with caution! I would recommand to first use the line without final e: it won't execute any mv command, but just print instead what would be executed if you were to add the e.
What I read from your question is:
you have a number of text (html) files in a directory
each file contains at least the tag <title> ... </title>
you want to extract the content (elements.text) and use it as filename
last you want to rename that file to the extracted filename
Is this correct?
So, then you need to loop through the files, e.g. with xargs or find
ls '*.txt' | xargs -i\{\} command "{}" ...
find -maxdepth 1 -type f -name '*.txt' -exec command "{}" ... \;
I always replace the xargs substitues by -i\{\} because the resulting command is compatible if I use it sometimes with find and its substitute {}.
Next the -maxdepth option will help find not to dive deeper in directory, if no subdir, you can leave it out.
command could be something very simple like echo "Testing File: {}" or a really small script if you use it with bash:
find . -name '*.txt' -exec bash -c 'CUR_FILE="{}"; echo "Working on: $CUR_FILE"; ls -l "$CUR_FILE";' \;
The big decision for your question is: how to get the text from title element.
A simple solution (suitable if opening and closing tag is on same textline) would be by grep
A more solid solution is to use a HTML Parser and navigate by DOM operation
The simple solution base on:
get the title line
remove the everything before and after title content
So do it together:
ls *.txt | xargs -i\{\} bash -c 'TITLE=$(egrep "<title>[^<]*</title>" "{}"); NEW_FNAME=$(echo "$TITLE" | sed -e "s#.*<title>\([^<]*\)</title>.*#\1#"); mv -v "{}" "$NEW_FNAME.txt"'
Same with usage of find:
find . -maxdepth 1 -type f -name '*.txt' -exec bash -c 'TITLE=$(egrep "<title>[^<]*</title>" "{}"); NEW_FNAME=$(echo "$TITLE" | sed -e "s#.*<title>\([^<]*\)</title>.*#\1#"); mv -v "{}" "$NEW_FNAME.txt"' \;
Hopefully it is what you expected.

How to rename all files in a folder removing everything after space character in linux?

Hello I can't use well the regular expressions it's all day I'm searching on Internet.
I have a folder with many pictures:
50912000 Bicchiere.jpg
50913714 Sottobottiglia Bernini.jpg
I'm using Mac OS X, but I can also try on a Ubuntu, I would like to make a script for bash to remove all the characters after the first space to have a solution like this:
50912000.jpg
50913714.jpg
For all the files in the folder.
Any help is appreciated.
Regards
Use pure BASH:
f='50912000 Bicchiere.jpg'
mv "$f" "${f/ *./.}"
Or using find fix all the files at once:
find . -type f -name "* *" -exec bash -c 'f="$1"; s="${f/_ / }"; mv -- "$f" "${s/ *./.}"' _ '{}' \;
Use sed,
sed 's/ .*\./\./g'
Notice the space before .*
You can use a combination of find and a small script.
prompt> find . -name "* *" -exec move_it {} \;
mv "./50912000 Bicchiere.jpg" ./50912000
mv "./50913714 Sottobottiglia Bernini.jpg" ./50913714
prompt> cat move_it
#!/bin/sh
dst=`echo $1 | cut -c 1-10`
# remove the echo in the line below to actually rename the file
echo mv '"'$1'"' $dst
With rename
rename 's/.*\s+//' *files