Expression that removes files except for some - regex

This list of files:
FILE0001
FILE0002
FILE0003
FILE0004
FILE0005
FILE0006
FILE0007
FILE0008
FILE0009
FILE0010
I want to delete all except the following:
FILE0001
FILE0008
FILE0010
How do I do this expression?
There can be a very time-consuming expression, because the files are large.
There are other files in that directory. And that can not be affected or removed. Even in the same pattern names.
Example:
FILE0001.1
FILE0002.2

bash patterns (http://www.gnu.org/software/bash/manual/bashref.html#Pattern-Matching)
shopt -s extglob
echo rm FILE00!(01|08|10)
remove the "echo" if you're satisfied.

GLOBIGNORE="FILE0001:FILE0008:FILE0010"
echo rm *

Something like this should do the trick assuming all the files are in the same directory that you execute the command and there are no other files or paths that you need to exclude:
find . ! -name 'FILE0001' ! -name 'FILE0008' ! -name 'FILE0010' -exec rm {} /dev/null \;

Related

Is there a globbing pattern to match by file extension, both PWD and recursively?

I need to match files only with one specific extension under all nested directories, including the PWD, with BASH using "globbing".
I do not need to Match all files under all nested directories with shell globbing, but not in the PWD.
I need to match files using commands other than grep search all directories with filename extension
I do not need to only grep recursively, but only in files with certain extensions (plural)
set -o globstar; ls **/*.* is great for all files (not my question).
ls **/*.php does not match in the PWD.
set -o globstar; **/*.php returns duplicate files.
grep -r --include=\*.php "find me" ./ is specifically for grep, not globbing (consider this Question). It seems grep has --include=GLOB because this is not possible using globbing.
From this Answer (here), I believe there may not be a way to do this using globbing.
tl;dr
I need:
A glob expression
To match any command where simple globs can be used (ls, sed, cp, cat, chown, rm, et cetera)
Mainly in BASH, but other shells would be interesting
Both in the PWD and all subdirectories recursively
For files with a specific extension
I'm using grep & ls only as examples, but I need a glob expression that applies to other commands also.
grep -r --include=GLOB is not a glob expression for, say, cp; it is a workaround specific to grep and is not a solution.
find is not a glob, but it may be a workaround for non-grep commands if there is no such glob expression. It would need | or while do;, et cetera.
Examples
Suppose I have these files, all containing "find me":
./file1.js
./file2.php
./inc/file3.js
./inc/file4.php
./inc.php/file5.js
./inc.php/file6.php
I need to match only/all .php one time:
./file2.php
./inc/file4.php
./inc.php/file6.php
Duplicates returned: shopt -s globstar; ... **/*.php
This changes the problem; it does not solve it.
Dup: ls
Before entering shopt -s globstar as a single command...
ls **/*.php returns:
inc/file4.php
inc.php/file5.js
inc.php/file6.php
file2.php does not return.
After entering shopt -s globstar as a single command...
ls **/*.php returns:
file2.php
inc/file4.php
inc.php/file6.php
inc.php:
file5.js
file6.php
inc.php/file6.php returns twice.
Dup: grep
Before entering shopt -s globstar as a single command...
grep -R "find me" **/*.php returns:
inc/file4.php: find me
inc.php/file6.php: find me
file2.php does not return.
After entering shopt -s globstar as a single command...
grep -R "find me" **/*.php returns:
file2.php: find me
inc/file4.php: find me
inc.php/file5.js: find me
inc.php/file6.php: find me
inc.php/file6.php: find me
inc.php/file6.php returns twice.
After seeing the duplicate seen from the ls output, we know why.
Current solution: faulty misuse of && logic
grep -r "find me" *.php && grep -r "find me" */*.php
ls -l *.php && ls -l */*.php
Please no! I fail here && so I never happen
Desired solution: single command via globbing
grep -r "find me" [GLOB]
ls -l [GLOB]
Insight from grep
grep does have the --include flag, which achieves the same result but using a flag specific to grep. ls does not have an --include option. This leads me to believe that there is no such glob expression, which is why grep has this flag.
With bash, you can first do a shopt -s globstar to enable recursive matching, and then the pattern **/*.php will expand to all the files in the current directory tree that have a .php extension.
zsh and ksh93 also support this syntax. Other commands that take a glob pattern as an argument and do their own expansion of it (like your grep --include) likely won't.
With shell globing it is possible to only get directories by adding a / at the end of the glob, but there's no way to exclusively get files (zsh being an exception)
Illustration:
With the given tree:
file.php
inc.php/include.php
lib/lib.php
Supposing that the shell supports the non-standard ** glob:
**/*.php/ expands to inc.php/
**/*.php expands to file.php inc.php inc.php/include.php lib/lib.php
For getting file.php inc.php/include.php lib/lib.php, you cannot use a glob.
=> with zsh it would be **/*.php(.)
Standard work-around (any shell, any OS)
The POSIX way to recursively get the files that match a given standard glob and then apply a command to them is to use find -type f -name ... -exec ...:
ls -l <all .php files> would be:
find . -type f -name '*.php' -exec ls -l {} +
grep "finde me" <all .php files> would be:
find . -type f -name '*.php' -exec grep "finde me" {} +
cp <all .php files> ~/destination/ would be:
find . -type f -name '*.php' -type f -exec sh -c 'cp "$#" ~/destination/' _ {} +
remark: This one is a little more tricky because you need ~/destination/ to be after the file arguments, and find's syntax doesn't allow find -exec ... {} ~/destination/ +
Suggesting different strategy:
Use explicit find command to build bash command(s) on the selected files using -printf option.
Inspect the command for correctness and run.
1. preparing bash commands on selected files
find . -type f -name "*.php" -printf "cp %p ~/destination/ \n"
2. inspect the output, correct command, correct filter, test
cp ./file2.php ~/destination/
cp ./inc/file4.php ~/destination/
cp ./inc.php/file5.php ~/destination/
3. execute prepared find output
bash <<< $(find . -type f -name "*.php" -printf "cp %f ~/destination/ \n")

Removing files using if statement in bash script when a certain condition is met

I am currently using the following code:
#!/bin/bash
rm /media/external/archive/auth-settings.tar.raw
rm /media/external/archive/bak-settings.tar.raw
rm /media/external/archive/cont-settings.tar.raw
rm /media/external/archive/data-data.tar.raw
rm /media/external/archive/data-settings.tar.raw
rm /media/external/archive/mon-data.tar.raw
rm /media/external/archive/mon-settings.tar.raw
rm /media/external/archive/mail-data.tar.raw
rm /media/external/archive/mail-settings.tar.raw
rm /media/external/archive/portal-settings.tar.raw
rm /media/external/archive/webserver-data.tar.raw
rm /media/external/archive/webserver-settings.tar.raw
for f in /media/external/archive/*.tar.raw;
do mv "$f" "${f%.*.tar.raw}.tar.raw";
done
to remove old backups once the new ones have been archived. However if for some reason the archiving fails, this script will be run regardlessly and it will remove all the archives, leaving nothing behind.
How can I modify the script in a way so that the rm command are only done if the the corresponding archive exists with a count number in its file name skipping the deletion if the numbered archive does not exist. For example:
auth-setting.tar.raw
should be removed only if there is a
auth-settings.number.tar.raw
You can use an if statement in the script. For example :
if [ -e auth-settings.*.tar.raw ]
then
rm /media/external/archive/auth-settings.tar.raw
fi
the "*" characther give you a wildcard to insert between the 2 dots.
In this way you delete /media/external/archive/auth-settings.tar.raw only if "auth-settings.number.tar.raw" exist.
This will return a list of files which have a number as in the format you provided, but it will strip the number so you can only delete this output :
find /media/external/archive/ -type f -regextype sed -regex ".*\.[0-9]\+\.tar\.raw" -print0 | xargs --null -L 1 basename | sed -E "s/(.*)(\.[0-9]+)(.*)/\1\3/g"
Files in the directory :
more-files.03242.tar.raw
somethingwithdot.3.tar.raw
'with spaces.09434.tar.raw'
Output :
somethingwithdot.tar.raw
more-files.tar.raw
with spaces.tar.raw
Now you can safely iterate and delete these files as you can be sure they have a backup.

Bash script to Rename multiple files in subfolder to their folder name

I have the following file structure:
Applications/Snowflake/applications/Salford_100/wrongname_120.nui; wrongname_200_d.nui
Applications/Snowflake/applications/Salford_900/wrongname_120.nui; wrongname_200_d.nui
Applications/Snowflake/applications/Salford_122/wrongname_120.nui; wrongname_200_d.nui
And I want to rename the fles to the same name as the directories they're in, but the files with "_d" at the end should retain its last 2 characters. The file pattern would always be "salford_xxx" where xxx is always 3 digits. So the resulting files would be:
Applications/Snowflake/applications/Salford_100/Salford_100.nui; Salford_100_d.nui
Applications/Snowflake/applications/Salford_900/Salford_900.nui; Salford_900_d.nui
Applications/Snowflake/applications/Salford_122/Salford_122.nui; Salford_122_d.nui
The script would run from a different location in
Applications/Snowflake/Table-updater
I imagine this would require a for loop and a sed regex, but Im open to any suggestions.
(Thanks #ghoti for your advice)
I've Tried this, which currently does not account for files with "_d" yet and I just get one file renamed correctly. Some help would be appreciated.
cd /Applications/snowflake/table-updater/Testing/applications/salford_*
dcomp="$(basename "$(pwd)")"
for file in *; do
ext="${file##*.}"
mv -v "$file" "$dcomp.$ext"
done
Ive now updated the script following #varun advice (thank you) and it now also searches through all files in the parent dir that contain salford in the name, missing out the parent name. Please see below
#!/bin/sh
#
# RenameToDirName2.sh
#
set -e
cd /Applications/snowflake/table-updater/Testing/Applications/
find salford* -maxdepth 1 -type d \( ! -name . \) -exec sh -c '(cd {} &&
(
dcomp="$(basename "$(pwd)")"
for file in *;
do ext="${file#*.}"
zz=$(echo $file|grep _d)
if [ -z $zz ]
then
mv -v "$file" "$dcomp.$ext"
else
mv -v "$file" "${dcomp}_d.$ext"
fi
done
)
)' ';'
The thing is, I've just realised that in these salford sub directories there are other files with different extensions that I don't want renaming. Ive tried putting in an else if statement to stipulate *.Nui files only, calling my $dcomp variable, like this
else
if file in $dcomp/*.nui
then
#continue...
But I get errors. Where should this go in my script and also do I have the correct syntax for this loop? Can you help?
You can write:
(
cd ../applications/ && \
for name in Salford_[0-9][0-9][0-9] ; do
mv "$name"/*_[0-9][0-9][0-9].nui "$name/$name.nui"
mv "$name"/*_[0-9][0-9][0-9]_d.nui "$name/${name}_d.nui"
done
)
(Note: the (...) is a subshell, to restrict the scope of the directory-change and of the name variable.)
#eggfoot,I have modified my script, which will look into all the directories in folder applications and look for for folders which have Salford in it.
So you can call my script like this
./rename.sh /home/username/Applications/Snowflake e/applications
#!/bin/bash
# set -x
path=$1
dir_list=$(find $path/ -type d)
for index_dir in $dir_list
do
aa=$(echo $index_dir|grep Salford)
if [ ! -z $aa ]
then
files_list=$(find $index_dir/ -type f)
for index in $files_list
do
xx=$(basename $index)
z=$(echo $xx|grep '_d')
if [ -z $z ]
then
result=$(echo $index | sed 's/\/\(.*\)\/\(.*\)\/\(.*\)\(\..*$\)/\/\1\/\2\/\2\4/')
mv "$index" "$result"
else
result=$(echo $index | sed 's/\/\(.*\)\/\(.*\)\/\(.*\)_d\(\..*$\)/\/\1\/\2\/\2_d\4/')
mv "$index" "$result"
fi
done
fi
done
Regarding sed, it uses the s command of sed and substitute the file name with directory name, keeping the extension as it is.
Regarding your script, you need to use grep command to find files which have _d and than you can use parameter substitution changing the mv for files with _d and one without _d.
dcomp="$(basename "$(pwd)")"
for file in *; do
ext="${file##*.}"
zz=$(echo $file|grep _d)
if [ -z $zz ]
then
mv -v "$file" "$dcomp.$ext"
else
mv -v "$file" "${dcomp}_d.$ext"
fi
done

How to rename all files in a folder removing everything after space character in linux?

Hello I can't use well the regular expressions it's all day I'm searching on Internet.
I have a folder with many pictures:
50912000 Bicchiere.jpg
50913714 Sottobottiglia Bernini.jpg
I'm using Mac OS X, but I can also try on a Ubuntu, I would like to make a script for bash to remove all the characters after the first space to have a solution like this:
50912000.jpg
50913714.jpg
For all the files in the folder.
Any help is appreciated.
Regards
Use pure BASH:
f='50912000 Bicchiere.jpg'
mv "$f" "${f/ *./.}"
Or using find fix all the files at once:
find . -type f -name "* *" -exec bash -c 'f="$1"; s="${f/_ / }"; mv -- "$f" "${s/ *./.}"' _ '{}' \;
Use sed,
sed 's/ .*\./\./g'
Notice the space before .*
You can use a combination of find and a small script.
prompt> find . -name "* *" -exec move_it {} \;
mv "./50912000 Bicchiere.jpg" ./50912000
mv "./50913714 Sottobottiglia Bernini.jpg" ./50913714
prompt> cat move_it
#!/bin/sh
dst=`echo $1 | cut -c 1-10`
# remove the echo in the line below to actually rename the file
echo mv '"'$1'"' $dst
With rename
rename 's/.*\s+//' *files

Regex to rename all files recursively removing everything after the character "?" commandline

I have a series of files that I would like to clean up using commandline tools available on a *nix system. The existing files are named like so.
filecopy2.txt?filename=3
filecopy4.txt?filename=33
filecopy6.txt?filename=198
filecopy8.txt?filename=188
filecopy3.txt?filename=19
filecopy5.txt?filename=1
filecopy7.txt?filename=5555
I would like them to be renamed removing all characters after and including the "?".
filecopy2.txt
filecopy4.txt
filecopy6.txt
filecopy8.txt
filecopy3.txt
filecopy5.txt
filecopy7.txt
I believe the following regex will grab the bit I want to remove from the name,
\?(.*)
I just can't figure out how to accomplish this task beyond this.
A bash command:
for file in *; do
mv $file ${file%%\?filename=*}
done
find . -depth -name '*[?]*' -exec sh -c 'for i do
mv "$i" "${i%[?]*}"; done' sh {} +
With zsh:
autoload zmv
zmv '(**/)(*)\?*' '$1$2'
Change it to:
zmv -Q '(**/)(*)\?*(D)' '$1$2'
if you want to rename dot files as well.
Note that if filenames may contain more than one ? character, both will only trim from the rightmost one.
If all files are in the same directory (ignoring .dotfiles):
$ rename -n 's/\?filename=\d+$//' -- *
If you want to rename files recursively in a directory hierarchy:
$ find . -type f -exec rename -n 's/\?filename=\d+$//' {} +
Remove -n option, to do the renaming.
I this case you can use the cut command:
echo 'filecopy2.txt?filename=3' | cut -d? -f1
example:
find . -type f -name "*\?*" -exec sh -c 'mv $1 $(echo $1 | cut -d\? -f1)' mv {} \;
You can use rename if you have it:
rename 's/\?.*$//' *
I use this after downloading a bunch of files where the URL included parameters and those parameters ended up in the file name.
This is a Bash script.
for file in *; do
mv $file ${file%%\?*};
done