How to -find -exec cp files from a list containing a portion of the filename - regex

I've been searching for a long time and can't find an answer that works. I have a list with partial filenames (the first few letters of the filenames). If I place the file names individually as follows it works:
find ~/directory/to/search -name "filename*" -print -exec cp '{}' ~/directory/to/copyto \;
If I try to include the list in this scenario it does not:
cat ~/directory/List.txt | while read line
do
echo "Text read from file - $line"
find ~/directory/to/search -name "$line*" -type f
done
neither does this:
cat ~/directory/List.txt | while read line
do
echo "Text read from file - $line"
find ~/directory/to/search -name "$line&*" -type f
done
Ultimately, I'd like to add:
-exec cp '{}' ~/directory/to/copy/to \;
And copy over all files matching the find criteria.
I've tried using grep but the files are huge so it would take forever. I tried using all sorts of combinations of find, xargs, cp, grep and regex as read in previous searches and no luck.
Is the only solution to write a long script with a bunch of if then statements? I've been using Linux but it would be cool to use it on mac as well.

Here is a crude attempt at getting away with a single find invocation.
predicates=()
or=''
while read -r line; do
predicates+=($or -name "$line*")
or="-o"
done < ~/directory/list.txt
find ~/directory/to/search -type f \( "${predicates[#]}" \) \
-exec cp -t ~/directory/to/copy/to {} +
The array functionality requires an extended shell (Bash, ksh, etc) with this functionality; it won't work with /bin/sh.
cp -t is a GNU extension; if you don't have that, maybe just use your original -exec cp {} dir \; though it will be less efficient. Some old versions of find also don't support -exec ... +.

Related

Strip unwanted characters from a series of text files

Hi I've got a list of csv files which need to be formatted properly by getting rid of some unwanted characters.
original:
9: ["2019-4-24",-7.101458109105941]
10: ["2019-5-6",-7.050609022950812]
100: ["2019-5-6",-7.050609022950812]
I'd like to modify as:
2019-4-24,-7.101458109105941
2019-5-6,-7.050609022950812
2019-5-6,-7.050609022950812
There are dozens of files in this format and I was thinking of writing a sed command which makes a series of null substitutions for all the files in directory, but these don't seem to work.
find ./ -type f -exec sed -i '' -e "s/^[[:space:]]*//" {} \;
find ./ -type f -exec sed -i '' -e "s/\[//" {} \;
find ./ -type f -exec sed -i '' -e "s/\]//" {} \;
Many thanks for suggestions.
I found this to work on my linux machine.
find ./ -type f -exec sed -i "s/^.\+\[//;s/\"//g;s/\]//" {} \;
Which, from what I gather is equivalent to the following in macOS:
find ./ -type f -exec sed -i '' "s/^.\+\[//;s/\"//g;s/\]//" {} \;
It comprises of 3 substitutions(separated by semicolon):
s/^.\+\[// deletes everything from the start to the "[" character.
s/\"//g deletes all occurences of the double quote character.
s/\]// deletes the final "]" at the end.
And please make a backup or something if you are going to use sed -i.

Passing loop variable to BASH find -regex argument?

I am trying to copy images selected by a REGEX expression via find. The command below runs but does not copy the files. How can I pass a loop variable (e.g. reading lines of a file) to the find -regex argument?
cat hit_images_regex_list.txt | while read line;
do
find . -regex "${line}" -exec cp --parents {} /destination_dir \;
done
The hit_images_regex_list.txt file where I am pulling REGEX expressions from looks like this:
".*B - 12.*tif"
".*D - 09.*tif"
".*G - 03.*tif"
".*G - 12.*tif"
...
Using find with each of these REGEX expressions works, but the loop pulling REGEX expressions from my .txt file doesn't do anything.
for i in $(cat hit_images_regex_list.txt); do find -name "$i" | cat -n | while read n f; do cp "$f" /newfolder/path/"$n".tig; done; done
this should open the file and for each regex (line) find all files that match and copy to a new folder
does this help
Try this.
while IFS= read -r line; do
find . -regex "$line" -exec cp --parents {} /destination_dir \;
done < hit_images_regex_list.txt
People have taken time to try and help. Please our of politeness rate the replies

append epoch date at the beginning of a file in bash

I have a list of 20 files, 10 of them already have 1970-01-01- at the beginning of the name and 10 does not ( the remaining ones all start with a small letter ) .
So my task was to rename those files that do not have the epoch date in the beginning with the epoch date too. Using bash, the below code works, but I could not solve it using a regular expression for example using rename. I had to extract the basename and then further mv. An elegant solution would be just use one pipe instead of two.
Works
find ./ -regex './[a-z].*' | xargs -I {} basename {} | xargs -I {} mv {} 1970-01-01-{}
Hence looking for a solution with just one xargs or -exec?
You can just use a single rename command:
rename -n 's/^([a-z])/1970-01-01-$1/' *
Assuming you're operating on all the files present in current directory.
Note that -n flag (dry run) will only show intended actions by rename command but won't really rename any files.
If you want to combine with find then use:
find . -type f -maxdepth 1 -name '[a-z]*.txt' -execdir rename -n 's/^/1970-01-01-/' {} +
I always prefer readable code over short code.
r() {
base=$(basename "$1")
dir=$(dirname "$1")
if [[ "$base" =~ ^1970-01-01- ]]
then
: "ignore, already has correct prefix"
else
echo mv "$1" "$dir/1970-01-01-$base"
fi
}
export -f r
find . -type f -exec bash -c 'r {}' \;
This also just prints out what would have been done (for testing). Remove the echo before the mv to have to real thing.
Mind that the mv will overwrite existing files (if there is a ./a/b/c and an ./a/b/1970-01-01-c already). Use option -i to mv to be save from this.

Bash script find/sed not working

I have the following line in my bash script:
find . -name "*.html" -print |
xargs sed -i 's/http\:\/\/version2\.staging\.myname\.com//g'
and it's giving me the following error:
sed: 1: "./instant/index. ...": invalid command code .
What I'm trying to do is replace any occurrence of http://version2.staging.myname.com with /. How do you do it?
Usually I use something like:
find . -name "*.html" -exec sed -i 's|http://version2\.staging\.myname\.com/|/|g' '{}' ';'
To test this out, you can first insert an echo statement
find . -name "*.html" -exec echo sed -i 's|http://version2\.staging\.myname\.com/|/|g' '{}' ';'
... that will tell you if the output will be what you expect. I always recommend doing a dry-run with echo first before any mass update. Also you can use | as an alternate regex delimiter to avoid using as many `/' in the paths.
For OSX try this:
find . -name "*.html" -exec sed -i.bak 's#http://version2\.staging\.myname\.com##g' '{}' \; -print
I think you may be using a Mac (and now I see a comment that you are on an iMac). On Mac OS X, the sed -i option requires an argument. That makes sense of your error message. The sed command is interpreting your s/...//g command as the suffix to use for the back up file; it is then trying to interpret the first file name as the sed script, and fortunately, that is not working.
Additionally, you can avoid most of the escaping issues by using some character other than / as the delimiter for s///. Also, it is generally better (especially on Macs where file paths often end up with spaces in them) to avoid xargs and use -exec in find, along with the + option to do what xargs does — namely group many file names into one command invocation.
This leads to:
find . -name "*.html" -type f \
-exec sed -i .bak -e 's%http://version2.staging.myname.com%%g' {} +
(NB: strictly, that will map http://version2-staging*myname#com to / too; if you're really worried about that, by all means escape the dots in the URL.)
If you want to get rid of the .bak files afterwards:
find . -name '*.bak' -type f -exec rm -f {} +

Search for lines with specific text in files and output file names (file command and grep)

I know this question may be simple to some of you, but I have tried several combinations and googled a lot, without success.
Problem: I have a bunch of files with a given file name, but in different directories.
For example, I have a file called 'THEFILE.txt' in directories a, b, c, d. I am in a directory that has these as subdirectories. In each of 'THEFILE.txts' I am looking for lines with the following pattern :'Has this property blah blah blah _apple'. So what I know for sure about the line is that it starts with 'Has this property ' and ends with '_apple'.
I tried:
find . -name 'THEFILE.txt' -exec grep -l 'Has this property' {} \;
This works, but I get each and every line with 'Has this property'. I only want ones with _apple at the end
So I tried:
find . -name 'THEFILE.txt' -exec grep -l 'Has this property*_apple' {} \; //Does not work, and from my google searches, I don't expect it to.
So, next I tried:
find . -name 'THEFILE.txt' -exec grep -l 'Has this property[!-~]*_apple' {} \;
//DOES NOT WORK
find . -name 'THEFILE.txt' -exec grep 'Has this property' {} \; | grep '_apple$'
//This outputs all matching lines, but not the file names
find . -name 'THEFILE.txt' -exec grep 'Has this property' {} \; | grep -l '_apple$'
//Says file is stdin
Expected output: (say files a and c have desired lines)
./a/THEFILE.txt
./c/THEFILE.txt
Your attempt 2. was almost there. With a little adjustment:
find . -name THEFILE.txt -exec grep -q '^Has this property.*_apple$' '{}' ';' -print
it is more precise than recursive grepping and simpler (no pipeline).
The reason why the above works (as opposed to grep -l), is that the -exec action evaluates to whatever the exit status of its command was.
grep will exit with 0 status (which is true) if it finds what it looked for, that will make -exec yield true, and that in turn will cause the next action (-print) to be taken too.
You can just use grep -r, and pipe it into awk if you only want the filenames.
grep -r "_apple$" . | awk -F: '{print $1}'