Expand command line exclude pattern with zsh - regex

I'm trying to pass a complicated regex as an ignore pattern. I want to ignore all subfolders of locales/ except locales/US/en/*. I may need to fallback to using a .agignore file, but I'm trying to avoid that.
I'm using silver searcher (similar to Ack, Grep). I use zsh in my terminal.
This works really well and ignores all locale subfolders except locales/US:
ag -g "" --ignore locales/^US/ | fzf
I also want to ignore all locales/US/* except for locales/US/en
Want I want is this, but it does not work.
ag -g "" --ignore locales/^US/^en | fzf
Thoughts?

Add multiple --ignore commands. For instance:
ag -g "" --ignore locales/^US/ --ignore locales/US/^en

The following can work as well:
find locales/* -maxdepth 0 -name 'US' -prune -o -exec rm -rf '{}' ';'
Man Pages Documentation
-prune
True; if the file is a directory, do not descend into it. If -depth is given, false; no effect. Because -delete implies -depth, you cannot usefully use -prune and -delete together.
-prune lets you filter out your results (better description here)
-exec command ;
Execute command; true if 0 status is returned. All following arguments to find are taken to be arguments to the command until an argument consisting of ';' is encountered. The string '{}' is replaced by the current file name being processed everywhere it occurs in the arguments to the command, not just in arguments where it is alone, as in some versions of find. Both of these constructions might need to be escaped (with a '\') or quoted to protect them from expansion by the shell. See the EXAMPLES section for examples of the use of the -exec option. The specified command is run once for each matched file. The command is executed in the starting directory. There are unavoidable security problems surrounding use of the -exec action; you should use the -execdir option instead.
-exec lets you execute a command on any results find returns.

Related

Match and replace string with sed in makefile

I want to search through a bunch of MD and HTML files for a line that begins with "id" and replace any instances of a string "old" with the string "new". I had this working from the command line with the following.
find \( -name '*.md' -o -name '*.html' \) -exec sed -i '/id:/s/old/new/g' '{} \;
However, I need to run the command from a Makefile. I have never done anything with make before. When I drop this same command into the Makefile and try to execute it from there, it fails. That's when I realized how little I know about make because I naively thought if it worked from the command line it would work from make. I was wrong. So I was looking in this Makefile for some examples of sed commands that do something similar and I came up with the following. This does not error out but it also does not do anything to my files. So, I am at a loss. Any help is appreciated!
switch_old_for_new:
find \( -name '*.md' -o '*.html' \) -exec sed -i 's#^\(id: \)$(OLD)#\1$(NEW)#' '{}' \;
NOTE: as you can probably see, I need to be able to pass in two actual values for "old" and "new" from the command line, so I also need to have variables in the sed. So I would execute it like this:
make switch_old_for_new OLD=old NEW=new
It seems it was late and you ran out of coffee when copying the command line to make ;)
The only thing that was fishy in your first example was a superfluous ' right before {}. All other things run unchanged in make. In a recipe the \ has no special meaning to make, that is, if make finds it in a tabulator-preceded line after a target: then it should really run verbatim to the solo command line. The only notable exception is a \ right before the line-break, i.e. something like:
target:
echo a very long \
line with a \+newline in it
In this case make will take the \(newline) as indication that it shall pass the current line together with next line (and all subsequent \(newline) concatenated) to the shell in one call instead of separate shell calls for each recipe line in the default case. (Note: only the tab but not the \(newline) will be deleted from the string given to the shell - you need to trick around with variables a bit if that \(newline gets in the way.)
Also, all types of quoting characters '," and also the back-tick (which SO won't allow me to write in syntax font) as well as glob-characters *,? don't invoke any kind of special behaviour - they are passed to the shell as they are.
So your make file could look like:
switch_old_for_new:
find . \( -name '*.md' -o -name '*.html' \) -exec sed -i '/id:/s/$(OLD)/$(NEW)/g' {} \;

How to Execute Python File In Unix Find Command

Okay. So I lets say that I am in the main directory of my computer. How can I search for a file.py and execute it with Unix in one line? Two lines is okay but we are assuming we do not know the file path.
Its a simple question but I am unable to find an answer
Updated
Per kojiro's comment, a better method is to use the -exec argument to find.
$ find ./ -name 'file.py' -exec python '{}' \;
The manpage for find explains its usage better than I can, see here under -exec command ;. But in short it will call command for each result with any arguments up to the \; and replacing '{}' with the file path of the result.
Also in the man page for find, it's worth looking at the notes relating to the -print and -print0 flags if you're using the below approach.
Original Answer
Does something like the following do what you want?
$ cd /path/to/dir/
$ find ./ -name 'file.py' | xargs -L 1 python
which is a pretty useful pattern where
find ./ -name 'file.py'
will list all the paths to files with names matching file.py in the current directory or any subdirectory.
Pipe the output of this into xargs which passes each line from its stdin as an argument to the program given to it. In this case, python. However we want to execute python once for every line given to xargs, from the wikipedia article for xargs
one can also invoke a command for each line of input at a time with -L 1
However, this will match all files under the current path that are named 'file.py'. You can probably limit this to the first result with a flag to find if you want.

How to Recursively Remove Files of a Certain Type

I misread the gzip documentation, and now I have to remove a ton of ".gz" files from many directories inside one another. I tried using 'find' to locate all .gz files. However, whenever there's a file with a space in the name, rm interprets that as another file. And whenever there's a dash, rm interprets that as a new flag. I decided to use 'sed' to replace the spaces with "\ " and the space-dashes with "\ -", and here's what I came up with.
find . -type f -name '*.gz' | sed -r 's/\ /\\ /g' | sed -r 's/\ -/ \\-/g'
When I run the find/sed query on a file that, for example, has a name of "Test - File - for - show.gz", I get the output
./Test\ \-\ File\ \-\ for\ \-\ show.gz
Which appears to be acceptable for rm, but when I run
rm $(find . -type f -name '*.gz'...)
I get
rm: cannot remove './Test\\': No such file or directory
rm: cannot remove '\\-\\': No such file or directory
rm: cannot remove 'File\\': No such file or directory
rm: cannot remove '\\-\\': No such file or directory
...
I haven't made extensive use of sed, so I have to assume I'm doing something wrong with the regular expressions. If you know what I'm doing wrong, or if you have a better solution, please tell me.
Adding backslashes before spaces protects the spaces against expansion in shell source code. But the output of a command in a command substitution does not undergo shell parsing, it only undergoes wildcard expansion and field splitting. Adding backslashes before spaces doesn't protect them against field splitting.
Adding backslashes before dashes is completely useless since it's rm that interprets dashes as special, and it doesn't interpret backslashes as special.
The output of find is ambiguous in general — file names can contain newlines, so you can't use a newline as a file name separator. Parsing the output of find is usually broken unless you're dealing with file names in a known, restricted character set, and it's often not the simplest method anyway.
find has a built-in way to execute external programs: the -exec action. There's no parsing going on, so this isn't subject to any problem with special characters in file names. (A path beginning with - could still be interpreted as an option, but all paths begin with . since that's the directory being traversed.)
find . -type f -name '*.gz' -exec rm {} +
Many find implementations (Linux, Cygwin, BSD) can delete files without invoking an external utility:
find . -type f -name '*.gz' -delete
See Why does my shell script choke on whitespace or other special characters? for more information on writing robust shell scripts.
There is no need to pipe to sed, etc. Instead, you can make use of the -exec flag on find, that allows you to execute a command on each one of the results of the command.
For example, for your case this would work:
find . -type f -name '*.gz' -exec rm {} \;
which is approximately the same as:
find . -type f -name '*.gz' -exec rm {} +
The last one does not open a subshell for each result, which makes it faster.
From man find:
-exec command ;
Execute command; true if 0 status is returned. All following
arguments to find are taken to be arguments to the command until an
argument consisting of ;' is encountered. The string{}' is
replaced by the current file name being processed everywhere it occurs
in the arguments to the command, not just in arguments where it is
alone, as in some versions of find. Both of these constructions
might need to be escaped (with a `\') or quoted to protect them from
expansion by the shell. See the EXAMPLES section for examples of the
use of the -exec option. The specified command is run once for
each matched file. The command is executed in the starting directory.
There are unavoidable security problems surrounding use of the -exec
action; you should use the -execdir option instead.

Recursively rename directories and files based on a regular expression

I am trying to strip all "?" in file names in a given directory who was got more subdirectories and they have subdirectories within it. I've tried using a simple perl regex script with system calls but it fails to recurse over each subdirectory, and going manually would be too much wasted time. How can I solve my problem?
You can use the find command to search the filenames with "?" and then use its exec argument to run a script which removes the "?" characters from the filename. Consider this script, which you could save to /usr/local/bin/rename.sh, for example (remember to give it +x permission):
#!/bin/sh
mv "$1" "$(echo $1| tr -d '?')"
Then this will do the job:
find -name "*\?*" -exec rename.sh {} \;
Try this :
find -name '*\?*' -exec prename 's/\?//g' {} +
See https://metacpan.org/module/RMBARKER/File-Rename-0.06/rename.PL (this is the default rename command on Ubuntu distros)
Find all the names with '?' and delete all of them. Probably -exec option could be used as well but would require additional script
for f in $(find $dir -name "*?*" -a -type f) ; do
mv $f ${f/?/}
done

Using non-consuming matches in Linux find regex

Here's my problem in a simplified scenario.
Create some test files:
touch /tmp/test.xml
touch /tmp/excludeme.xml
touch /tmp/test.ini
touch /tmp/test.log
I have a find expression that returns me all the XML and INI files:
[root#myserver] ~> find /tmp -name -prune -o -regex '.*\.\(xml\|ini\)'
/tmp/test.ini
/tmp/test.xml
/tmp/excludeme.xml
I now want a way of modifying this -regex to exclude the excludeme.xml file from being included in the results.
I thought this should be possible by using/combining a non-consuming regex (?=expr) with a negated match (?!expr). Unfortunately I can't quite get the format of the command right, so my attempts result in no matches being returned. Here was one of my attempts (I've tried many different forms of this with different escaping!):
find /tmp -name -prune -o -regex '\(?=.*excludeme\.xml\).*\.\(xml\|ini\)'
I can't break down the command into multiple steps (e.g. piping through grep -v) as the find command is assumed as input into other parts of our tool.
This does what you want on linux:
find /tmp -name -prune -o -regex '.*\.\(xml\|ini\)' \! -regex '.*excludeme\.xml'
I'm not sure if the "!" operator is unique to gnu find.
Not sure about what escapes you need or if lookarounds work, but these work for Perl:
/^(?!.*\/excludeme\.).*\.(xml|ini)$/
/(?<!\/excludeme)\.(xml|ini)$/
Edit - Just checked find command, best you can do with find is to change the regextype to -regextype posix-extended but that doesen't do stuff like look-arounds. The only way around this looks to be using some gnu stuff, either as #unholygeek suggests with find or piping find into gnu grep with the -P perl option. You can use the above regex verbatim if you go with a gnu grep. Something like find .... -print | xargs grep -P ...
Sorry, thats the best I can do.