grep filename[asterisk] returns unexpected result - regex

I have a basic question about ls command.
Suppose in a directory I have 4 files named
run
run1
running
run.sh
So, if i do: ls -l|grep run* then I get no result.
But if i do ls -l|grep run.* then I get run.sh as a result.
However I expected grep to list all of the files in both the cases.
Could you make me understand what is going on behind scenes?

This is because the asterisk is special to the shell and gets expanded. To avoid this, you have to quote the regex for grep to see it unexpanded:
ls -l|grep 'run*'
And note that this is not what you want, because 'run*' as an regexp means 'ru followed by any number of n'. This will list also files named rubber and so on. To list files that match a shell glob pattern (which is different from an regexp), why not simply use
ls -l run*
ls -l run.*
and avoid the useless grep process entirely?

As long as I understand, the "*" is expanded by the shell before executing the command itself, so your grep will try to catch a string with all the file names! On the other hand, grep expects a regular expression, so the "*" is not interpreted as you expect.
The direct solution would be:
$ ls -l run*
Or, if you want to use grep, then scape the "*" and provide a regular expression:
$ ls -l|grep run.\*
$ ls -l|grep 'run.*'

Before the shell even runs grep, it searches through your command for any unquoted file globbing characters, and performs filename expansion on those arguments.
So when you enter this command:
ls -l | grep run*
the shell uses the pattern run* to search for files in the current directory, and finds run, run1, running and run.sh. It then rewrites the grep command with those arguments:
ls -l | grep run run1 running run.sh
which causes grep to search run1, running and run.sh for the string run.
As noted, the solution is to quote the argument to grep so the shell does not try to perform filename expansion on it:
ls -l | grep 'run.*'

Related

ls and regular expression linux

I have two directories:
run.2016-02-25_01.
run.2016-02-25_01.47.04
Both these directories are present under a common directory called gte.
I want a directory that ends without a dot character ..
I am using the following command, however, I am not able to make it work:
ls run* | grep '.*\d+'
The commands is not able to find anything.
The negated character set in shell globbing uses ! not ^:
ls -d run*[!.]
(The ^ was at one time an archaic synonym for |.) The -d option lists directory names, not the contents of those directories.
Your attempt using:
ls run* | grep '.*\d+'
requires a PCRE-enabled grep and the PCRE regex option (-P), and you are looking for zero or more of any character followed by one or more digits, which isn't what you said you wanted. You could use:
ls -d run* | grep '[^.]$'
which doesn't require the PCRE regexes, but simply having the shell glob the right names is probably best.
If you're worried that there might not be a name starting run and ending with something other than a dot, you should consider shopt -s nullglob, as mentioned in Anubhava's answer. However, note the discussion below between hek2mgl and myself about the potentially confusing behaviour of, in particular, the ls command in conjunction with shopt -s nullglob. If you were using:
for name in run*[!.]
do
…
done
then shopt -s nullglob is perfect; the loop iterates zero times when there's no match for the glob expression. It isn't so good when the glob expression is an argument to commands such as ls that provide a default behaviour in the absence of command line arguments.
You don't need grep. Just use:
shopt -s nullglob
ls -d run*[0-9]
If your directories are not always ending with digits then use extglob:
shopt -s nullglob extglob
ls -d run*+([^.])
or to list all entries inside the run* directory ending without DOT:
printf "%s\n" run*+([^.])/*
This works...
ls|grep '.*[^.]$'
That is saying I want any amount of anything but I want the last character before the line ending to be anything except for a period.
To list the directories that don't end with a . .
ls -d run* |grep "[^.]$"
I would use find
find -regextype posix-awk -maxdepth 1 -type d -regex '-*[[:digit:]]+$'

Regex with inotifywait to compile two types of file in golang

I use a script to auto-compile in golang with inotifywait. But this script only checks files with the extension .go. I want to also add the .tmpl extension but the script uses regular expressions. What kind of changes I have to make to this line to get the desired result?
inotifywait -q -m -r -e close_write -e moved_to --exclude '[^g][^o]$' $1
I've tried to concatenate with | or & and other things like ([^t][^m][^p][^l]|[^g][^o])$ but nothing seems to work.
Rather than trying to use a regex to exclude two types of file, why don't you just only watch those files?
inotifywait -q -m -r -e close_write -e moved_to /path/**/*.{go,tmpl}
To use the ** (which does a recursive match), you may have to enable bash's globstar:
shopt -s globstar

why the ls -R (recursing down) doesn't work with regular expression

In my case, the directory tree is following
[peter#CentOS6 a]$ tree
.
├── 2.txt
└── b
└── 1.txt
1 directory, 2 files
why the following two command does only get 2.txt?
[peter#CentOS6 a]$ ls -R *.txt
2.txt
[peter#CentOS6 a]$ ls -R | grep *.txt
2.txt
In both cases, your shell is expanding *.txt into 2.txt before the argument hits the command. So, you are in effect running
ls -R 2.txt
ls -R | grep 2.txt
You can't tell ls to look for a file pattern - that's what find is for. In the second case, you should quote your expression and use a proper regex:
ls -R | grep '\.txt'
You can use find as follows to list all matching files in current and sub directories
find . -name "*.txt"
It isn't clear if you are asking "why" meaning "explain the output" or "how should it be done". Steephen has already answered the latter, this is an answer to the former.
The reason for that is called "shell expansion". When you type *.txt in the command line, the program doesn't get it as a parameter, but rather the shell expands it and then passes the results.
*.txt expands to be "all files in the current directory with arbitrarily many symbols in the beginning, ending with '.txt' and not starting with '.'".
This means that when you type "ls -R *.txt" the command that actually executes is "ls -R 2.txt"; and when you do "ls -R | grep *.txt" it actually executes "ls -R | grep 2.txt".
This is the exact reason why Steephen has put quotation marks around the wildcard in the answer provided. It is necessary to stop this expansion. In fact you could also do so with single quotes or by placing a slash before any special character. Thus any of the following will work:
find . -name "*.txt"
or
find . -name '*.txt'
or
find . -name \*.txt
The other problem that nobody has mentioned yet is that, beyond the fact that the shell intercepts the * before grep sees it, the shell treats * differently from grep.
The shell uses file globbing, and * means "any number of characters".
grep uses regular expressions, and * means "any number of the preceding item".
What you need to do is
ls -R | grep .\*\\.txt
which will
escape the * so your shell does not intercept it
properly format the regular expression the way grep expects
properly escape the . in .txt to ensure that you have file extensions

How to call grep on pattern files?

I'm trying to grep over files which have names matching regexp. But following:
#!/bin/bash
grep -o -e "[a-zA-Z]\{1,\}" $1 -h --include=$2 -R
is working only in some cases. When I call this script like that:
./script.sh dir1/ [A-La-l]+
it doesn't work. But following:
./script.sh dir1/ \*.txt
works fine. I have also tried passing arguments within double-quotes and quotes but neither worked for me.
Do you have any ideas how to solve this problem?
grep's --include option does not accept a regex but a glob (such as *.txt), which is considerably less powerful. You will have to decide whether you want to match regexes or globs -- *.txt is not a valid regex (the equivalent regex is .*\.txt) while [A-La-l]+ is not a valid glob.
If you want to do it with regexes, you will not be able to do it with grep alone. One thing you could do is to leave the file selection to a different tool such as find:
find "$1" -type f -regex "$2" -exec grep -o -e '[a-zA-Z]\{1,\}' -h '{}' +
This will construct and run a command grep -o -e '[a-zA-Z]\{1,\}' -h list of files in $1 matching the regex $2. If you replace the + with \;, it will run the command for each file individually, which should yield the same results (very) slightly more slowly.
If you want to do it with globs, you can carry on as before; your code already does that. You should put double quotes around $1 and $2, though.

Safe search&replace on linux

Let's consider I have files located in different subfolders and I would like to search, test and replace something into these files.
I would like to do it in three steps:
Search of a specific pattern (with or without regexp)
Test to replace it with something (with or without regexp)
Apply the changes only to the concerned files
My current solution is to define some aliases in my .bashrc in order to easily use grep and sed:
alias findsrc='find . -name "*.[ch]" -or -name "*.asm" -or -name "*.inc"'
alias grepsrc='findsrc | xargs grep -n --color '
alias sedsrc='findsrc | xargs sed '
Then I use
grepsrc <pattern> to search my pattern
(no solution found yet)
sedsrc -i 's/<pattern>/replace/g'
Unfortunately this solution does not satisfy me. The first issue is that sed touch all the files even of no changes. Then, the need to use aliases does not look very clean to me.
Ideally I would like have a workflow similar to this one:
Register a new context:
$ fetch register 'mysrcs' --recurse *.h *.c *.asm *.inc
Context list:
$ fetch context
1. mysrcs --recurse *.h *.c *.asm *.inc
Extracted from ~/.fetchrc
Find something:
$ fetch files mysrcs /0x[a-f0-9]{3}/
./foo.c:235 Yeah 0x245
./bar.h:2 Oh yeah 0x2ac hex
Test a replacement:
$ fetch test mysrcs /0x[a-f0-9]{3}/0xabc/
./foo.c:235 Yeah 0xabc
./bar.h:2 Oh yeah 0xabc hex
Apply the replacement:
$ fetch subst --backup mysrcs /0x[a-f0-9]{3}/0xabc/
./foo.c:235 Yeah 0xabc
./bar.h:2 Oh yeah 0xabc hex
Backup number: 242
Restore in case of mistake:
$ fetch restore 242
This kind of tools look pretty standard to me. Everybody needs to search and replace. What alternative can I use that is standard in Linux?
#!/bin/ksh
# Call the batch with the 2 (search than replace) pattern value as argument
# assuming the 2 pattern are "sed" compliant regex
SearchStr="$1"
ReplaceStr="$2"
# Assuming it start the search from current folder and take any file
# if more filter needed, use a find before with a pipe
grep -l -r "$SearchStr" . | while read ThisFile
do
sed -i -e "s/${SearchStr}/${ReplaceStr}/g" ${ThisFile}
done
should be a base script to adapt to your need
I often have to perform such maintenance tasks. I use a mix of find, grep, sed, and awk.
And instead of aliases, I use functions.
For example:
# i. and ii.
function grepsrc {
find . -name "*.[ch]" -or -name "*.asm" -or -name "*.inc" -exec grep -Hn "$1"
}
# iii.
function sedsrc {
grepsrc "$1" | awk -F: '{print $1}' | uniq | while read f; do
sed -i s/"$1"/"$2"/g $f
done
}
Usage example:
sedsrc "foo[bB]ar*" "polop"
for F in $(grep -Rl <pattern>) ; do sed 's/search/replace/' "$F" | sponge "$F" ; done
grep with the -l argument just lists files that match
We then use an iterator to just run those files which match through sed
We use the sponge program from the moreutils package to write the processed stream back to the same file
This is simple and requires no additional shell functions or complex scripts.
If you want to make it safe as well... check the folder into a Git repository. That's what version control is for.
Yes there is a tool doing exactely that you are looking for. This is Git. Why do you want to manage the backup of your files in case of mistakes when specialized tools can do that job for you?
You split your request in 3 subquestions:
How quickly search into a subset of my files?
How to apply a substitution temporarly, then go back to the original state?
How to substitute into your subset of files?
We first need to do some jobs in your workspace. You need to init a Git repository then add all your files into this repository:
$ cd my_project
$ git init
$ git add **/*.h **/*.c **/*.inc
$ git commit -m "My initial state"
Now, you can quickly get the list of your files with:
$ git ls-files
To do a replacement, you can either use sed, perl or awk. Here the example using sed:
$ git ls-files | xargs sed -i -e 's/search/replace/'
If you are not happy with this change, you can roll-back anytime with:
$ git checkout HEAD
This allows you to test your change and step-back anytime you want to.
Now, we did not simplified the commands yet. So I suggest to add an alias to your Git configuration file, usually located here ~/.gitconfig. Add this:
[alias]
sed = ! git grep -z --full-name -l '.' | xargs -0 sed -i -e
So now you can just type:
$ git sed s/a/b/
It's magic...