Escaping # in BRE regex [duplicate] - regex

This question already has answers here:
Escaping the exclamation point in grep?
(2 answers)
Closed 5 years ago.
I want to find files that are scripts and I need to get from these files the list of all interpreters like Bash, sh, etc.
To find that, I use:
grep "#!/bin/*" ./*
But it displays that:
-bash: !/bin/*": event not found
I assume I need to escape # symbol somehow, but I didn't find that symbol to be escaped in documentation of BRE.
And how I can find files that contain this pattern in regex only on the first line of the file?

the # is no problem, you should escape the !, in Bash it refers to a previous command and must be followed by something, $ for the previous command or a number representing the index of the command in the history. (thx Aaron's correction)
also you may want to change * into .*
like grep "#\!/bin/.*"
If you don't want to escape !, use single quote like:
grep '#!/bin' ....
Also you can disable the regex match by using -F

Could you please try following find command and let me know if this helps.
find -type f -exec grep -l '#!/bin*' {} \+

Escape the ! with \:
grep "#\!/bin/*" ./*

Related

Find command : special characters do not work as expected [duplicate]

This question already has answers here:
How to use regex with find command?
(9 answers)
Closed 5 months ago.
I am trying to search some files in a specific directory called Dico with find command e.g.
I want to find all files ending with g : find ./Dico -name "*g works but find ./Dico -name ".*g or find ./Dico -name ".*g$ do not. Can someone explain me why ?
Another example would be to find : all files that start with a number followed by exactly 5 characters (lowercase) : find ./Dico -name "\[0-9\]+[a-z]\{5\}" or find ./Dico -name "\d+[a-z]{5}". In that case +, \d+ and {n} do seem to do nothing.. I've tried both {5} and \{5\} (emacs syntax) but still the special characters seem to not work correctly.
I am on Ubuntu 20.04. Thank you in advance.
First of all, -name does not use regex.
-name pattern
Base of file name (the path with the leading directories removed) matches shell pattern pattern.
A matching shell pattern could look like this:
find /Dico -name '[0-9][a-z][a-z][a-z][a-z][a-z]'
To use regex, you need to instead use the -regex option (and -regextype to select the regex dialect you want).
-regex pattern
File name matches regular expression pattern. This is a match on the whole path, not a search.
I selected the regex dialect egrep:
find /Dico -regextype egrep -regex '.*/[0-9]+[a-z]{5}'
You can do find -regextype help (or just about any invalid dialect) to get a list of the supported regex dialects.

grep utf8/unicode support/ u modifier [duplicate]

This question already has an answer here:
grep -w finds partial match in words with non-latin letters
(1 answer)
Closed 12 months ago.
I'm trying to validate vtt files for a particular format. The regex is functional but UTF8 characters are causing issues. I tried using (?u) with no luck
The regex I'm using is:
grep -P '(?m)^(\d+:\d+[.]\d+\s*-->\s*\d+:\d+[.]\d+|\s*[\w\s]+)|^\s*$' . -r -v
The u flag allows the regex to work as expected here, https://regex101.com/r/21HW2A/1, but I can't find a way to do that in grep. Do I need to swap the \w to all allowed alphanumeric chars or can the u modifier be used in grep somehow?
The \w can be converted to \p{L} which doesn't require the u modifier for unicode support.
Full solution:
grep -P '(?m)^(\d+:\d+[.]\d+\s*-->\s*\d+:\d+[.]\d+|\s*[\p{L}\s]+)|^\s*$' . -r -v

Use variable in replace regex [duplicate]

This question already has answers here:
Sed replace variable in double quotes
(3 answers)
Closed 3 years ago.
I've already looked at the following:
Escape a string for a sed replace pattern
Is it possible to escape regex metacharacters reliably with sed
They give absolutely no easy to understand answer.
version=1.2.3
sed -i -z -e 's/"version": "[A-Za-z0-9_.-]*"/"version": "$(version)"/' package.json
I'm trying to use a variable in a replace regular expression in a file. I don't have to use sed per say, as long as it works on macs and linux dists I'm ok with it.
So I just had to use double quote for sed to recognise the 4version variable, then just escape the double quotes which are included otherwise.
sed -i -z -e "s/\"version\": \"[A-Za-z0-9_.-]*\"/\"version\": \"$version\"/" package.json

Regex not working in Bash

I have this regex for now
It should catch something like this
org.package;version="[1.0.41, 1.0.51)" and "," optionally if it is not last element.
Also if after package i added .* because the package could be "org.package.util.something" until ";version"
I tried it online in Regex tool and it is working like this
org.package(.*.*)?;version="[[0-9].[0-9].[0-9][0-9],\s[0-9].[0-9].[0-9][0-9])",?
but i dont know what should i change so it can work in bash
package="org.package"
sed -i "s/"$$package.*;version="\[[0-9].[0-9].[0-9][0-9],[[:space:]][0-9].[0-9].[0-9][0-9]\)",?"//g" "$file"
Change the double quotes arround sed command by single quotes, because variable expansion of $package single quotes are closed and double quotes are use arround variable
package="org.package"
sed -i 's/'"$package"'.*;version="\[[0-9].[0-9].[0-9][0-9],[[:space:]][0-9].[0-9].[0-9][0-9]\)",?//g' "$file"
before using command with -i option check the output is correct
There is more than one problem
$$ will be replaced by bash with its PID, that's probably not what you want
online regex evaluators usually use extended regex or perl regex syntax
sed -r will enable extended regex mode. (for grep there's -E and -P)
You use . when you want to match literal dots. However you should be using \., because . actually means "any character" in regular expressions.

Use grep to find strings at the beginning of a line or after a delimiter in Git Bash for Windows

I have such file:
blue|1|red|2
green|3|blue|4
darkblue|0|yellow|3
I want to use grep to find anything containg blue| at the beginning of line or |blue| anywhere, but not any darkblue| or |darkblue| or |blueberry|
I tried to use grep [^|\|]blue\| but Git Bash gives me error:
$ grep [^|\|]blue\| *.*
grep: Unmatched [ or [^
sh.exe": |]blue|: command not found
What did I do wrong? What's the proper way to do it?
Here's a quick & dirty one:
grep -E '(^|\|)blue\|' *
Matches start of line or |, followed by blue|. The important note is that you need extended regular expressions (via egrep or the -E flag) to use the | (or) construct.
Also, note the single quotes around the regular expression.
So, in answer to the OP's "What did I do wrong?",
You forgot to put the regexp in single quotes;
You chose the wrong type of brackets to enclose the alternate expressions; and finally
You forgot to use egrep or the -E flag
It's always easier to see other people's errors; I wish I was a quick to spot my own :-|