Regex to Find empty functions in js File using grep - regex

I would like to Find in several files the empty function like
fLocalEvent(){
}
I would like to find using grep
I tried:
grep -Rlz "function fLocalEventoBotoes.*[\n\s]*?}" sob/SOB910C.js
grep -Rlz "fLocalEventoBotoes\((?:(?!\)\s*\{).)*\)\s*\{\s*\}" sob/SOB910C.js - return -bash: !\: event not found

You are better off using something like Perl for this:
echo '
fLegit1(x){ something }
empty1(){}
legit2(){ something }
fLocalEvent(){
}' | perl -0777 -lnE 'say $1 while (/(\w+\(\s*\)\s*{\s*})/g)'
Prints:
empty1(){}
fLocalEvent(){
}
With GNU grep, you can use the -Pzo switches to use that same regex, ignore line endings, and print only the matched section:
echo '
fLegit1(x){ something }
empty1(){}
legit2(){ something }
fLocalEvent(){
}' | grep -Pzo '\w+\(\s*\)\s*{\s*}\s*'
# same output...

This grep should work :
grep -Poz '\w+\(\){\s*}\n?' data

Related

How to grep for a string with punctuation characters included?

I'm trying to grep(newbie to regular expressions) for the string on the right side of the following assignment in a source tree:
some_var = %1$s %2$s ID
I have tried:
grep -ri '[[:punct:]]1[[:punct:]]s [[:punct:]]2[[:punct:]]s ID' .
grep -ri "'[[:punct:]]1[[:punct:]]s'\|'[[:punct:]]2[[:punct:]]s ID'" .
I've ran:
grep -ri some_var .
And this returned some_var but having trouble figuring out how to return the other side of the assignment operator.
I've read through gnu grep character classes and bracket expressions but it's still not clear to me.
Use awk:
awk -F'[[:space:]]*=[[:space:]]*' '$1 == "some_var" { print $2 }'
or sed:
sed 's/^some_var[[:space:]]*=[[:space:]]*\(.*\)/\1/'
If you want to use grep you need to use GNU grep with perl regexes:
grep -oP '^some_var\s*=\s*\K.*'
But as I said, that works only with GNU grep and is not standard.
To remove the left side with GNU sed:
$ sed 's/^some_var\s=\s*//' <<< "some_var = %1$s %2$s ID"
%1 %2 ID

Find regular expression in a file matching a given value

I have some basic knowledge on using regular expressions with grep (bash).
But I want to use regular expressions the other way around.
For example I have a file containing the following entries:
line_one=[0-3]
line_two=[4-6]
line_three=[7-9]
Now I want to use bash to figure out to which line a particular number matches.
For example:
grep 8 file
should return:
line_three=[7-9]
Note: I am aware that the example of "grep 8 file" doesn't make sense, but I hope it helps to understand what I am trying to achieve.
Thanks for you help,
Marcel
As others haven pointed out, awk is the right tool for this:
awk -F'=' '8~$2{print $0;}' file
... and if you want this tool to feel more like grep, a quick bash wrapper:
#!/bin/bash
awk -F'=' -v seek_value="$1" 'seek_value~$2{print $0;}' "$2"
Which would run like:
./not_exactly_grep.sh 8 file
line_three=[7-9]
My first impression is that this is not a task for grep, maybe for awk.
Trying to do things with grep I only see this:
for line in $(cat file); do echo 8 | grep "${line#*=}" && echo "${line%=*}" ; done
Using while for file reading (following comments):
while IFS= read -r line; do echo 8 | grep "${line#*=}" && echo "${line%=*}" ; done < file
This can be done in native bash using the syntax [[ $value =~ $regex ]] to test:
find_regex_matching() {
local value=$1
while IFS= read -r line; do # read from input line-by-line
[[ $line = *=* ]] || continue # skip lines not containing an =
regex=${line#*=} # prune everything before the = for the regex
if [[ $value =~ $regex ]]; then # test whether we match...
printf '%s\n' "$line" # ...and print if we do.
fi
done
}
...used as:
find_regex_matching 8 <file
...or, to test it with your sample input inline:
find_regex_matching 8 <<'EOF'
line_one=[0-3]
line_two=[4-6]
line_three=[7-9]
EOF
...which properly emits:
line_three=[7-9]
You could replace printf '%s\n' "$line" with printf '%s\n' "${line%%=*}" to print only the key (contents before the =), if so inclined. See the bash-hackers page on parameter expansion for a rundown on the syntax involved.
This is not built-in functionality of grep, but it's easy to do with awk, with a change in syntax:
/[0-3]/ { print "line one" }
/[4-6]/ { print "line two" }
/[7-9]/ { print "line three" }
If you really need to, you could programmatically change your input file to this syntax, if it doesn't contain any characters that need escaping (mainly / in the regex or " in the string):
sed -e 's#\(.*\)=\(.*\)#/\2/ { print "\1" }#'
As I understand it, you are looking for a range that includes some value.
You can do this in gawk:
$ cat /tmp/file
line_one=[0-3]
line_two=[4-6]
line_three=[7-9]
$ awk -v n=8 'match($0, /([0-9]+)-([0-9]+)/, a){ if (a[1]<n && a[2]>n) print $0 }' /tmp/file
line_three=[7-9]
Since the digits are being treated as numbers (vs a regex) it supports larger ranges:
$ cat /tmp/file
line_one=[0-3]
line_two=[4-6]
line_three=[75-95]
line_four=[55-105]
$ awk -v n=92 'match($0, /([0-9]+)-([0-9]+)/, a){ if (a[1]<n && a[2]>n) print $0 }' /tmp/file
line_three=[75-95]
line_four=[55-105]
If you are just looking to interpret the right hand side of the = as a regex, you can do:
$ awk -F= -v tgt=8 'tgt~$2' /tmp/file
You would like to do something like
grep -Ef <(cut -d= -f2 file) <(echo 8)
This wil grep what you want but will not display where.
With grep you can show some message:
echo "8" | sed -n '/[7-9]/ s/.*/Found it in line_three/p'
Now you would like to transfer your regexp file into such commands:
sed 's#\(.*\)=\(.*\)#/\2/ s/.*/Found at \1/p#' file
Store these commands in a virtual command file and you will have
echo "8" | sed -nf <(sed 's#\(.*\)=\(.*\)#/\2/ s/.*/Found at \1/p#' file)

Output the first regex match against STDIN

In bash, I have the following:
#!/bin/bash
curl $1 | tac | tac | perl -e '/(\d\d(?=:\d\d))/g; print $1' > $2
All I want is to the first match from the output of curl and print it to the output file. I run the script with ./scriptname url outputfile.txt but nothing is printed. My regex is valid on http://regexr.com, so I'm sure it's something I don't know about Perl. What am I doing wrong? Thanks.
You can use the following:
#!/bin/bash
curl "$1" | perl -nle'print for /\d\d(?=:\d\d)/g' > "$2"
If you change the match to /script/g, you can see it working with something like
./scriptname http://www.ucsd.edu outputfile.txt
I suppose this means perl -ne is reading the input line by line. Is there a simple way to have perl return only the first result?
Consider using sed:
... | sed '/^.*\([[:digit:]]\{2\}\):[[:digit:]]\{2\}.*/{s//\1/;q};d'
In Perl, that would be:
... | perl -nle 'if (s/^.*(\d\d):\d\d.*/$1/) { print; exit }'
And with GNU Grep compiled with --perl-regexp:
... | grep -m1 -Po '\d\d(?=:\d\d)'
There are a few problems:
You never read from STDIN.
You don't stop trying to match after the first match.
You print unconditionally.
If you want all matches (as per your original question):
perl -nle'print for /\d\d(?=:\d\d)/g'
If you want the first match (as per your comment):
perl -nle'if (/\d\d(?=:\d\d)/) { print $&; exit }'
perl -nle'if (/(\d\d):\d\d/) { print $1; exit }'
grep -Pom1 '\d\d(?=:\d\d)'
Notes:
-n wraps the code with a loop that reads from STDIN.

How to extract value from the string in bash?

I have an input string in the following format:
bugfix/ABC-12345-1-00
I want to extract "ABC-12345". Regex for that format in C# looks like this:
.\*\\/([A-Z]+-[0-9]+).\*
How can I do that in a bash script? I've tried sed and awk but had no success because I need to extract value from the capturing group and skip the rest.
If your grep supports -P then you could use the below grep commands.
$ echo 'bugfix/ABC-12345-1-00' | grep -oP '/\K[A-Z]+-\d+'
ABC-12345
\K keeps the text matched so far out of the overall regex match.
$ echo 'bugfix/ABC-12345-1-00' | grep -oP '(?<=/)[A-Z]+-\d+'
ABC-12345
(?<=/) Positive lookbehind which asserts that the match must be preceded by a / symbol.
Through sed,
$ echo 'bugfix/ABC-12345-1-00' | sed 's~.*/\([A-Z]\+-[0-9]\+\).*~\1~'
ABC-12345
echo "bugfix/ABC-12345-1-00"| perl -ane '/.*?([A-Z]+\-[0-9]+).*/;print $1."\n"'
You could try something like:
echo "bugfix/ABC-12345-1-00" | egrep -o '[A-Z]+-[0-9]+'
OUTPUT:
ABC-12345
If you do not like to use regex, you can use this awk:
echo "bugfix/ABC-12345-1-00" | awk -F\/ '{print $NF}'
ABC-12345-1-00
Or just this:
awk -F\/ '$0=$NF'

Simple Grep Mismatch problem

I am using Ubuntu 10.10 and using Grep to process some HTML files.
Here is the HTML snippet:
<a href="video.php?video=one-hd.mov"><img src="/1.jpg"><a href="video.php?video=normal.mov"><img src="/2.jpg"><a href="video.php?video=another-hd.mov">
I would like to extract one-hd.mov and another-hd.mov but ignore normal.mov.
Here is my code:
example='<a href="video.php?video=one-hd.mov"><img src="/1.jpg"><a href="video.php?video=normal.mov"><img src="/2.jpg"><a href="video.php?video=another-hd.mov">'
echo $example | grep -Po '(?<=video.php\?video=).*?(?=-hd.mov">)'
The result is:
one
normal.mov"><img src="/2.jpg"><a href="video.php?video=another
But I want
one
another
There is a mismatch there.
Is this because of the so-called Greedy Regular Expression?
I am sing GREP but any command line bash tools are welcome to solve this problem like sed etc.
Thanks a lot.
You want use Perl regexes for grep - why not directly perl?
echo "$example" | perl -nle 'm/.*?video.php\?video=([^"]+)">.*video.php\?video=([^"]+)".*/; print "=$1=$2="'
will print
=one-hd.mov=another-hd.mov=
Here is a solution using xmlstarlet:
$ example='<a href="video.php?video=one-hd.mov"><img src="/1.jpg"><a href="video.php?video=normal.mov"><img src="/2.jpg"><a href="video.php?video=another-hd.mov">'
$ echo $example | xmlstarlet fo -R 2>/dev/null | xmlstarlet sel -t -m "//*[substring(#href, string-length(#href) - 6, 7) = '-hd.mov']" -v 'substring(#href,17, string-length(#href) - 17 - 3)' -n
one-hd
another-hd
$
Solution using awk:
{
for(i=1;i<NF;i++) {
if ($i ~ /mov/) {
if ($i !~ /normal/){
sub(/^.*=/, "", $i)
print $i
}
}
}
}
outputs:
$ awk -F'"' -f h.awk html
one-hd.mov
another-hd.mov
But I strongly advice you to use a html-parser for this instead, something like BeautifulSoup