Regular expression to find active console.log - regex

I try to build a regular expression to find all active console.log().
I don't want to find those which are unactive (after //).
In my sample, I want to match number 1, 4, 5, 6 and 9.
console.log('1');
// console.log('2');
// console.log("3");
console.log("4");
console.log('5');
console.log('6');
// console.log('7');
//console.log('8');
console.log("9");
I create a logic as :
^(?!\s*\/\/)console\.log\(
But it match only if console.log is at the very beginning of a line, despite I specify to match any whitespaces as many times as possible with \s*.
https://regex101.com/r/f4wYnG/1
What is not correct with my regular expression?

How about:
^\s*console\.log\(
Works only if the line with "console.log", does not contain anything other than white spaces!
Or with keep using the "negative-lookahead"
^(?!.*\/\/).*console\.log\(
Both will find the lines you mentioned. The below is more accurate I think.

With grep:
$ grep -v '//' file
console.log('1');
console.log("4");
console.log('5');
console.log('6');
console.log("9");
Then, to match numbers:
$ grep -v '//' file | grep -oE '[0-9]+'
1
4
5
6
9

/^[/]+|^\s+[/]+/g
This is regex code that find '//'.
^\s*\w+
Above is regex code that find string starts with space or character.
It would be help to you.

Related

Unable to match multiple digits in regex

I am simply trying to print 5 or 6 digit number present in each line.
cat file.txt
Random_something xyz ...64763
Random2 Some String abc-778986
Something something 676347
Random string without numbers
cat file.txt | sed 's/^.*\([0-9]\{5,6\}\+\).*$/\1/'
Current Output
64763
78986
76347
Random string without numbers
Expected Output
64763
778986
676347
The regex doesn't seem to work as intended with 6 digit numbers. It skips the first number of the 6 digit number for some reason and it prints the last line which I don't need as it doesn't contain any 5 or 6 digit number whatsoever
grep is a better for this with -o option that prints only matched string:
grep -Eo '[0-9]{5,6}' file
64763
778986
676347
-E is for enabling extended regex mode.
If you really want a sed, this should work:
sed -En 's/(^|.*[^0-9])([0-9]{5,6}).*/\2/p' file
64763
778986
676347
Details:
-n: Suppress normal output
(^|.*[^0-9]): Match start or anything that is followed by a non-digit
([0-9]{5,6}): Match 5 or 6 digits in capture group #2
.* Match remaining text
\2: is replacement that puts matched digits back in replacement
/p prints substituted text
With awk, you could try following. Simple explanation would be, using match function of awk and giving regex to match 5 to 6 digits in each line, if match is found then print the matched part.
awk 'match($0,/[0-9]{5,6}/){print substr($0,RSTART,RLENGTH)}' Input_file

Highlight all keys that look like '&name=' in a text with grep console [duplicate]

I want to grep the shortest match and the pattern should be something like:
<car ... model=BMW ...>
...
...
...
</car>
... means any character and the input is multiple lines.
You're looking for a non-greedy (or lazy) match. To get a non-greedy match in regular expressions you need to use the modifier ? after the quantifier. For example you can change .* to .*?.
By default grep doesn't support non-greedy modifiers, but you can use grep -P to use the Perl syntax.
Actualy the .*? only works in perl. I am not sure what the equivalent grep extended regexp syntax would be. Fortunately you can use perl syntax with grep so grep -P would work but grep -E which is same as egrep would not work (it would be greedy).
See also: http://blog.vinceliu.com/2008/02/non-greedy-regular-expression-matching.html
grep
For non-greedy match in grep you could use a negated character class. In other words, try to avoid wildcards.
For example, to fetch all links to jpeg files from the page content, you'd use:
grep -o '"[^" ]\+.jpg"'
To deal with multiple line, pipe the input through xargs first. For performance, use ripgrep.
My grep that works after trying out stuff in this thread:
echo "hi how are you " | grep -shoP ".*? "
Just make sure you append a space to each one of your lines
(Mine was a line by line search to spit out words)
Sorry I am 9 years late, but this might work for the viewers in 2020.
So suppose you have a line like "Hello my name is Jello".
Now you want to find the words that start with 'H' and end with 'o', with any number of characters in between. And we don't want lines we just want words. So for that we can use the expression:
grep "H[^ ]*o" file
This will return all the words. The way this works is that: It will allow all the characters instead of space character in between, this way we can avoid multiple words in the same line.
Now you can replace the space character with any other character you want.
Suppose the initial line was "Hello-my-name-is-Jello", then you can get words using the expression:
grep "H[^-]*o" file
The short answer is using the next regular expression:
(?s)<car .*? model=BMW .*?>.*?</car>
(?s) - this makes a match across multiline
.*? - matches any character, a number of times in a lazy way (minimal
match)
A (little) more complicated answer is:
(?s)<([a-z\-_0-9]+?) .*? model=BMW .*?>.*?</\1>
This will makes possible to match car1 and car2 in the following text
<car1 ... model=BMW ...>
...
...
...
</car1>
<car2 ... model=BMW ...>
...
...
...
</car2>
(..) represents a capturing group
\1 in this context matches the sametext as most recently matched by
capturing group number 1
I know that its a bit of a dead post but I just noticed that this works. It removed both clean-up and cleanup from my output.
> grep -v -e 'clean\-\?up'
> grep --version grep (GNU grep) 2.20

How can this regex let a line like "0.0083" pass? grep -ioE '([0-9]{1,3}.){3}[0-9]{1,3}'

I am trying to make a bash script for active scan of a network. It seems I don't have a hang on regex. The code looks like this:
#! /bin/bash
cd /home/pi/int_lib
for word in $(nmap -sn 192.168.1.0/24 | grep -ioE '([0-9]{1,3}.){3}[0-9]{1,3}' |
grep -v -)
do
mac=$(arp $word | grep -ioE '([A-Fa-f0-9]{2}:){5}[A-Fa-f0-9]{2}')
echo $word: $mac
done
I just want to know how it is possible that a line like "0.0083" can pass the first regex. nmap gives the response time for each host, and in exactly one case the mentioned line pass the filter. Why?
The regex
([0-9]{1,3}.){3}[0-9]{1,3}
matches 1-3 digits followed by any character, 3 times, followed by 1-3 digits. That sums up to at least 7 characters/digits. Illustrated with n as digits, it can look like this
n.n.n.n
where . is any character, up to its longest form
nnn.nnn.nnn.nnn
Since 0.0083 only is 6 characters long, it can never match that regex.
But... simply adding a digit, e.g. 0.00831 makes it match.
Finally, I believe what you're after is the same, but with the . escaped, thus only matching dot.
([0-9]{1,3}\.){3}[0-9]{1,3}

Get string between two characters occurring many times in a line

I am trying to extract a single string out of a line having many segments in a key-value order, but I don't get it as it matches much more than I want to.
This is my example line:
|SEGA~1~MAGIC~DESCRIPTION~~~M~TEST~|SEGB~34~12.11.2011~3~M~O~|SEGC~HELLO~WORLD~|
This lines is a kind concatenation of many segments into one line. Now I want to extract the the string at index 2 in the segment starting with SEGA.
So what I do is grep for this:
egrep -o 'SEGA(.*?)\~\|'
But it gives me the whole line, sometimes it gives me only the segment I am looking for. With the match I would split that segment by using the ~ character and take the third one.
Since I use .*? with the question mark I expected egrep to only match the content between SEGA and the very first occurrence of ~| which is right before SEGB and not the one at the end of SEGC or SEGB.
How can I tell grep to search for SEGA and give the whole content starting right after SEGA until THE VERY FIRST occurrence of ~|
You can use the -P(--perl-regexp) option in grep:
grep -oP '(?<=SEGA).*?(?=~\|)' file
If you want to include the trailing ~|, please remove the lookahead (?=...).
I think .*? (lazy) does not exit in egrep.
I'd suggest you break the line into lines on | and then grep from those:
$ echo "|SEGA~1~MAGIC~DESCRIPTION~~~M~TEST~|SEGB~34~12.11.2011~3~M~O~|SEGC~HELLO~WORLD~|" | sed -e 's/|/\n/g' | grep ^SEGA
SEGA~1~MAGIC~DESCRIPTION~~~M~TEST~

How to do a non-greedy match in grep?

I want to grep the shortest match and the pattern should be something like:
<car ... model=BMW ...>
...
...
...
</car>
... means any character and the input is multiple lines.
You're looking for a non-greedy (or lazy) match. To get a non-greedy match in regular expressions you need to use the modifier ? after the quantifier. For example you can change .* to .*?.
By default grep doesn't support non-greedy modifiers, but you can use grep -P to use the Perl syntax.
Actualy the .*? only works in perl. I am not sure what the equivalent grep extended regexp syntax would be. Fortunately you can use perl syntax with grep so grep -P would work but grep -E which is same as egrep would not work (it would be greedy).
See also: http://blog.vinceliu.com/2008/02/non-greedy-regular-expression-matching.html
grep
For non-greedy match in grep you could use a negated character class. In other words, try to avoid wildcards.
For example, to fetch all links to jpeg files from the page content, you'd use:
grep -o '"[^" ]\+.jpg"'
To deal with multiple line, pipe the input through xargs first. For performance, use ripgrep.
My grep that works after trying out stuff in this thread:
echo "hi how are you " | grep -shoP ".*? "
Just make sure you append a space to each one of your lines
(Mine was a line by line search to spit out words)
Sorry I am 9 years late, but this might work for the viewers in 2020.
So suppose you have a line like "Hello my name is Jello".
Now you want to find the words that start with 'H' and end with 'o', with any number of characters in between. And we don't want lines we just want words. So for that we can use the expression:
grep "H[^ ]*o" file
This will return all the words. The way this works is that: It will allow all the characters instead of space character in between, this way we can avoid multiple words in the same line.
Now you can replace the space character with any other character you want.
Suppose the initial line was "Hello-my-name-is-Jello", then you can get words using the expression:
grep "H[^-]*o" file
The short answer is using the next regular expression:
(?s)<car .*? model=BMW .*?>.*?</car>
(?s) - this makes a match across multiline
.*? - matches any character, a number of times in a lazy way (minimal
match)
A (little) more complicated answer is:
(?s)<([a-z\-_0-9]+?) .*? model=BMW .*?>.*?</\1>
This will makes possible to match car1 and car2 in the following text
<car1 ... model=BMW ...>
...
...
...
</car1>
<car2 ... model=BMW ...>
...
...
...
</car2>
(..) represents a capturing group
\1 in this context matches the sametext as most recently matched by
capturing group number 1
I know that its a bit of a dead post but I just noticed that this works. It removed both clean-up and cleanup from my output.
> grep -v -e 'clean\-\?up'
> grep --version grep (GNU grep) 2.20