Perl Regular expression look ahead [duplicate] - regex

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 8 years ago.
What is the use of
?=
in perl regex
please tell the exact meaning and give some regex example.

(?=...)
is a positive lookahead, a type of zero-width assertion. What it's saying is that the match must be followed by whatever is within the parentheses but that part isn't captured.
Example:
.*(?=bar)
This pattern matches all the characters upto the string bar. When bar is detected then it stops matching. If a line contains more than one bar means it matches upto the last bar because .* does a greedy match.
DEMO

Related

Negate matched results of regex expression [duplicate]

This question already has answers here:
Regular expression to match a line that doesn't contain a word
(34 answers)
Closed 2 years ago.
I would like to come out with a regex expression that negate the matched results of regex expression: .google.*search. And, is it possible to achieve it with regex from the regex expression I am trying to negate?
Test data
[1] https://www.google.com/search?newwindow=1&sxsrf=ALeKk02MzEfbUp3jO4Np
[2] https://github.com/redis/redis-rb
[3] https://web.whatsapp.com/
Expected result
Row 2, 3 match the regex pattern and are part of the results.
the following regex does the trick
^(?!.+google.*search)
basically matching the beginning of the line then negating (?!) (negative lookahead) your regex.
You may use a negative lookahead here:
https?:\/\/(?!.*\.google\..*search).*
Demo
The "secret sauce" here is (?!.*\.google\..*search), which asserts that .google. followed by search does not occur anywhere within the URL to the right of the https:// portion.

Find DATE match starting from end of string [duplicate]

This question already has answers here:
Regex Last occurrence?
(7 answers)
Closed 3 years ago.
I have the following RegEx syntax that will match the first date found.
([0-9]+)/([0-9]+)/([0-9]+)
However, I would like to start from the end of the content and search backwards. In other words, in the below example, my syntax will always match the first date, but I want it to match the last instead.
Some Text here
01/02/15
Some additional
text here.
10/04/14
Ending text
here
I believe this is possible by using a negative lookahead, but all my attempts failed at this because I don't understand RegEx enough. Help would be appreciated.
Note: my application uses RegEx PCRP.
You could make the dot match a newline using for example an inline modifier (?s) and match until the end of the string.
Then make use of backtracking until the last occurrence of the date like pattern and precede the first digit with a word boundary.
Use \K to forget what was matched and match the date like pattern.
^(?s).*\b\K[0-9]+/[0-9]+/[0-9]+
Regex demo
Note that the pattern is a very broad match and does not validate a date itself.

RegEx----What is mean "?!:"? [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 5 years ago.
codes image
If the question: Will the regular expression "?!:" What does that mean?
This is a "Negative Lookahead Assertion". This code is saying "this regular expression matches only if it begins with /wiki/ and is not followed by a colon".
Consider reading through https://www.regular-expressions.info and in particular the Lookahead and Lookbehind Zero-Length Assertions.
? --> zero or one occurrences of the previous expression
! --> negation
: --> simple colon
So... it would mean 'zero or one occurrences of the previous expression, not followed by a colon'

Regex for string containing one string, but not another [duplicate]

This question already has answers here:
Regular expression for a string containing one word but not another
(5 answers)
Closed 3 years ago.
Have regex in our project that matches any url that contains the string
"/pdf/":
(.+)/pdf/.+
Need to modify it so that it won't match urls that also contain "help"
Example:
Shouldn't match: "/dealer/help/us/en/pdf/simple.pdf"
Should match: "/dealer/us/en/pdf/simple.pdf"
If lookarounds are supported, this is very easy to achieve:
(?=.*/pdf/)(?!.*help)(.+)
See a demo on regex101.com.
(?:^|\s)((?:[^h ]|h(?!elp))+\/pdf\/\S*)(?:$|\s)
First thing is match either a space or the start of a line
(?:^|\s)
Then we match anything that is not a or h OR any h that does not have elp behind it, one or more times +, until we find a /pdf/, then match non-space characters \S any number of times *.
((?:[^h ]|h(?!elp))+\/pdf\/\S*)
If we want to detect help after the /pdf/, we can duplicate matching from the start.
((?:[^h ]|h(?!elp))+\/pdf\/(?:[^h ]|h(?!elp))+)
Finally, we match a or end line/string ($)
(?:$|\s)
The full match will include leading/trailing spaces, and should be stripped. If you use capture group 1, you don't need to strip the ends.
Example on regex101

regex that matches everything except a constant [duplicate]

This question already has answers here:
Regular expression to match a line that doesn't contain a word
(34 answers)
Closed 8 years ago.
I need a regexp that will match everything except a single constant (case ignored)
Example for constant ALL, should match words like: dog, MOUSE, mall, alligator. But it shouldn't match: all, ALL, alL.
(?si)^(?!all$).*
will match any string except all (case-insensitively).
(?i) makes the regex case-insensitive, (?s) allows the dot to match any character, including newlines. If you don't expect newlines in your input, you can remove the s.
See it live on regex101.com.