Understanding Regex expression [duplicate] - regex

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 6 years ago.
I have a file where the application is configured to check the following Regex
[\x00-\x1F\x7F&&[^\x0A]&&[^\x0D]]
Can anyone please tell me the meaning of this regex expression exactly what it means. I do know that this regex expression ignored line feed and character feed. I even validated my file on http://regexr.com/ with the above specified regex expression and it shows no match found so not understanding why the regex is getting matched in the application.
FYI: I do not want the regex to match file as it is stopping my processing.

It could be that in Java and Ruby the regex expression && refers to character class intersection, while http://regexr.com/ doesn't support that expression and is trying to match literal & symbols. The regex you posted means match any characters from \x00 to \x1f or \x7f as long as it's not \x0A or \x0D.

Related

Regular Expression - select string between 2 expressions [duplicate]

This question already has answers here:
Regex Match all characters between two strings
(16 answers)
RegEx match open tags except XHTML self-contained tags
(35 answers)
Closed 2 years ago.
I would like to mark all strings between 2 strings with the regular expression.
Example:
https://regex101.com/r/Etfpol/1
I want regular expression to mark follow text:
Solution changed from
Resolved Time changed
Updated By changed from
enter image description here
Thanks
You can use positive lookbehind and positive lookahead to check the tags.
(?<=<Name>Description<\/Name><Value>).*?(?=<\/Value>)
Match results
Solution changed from
Resolved Time changed
Updated By changed from
If you prefer not to use them, this will work as well, but the full match will include the strings before and after your desired string.
(?:<Name>Description<\/Name><Value>)(.*?)(?:<\/Value>)

RegEx to capture what's between opening and closing square brackets [duplicate]

This question already has answers here:
JavaScript regex get all matches in a string
(2 answers)
Closed 2 years ago.
I'm trying to build a regular expression that captures anything between square brackets like the following numbers.
[phone]010101[/phone] [phone]434343[/phone]
[phone]3443434[/phone]
so the matches should be 010101, 434343, 3443434
I built cow([\s\S]*?)milk to experiment, and this seems to capture multiple matches and works fine with multiple lines, achieving what I exactly need.
However when I attempted to build the actual regex using this: \[phone\]([\s\S]*?)\[\/phone\] , it would only capture the first single match.
What could be wrong with my expression?
Another approach. This will capture the numbers as you intend.
\](.*)\[
RegexDemo
The regex is correct but global and multi-line flags are missing. In JavaScript, with g (global) and m (multiline) flags added to regex, intended matches can be found.
str=`[phone]010101[/phone] [phone]434343[/phone]
[phone]3443434[/phone]`;
reg = /\[phone\]([\s\S]*?)\[\/phone\]/gm;
[...str.matchAll(reg)].map(x=> x[1]); //["010101", "434343", "3443434"]

Different behavior between two regex patterns [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 3 years ago.
I am trying to match the letters 'C' or 'c' as they appear in a file.
They must be stand alone and NOT followed by a '+' or '.'.
The following two patterns give me the same result using Regex101, but I get a different result
in the Dataquest IDE and my home PC.
The two patterns are:
pattern = r'\b[Cc]\b(?!\+|\.)'
pattern = r"\b[Cc]\b[^.+]"
The problem line in question is: (Line 223 from the hacker_news.csv file)
MemSQL (YC W11) Raises $36M Series C
On my home PC and Dataquests IDE:
The regex using the negative lookahead matches that line.
The other regex does not.
On Regex101 they both match that line.
I am NOT supposed to match it.
I wrote the lookahead regex, which fails in Dataquests IDE.
The non-lookahead version is their answer, which passes.
I think they should both yield the same result, but they do not.
I am running Python 3.7.6
What am I missing?
(?!\+|\.) is negative lookahead. It doesn't include any additional characters in the match; it simply adds a requirement to the character that precedes it that says it can't be followed by . or +. In your input string, the C at the end is not followed by one of these characters, so the match succeeds.
[^.+] matches a single character that is not a . or a +. There are no characters after the C so the match fails.

Checkpoint regex [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 5 years ago.
I found these two expressions in documentation:
To match subdomains of mydomain.com: (^|.*\.)mydomain\.com
To match domain and subdomains of mydomain.com: (^|.*\.)*mydomain\.com
I can't understand why those expressions mean what they say they mean. Can anybody explain both expressions please?
Fist it is not a good regex expressions (it except other things that it should not) but i will explain the (^|.*\.)mydomain\.com (you will figure out the second)
between the parenthesis :
^ matches the starting position of the line
| acts like a Boolean OR ,between the expression before and the expression after the operator
.matches any character except line breaks
*Matches the preceding element zero or more times
\.matches a dot . character
For more information you could read wiki doc and use a great Regex tool

VB.NET Regular Expressions [duplicate]

This question already has answers here:
RegEx match open tags except XHTML self-contained tags
(35 answers)
Closed 9 years ago.
I have this HTML code:
<td class="Class 1">Example</td><td class="Class2">Other Example</td>
and I am trying to use Regular Expressions in VB.NET to extract "Example" and "Other Example"
Dim parsedtext As MatchCollection = Regex.Matches(htmlcode, ">(.+)<)
(the htmlcode variable contains the html code mentioned above as a string.)
However, looking at
parsedtext(0).Groups(0)
, it is returning ">Example</td><td class="Class2">Other Example<". I do not understand why this is happening, and I have tried many other pattern strings and cannot figure this problem out. How would one extract all text between two specific characters such as > and < in the example above?
I agree with #ColeJohnson (no one on SO is allowed to believe otherwise, at this point), but it's a good example for teaching the concept of greedy versus non-greedy matching.
By default, regular expressions quantifiers (+, *, ?) "eat up" as much as possible, and only eat less when some part of the match fails. That's called greedy matching. To make it non-greedy, you use non-greedy quantifiers: +?, *?, ??.
That is,
">(.+?)<"
In other words, your .+ continued to match as many character as possible, before finding a <; so you see, your output was to be expected. If, however, hypothetically, it had not found that last <, it would have backtracked to the last time it "saw" a <.