Regex negative range algorithm [duplicate] - regex

This question already has an answer here:
How to implement regular expression NFA with character ranges?
(1 answer)
Closed 7 years ago.
I need to implement positive and negative ranges in my regex matcher.
It looks not difficult for positive range:
[1-3] == (1|2|3)
But I do not understand how to convert negative range [^1-3] to simple regex string.
Is it possible?
Thanks!
Update
Not. Seems it is impossible.
Ok, how regex libraries process negative ranges in this case?

If the regex engine you're using supports negative lookahead, you can do it like this:
(?!1|2|3).
?! is the negative lookahead operator. It says "the characters that follow this expression must not match this expression." It makes a negative match without advancing the cursor. Here it's followed by a . to indicate any character.

Related

Confusion in JavaScript RegExp ?= Quantifier [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 5 years ago.
What the difference between
(?=.\d)(?=.[a-z])(?=.[A-Z])
and
(.\d)(.[a-z])(.[A-Z])
When I test the string a2A only the first RegExp returns true. Can anyone explain this for me?
The difference is in the lookahead operator for each of the terms in the regex. The LA operator matches the sub-regex it guards as usual, but effectively locks the initial matching position for the subsequent regex portion.
This means that the first regex should not match (contrary to your tests, which engine have you used ?) - Given any initial matching position, the second character would have to be a number, a lowercase letter, and an uppercase letter, all at the same time.
Observe that this will not happen if the . ('any char') is quantified:
(?=.*\d)(?=.*[a-z])(?=.*[A-Z])
Each LA term may skip an arbitrary amount of material before matching the character class, and this amount may differ between the subexpressions.
The second alternative (with and without quantification) will never match as it invariably requires a subsequence of digit-letter-letter, which the test string a2A does not provide.

How to build a regular expression which prohibits hyphens from appearing at the start and end of a string? [duplicate]

This question already has answers here:
RegEx for allowing alphanumeric at the starting and hyphen thereafter
(4 answers)
Closed 5 years ago.
I want to build a regular expression which only matches [A-Za-z0-9\-] with an additional rule that hyphens (-) are not allowed to appear at the start and at the end.
For example:
my-site is matched.
m is matched.
mysite- is not matched.
-mysite is not matched.
Currently, I've come up with ^[A-Za-z0-9][A-Za-z0-9\-]*[A-Za-z0-9]+$.
But this doesn't match m.
How can I change my regular expression so that it fits my needs?
Use look arounds:
^(?!-)[A-Za-z0-9-]*(?<!-)$
The reason this works is that look arounds don't consume input, so the look ahead and the look behind can both assert on the same character.
Note that you don't need to escape the dash within the character class if it's the first or last character.

Tricky Regular Expression with a Alphanumeric pattern in uppercase [duplicate]

This question already has answers here:
Can you make just part of a regex case-insensitive?
(5 answers)
Closed 3 years ago.
Okay this might not be tricky at all for some but at the moment really screwing up with my head.
First of all i don't know what engine i am dealing with, but it doesn't seem to identify uppercase.
I have a string for example
Circuit Ref
Service Type
A End Address
Z End Address
52GD J32SD41 O2AE EVC001
Evolve Internet
And I am only trying to extract the string "52GD J32SD41 O2AE EVC001". I have already tried quite a few combinations like
[0-9A-Z]{4}\s[0-9A-Z]+\s[0-9A-Z]+\s[0-9A-Z]+
[A-Z0-9]{4}\s\W+\s\W+\s\W+
[A-Z0-9]{4}\s[A-Z0-9\s]*[A-Z0-9\s]*[A-Z0-9\s]*
Nothing seem to work...I want to keep the expression fairly flexible as the expression can change order of the letters and digits. but the pattern is mostly same. Any nudge in a right direction will be greatly appreciated.
Thanks
This is wild guess, but please try following things:
in front of the regex add (?-i) (Related question, regular-expressions.info, net page about regex)
enclose regex with (?-i: ... )
enclose regex with (?I: ... )
BTW. Regarding 2nd case that you tried: [A-Z0-9]{4}\s\W+\s\W+\s\W+.
Seem that you tried to use \W as "upper case word character", but it is not what it means.
\W means anything that is not \w. That is any non-word character.

RegEx to check if specified word is not in string [duplicate]

This question already has answers here:
How to negate specific word in regex? [duplicate]
(12 answers)
Closed 7 years ago.
I am trying to learn RegEx and build a regular expression that would look whether specified word is NOT in the provided string. So far I did try Regular Expression Info and RexxEgg all this tested on Regular Expression Online but I did not find the answer to my question.
I have tried conditionals and lookarounds. Let's say I want to build an expression to test against not existing word myword and pass expression when the word is NOT in the string. I used expression
(?(?!myword).*)
but RegEx passes regardless the word myword meaning both strings This is the text and This is myword the text pass the test.
Using negative lookahead and conditions is used to test that condition is true when myword does not exist. Lookahead is also zero length and therefore .* would return the whole string.
Hope someone can help :)
^(?(?!\bmyword\b).)*$
You can try this.See demo.Also use \b for matching exactly myword and not mywords
https://regex101.com/r/hI0qP0/7
You should use anchors and negative lookahead:
^(?!.*?myword).*$
(?!.*?myword) is a negative lookahead that will fail the match if myword is found anywhere in the input string.

Regular expression to match floats only [duplicate]

This question already has answers here:
Matching numbers with regular expressions — only digits and commas
(10 answers)
regular expression for finding decimal/float numbers?
(9 answers)
Closed 7 years ago.
I need to do a regular expression to match the floats only, what i got is the following :
[\-\+]?[0-9]*(\.[0-9]+)?
But this match also the below
123123132 ,
05/03/1994
I only need want to match the number with the decimal point
Your regex is almost correct for your purpose.
It finds 123123132, because the last part is optional. Removing the ? solves that.
[-+]?[0-9]*(\.[0-9]+)
With that adjustment, it might still find matches in strings like .12/39/3239, if you don't want that to happen, insert enforce matching over the complete string by inserting ^ and $:
^[-+]?[0-9]*(\.[0-9]+)$
How about:
([+-]?[0-9]*\.[0-9]*)
You can see it working here
Here is a regexp handling also existing exponents:
[-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?
Debuggex Demo
Additionally you should force the hole string to be matched to avoid matchings within your date values.
^[-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?$
By the way here is a nice tutorial about matching floating point numbers using regular expressions: http://www.regular-expressions.info/floatingpoint.html.