How do I exclude specific numbers from a Regex expression? - regex

I have the following regex expression.
[1-9][0-9]\.[0-9][0-9]
It matches numbers between 10.00 to 99.99 in increments of.01 (e.g., 11.00 or 11.44, or 65.90).
Let's say I have two numbers that I would like to be excluded from this range. 13.00 and 44.51.
How would I add this condition to the expression?
Twist: I'm using Google Forms to validate this and it doesn't recognize negative look-aheads (!?...)
Does anyone know how I would go about solving this?

Negative lookaheads are fine for this job but unfortunately not all regex engines support them. So a workaround is to do a positive match and discard instead.
Eg
(?:13\.00|44\.51)|([1-9][0-9]\.[0-9][0-9])
Accept only if group 1 is matched else discard (this means the wrong numbers were matched instead)
PS: The answer assumes one can test which part of the regex matched and discard if not group 1 is matched. It is not supposed to work without additional programming logic, only negative lookaheads could achieve that.

Related

Regex: split number into optional first group of up to three then last group of up to three

I have two 1-6 digit numbers separated by a slash. I want these split up into groups of at most 3 digits, taking from the right.
For example:
0/1 -> [,0,,1]
1234/3 -> [1,234,,3]
12345/1234 -> [12,345,1,234]
123456/789123 -> [123,456,789,123]
I need to use a regular expression to do this because I want to do this for a location in NGINX. It's possible to do this with application logic but that is not the question due to performance.
Similar question which solves part of this was here using a negative lookahead: Regular expression to match last number in a string
What regex can achieve this split?
UPDATE:
This regex comes close to what I want (https://regex101.com/r/bQtNdK/3):
(?<prefix1>\d{0,3}?)(?<threes1>\d{0,3})\/(?<prefix2>\d{0,3}?)(?=\d)(?<threes2>\d{0,3})
It fails matching if the second number behind the slash is more than 3 digits long.
UPDATE2:
Now this regex works for most combinations (https://regex101.com/r/bQtNdK/5):
(?<prefix1>\d{0,3}?)(?<threes1>\d{1,3})\/(?<prefix2>\d{0,3})(?<threes2>\d{3})
I don't understand why this starts to fail if I use the same regex for prefix2/threes2 like prefix1/threes1 (i.e. make prefix2 also lazy). Any ideas how to solve this? So close...
I don't know that it's possible without the ability for the regex engine to remember all intermediate matches of a match group that matched an arbitrary number of times (.NET can do this, not sure what others). PCRE will apparently only remember the 'last' match for each group, other wise you could use something like this : (?<prefix1>\d{0,2})(?:(?<threes1>\d{3})*)\/(?<prefix2>\d{0,2})(?<threes2>\d{3})*\s
This regex seems to be correct now (regex101):
(?<prefix1>\d{0,3}?)(?<suffix1>\d{1,3})\/(?<prefix2>\d{0,3}?)(?<suffix2>\d{1,3})\/

Using Regex to find repeating groups in phone numbers

I'm looking for a way to use regex to search for obviously false phone numbers that have the same digit repeating. The numbers are all formatted and stored as follows:
(111)111-1111
I'm not able to alter the text in any way.
I've tried modifying a few of the regex lines I've seen such as:
^([0-9])\1{2}.\1{3}.\1{4}$
which was for finding repeating digits with a period in between the numbers. However, I haven't figured out how to get around the first character as a parenthesis.
Any help would be appreciated!
You misunderstand the purpose of the . Dot Operator. It is not to match a period, it matches anything. In that (quite badly) regex, it serves only to skip the - – and because it matches anything, it will also match something like 11121113111.
Use this regexp instead:
^\(?([0-9])\1{2}\)?\1{3}-?\1{4}$
This checks for parentheses around the first group, optionally so it will still work without; and specifically checks for the presence of a dash between the second and third group of digits, also optionally.

Regex: Email contain numbers but they are not on year pattern

I'm new using regex and, after some tutorials, I'm getting difficult to implement an match criteria like "Email contain numbers but they are not on year pattern".
So, I have this simple regex for "contain numbers":
\d+(?=#)
Considering that e-mail address does have numbers, I would like to get a match for expressions NOT being in one of these below:
\w*(19|20)\d{2}\D*(?=#)
\w*[6-9][0-9]\D*(?=#)
\w*[0-1][0-9]\D*(?=#)
How, in regex, can I express this?
Example matching inputs:
foo123#gmail.com
a22oo#hotmail.com
hoo567#outlook.com
Example non-matching inputs:
foo#gmail.com
johndoe88#hotmail.com
john1976#outlook.com
Regex is difficult to invert, i.e. to not match something.
In your simple case I would just parse an arbitrary long number, and then do the check in code, preferably after converting it to an integer.
But to your question, the following would invert the cases, just or them together
(\d)| 1 digit
([2345]\d)| 2 digits not starting with 0,1,6,7,8,9
(\d\d\d)| 3 digits
((1[^9]|2[^0]|[03-9]\d)\d\d)| 4 digits not starting with 19 or 20
(\d\d\d\d\d*) 5+ digits
Something like this. I'm sure someone can make it prettier.
EDIT
Here is the full regex now tested properly with all possible cases I can think of matching your specified criteria, and proper boundary tests (see https://regex101.com/r/sM5aF7/1):
(\b|[^\d\s])(\d|[2345]\d|\d{3}|(1[^9]|2[^0]|[03-9]\d)\d\d|\d{5,})(\D*?#|#)
This regex passes your examples:
\D(?![6-9]\d\D)\d{2,3}\D
See live demo.

Add two decimal digits to a number range regex

I've created a Regexp to validate a direction in degrees, between -359 and +359 (with optional sign). This is my regex:
const QString xWindDirectionPattern("[+-]{0,1}([0-9]{1,2}|[12][0-9]{2}|3[0-5][0-9])");
Now, I want to add two decimal numbers, in order to write numbers from -359.99 to +359.99. I've tried something like appending \.[0-9]{1,2}|[0-9]{1,3} but It does not work.
I'd like to have optional decimal point so I can have
23.3 valid
23.33 valid
23 valid
23.333 not valid
I've read some other questions, like this one, but I'm not able to modify the example to match a number range, like in my case.
How can I achieve this result?
Thanks in advance for your replies.
How can achieve this?
I've created a Regexp to validate a direction in degrees, between -359 and +359
No, you can't. You shouldn't. You are using the wrong tool. Regex cannot do the kinds of validation, which require it to dig into the semantics of the characters.
Regex can only process and match text, but cannot identify what they actually mean. Basically Regex are good for parsing regular language, and bad for almost everything else.
For e.g.:
A Regex can match 3 digits, but it would be extremely impractical to use it to match 3 digits that fall in range - [259, 634]. For that you would need to know the meaning of each individual digits in that number.
A Regex can match a pattern for date like - \d\d/\d\d/\d\d, but it cannot identify which part is date, and which part is month.
Similarly, it can find you two numbers x and y, but it cannot identify, whether x < y or not.
The task as above require you to understand the meaning of the text. Regex can't do that.
Well, of course you have come up with a regex for sure, but as you can see it is highly un-flexible. A little change in your requirement, will screw both - the regex and you.
You should better use corresponding language features - constructs like if-else to make sure you are reading degrees in that range, and not regex.
You can do this:
[+-]{0,1}((?:[0-9]{1,2}|[12][0-9]{2}|3[0-5][0-9])(?:\.[0-9]{1,2})?)
This will allow an a decimal point followed by one or two digits. You'll probably also want to use start and end anchors (^ / $) to ensure that there are no characters other than this pattern in your string—without this, 23.333 would be allowed because 23.33 matches the above pattern:
^[+-]{0,1}((?:[0-9]{1,2}|[12][0-9]{2}|3[0-5][0-9])(?:\.[0-9]{1,2})?)$
You can test it out here.
Try [+-]?([1-9]\d?|[12]\d{2}|3[0-5]\d)(\.\d{1,2})?.
[+-]? Optional Sign
[1-9]\d? 1 or 2 digit number
[12]\d{2} 100 to 299
3[0-5]\d 300 to 359
(\.\d{1,2})? Optional decimal point followed by 1 or two digits

Regex for detecting numbers away from each other?

so basically I want to detect if in these strings:
Hello 123 My 222 dear 112 troll 12 8889
192.1.1.254:10000
the numbers are in a format like this:
[0 to 255][ANYTHING][0 to 255][ANYTHING][0 to 255][ANYTHING][0 to 255][ANYTHING][0 to 65536]
Does anyone know how I can build such a regex?
It is for detecting if anyone posts an IP:Port in unusual format to bypass default ip:port filters.
Edit: As for the first comment: I do not know regex and what I have tried is:
if(regex_match("192.168 najlepszy serwer SAMP!!1 1 join1!! 8080","/^[0-2](*)?[0-5](*)?[0-5](*).(*)[0-2](*)?[0-5](*)?[0-5](*).(*)[0-2](*)?[0-5](*)?[0-5](*).(*)[0-2](*)?[0-5](*)?[0-5](*)?$/"))
{
print("Cannot send message");
}
else
{
print("New message for everyone! :)");
}
and some other not working regexes.
If you don't want to complicate your life checking the exact ranges, the simple regex would be:
/^.*(\d)+.+(\d)+.+(\d)+.+(\d)+.+(\d)+.*$/
The first four (\d)+ parts can be replaced with more complicated check for 0-255 range:
(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
the last (\d)+ replace with next for port range check:
(6553[0-5]|655[0-2]\d|65[0-4]\d\d|6[0-4]\d\d\d|[1-5]\d\d\d\d|[1-9]\d{0,3})
An exact, simple, and direct representation of your pattern as a regular expression is not possible in the general case. The reason are the number ranges. Something like "at this place any integral number with a value from a to b" is just to complex. A regular expression is executed by a finite state machine and these (theoretical) beasts are (basically) only able to look at strings character by character. Therefore you can match something like "ignore all characters until you find the first digit, then check whether the first digit is followed by at most two more digits".
As a workaround you may try to build a list of alternations of possible digit patterns that covers your desired range of values (in the extreme case list every single value like \b(?:1|2|3|4|...|154|155|...|255)\b). I have a pattern for the range 0-255, but I have none for the range of possible port numbers. So a first approximation may be (really, this is only an approximation and not thoroughly tested):
\b(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\b.*\b(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\b.*\b(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\b.*\b(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\b[^0-9]*[0-9]{1,5}
In the above pattern (?: .... ) means a shy group (not remembered for back references) and \b means word boundary.
I'd suggest you read up on Regex syntax. For starters . is special and matches any character. Also doing something like [0-2][0-5][0-5] won't catch something like 192 as 9 is not within 0-5.
According to your requirements here's a Regex that should roughly do what you want
([0-2]?\d{1,2}).*([0-2]?\d{1,2}).*([0-2]?\d{1,2}).*([0-2]?\d{1,2}).*(\d{1,5})?
Each of the ([0-2]?\d{1,2}) portions will match 1 or 2 digits preceded optionally with a 0,1, or 2. Each () will capture a group which you can then examine using a Regex engine. You will need to examine this group as the Regex for each of those portions will match numbers above 255 (specifically 256-299).
The last group (\d{1,5})? is to catch the port number, again you will have to examine this as it will catch any 1 to 5 digit number (hence the {1,5}). The ? makes the group optional, remove it if you want it to have to match against a port number.
As far as doing Regex in C, I haven't had much experience but there should be a way to get all the grouped matches and inspect them. Unfortunately they will be strings so you will have to convert them to integers to examine them.
Are you sure you need regex for this? In my opinion, you do not need regex for this.
Just split numbers into groups which are seperated by non-numeric characters. Then analyze.
What language?
As for actually looking for valid range, take a look at this;
http://www.regular-expressions.info/numericranges.html
I would do this simple regex
((\d|\D)+)*