Add two decimal digits to a number range regex - regex

I've created a Regexp to validate a direction in degrees, between -359 and +359 (with optional sign). This is my regex:
const QString xWindDirectionPattern("[+-]{0,1}([0-9]{1,2}|[12][0-9]{2}|3[0-5][0-9])");
Now, I want to add two decimal numbers, in order to write numbers from -359.99 to +359.99. I've tried something like appending \.[0-9]{1,2}|[0-9]{1,3} but It does not work.
I'd like to have optional decimal point so I can have
23.3 valid
23.33 valid
23 valid
23.333 not valid
I've read some other questions, like this one, but I'm not able to modify the example to match a number range, like in my case.
How can I achieve this result?
Thanks in advance for your replies.
How can achieve this?

I've created a Regexp to validate a direction in degrees, between -359 and +359
No, you can't. You shouldn't. You are using the wrong tool. Regex cannot do the kinds of validation, which require it to dig into the semantics of the characters.
Regex can only process and match text, but cannot identify what they actually mean. Basically Regex are good for parsing regular language, and bad for almost everything else.
For e.g.:
A Regex can match 3 digits, but it would be extremely impractical to use it to match 3 digits that fall in range - [259, 634]. For that you would need to know the meaning of each individual digits in that number.
A Regex can match a pattern for date like - \d\d/\d\d/\d\d, but it cannot identify which part is date, and which part is month.
Similarly, it can find you two numbers x and y, but it cannot identify, whether x < y or not.
The task as above require you to understand the meaning of the text. Regex can't do that.
Well, of course you have come up with a regex for sure, but as you can see it is highly un-flexible. A little change in your requirement, will screw both - the regex and you.
You should better use corresponding language features - constructs like if-else to make sure you are reading degrees in that range, and not regex.

You can do this:
[+-]{0,1}((?:[0-9]{1,2}|[12][0-9]{2}|3[0-5][0-9])(?:\.[0-9]{1,2})?)
This will allow an a decimal point followed by one or two digits. You'll probably also want to use start and end anchors (^ / $) to ensure that there are no characters other than this pattern in your string—without this, 23.333 would be allowed because 23.33 matches the above pattern:
^[+-]{0,1}((?:[0-9]{1,2}|[12][0-9]{2}|3[0-5][0-9])(?:\.[0-9]{1,2})?)$
You can test it out here.

Try [+-]?([1-9]\d?|[12]\d{2}|3[0-5]\d)(\.\d{1,2})?.
[+-]? Optional Sign
[1-9]\d? 1 or 2 digit number
[12]\d{2} 100 to 299
3[0-5]\d 300 to 359
(\.\d{1,2})? Optional decimal point followed by 1 or two digits

Related

Regex: Email contain numbers but they are not on year pattern

I'm new using regex and, after some tutorials, I'm getting difficult to implement an match criteria like "Email contain numbers but they are not on year pattern".
So, I have this simple regex for "contain numbers":
\d+(?=#)
Considering that e-mail address does have numbers, I would like to get a match for expressions NOT being in one of these below:
\w*(19|20)\d{2}\D*(?=#)
\w*[6-9][0-9]\D*(?=#)
\w*[0-1][0-9]\D*(?=#)
How, in regex, can I express this?
Example matching inputs:
foo123#gmail.com
a22oo#hotmail.com
hoo567#outlook.com
Example non-matching inputs:
foo#gmail.com
johndoe88#hotmail.com
john1976#outlook.com
Regex is difficult to invert, i.e. to not match something.
In your simple case I would just parse an arbitrary long number, and then do the check in code, preferably after converting it to an integer.
But to your question, the following would invert the cases, just or them together
(\d)| 1 digit
([2345]\d)| 2 digits not starting with 0,1,6,7,8,9
(\d\d\d)| 3 digits
((1[^9]|2[^0]|[03-9]\d)\d\d)| 4 digits not starting with 19 or 20
(\d\d\d\d\d*) 5+ digits
Something like this. I'm sure someone can make it prettier.
EDIT
Here is the full regex now tested properly with all possible cases I can think of matching your specified criteria, and proper boundary tests (see https://regex101.com/r/sM5aF7/1):
(\b|[^\d\s])(\d|[2345]\d|\d{3}|(1[^9]|2[^0]|[03-9]\d)\d\d|\d{5,})(\D*?#|#)
This regex passes your examples:
\D(?![6-9]\d\D)\d{2,3}\D
See live demo.

Unsure of regex for numeric which can include unit chars

I have the following regex which only allows numerics
^[0-9]+$
Trouble is, I also need to allow the user to enter a decimal aspect and unit
So these would all need to be valid
123
123456789
1.25M
1.2K
1.5B
12345.53M
0.5M
If anyone can help I'd be most grateful
Does this work for all your cases, and exclude all the cases that should not match?
^\d*\.?\d+[GMKB]?$
Explanation:
^\d* - Start with zero or more digits
\.? - Allow a decimal point, if there is one
\d+ - Require at least one digit (which might be after a decimal point)
[GMKB]? - Allow one of these 4 letters
$ - Don't allow any more characters after this sequence
Since you didn't specified one I assume you are using perl compatible regex engine. You can use this:
/^([0-9]*\.)?[0-9]+(B|K|M|G)?$/
I also assume that numbers like 0.1 can be written like .1. Having the units encapsulated in capturing group (B|K|M|G) makes it easy to extract it from results afterwards.
You can test the regex here

Intelligent RegEx Replacement

I'm setting up a system to parse a string with very specified syntax and fix user errors. For example, the syntax requires dates in a m/d/yy format (no leading 0s), so I need to make the following substitutions:
10/01/13 -> 10/1/13
10/10/13 -> no change
10/1/13 -> no change
01/10/13 -> 1/10/13
I have a lot of rules like this by which I need to find portions of a string and fix those portions. I can use RegEx to identify what needs to be corrected easily. For an easier example, I want to find CBUx[2-9], but then I need to replace with something like this CBU x [2-9] (spaces around x if preceded by CBU and follwed by a digit). Example:
input text: "blah blah CBUx3"
matched: "CBUx3"
replace: "CBU x 3"
output text: "blah blah CBU x 3"
Is this possible? Note that I'm fully aware I could write code to find the slashes and digits. I'm specifically trying to do this with an "intelligence RegEx replace". I have a lot of different types of corrections that I can match with RegEx, and I would like to avoid writing specific correction procedures for each.
Maybe something like that for the leading zeroes:
\b0+([1-9])
And replace with $1 (or \1 depending on the language, though \1 is less common nowadays).
But something a bit better might be with the use of a negative lookbehind:
(?<![.,])\b0+([1-9])
So that the 0 in 10,001.002 are not changed to 10,1.2.
regex101 demo
The word boundary, \b, makes sure that the 0 (or more) are at the beginning of the number and the negative lookbehind is for cases of decimals and thousand separators, assuming that you have have floating numbers in the string. Note that this will however prevent the removal of zeroes in a date format of 11.01.13. A more complex regex can however be made with the assumption that such a date always have a least one number after a second dot (itself after 2 numbers since dates and months take at most 2 digits) without encountering anything other than other numbers, which makes the regex look like...
(?<![.,](?![0-9]{2}\.[0-9]))\b0+([1-9])
And which renders to something like this.
For the CBUx[2-9], you can use a capture group as well:
CBUx([2-9])
And replace with: CBU x $1 (or \1)
There might be some tweaks I didn't consider for the leading zero removal part, but that's what I can think about right now.

Regex for detecting numbers away from each other?

so basically I want to detect if in these strings:
Hello 123 My 222 dear 112 troll 12 8889
192.1.1.254:10000
the numbers are in a format like this:
[0 to 255][ANYTHING][0 to 255][ANYTHING][0 to 255][ANYTHING][0 to 255][ANYTHING][0 to 65536]
Does anyone know how I can build such a regex?
It is for detecting if anyone posts an IP:Port in unusual format to bypass default ip:port filters.
Edit: As for the first comment: I do not know regex and what I have tried is:
if(regex_match("192.168 najlepszy serwer SAMP!!1 1 join1!! 8080","/^[0-2](*)?[0-5](*)?[0-5](*).(*)[0-2](*)?[0-5](*)?[0-5](*).(*)[0-2](*)?[0-5](*)?[0-5](*).(*)[0-2](*)?[0-5](*)?[0-5](*)?$/"))
{
print("Cannot send message");
}
else
{
print("New message for everyone! :)");
}
and some other not working regexes.
If you don't want to complicate your life checking the exact ranges, the simple regex would be:
/^.*(\d)+.+(\d)+.+(\d)+.+(\d)+.+(\d)+.*$/
The first four (\d)+ parts can be replaced with more complicated check for 0-255 range:
(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
the last (\d)+ replace with next for port range check:
(6553[0-5]|655[0-2]\d|65[0-4]\d\d|6[0-4]\d\d\d|[1-5]\d\d\d\d|[1-9]\d{0,3})
An exact, simple, and direct representation of your pattern as a regular expression is not possible in the general case. The reason are the number ranges. Something like "at this place any integral number with a value from a to b" is just to complex. A regular expression is executed by a finite state machine and these (theoretical) beasts are (basically) only able to look at strings character by character. Therefore you can match something like "ignore all characters until you find the first digit, then check whether the first digit is followed by at most two more digits".
As a workaround you may try to build a list of alternations of possible digit patterns that covers your desired range of values (in the extreme case list every single value like \b(?:1|2|3|4|...|154|155|...|255)\b). I have a pattern for the range 0-255, but I have none for the range of possible port numbers. So a first approximation may be (really, this is only an approximation and not thoroughly tested):
\b(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\b.*\b(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\b.*\b(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\b.*\b(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\b[^0-9]*[0-9]{1,5}
In the above pattern (?: .... ) means a shy group (not remembered for back references) and \b means word boundary.
I'd suggest you read up on Regex syntax. For starters . is special and matches any character. Also doing something like [0-2][0-5][0-5] won't catch something like 192 as 9 is not within 0-5.
According to your requirements here's a Regex that should roughly do what you want
([0-2]?\d{1,2}).*([0-2]?\d{1,2}).*([0-2]?\d{1,2}).*([0-2]?\d{1,2}).*(\d{1,5})?
Each of the ([0-2]?\d{1,2}) portions will match 1 or 2 digits preceded optionally with a 0,1, or 2. Each () will capture a group which you can then examine using a Regex engine. You will need to examine this group as the Regex for each of those portions will match numbers above 255 (specifically 256-299).
The last group (\d{1,5})? is to catch the port number, again you will have to examine this as it will catch any 1 to 5 digit number (hence the {1,5}). The ? makes the group optional, remove it if you want it to have to match against a port number.
As far as doing Regex in C, I haven't had much experience but there should be a way to get all the grouped matches and inspect them. Unfortunately they will be strings so you will have to convert them to integers to examine them.
Are you sure you need regex for this? In my opinion, you do not need regex for this.
Just split numbers into groups which are seperated by non-numeric characters. Then analyze.
What language?
As for actually looking for valid range, take a look at this;
http://www.regular-expressions.info/numericranges.html
I would do this simple regex
((\d|\D)+)*

What's wrong with this number extracting Regex?

I have a string like the following:
<br><b>224h / 15.45 verbuchte Stunden</b>
I want to extract the numbers and have created the following Regex:
([0-9]\.?[0-9]{0,2})h\s\/\s([0-9]\.?[0-9]{0,2})
But for the preceding string this gives me the numbers 224 and 15 instead of 15.45.
What's wrong with this Regex?
Because you allow only one digit before the dot.
Try this, I used {1,2} as quantifier before the dot, change it to your needs. Probably + would be a better choice, it allows one or more.
([0-9]\.?[0-9]{0,2})h\s\/\s([0-9]{1,2}\.?[0-9]{0,2})
A better regex could be this
([0-9]+(?:\.[0-9]{1,2})?)h\s*\/\s*([0-9]+(?:\.[0-9]{1,2})?)
I made here the complete fraction part optional and require at least one and at most 2 digits after the dot and minimum one before.
The answer is given by stema.
If your regex engine supports character classes it could be a little bit more compact like this:
(\d{1,2}\.?\d{0,2})h\s/\s(\d{1,2}\.?\d{0,2})
\d is a shorthand character class for [0-9]