I need to validate that percent complete is < 90.00, so basically 89.99 and below.
I was able to get regex to work with whole numbers:
^(?:[0-9]|(?:[1-8][0-9]))$
However, I need to be able to match one the decimal too and find it within the below string.
Endpointgroup Name::ALL::Endpointgroup Description::::SQLRun TS::2017-06-19 14:15:02::ORIGINAL_NODE=CE01::ORIGINAL_NODE=CE01::Total EP::940256::Completed EP::869655::Job Status::W::Percent Complete::92.49
This will capture strings where the number following the Complete:: is less or equal 89.99. You can get the number from the first capturing group.
Complete::([0-8]?[0-9]\.[0-9][0-9])
It is quite ugly to do with regex. See the posts
Validate max-min with regex and Use regex to compare numbers
Related
I need to validate with regex a date in format yyyy-mm-dd (2019-12-31) that should be within the range 2019-12-20 - 2020-01-10.
What would be the regex for this?
Thanks
Regex only deal with characters. so we have to work out at each position in the date what are the valid characters.
The first part is easy. The first two characters have to be 20
Now it gets complicated the next character can be a 1 or a 2 but what follows depends on the value of that character so we split the rest of the regex into two sections the first if the third character matches 1 and the second if it matches 2
We know that if the third character is a 1 then what must follow is the characters 9-12- as the range starts at 2019-12-20 now for the day part. The 9th character is the tens for the day this can only be 2 or 3 as we are already in the last month and the minimum date is 20. The last character can be any digit 0-9. This gives us a day match of [23][0-9]. Putting this together we now have a pattern for years starting 2019 as 19-12-[23][0-9]
It the third character is a 2 then we can match up to the day part of the date a gain as the range ends in January. This gives us a partial match of 20-01- leaving us to work on the day part. Hear we know that the first character of the day can either be a 1 or 0 however if it's a 1 then the last character must be a 0 and if it's a 0 then the last character can only be in the range 1 to 9. This give us another alteration (?:0[1-9]|10) Putting the second part together we get 20-01-(?:0[1-9]|10).
Combining these together gives the final regex 20(?:19-12-[23][0-9]|20-01-(?:0[1-9]|10))
Note that I'm assuming that the date you are testing against is a validly formatted date.
Try this:
(2019|2020)\-(12|01)\-([0-3][0-9]|[0-9])
But be aware that this will allow number up to where the first digit is between zero and three and the second digit between zero and nine for the dd value. You could specify all numbers you want to allow (from 20 to 10) like this (20|21|22|23|24|25|26|27|28|29|30|31|01|1|02|2|03|3|04|4|05|5|06|6|07|7|08|8|09|9|10).
(2019|2020)\-(12|01)\-(20|21|22|23|24|25|26|27|28|29|30|31|01|1|02|2|03|3|04|4|05|5|06|6|07|7|08|8|09|9|10)
But honestly... Regular-Expressions are not the right tool for this. RegExp gives a mask to something, not a logical context. Use regex to extract the data/value from a string and validate those values using another language.
The above 2nd Regex will, f.e. match your dates, but also values outside of this range since there is no context between 2019|2020 and the second group 12|01 so they match values like 2019-12-11 but also 2020-12-11.
To only match the values you want this will be a really large regex like this (inner brackets only if you need them) ((2019)-(12)-(20)|(2019)-(12)-(21)|(2019)-(12)-(22)|...) and continue with all possible dates - and ask yourself: what would you do if you find such a regex in a project you have to work with ;)
Better solution (quick and dirty, there might be better solutions):
(?<yyyy>20[0-9]{2})\-(?<mm>[01][0-9]|[0-9])\-(?<dd>[0-3][0-9]|[0-9])
This way you have three named groups (yyyy, mm, dd) you can access and validate the matched values... The regex is smaller, you have a better association between code and regex and both are easier to maintain.
I'm using an online tool to create contests. In order to send prizes, there's a form in there asking for user information (first name, last name, address,... etc).
There's an option to use regular expressions to validate the data entered in this form.
I'm struggling with the regular expression to put for the street number (I'm located in Belgium).
A street number can be the following:
1234
1234a
1234a12
begins with a number (max 4 digits)
can have letters as well (max 2 char)
Can have numbers after the letter(s) (max3)
I came up with the following expression:
^([0-9]{1,4})([A-Za-z]{1,2})?([0-9]{1,3})?$
But the problem is that as letters and second part of numbers are optional, it allows to enter numbers with up to 8 digits, which is not optimal.
1234 (first group)(no letters in the second group) 5678 (third group)
If one of you can tip me on how to achieve the expected result, it would be greatly appreciated !
You might use this regex:
^\d{1,4}([a-zA-Z]{1,2}\d{1,3}|[a-zA-Z]{1,2}|)$
where:
\d{1,4} - 1-4 digits
([a-zA-Z]{1,2}\d{1,3}|[a-zA-Z]{1,2}|) - optional group, which can be
[a-zA-Z]{1,2}\d{1,3} - 1-2 letters + 1-3 digits
or
[a-zA-Z]{1,2} - 1-2 letters
or
empty
\d{0,4}[a-zA-Z]{0,2}\d{0,3}
\d{0,4} The first groupe matches a number with 4 digits max
[a-zA-Z]{0,2} The second groupe matches a char with 2 digit in max
\d{0,3} The first groupe matches a number with 3 digits max
You have to keep the last two groups together, not allowing the last one to be present, if the second isn't, e.g.
^\d{1,4}(?:[a-zA-z]{1,2}\d{0,3})?$
or a little less optimized (but showing the approach a bit better)
^\d{1,4}(?:[a-zA-z]{1,2}(?:\d{1,3})?)?$
As you are using this for a validation I assumed that you don't need the capturing groups and replaced them with non-capturing ones.
You might want to change the first number check to [1-9]\d{0,3} to disallow leading zeros.
Thank you so much for your answers ! I tried Sebastian's solution :
^\d{1,4}(?:[a-zA-z]{1,2}\d{0,3})?$
And it works like a charm ! I still don't really understand what the ":" stand for, but I'll try to figure it out next time i have to fiddle with Regex !
Have a nice day,
Stan
The first digit cannot be 0.
There shouldn't be other symbols before and after the number.
So:
^[1-9]\d{0,3}(?:[a-zA-Z]{1,2}\d{0,3})?$
The ?: combination means that the () construction does not create a matching substring.
Here is the regex with tests for it.
I have thousands of article descriptions containing numbers.
they look like:
ca.2760h3x1000.5DIN345x1500e34
the resulting numbers should be:
2760
1000.5
1500
h3 or 3 shall not be a result of the parsing, since h3 is a tolerance only
same for e34
DIN345 is a norm an needs to be excluded (every number with a trailing DIN or BN)
My current REGEX is:
[^hHeE]([-+]?([0-9]+\.[0-9]+|[0-9]+))
This solves everything BUT the norm. How can I get this "DIN" and "BN" treated the same way as a single character ?
Thanx, TomE
Try using this regular expression:
(?<=x)[+-]?0*[0-9]+(?:\.[0-9]+)?|[+-]?0*[0-9]+(?:\.[0-9]+)?(?=h|e)
It looks like every number in your testcase you want to match exept the first number is starting with x.This is what the first part of the regex matches. (?<=x)[+-]?0*[0-9]+(?:\.[0-9]+)?The second part of the regex matches the number until h or e. [+-]?0*[0-9]+(?:\.[0-9]+)?(?=h|e)
The two parts [+-]?0*[0-9]+(?:\.[0-9]+)? in the regex is to match the number.
If we can assume that the numbers are always going to be four digits long, you can use the regex:
(\d{4}\.\d+|\d{4})
DEMO
Depending on the language you might need to replace \d with [0-9].
There's a long natural number that can be grouped to smaller numbers by the 0 (zero) delimiter.
Example: 4201100370880
This would divide to Group1: 42, Group2: 110, Group3: 370880
There are 3 groups, groups never start with 0 and are at least 1 char long. Also the last groups is "as is", meaning it's not terminated by a tailing 0.
This is what I came up with, but it only works for certain inputs (like 420110037880):
(\d+)0([1-9][0-9]{1,2})0([1-9]\d+)
This shows I'm attempting to declare the 2nd group's length to min2 max3, but I'm thinking the correct solution should not care about it. If the delimiter was non-numeric I could probably tackle it, but I'm stumped.
All right, factoring in comment information, try splitting on a regex (this may vary based on what language you're using - .split(/.../) in JavaScript, preg_split in PHP, etc.)
The regex you want to split on is: 0(?!0). This translates to "a zero that is not followed by a zero". I believe this will solve your splitting problem.
If your language allows a limit parameter (PHP does), set it to 3. If not, you will need to do something like this (JavaScript):
result = input.split(/0(?!0)/);
result = result.slice(0,2).concat(result.slice(2).join("0"));
The following one should suit your needs:
^(.*?)0(?!0)(.*?)0(?!0)(.*)$
Visualization by Debuggex
The following regex works:
(\d+?)0(?!0) with the g modifier
Demo: http://regex101.com/r/rS4dE5
For only three matches, you can do:
(\d+?)0(?!0)(\d+?)0(?!0)(.*)
I want to create a regex that will match any of these values
7-5
6-6 ((0-99) - (0-99))
6-4
6-3
6-2
6-1
6-0
0-6
1-6
2-6
3-6
4-6
the 6-6 example is a special case, here are some examples of values:
6-6 (23-8)
6-6 (4-25)
6-6 (56-34)
Is it possible to make one regex that can do this?
If so, is it possible to further extend that regex for the 6-6 special case such that the the difference between the two numbers within the parentheses is equal to 2 or -2?
I could easily write this with procedural code, but i'm really curious if someone can devise a regex for this.
Lastly, if it could be further extended such that the individual digits were in their own match groups I'd be amazed. An example would be for 7-5, i could have a match group that just had the value 7, and another that had the value 5. However for 6-6 (24-26) I'd like a match group that had the first six, a match group for the second 6, a match group for the 24 and a match group for the 26.
This may be impossible, but some of you can probably get this part of the way there.
Good luck, and thanks for the help.
NO. The answer is "We can't," and the reason is because you're trying to use a hammer to dig a hole.
The problem with writing one long "clever" (this word causes a knee-jerk reaction in many people who are far more anti-regex than I) regex is that, six months from now, you'll have forgotten those clever regex features that you used so heavily, and you'll have written six months worth of code related to something else, and you'll get back to your impressive regex and have to tweak one detail, and you'll say, "WTF?"
This is what (I understand) you want, in Perl:
# data is in $_
if(/7-5|6-[0-4]|[0-4]-6|6-6 \((\d{1,2})-(\d{1,2})\)/) {
if($1 and $2 and abs($1 - $2) == 2) {
# we have the right difference
}
}
Some might say that the given regex is a bit much, but I don't think it's too bad. If the \d{1,2} bit is a little too obscure you could use \d\d? (which is what I used at first, but didn't like the repetition).
You can do it like this:
7-5|6-[0-4]|[0-5]-6|6-6 \(\d\d?-\d\d?\)
Just add parens to get your match groups.
Off the top of my head (there may be some errors but the principle should be good):
\d-\d|6-6 (\d+-\d+)
And like with any regexp, you can surround what you want to extract with parentheses for match groups:
(\d)-(\d)|(6)-(6) ((\d)+-(\d+))
In the 6-6 case, the first two parentheses should get the sixes, and the second two should get the multi-digit values that come afterwards.
Here is one that will match only the numbers you want and let you get each digit by name:
p = r'(?P<a>[0-4]|6|7)-(?P<b>[0-4]|6|5) *(\((?P<c>\d{1,2})-(?P<d>\d{1,2})\))?'
To get each digit you could use:
values = re.search(p, string).group('a', 'b', 'c', 'd')
Which will return a four element tuple with the values you are looking for (or None if no match was found).
One problem with this pattern is that it will patch the stuff in the parenthesis whether or not there was a match to '6-6'. This one will only match the final parenthesis if 6-6 is matched:
p = r'(?P<a>[0-4]|(?P<tmp_a>6)|7)-(?P<b>(?(tmp_a)(?P<tmp_b>6)|([0-4]|5)))(?(tmp_b) *(\((?P<c>\d{1,2})-(?P<d>\d{1,2})\))?)'
I don't know of any way to look for a difference between the numbers in the parenthesis; regex only knows about strings, not numerical values . . .
(I am assuming python syntax here; the perl syntax is slightly different, though perl supports the python way of doing things.)