Trying to check input textbox for time - regex

I have to make this overview of questions and the user has to be able to insert a time.
To do this I made 2 textboxes, 1 is for the hour input and 1 is for the minute input.
What I want to do now is check if the values aren't to high to be correct.
Example:
The hour value cant be higher than 23 and the minute cant be higher than 59.
What is the best method for checking this?
I've been thinking about if statements but maybe there is a much more efficient way to get this done?
Maybe regular expressions, although I wouldnt know a correct syntax for this matter.
Thanks in advance.

If it has to be a regex:
^(?:2[0-3]|[01]?[0-9])$
will validate the hour and
^[0-5]?[0-9]$
will validate the minute.
Explanation for the "Hours" regex: (you can figure out the minutes yourself easily):
^ # Match start of string
(?: # Match either...
2[0-3] # 2, followed by 0, 1, 2 or 3,
| # or...
[01]? # 0 or 1 (optional; the empty string is OK, too), followed by
[0-9] # any digit
) # End of group
$ # Match end of string

If statements are definitely the way to go. There's no reason to use a regular expression for something so simple... it's like using a sledgehammer to place a small nail into a wall. If statements are also very efficient and easy to read... there's no reason to use regex for what you're doing.

Related

Regular Expression Extracting Text from a group

I have a filename like this:
0296005_PH3843C5_SEQ_6210_QTY_BILLING_D_DEV_0000000000000183.PS.
I needed to break down the name into groups which are separated by a underscore. Which I did like this:
(.*?)_(.*?)_(.*?)_(.*?)_(.*?)_(.*?)_(.*?)_(.*?)_(.*?)(\d{16})(.*)
So far so go.
Now I need to extract characters from one of the group for example in group 2 I need the first 3 and 8 decimal ( keep mind they could be characters too ).
So I had try something like this :
(.*?)_([38]{2})(.*?) _(.*?)_(.*?)_(.*?)_(.*?)_(.*?)_(.*?)_(.*?)(\d{16})(.*)
It didn’t work but if I do this:
(.*?)_([PH]{2})(.*?) _(.*?)_(.*?)_(.*?)_(.*?)_(.*?)_(.*?)_(.*?)(\d{16})(.*)
It will pull the PH into a group but not the 38 ? So I’m lost at this point.
Any help would be great
Try the below Regex to match any first 3 char/decimal and one decimal
(.?)_([A-Z0-9]{3}[0-9]{1})(.?)(.*?)(.?)_(.?)(.*?)(.?)_(.?)
Try the below Regex to match any first 3 char/decimal and one decimal/char
(.?)_([A-Z0-9]{3}[A-Z0-9]{1})(.?)(.*?)(.?)_(.?)(.*?)(.?)_(.?)
It will match any 3 letters/digits followed by 1 letter/digit.
If your first two letter is a constant like "PH" then try the below
(.?)_([PH]+[0-9A-Z]{2})(.?)(.*?)(.?)_(.?)(.*?)(.?)_(.?)
I am assuming that you are trying to match group2 starting with numbers. If that is the case then you have change the source string such as
0296005_383843C5_SEQ_6210_QTY_BILLING_D_DEV_0000000000000183.PS.
It works, check it out at https://regex101.com/r/zem3vt/1
Using [^_]* performs much better in your case than .*? since it doesn't backtrack. So changing your original regex from:
(.*?)_(.*?)_(.*?)_(.*?)_(.*?)_(.*?)_(.*?)_(.*?)_(.*?)(\d{16})(.*)
to:
([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_(.*?)(\d{16})(.*)
reduces the number of steps from 114 to 42 for your given string.
The best method might be to actually split your string on _ and then test the second element to see if it contains 38. Since you haven't specified a language, I can't help to show how in your language, but most languages employ a contains or indexOf method that can be used to determine whether or not a substring exists in a string.
Using regex alone, however, this can be accomplished using the following regular expression.
See regex in use here
Ensuring 38 exists in the second part:
([^_]*)_([^_]*38[^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_(.*?)(\d{16})(.*)
Capturing the 38 in the second part:
([^_]*)_([^_]*)(38)([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_(.*?)(\d{16})(.*)

*NIX REGEXP number series

Am playing around with regexp's but this is my headache. I have a dynamic number which needs a suffix. The suffix is always 0 to 9, 99 or 999.
Example:
I have the number 461200 and now I want to create an regexp that will match 461200 to 461209. What I've learned it should be ^46120[0-9]$? Is this correct or somewhere to the left of hell?
Ok, let us assume it is correct and I now want to match 461200 - 461299? This is where I get lost.
^4612[0-9]{2}?
It cannot be. I am yet to figure this out.
Any help appreciated.
For 1 digit at the end you need:
^4612[0-9]$
2 digits at the end:
^4612[0-9]{2}$
3 digits at the end:
^4612[0-9]{3}$
The number in braces {} means the number of time the preceding character or set has to be repeated.
Ok, let us assume it is correct and I now want to match 461200 -
461299?
You can either repeat the desired character class by saying [0-9][0-9] or use quantifiers [0-9]{2}.
It can be either:
^4612[0-9][0-9]$
or
^4612[0-9]{2}$
Both would work.
maybe try this regex:
^4612\d{2}$

Regular Expression (RegEx) For Hours with Increments

I need to only accept input that meets these rules...
0.25-24
Increments of .25 (.00, .25, .50, .75)
First digit doesn't have to be required.
Would like trailing zeros to be optional.
Examples of some valid entries:
0.25
.50
.5
1
1.0
5.50
23.75
24 (max allowed)
UPDATE: nothing at all, null/blank, should also be accepted as valid
Example of some invalid entries:
0
.0
.00
0.0
0.00
24.25
-1
I understand that RegEx is a pattern matching language therefore it's not great for ranges, less-than, and great-than checking. So to check if it's less than or equal to 24 means I'd have to find a pattern, right? So there are 24 possible patters which would make this a long RegEx, am I understanding this correctly? I could use ColdFusion to do the check to make sure it's in the 0-24 range. It's not the end of the world if I have use ColdFusion for this part, but it'd be nice to get it all into the RegEx if it doesn't cause it to be too long. This is what I have so far:
^\d{0,2}((\.(0|00|25|5|50|75))?)$
http://regex101.com/r/iS7zM3
This handles pretty much all of it except for the 0-24 range check or the check for just a zero. I'll keep plugging away at it but any help would be appreciated. Thanks!
Change \d{0,2} to (?:1[0-9]?|2[0-4]?|[3-9])? and it'll match from 1 to 24 (or nothing).
You can also simplify the second part to (?:\.(?:00?|25|50?|75))? - you could go further to (?:\.(?:[05]0?|[27]5))? but that might obfuscate the intent a bit too far.
To exclude 24.25 you could perhaps use a negative lookahead (?!24\.[^0]) to prevent anything other than 24.0 or 24.00, but it's probably simpler to just exclude 24 from the main pattern and include a specific check for 24/24.0/24.00 at the start:
(?x)
# checks for 24
^24$|^24\.00?$
|
# integer part
^
(?:1[0-9]?|2[0-3]?|[3-9]|0(?=\.[^0])|(?=\.[^0]))
# decimal part
(?:\.(?:00?|25|50?|75))?
$
That also includes a check for 0(?=\.[^0]) which uses a positive lookahead to only allow an initial 0 if the next char is a . followed by a non-zero (so 0.0 and 0.00 isn't allowed).
The (?x) flag allows whitespace to be ignored, allowing readable regex in your code - obviously preferable to squashing it all onto a single line - and also enables the use of # to start line comments to explain parts of a pattern. (Literal whitespaces and hashes can be escaped with backslash, or encoded via e.g. \x23 for hash.)
For comparison, here's a pure-CFML way of doing it:
IsNumeric(Num)
AND Num GT 0
AND Num LTE 24
AND NOT find('.',Num*4)
Now, are you really sure it's better as a regex...
You could try this regex (broken down):
^
(?:
(?:[1-9]|1\d|2[0-3])(?:\.(?:[05]0?|[27]5))? # Non-zeros with optional decimal
|
0?(?:\.(?:50?|[27]5)) # Decimals under 1
|
24(?:\.00?)? # The maximum
)
$
In one line:
^(?:(?:[1-9]|1\d|2[0-3])(?:\.(?:[05]0?|[27]5))?|0?(?:\.(?:50?|[27]5))|24(?:\.00?)?)$
regex101 demo
^([0-1]?[0-9]|2[0-4])((\.(0|00|25|5|50|75))?)$
This means the one's place can be 0-9 if the tens place is missing, a 0, or 1.
If the tens place is a 2, then the ones place can be 0-4.
The second part is great, it's simple and readable too. It has an extra set of parens though that can be removed, reducing it to this:
^([0-1]?[0-9]|2[0-4])(\.(0|00|25|5|50|75))?$

Simplify regular expression for time literals (like "10h50m")

I am writing lexer rules for a custom description language using pyLR1 which shall include time literals like for example:
10h30m # meaning 10 hours + 30 minutes
5m30s # meaning 5 minutes + 30 seconds
10h20m15s # meaning 10 hours + 20 minutes + 15 seconds
15.6s # meaning 15.6 seconds
The order of specification for hour, minute and second parts shall be fixed to h, m, s. To specify this in detail, I want the following valid combinations hms, hm, h, ms, m and s (with numbers between the different segments of course).
As a bonus the regex should check for decimal (i.e. non-natural) numbers in the segments and only allow these in the segment with least significance.
So I have for all but the last group a number match like:
([0-9]+)
And for the last group even:
([0-9]*\.[0-9]+|[0-9]+(\.[0-9]*)?) # to allow for .5 and 0.5 and 5.0 and 5
Going through all the combinations of h, m and s a cute little python script gives me the following regex:
(([0-9]*\.[0-9]+|[0-9]+(\.[0-9]*)?)h|([0-9]+)h([0-9]*\.[0-9]+|[0-9]+(\.[0-9]*)?)m|([0-9]+)h([0-9]+)m([0-9]*\.[0-9]+|[0-9]+(\.[0-9]*)?)s|([0-9]*\.[0-9]+|[0-9]+(\.[0-9]*)?)m|([0-9]+)m([0-9]*\.[0-9]+|[0-9]+(\.[0-9]*)?)s|([0-9]*\.[0-9]+|[0-9]+(\.[0-9]*)?)s)
Obviously, this is a little bit of horror expression. Is there any way to simplify this? The answer must work with pythons re module and I will also accept answers which do not work with pyLR1 if its due to its restricted subset of regular expressions.
You can factorise your regular expression, using the notation h, m, s to denote each of the subregexes, the most basic version is:
h|hm|hms|ms|m|s
which is what you have currently. You can break this into:
(h|hm|hms)|(ms|m)|s
and then pulling out h from the first expression and m from the second we get (using (x|) == x?):
h(m|ms)?|ms?|s
Continuing on we get to
h(ms?)?|ms?|s
which is probably simpler (and probably the simplest).
Adding in the regex d to denote decimals (as in \.[0-9]+), this could be written as
h(d|m(d|sd?)?)?|m(d|sd?)?|sd?
(i.e. at each stage optionally have either decimals, or a continuation to the next of h m or s.)
This would result in something like (for just hours and minutes):
[0-9]+((\.[0-9]+)?h|h[0-9]+(\.[0-9]+)?m)|[0-9]+(\.[0-9]+)?m
Looking at this, it might not be possible to get into a form ameniable for pyLR1, so doing the parsing with decimals in every spot and then a secondary check might be the best way to do this.
the below representation should be understandable, I dont know the exact regex syntax you're using, so you have to "translate" to the valid syntax yourself.
your hours
[0-9]{1,2}h
your minutes
[0-9]{1,2}m
your seconds
[0-9]{1,2}(\.[0-9]{1,3})?s
you want all those in order, and able to omit any of those (wrap with ?)
([0-9]{1,2}h)?([0-9]{1,2}m)?([0-9]{1,2}(\.[0-9]{1,3})?s)?
this however matches things like: 10h30s
that is valid combinations are hms, hm, hs, h, ms, m and s
or iow, minutes can be ommited, but still have hours and seconds.
the other problem is if the empty string is given, it is matched, as all three ? make that valid. so you have to work around this somehow. hmm
looking at #dbaupp h(ms?)?|ms?|s you can take the above and match:
h: [0-9]{1,2}h
m: [0-9]{1,2}m
s: [0-9]{1,2}(\.[0-9]{1,3})?s
so you get to:
h(ms?)?: ([0-9]{1,2}h([0-9]{1,2}m([0-9]{1,2}(\.[0-9]{1,3})?s)?)?
ms? : [0-9]{1,2}m([0-9]{1,2}(\.[0-9]{1,3})?s)?
s : [0-9]{1,2}(\.[0-9]{1,3})?s
all those OR'd together give you a big but easy to break down regex:
([0-9]{1,2}h([0-9]{1,2}m([0-9]{1,2}(\.[0-9]{1,3})?s)?)?|[0-9]{1,2}m([0-9]{1,2}(\.[0-9]{1,3})?s)?|[0-9]{1,2}(\.[0-9]{1,3})?s
which get you away with both the empty string problem and the match of hs.
looking at #Donal Fellows comment on #dbaupp answer, I'll also do (h?m)?S|h?M|H
(h?m)?s: (([0-9]{1,2}h)?[0-9]{1,2}m)?[0-9]{1,2}(\.[0-9]{1,3})?s
h?m : ([0-9]{1,2}h)?[0-9]{1,2}m
h : [0-9]{1,2}h
and merged together, you end up with something smaller than the above:
(([0-9]{1,2}h)?[0-9]{1,2}m)?[0-9]{1,2}(\.[0-9]{1,3})?s|([0-9]{1,2}h)?[0-9]{1,2}m|[0-9]{1,2}h
now we have to find a way to match .xx demical representation
Here is a short Python expression that works:
(\d+h)?(\d+m)?(\d*\.\d+|\d+(\.\d*)?)(?(2)s|(?(1)m|[hms]))
Inspired by Cameron Martins answer based on conditionals.
Explained:
(\d+h)? # optional int "h" (capture 1)
(\d+m)? # optional int "m" (capture 2)
(\d*\.\d+|\d+(\.\d*)?) # int or decimal
(?(2) # if "m" (capture 2) was matched:
s # "s"
| (?(1) # else if "h" (capture 1) was matched:
m # "m"
| # else (nothing matched):
[hms])) # any of the "h", "m" or "s"
You may have hours, minutes, and seconds.
/(\d{1,2}h)*(\d{1,2}m)*(\d{1,2}(\.\d+)*s)*/
should do the work. Depending on the regex library, you will get your items in order, or you will have to parse them further to check for h, m or s.
In this latter case, see also what is returned by
/(\d{1,2}(h))*(\d{1,2}(m))*(\d{1,2}(\.\d+)*(s))*/
The last group should be:
([0-9]*\.[0-9]+|[0-9]+(\.[0-9]+)?)
unless you want to match 5.
You could use regex ifs, like so:
(([0-9]+h)?([0-9]+m)?([0-9]+s)?)(?(?<=h)(([0-9]*\.[0-9]+|[0-9]+(\.[0-9]*)?)m)?|(?(?<=m)(([0-9]*\.[0-9]+|[0-9]+(\.[0-9]*)?)s)?|\b(([0-9]*\.[0-9]+|[0-9]+(\.[0-9]*)?)[hms])?))
Here - http://regexr.com?31dmj
I havn't checked that this works, but it trys to match just integers for hours, minutes, then seconds first, then if the last thing matched is hours, it allows fractional minutes, otherwise if the last thing matched is minutes, it allows fractional seconds.

Create shortest possible regex

I want to create a regex that will match any of these values
7-5
6-6 ((0-99) - (0-99))
6-4
6-3
6-2
6-1
6-0
0-6
1-6
2-6
3-6
4-6
the 6-6 example is a special case, here are some examples of values:
6-6 (23-8)
6-6 (4-25)
6-6 (56-34)
Is it possible to make one regex that can do this?
If so, is it possible to further extend that regex for the 6-6 special case such that the the difference between the two numbers within the parentheses is equal to 2 or -2?
I could easily write this with procedural code, but i'm really curious if someone can devise a regex for this.
Lastly, if it could be further extended such that the individual digits were in their own match groups I'd be amazed. An example would be for 7-5, i could have a match group that just had the value 7, and another that had the value 5. However for 6-6 (24-26) I'd like a match group that had the first six, a match group for the second 6, a match group for the 24 and a match group for the 26.
This may be impossible, but some of you can probably get this part of the way there.
Good luck, and thanks for the help.
NO. The answer is "We can't," and the reason is because you're trying to use a hammer to dig a hole.
The problem with writing one long "clever" (this word causes a knee-jerk reaction in many people who are far more anti-regex than I) regex is that, six months from now, you'll have forgotten those clever regex features that you used so heavily, and you'll have written six months worth of code related to something else, and you'll get back to your impressive regex and have to tweak one detail, and you'll say, "WTF?"
This is what (I understand) you want, in Perl:
# data is in $_
if(/7-5|6-[0-4]|[0-4]-6|6-6 \((\d{1,2})-(\d{1,2})\)/) {
if($1 and $2 and abs($1 - $2) == 2) {
# we have the right difference
}
}
Some might say that the given regex is a bit much, but I don't think it's too bad. If the \d{1,2} bit is a little too obscure you could use \d\d? (which is what I used at first, but didn't like the repetition).
You can do it like this:
7-5|6-[0-4]|[0-5]-6|6-6 \(\d\d?-\d\d?\)
Just add parens to get your match groups.
Off the top of my head (there may be some errors but the principle should be good):
\d-\d|6-6 (\d+-\d+)
And like with any regexp, you can surround what you want to extract with parentheses for match groups:
(\d)-(\d)|(6)-(6) ((\d)+-(\d+))
In the 6-6 case, the first two parentheses should get the sixes, and the second two should get the multi-digit values that come afterwards.
Here is one that will match only the numbers you want and let you get each digit by name:
p = r'(?P<a>[0-4]|6|7)-(?P<b>[0-4]|6|5) *(\((?P<c>\d{1,2})-(?P<d>\d{1,2})\))?'
To get each digit you could use:
values = re.search(p, string).group('a', 'b', 'c', 'd')
Which will return a four element tuple with the values you are looking for (or None if no match was found).
One problem with this pattern is that it will patch the stuff in the parenthesis whether or not there was a match to '6-6'. This one will only match the final parenthesis if 6-6 is matched:
p = r'(?P<a>[0-4]|(?P<tmp_a>6)|7)-(?P<b>(?(tmp_a)(?P<tmp_b>6)|([0-4]|5)))(?(tmp_b) *(\((?P<c>\d{1,2})-(?P<d>\d{1,2})\))?)'
I don't know of any way to look for a difference between the numbers in the parenthesis; regex only knows about strings, not numerical values . . .
(I am assuming python syntax here; the perl syntax is slightly different, though perl supports the python way of doing things.)