Regex to allow separator only when both parts are present - regex

Currently I'm writing a regex to validate datetime strings. The basic format is
(Date)(?: *,?)(Time)
I've worked out the date and time portions of the regex just fine, but I'm having trouble making the regex allow for all three of these cases:
date and no separator and no time
date and separator and time
no date and no separator and time
The easy way (making all three parts optional) has the unintended side-effect of allowing the case
no date and separator and no time
Is this something that can be achieved with a lookaround?
Example valid input:
January 25 2004, 10:30 PM
2004-1-25
10:30 PM
Invalid input:
,
(The Date and Time regexes handle a bunch of cases, and I've got that worked out - it's preventing that invalid case while allowing for all three forms that I still need.)

If you include the separator with either the date or time group, then a separator by itself will not match:
^(DATE(?: *,? *))?(TIME)?$
It has an optional date and separator, followed by an optional time.
See demo: http://regex101.com/r/dH6nI2/1

You could use alternation:
((Date)((?: *,?)(Time))?|((Date)(?: *,?))?(Time))
Though it would lead to some duplication.

I think I found a solution which works. I use a lookahead at the beginning to verify that it doesn't start with a separator... and then make the separator an optional part of the time section.
/^(?=[A-Za-z0-9])(Date)(?:(?:Separator)?(Time))$/
As an example, to validate datetimes with the date in the format YYYY-MM-DD and times in the format HH:ii and Separator , *| + (optional comma and one or more spaces), it would be
/^(?=[0-9])(?:([0-9]{4})-([0-9]{2})-([0-9]{2}))?(?:(?:, *| +)?([0-9]{2}:[0-9]{2}))?$/

Related

Regex to validate date format

I'm figuring out a way to validate my input date in Altva Mapforce which is in the format "YYYYMMDD".
I know to verify year I can use [0-9]{4} but I'm having trouble figuring out a way to "restrict" date range to "01-31" and month to "01-12". Please note "01" is valid while "1" should be invalid.
Can someone please provide a regex expression to validate this sort of input?
From searching internet I got one for month: ([1-9]|[12]\d|3[01]) but this one is valid for range 1-31. I want 01-31 and so on.
For a month from 1-12 adding a zero for a single digit 1-9:
(?:0[1-9]|1[012])
For a day 1-31 adding a zero for a single digit 1-9:
(?:0[1-9]|[12]\d|3[01])
Putting it all together with 4 digits to match a year (note that \d{4} can also match 0000 and 9999), enclosed in word boundaries \b to prevent a partial match with leading or trailing digits / word characters:
\b\d{4}(?:0[1-9]|1[012])(?:0[1-9]|[12]\d|3[01])\b
A variation, limiting the scope for a year to for example 1900 - 2099
\b(?:19|20)\d{2}(?:0[1-9]|1[012])(?:0[1-9]|[12]\d|3[01])\b
But note that this does not validate the date itself, it can for example also match 20210231. To validate a date, use a designated api for handling a date in the tool or code.

regular expression for a match to a specific date

I have a need to find a match for any kind of legit date format, but - for a specific given date only, that I am given as a parameter.
for example: 01-05-2020
I need it to match as many formats as possible, such as, but not only : 01/05/2020, 1/5/20, 05-01-2020 2020-05-01, and so on.
not match: 02-05-2020, or any other date that is not the first of May 2020.
thanks,
Dani
As upog already said, you must have a constant format for the month and date. 5th jan and 1st may can easily switch places. However given you have a constant format we can construct a regex.
For the format - Date/Month/Year and Date-Month-Year. You can construct a regex to match 1st may of 2020 like so-
^0?1[\/\-]0?5[\/\-](?:20)?20$
Live demo
For the format - Month/Date/Year and Month-Date-Year. You can construct a regex to match 1st may of 2020 like so-
^0?5[\/\-]0?1[\/\-](?:20)?20$
Live demo

Regex expression for date within dates range

I need to validate with regex a date in format yyyy-mm-dd (2019-12-31) that should be within the range 2019-12-20 - 2020-01-10.
What would be the regex for this?
Thanks
Regex only deal with characters. so we have to work out at each position in the date what are the valid characters.
The first part is easy. The first two characters have to be 20
Now it gets complicated the next character can be a 1 or a 2 but what follows depends on the value of that character so we split the rest of the regex into two sections the first if the third character matches 1 and the second if it matches 2
We know that if the third character is a 1 then what must follow is the characters 9-12- as the range starts at 2019-12-20 now for the day part. The 9th character is the tens for the day this can only be 2 or 3 as we are already in the last month and the minimum date is 20. The last character can be any digit 0-9. This gives us a day match of [23][0-9]. Putting this together we now have a pattern for years starting 2019 as 19-12-[23][0-9]
It the third character is a 2 then we can match up to the day part of the date a gain as the range ends in January. This gives us a partial match of 20-01- leaving us to work on the day part. Hear we know that the first character of the day can either be a 1 or 0 however if it's a 1 then the last character must be a 0 and if it's a 0 then the last character can only be in the range 1 to 9. This give us another alteration (?:0[1-9]|10) Putting the second part together we get 20-01-(?:0[1-9]|10).
Combining these together gives the final regex 20(?:19-12-[23][0-9]|20-01-(?:0[1-9]|10))
Note that I'm assuming that the date you are testing against is a validly formatted date.
Try this:
(2019|2020)\-(12|01)\-([0-3][0-9]|[0-9])
But be aware that this will allow number up to where the first digit is between zero and three and the second digit between zero and nine for the dd value. You could specify all numbers you want to allow (from 20 to 10) like this (20|21|22|23|24|25|26|27|28|29|30|31|01|1|02|2|03|3|04|4|05|5|06|6|07|7|08|8|09|9|10).
(2019|2020)\-(12|01)\-(20|21|22|23|24|25|26|27|28|29|30|31|01|1|02|2|03|3|04|4|05|5|06|6|07|7|08|8|09|9|10)
But honestly... Regular-Expressions are not the right tool for this. RegExp gives a mask to something, not a logical context. Use regex to extract the data/value from a string and validate those values using another language.
The above 2nd Regex will, f.e. match your dates, but also values outside of this range since there is no context between 2019|2020 and the second group 12|01 so they match values like 2019-12-11 but also 2020-12-11.
To only match the values you want this will be a really large regex like this (inner brackets only if you need them) ((2019)-(12)-(20)|(2019)-(12)-(21)|(2019)-(12)-(22)|...) and continue with all possible dates - and ask yourself: what would you do if you find such a regex in a project you have to work with ;)
Better solution (quick and dirty, there might be better solutions):
(?<yyyy>20[0-9]{2})\-(?<mm>[01][0-9]|[0-9])\-(?<dd>[0-3][0-9]|[0-9])
This way you have three named groups (yyyy, mm, dd) you can access and validate the matched values... The regex is smaller, you have a better association between code and regex and both are easier to maintain.

How to creating a regex pattern in VBA to extract dates from string and exclude false matches

I am trying to use Regex to parse a series of strings to extract one or more text dates that may be in multiple formats. The strings will look something like the following:
24 Aug 2016: nno-emvirt010a/b; 16 Aug 2016 nnt-emvirt010a/b nnd-emvirt010a/b COSI-1.6.5
24.16 nno-emvirt010a/b nnt-emvirt010a/b nnd-emvirt010a/b EI.01.02.03\
9/23/16: COSI-1.6.5 Logs updated at /vobs/COTS/1.6.5/files/Status_2016-07-27.log, Status_2016-07-28.log, Status_2016-08-05.log, Status_2016-08-08.log
I am not concerned about validating the individual date fields; just extracting the date string. The part I am unable to figure out is how to not match on number sequences that match the pattern but aren’t dates (‘1.6.5’ in ex. (1) and 01.02.03 in ex. (2)) and dates that are part of a file name (2016-07-27 in ex. (3)). In each of these exception cases in my input data, the initial numbers are preceded by either a period(.), underscore (_) or dash (-), but I cannot determine how to use this to edit the pattern syntax to not match these strings.
The pattern I have that partially works is below. It will only ignore the non date matches if it starts with 1 digit as in example 1.
/[^_\.\(\/]\d{1,4}[/\-\.\s*]([1-9]|0[1-9]|[12][0-9]|3[01]|[a-z]{3})[/\-\.\s*]\d{1,4}/ig`
I am not sure about vba check if this works . seems they have given so much options : https://www.safaribooksonline.com/library/view/regular-expressions-cookbook/9781449327453/ch04s04.html
^(?:(1[0-2]|0?[1-9])/(3[01]|[12][0-9]|0?[1-9])|↵
(3[01]|[12][0-9]|0?[1-9])/(1[0-2]|0?[1-9]))/(?:[0-9]{2})?[0-9]{2}$
^(?:
# m/d or mm/dd
(1[0-2]|0?[1-9])/(3[01]|[12][0-9]|0?[1-9])
|
# d/m or dd/mm
(3[01]|[12][0-9]|0?[1-9])/(1[0-2]|0?[1-9])
)
# /yy or /yyyy
/(?:[0-9]{2})?[0-9]{2}$
According to the test strings you've presented, you can use the following regex
See this regex in use here
(?<=[^a-zA-Z\d.]|^)((?:\d{1,2}\s*[A-Z][a-z]{2}\s*\d+)|(?:(?:\d{1,2}\/){2}\d+)|(?:\d+(?:-\d{2}){2})|\d{2}\.\d{2})(?=[^a-zA-Z\d.])
This regex ensures that specific date formats are met and are preceded by nothing (beginning of the string) or by a non-word character (specifically a-z, A-Z, 0-9) or dot .. The date formats that will be matched are:
24 Aug 2016
24.16
9/23/16
The regex could be further manipulated to ensure numbers are in the proper range according to days/month, etc., however, I don't feel that is really necessary.
Edits
Edit 1
Since VBA doesn't support lookbehinds, you can use the following. The date is in capture group 1.
(?:[^a-zA-Z\d.]|^)((?:\d{1,2}\s*[A-Z][a-z]{2}\s*\d+)|(?:(?:\d{1,2}\/){2}\d+)|(?:\d+(?:-\d{2}){2})|\d{2}\.\d{2})(?=[^a-zA-Z\d.])
Edit 2
As per bulbus's comment below
(?:[^\w.]|^)((?:\d{1,2}\s*[A-Z][a-z]{2}\s*\d{2,4})|(?:(?:\d{‌1,2}\/){2}\d{2,4})|(‌​?:\d{2,4}(?:-\d{2}){‌​2})|\d{2}\.\d{2})
Took liberty to edit that a bit.
replaced [^a-zA-Z\d.] with [^\w.], comes with added advantage of excluding dates with _2016-07-28.log
Due to 1 removed trailing condition (?=[^a-zA-Z\d.]).
Forced year digits from \d+ to \d{2,4}
Edit 3
Due to added conditions of the regex, I've made the following edits (to improve upon both previous edits). As per the OP:
The edited pattern above works in all but 2 cases:
it does not find dates with the year first (ex. 2016/07/11)
if the date is contained within parenthesis in the string, it returns the left parenthesis as part of the date (ex. match = (8/20/2016)
Can you provide the edit to fix these?
In the below regexes, I've changed years to \d+ in order for it to work on any year greater than or equal to 0.
See the code in use here
(?:[^\w.]|^)((?:\d{1,2}\s+[A-Z][a-z]{2}\s+\d+)|(?:(?:\d{1,2}\/){2}\d+)|(?:\d+(?:\/\d{1,2}){2})|(?:\d+(?:-\d{2}){2})|\d{2}\.\d+)
This regex adds the possibility of dates in the XXXX/XX/XX format where the date may appear first.
The reason you are getting ( as a match before the regex is the nature of the Full Match. You need to, instead, grab the value of the first capture group and not the whole regex result. See this answer on how to grab submatches from a regex pattern in VBA.
Also, note that any additional date formats you need to catch need to be explicitly set in the regex. Currently, the regex supports the following date formats:
\d{1,2}\s+[A-Z][a-z]{2}\s+\d+
12 Apr 17
12 Apr 2017
(?:\d{1,2}\/){2}\d+
1/4/17
01/04/17
1/4/2017
01/04/2017
\d+(?:\/\d{1,2}){2}
17/04/01
2017/4/1
2017/04/01
17/4/1
\d+(?:-\d{2}){2}
17-04-01
2017-04-01
\d{2}\.\d+ - Although I'm not sure what this date format is even used for and how it could be considered efficient if it's missing month
24.16

Regular Expression to match valid dates

I'm trying to write a regular expression that validates a date. The regex needs to match the following
M/D/YYYY
MM/DD/YYYY
Single digit months can start with a leading zero (eg: 03/12/2008)
Single digit days can start with a leading zero (eg: 3/02/2008)
CANNOT include February 30 or February 31 (eg: 2/31/2008)
So far I have
^(([1-9]|1[012])[-/.]([1-9]|[12][0-9]|3[01])[-/.](19|20)\d\d)|((1[012]|0[1-9])(3[01]|2\d|1\d|0[1-9])(19|20)\d\d)|((1[012]|0[1-9])[-/.](3[01]|2\d|1\d|0[1-9])[-/.](19|20)\d\d)$
This matches properly EXCEPT it still includes 2/30/2008 & 2/31/2008.
Does anyone have a better suggestion?
Edit: I found the answer on RegExLib
^((((0[13578])|([13578])|(1[02]))[\/](([1-9])|([0-2][0-9])|(3[01])))|(((0[469])|([469])|(11))[\/](([1-9])|([0-2][0-9])|(30)))|((2|02)[\/](([1-9])|([0-2][0-9]))))[\/]\d{4}$|^\d{4}$
It matches all valid months that follow the MM/DD/YYYY format.
Thanks everyone for the help.
This is not an appropriate use of regular expressions. You'd be better off using
[0-9]{2}/[0-9]{2}/[0-9]{4}
and then checking ranges in a higher-level language.
Here is the Reg ex that matches all valid dates including leap years. Formats accepted mm/dd/yyyy or mm-dd-yyyy or mm.dd.yyyy format
^(?:(?:(?:0?[13578]|1[02])(\/|-|\.)31)\1|(?:(?:0?[1,3-9]|1[0-2])(\/|-|\.)(?:29|30)\2))(?:(?:1[6-9]|[2-9]\d)?\d{2})$|^(?:0?2(\/|-|\.)29\3(?:(?:(?:1[6-9]|[2-9]\d)?(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468][048]|[3579][26])00))))$|^(?:(?:0?[1-9])|(?:1[0-2]))(\/|-|\.)(?:0?[1-9]|1\d|2[0-8])\4(?:(?:1[6-9]|[2-9]\d)?\d{2})$
courtesy Asiq Ahamed
I landed here because the title of this question is broad and I was looking for a regex that I could use to match on a specific date format (like the OP). But I then discovered, as many of the answers and comments have comprehensively highlighted, there are many pitfalls that make constructing an effective pattern very tricky when extracting dates that are mixed-in with poor quality or non-structured source data.
In my exploration of the issues, I have come up with a system that enables you to build a regular expression by arranging together four simpler sub-expressions that match on the delimiter, and valid ranges for the year, month and day fields in the order you require.
These are :-
Delimeters
[^\w\d\r\n:]
This will match anything that is not a word character, digit character, carriage return, new line or colon. The colon has to be there to prevent matching on times that look like dates (see my test Data)
You can optimise this part of the pattern to speed up matching, but this is a good foundation that detects most valid delimiters.
Note however; It will match a string with mixed delimiters like this 2/12-73 that may not actually be a valid date.
Year Values
(\d{4}|\d{2})
This matches a group of two or 4 digits, in most cases this is acceptable, but if you're dealing with data from the years 0-999 or beyond 9999 you need to decide how to handle that because in most cases a 1, 3 or >4 digit year is garbage.
Month Values
(0?[1-9]|1[0-2])
Matches any number between 1 and 12 with or without a leading zero - note: 0 and 00 is not matched.
Date Values
(0?[1-9]|[12]\d|30|31)
Matches any number between 1 and 31 with or without a leading zero - note: 0 and 00 is not matched.
This expression matches Date, Month, Year formatted dates
(0?[1-9]|[12]\d|30|31)[^\w\d\r\n:](0?[1-9]|1[0-2])[^\w\d\r\n:](\d{4}|\d{2})
But it will also match some of the Year, Month Date ones. It should also be bookended with the boundary operators to ensure the whole date string is selected and prevent valid sub-dates being extracted from data that is not well-formed i.e. without boundary tags 20/12/194 matches as 20/12/19 and 101/12/1974 matches as 01/12/1974
Compare the results of the next expression to the one above with the test data in the nonsense section (below)
\b(0?[1-9]|[12]\d|30|31)[^\w\d\r\n:](0?[1-9]|1[0-2])[^\w\d\r\n:](\d{4}|\d{2})\b
There's no validation in this regex so a well-formed but invalid date such as 31/02/2001 would be matched. That is a data quality issue, and as others have said, your regex shouldn't need to validate the data.
Because you (as a developer) can't guarantee the quality of the source data you do need to perform and handle additional validation in your code, if you try to match and validate the data in the RegEx it gets very messy and becomes difficult to support without very concise documentation.
Garbage in, garbage out.
Having said that, if you do have mixed formats where the date values vary, and you have to extract as much as you can; You can combine a couple of expressions together like so;
This (disastrous) expression matches DMY and YMD dates
(\b(0?[1-9]|[12]\d|30|31)[^\w\d\r\n:](0?[1-9]|1[0-2])[^\w\d\r\n:](\d{4}|\d{2})\b)|(\b(0?[1-9]|1[0-2])[^\w\d\r\n:](0?[1-9]|[12]\d|30|31)[^\w\d\r\n:](\d{4}|\d{2})\b)
BUT you won't be able to tell if dates like 6/9/1973 are the 6th of September or the 9th of June. I'm struggling to think of a scenario where that is not going to cause a problem somewhere down the line, it's bad practice and you shouldn't have to deal with it like that - find the data owner and hit them with the governance hammer.
Finally, if you want to match a YYYYMMDD string with no delimiters you can take some of the uncertainty out and the expression looks like this
\b(\d{4})(0[1-9]|1[0-2])(0[1-9]|[12]\d|30|31)\b
But note again, it will match on well-formed but invalid values like 20010231 (31th Feb!) :)
Test data
In experimenting with the solutions in this thread I ended up with a test data set that includes a variety of valid and non-valid dates and some tricky situations where you may or may not want to match i.e. Times that could match as dates and dates on multiple lines.
I hope this is useful to someone.
Valid Dates in various formats
Day, month, year
2/11/73
02/11/1973
2/1/73
02/01/73
31/1/1973
02/1/1973
31.1.2011
31-1-2001
29/2/1973
29/02/1976
03/06/2010
12/6/90
month, day, year
02/24/1975
06/19/66
03.31.1991
2.29.2003
02-29-55
03-13-55
03-13-1955
12\24\1974
12\30\1974
1\31\1974
03/31/2001
01/21/2001
12/13/2001
Match both DMY and MDY
12/12/1978
6/6/78
06/6/1978
6/06/1978
using whitespace as a delimiter
13 11 2001
11 13 2001
11 13 01
13 11 01
1 1 01
1 1 2001
Year Month Day order
76/02/02
1976/02/29
1976/2/13
76/09/31
YYYYMMDD sortable format
19741213
19750101
Valid dates before Epoch
12/1/10
12/01/660
12/01/00
12/01/0000
Valid date after 2038
01/01/2039
01/01/39
Valid date beyond the year 9999
01/01/10000
Dates with leading or trailing characters
12/31/21/
31/12/1921AD
31/12/1921.10:55
12/10/2016 8:26:00.39
wfuwdf12/11/74iuhwf
fwefew13/11/1974
01/12/1974vdwdfwe
01/01/99werwer
12321301/01/99
Times that look like dates
12:13:56
13:12:01
1:12:01PM
1:12:01 AM
Dates that runs across two lines
1/12/19
74
01/12/19
74/13/1946
31/12/20
08:13
Invalid, corrupted or nonsense dates
0/1/2001
1/0/2001
00/01/2100
01/0/2001
0101/2001
01/131/2001
31/31/2001
101/12/1974
56/56/56
00/00/0000
0/0/1999
12/01/0
12/10/-100
74/2/29
12/32/45
20/12/194
2/12-73
Maintainable Perl 5.10 version
/
(?:
(?<month> (?&mon_29)) [\/] (?<day>(?&day_29))
| (?<month> (?&mon_30)) [\/] (?<day>(?&day_30))
| (?<month> (?&mon_31)) [\/] (?<day>(?&day_31))
)
[\/]
(?<year> [0-9]{4})
(?(DEFINE)
(?<mon_29> 0?2 )
(?<mon_30> 0?[469] | (11) )
(?<mon_31> 0?[13578] | 1[02] )
(?<day_29> 0?[1-9] | [1-2]?[0-9] )
(?<day_30> 0?[1-9] | [1-2]?[0-9] | 30 )
(?<day_31> 0?[1-9] | [1-2]?[0-9] | 3[01] )
)
/x
You can retrieve the elements by name in this version.
say "Month=$+{month} Day=$+{day} Year=$+{year}";
( No attempt has been made to restrict the values for the year. )
To control a date validity under the following format :
YYYY/MM/DD or YYYY-MM-DD
I would recommand you tu use the following regular expression :
(((19|20)([2468][048]|[13579][26]|0[48])|2000)[/-]02[/-]29|((19|20)[0-9]{2}[/-](0[4678]|1[02])[/-](0[1-9]|[12][0-9]|30)|(19|20)[0-9]{2}[/-](0[1359]|11)[/-](0[1-9]|[12][0-9]|3[01])|(19|20)[0-9]{2}[/-]02[/-](0[1-9]|1[0-9]|2[0-8])))
Matches
2016-02-29 | 2012-04-30 | 2019/09/31
Non-Matches
2016-02-30 | 2012-04-31 | 2019/09/35
You can customise it if you wants to allow only '/' or '-' separators.
This RegEx strictly controls the validity of the date and verify 28,30 and 31 days months, even leap years with 29/02 month.
Try it, it works very well and prevent your code from lot of bugs !
FYI : I made a variant for the SQL datetime. You'll find it there (look for my name) : Regular Expression to validate a timestamp
Feedback are welcomed :)
Sounds like you're overextending regex for this purpose. What I would do is use a regex to match a few date formats and then use a separate function to validate the values of the date fields so extracted.
Perl expanded version
Note use of /x modifier.
/^(
(
( # 31 day months
(0[13578])
| ([13578])
| (1[02])
)
[\/]
(
([1-9])
| ([0-2][0-9])
| (3[01])
)
)
| (
( # 30 day months
(0[469])
| ([469])
| (11)
)
[\/]
(
([1-9])
| ([0-2][0-9])
| (30)
)
)
| ( # 29 day month (Feb)
(2|02)
[\/]
(
([1-9])
| ([0-2][0-9])
)
)
)
[\/]
# year
\d{4}$
| ^\d{4}$ # year only
/x
Original
^((((0[13578])|([13578])|(1[02]))[\/](([1-9])|([0-2][0-9])|(3[01])))|(((0[469])|([469])|(11))[\/](([1-9])|([0-2][0-9])|(30)))|((2|02)[\/](([1-9])|([0-2][0-9]))))[\/]\d{4}$|^\d{4}$
if you didn't get those above suggestions working, I use this, as it gets any date I ran this expression through 50 links, and it got all the dates on each page.
^20\d\d-(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)-(0[1-9]|[1-2][0-9]|3[01])$
This regex validates dates between 01-01-2000 and 12-31-2099 with matching separators.
^(0[1-9]|1[012])([- /.])(0[1-9]|[12][0-9]|3[01])\2(19|20)\d\d$
var dtRegex = new RegExp(/[1-9\-]{4}[0-9\-]{2}[0-9\-]{2}/);
if(dtRegex.test(date) == true){
var evalDate = date.split('-');
if(evalDate[0] != '0000' && evalDate[1] != '00' && evalDate[2] != '00'){
return true;
}
}
Regex was not meant to validate number ranges(this number must be from 1 to 5 when the number preceding it happens to be a 2 and the number preceding that happens to be below 6).
Just look for the pattern of placement of numbers in regex. If you need to validate is qualities of a date, put it in a date object js/c#/vb, and interogate the numbers there.
I know this does not answer your question, but why don't you use a date handling routine to check if it's a valid date? Even if you modify the regexp with a negative lookahead assertion like (?!31/0?2) (ie, do not match 31/2 or 31/02) you'll still have the problem of accepting 29 02 on non leap years and about a single separator date format.
The problem is not easy if you want to really validate a date, check this forum thread.
For an example or a better way, in C#, check this link
If you are using another platform/language, let us know
Perl 6 version
rx{
^
$<month> = (\d ** 1..2)
{ $<month> <= 12 or fail }
'/'
$<day> = (\d ** 1..2)
{
given( +$<month> ){
when 1|3|5|7|8|10|12 {
$<day> <= 31 or fail
}
when 4|6|9|11 {
$<day> <= 30 or fail
}
when 2 {
$<day> <= 29 or fail
}
default { fail }
}
}
'/'
$<year> = (\d ** 4)
$
}
After you use this to check the input the values are available in $/ or individually as $<month>, $<day>, $<year>. ( those are just syntax for accessing values in $/ )
No attempt has been made to check the year, or that it doesn't match the 29th of Feburary on non leap years.
If you're going to insist on doing this with a regular expression, I'd recommend something like:
( (0?1|0?3| <...> |10|11|12) / (0?1| <...> |30|31) |
0?2 / (0?1| <...> |28|29) )
/ (19|20)[0-9]{2}
This might make it possible to read and understand.
/(([1-9]{1}|0[1-9]|1[0-2])\/(0[1-9]|[1-9]{1}|[12]\d|3[01])\/[12]\d{3})/
This would validate for following -
Single and 2 digit day with range from 1 to 31. Eg, 1, 01, 11, 31.
Single and 2 digit month with range from 1 to 12. Eg. 1, 01, 12.
4 digit year. Eg. 2021, 1980.
A slightly different approach that may or may not be useful for you.
I'm in php.
The project this relates to will never have a date prior to the 1st of January 2008. So, I take the 'date' inputed and use strtotime(). If the answer is >= 1199167200 then I have a date that is useful to me. If something that doesn't look like a date is entered -1 is returned. If null is entered it does return today's date number so you do need a check for a non-null entry first.
Works for my situation, perhaps yours too?