Regex to match an ISO 8601 datetime string - regex

does anyone have a good regex pattern for matching iso datetimes?
ie: 2010-06-15T00:00:00

For the strict, full datetime, including milliseconds, per the W3C's take on the spec.:
//-- Complete precision:
/\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d:[0-5]\d\.\d+([+-][0-2]\d:[0-5]\d|Z)/
//-- No milliseconds:
/\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d:[0-5]\d([+-][0-2]\d:[0-5]\d|Z)/
//-- No Seconds:
/\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d([+-][0-2]\d:[0-5]\d|Z)/
//-- Putting it all together:
/(\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d:[0-5]\d\.\d+([+-][0-2]\d:[0-5]\d|Z))|(\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d:[0-5]\d([+-][0-2]\d:[0-5]\d|Z))|(\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d([+-][0-2]\d:[0-5]\d|Z))/
.
Additional variations allowed by the actual ISO 8601:2004(E) doc:
/********************************************
** No time-zone varients:
*/
//-- Complete precision:
/\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d:[0-5]\d\.\d+/
//-- No milliseconds:
/\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d:[0-5]\d/
//-- No Seconds:
/\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d/
//-- Putting it all together:
/(\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d:[0-5]\d\.\d+)|(\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d:[0-5]\d)|(\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d)/
WARNING: This all gets messy fast, and it still allows certain nonsense such as a 14th month.
Additionally, ISO 8601:2004(E) allows a several other variants.
.
"2010-06-15T00:00:00" isn't legal, because it doesn't have the time-zone designation.

For matching just ISO date, like 2017-09-22, you can use this regexp:
^\d{4}-([0]\d|1[0-2])-([0-2]\d|3[01])$
It will match any numeric year, any month specified by two digits in range 00-12 and any date specified by two digits in range 00-31

I reworked the top answer into something a bit more concise. Instead of writing out each of the three optional patterns, the elements are nested as optional statements.
/[+-]?\d{4}(-[01]\d(-[0-3]\d(T[0-2]\d:[0-5]\d:?([0-5]\d(\.\d+)?)?[+-][0-2]\d:[0-5]\dZ?)?)?)?/
I'm curious if there are downsides to this approach?
You can find tests for my suggested answer here: http://regexr.com/3e0lh

I have made this regex and solves the validation for dates as they come out of Javascript's .toISOString() method.
^[0-9]{4}-((0[13578]|1[02])-(0[1-9]|[12][0-9]|3[01])|(0[469]|11)-(0[1-9]|[12][0-9]|30)|(02)-(0[1-9]|[12][0-9]))T(0[0-9]|1[0-9]|2[0-3]):(0[0-9]|[1-5][0-9]):(0[0-9]|[1-5][0-9])\.[0-9]{3}Z$
Contemplated:
Proper symbols ('-', 'T', ':', '.', 'Z') in proper places.
Consistency with months of 29, 30 or 31 days.
Hours from 00 to 23.
Minutes and seconds from 00 to 59.
Milliseconds from 000 to 999.
Not contemplated:
Leap years.
Example date: 2019-11-15T13:34:22.178Z
Example to run directly in Chrome console: /^[0-9]{4}-((0[13578]|1[02])-(0[1-9]|[12][0-9]|3[01])|(0[469]|11)-(0[1-9]|[12][0-9]|30)|(02)-(0[1-9]|[12][0-9]))T(0[0-9]|1[0-9]|2[0-3]):(0[0-9]|[1-5][0-9]):(0[0-9]|[1-5][0-9])\.[0-9]{3}Z$/.test("2019-11-15T13:34:22.178Z");
Regex flow diagram (Regexper):

Here is a regular expression to check ISO 8601 date format including leap years and short-long months. To run this, you'll need to "ignore white-space". A compacted version without white-space is on regexlib: http://regexlib.com/REDetails.aspx?regexp_id=3344
There's more to ISO 8601 - this regex only cares for dates, but you can easily extend it to support time validation which is not that tricky.
Update: This works now with javascript (without lookbehinds)
^(?:
(?=
[02468][048]00
|[13579][26]00
|[0-9][0-9]0[48]
|[0-9][0-9][2468][048]
|[0-9][0-9][13579][26]
)
\d{4}
(?:
(-|)
(?:
(?:
00[1-9]
|0[1-9][0-9]
|[1-2][0-9][0-9]
|3[0-5][0-9]
|36[0-6]
)
|
(?:01|03|05|07|08|10|12)
(?:
\1
(?:0[1-9]|[12][0-9]|3[01])
)?
|
(?:04|06|09|11)
(?:
\1
(?:0[1-9]|[12][0-9]|30)
)?
|
02
(?:
\1
(?:0[1-9]|[12][0-9])
)?
|
W(?:0[1-9]|[1-4][0-9]|5[0-3])
(?:
\1
[1-7]
)?
)
)?
)$
|
^(?:
(?!
[02468][048]00
|[13579][26]00
|[0-9][0-9]0[48]
|[0-9][0-9][2468][048]
|[0-9][0-9][13579][26]
)
\d{4}
(?:
(-|)
(?:
(?:
00[1-9]
|0[1-9][0-9]
|[1-2][0-9][0-9]
|3[0-5][0-9]
|36[0-5]
)
|
(?:01|03|05|07|08|10|12)
(?:
\2
(?:0[1-9]|[12][0-9]|3[01])
)?
|
(?:04|06|09|11)
(?:
\2
(?:0[1-9]|[12][0-9]|30)
)?
|
(?:02)
(?:
\2
(?:0[1-9]|1[0-9]|2[0-8])
)?
|
W(?:0[1-9]|[1-4][0-9]|5[0-3])
(?:
\2
[1-7]
)?
)
)?
)$
To cater for time, add something like this to the mixture (from: http://underground.infovark.com/2008/07/22/iso-date-validation-regex/ ):
([T\s](([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)?(\15([0-5]\d))?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?

The ISO 8601 specification allows a wide variety of date formats. There's a mediocre explanation as to how to do it here. There is a fairly minor discrepancy between how Javascript's date input formatting and the ISO formatting for simple dates which do not specify timezones, and it can be easily mitigated using a string substitution. Fully supporting the ISO-8601 specification is non-trivial.
Here is a reference example which I do not guarantee to be complete, although it parses the non-duration dates from the aforementioned Wikipedia page.
Below is an example, and you can also see it's output on ideone. Unfortunately, it does not work to specification as it does not properly implement weeks. The definition of the week number 01 in ISO-8601 is non-trivial and requires some browsing the calendar to determine where week one begins, and what exactly it means in terms of the number of days in the specified year. This can probably be fairly easily corrected (I'm just tired of playing with it).
function parseISODate (input) {
var iso = /^(\d{4})(?:-?W(\d+)(?:-?(\d+)D?)?|(?:-(\d+))?-(\d+))(?:[T ](\d+):(\d+)(?::(\d+)(?:\.(\d+))?)?)?(?:Z(-?\d*))?$/;
var parts = input.match(iso);
if (parts == null) {
throw new Error("Invalid Date");
}
var year = Number(parts[1]);
if (typeof parts[2] != "undefined") {
/* Convert weeks to days, months 0 */
var weeks = Number(parts[2]) - 1;
var days = Number(parts[3]);
if (typeof days == "undefined") {
days = 0;
}
days += weeks * 7;
var months = 0;
}
else {
if (typeof parts[4] != "undefined") {
var months = Number(parts[4]) - 1;
}
else {
/* it's an ordinal date... */
var months = 0;
}
var days = Number(parts[5]);
}
if (typeof parts[6] != "undefined" &&
typeof parts[7] != "undefined")
{
var hours = Number(parts[6]);
var minutes = Number(parts[7]);
if (typeof parts[8] != "undefined") {
var seconds = Number(parts[8]);
if (typeof parts[9] != "undefined") {
var fractional = Number(parts[9]);
var milliseconds = fractional / 100;
}
else {
var milliseconds = 0
}
}
else {
var seconds = 0;
var milliseconds = 0;
}
}
else {
var hours = 0;
var minutes = 0;
var seconds = 0;
var fractional = 0;
var milliseconds = 0;
}
if (typeof parts[10] != "undefined") {
/* Timezone adjustment, offset the minutes appropriately */
var localzone = -(new Date().getTimezoneOffset());
var timezone = parts[10] * 60;
minutes = Number(minutes) + (timezone - localzone);
}
return new Date(year, months, days, hours, minutes, seconds, milliseconds);
}
print(parseISODate("2010-06-29T15:33:00Z-7"))
print(parseISODate("2010-06-29 06:14Z"))
print(parseISODate("2010-06-29T06:14Z"))
print(parseISODate("2010-06-29T06:14:30.2034Z"))
print(parseISODate("2010-W26-2"))
print(parseISODate("2010-180"))

yyyy-MM-dd
Too much explanation for most of the answers here, here's a short variation of #Sergey answer addressing some weird scenarios (like 2020-00-00), this RegExp only cares about the yyyy-MM-dd date:
// yyyy-MM-dd
^\d{4}-([0][1-9]|1[0-2])-([0-2][1-9]|[1-3]0|3[01])$
Also this one doesn't care about the number of days per month, like 2020-11-31 (because November has only 30 days).
My use-case was to convert a String into a Date (from an API param) and I needed only to know that the input string didn't contained strange stuff, I do the next validation against an actual Date object.

Here is my take on this:
^\d{4}-(?:0[1-9]|1[0-2])-(?:[0-2][1-9]|[1-3]0|3[01])T(?:[0-1][0-9]|2[0-3])(?::[0-6]\d)(?::[0-6]\d)?(?:\.\d{3})?(?:[+-][0-2]\d:[0-5]\d|Z)?$
Examples for a match:
2016-12-31T23:59:60+12:30
2021-05-10T09:05:12.000Z
3015-01-01T23:00+02:00
1001-01-31T23:59:59Z
2023-12-20T20:20
The minutes and seconds part could be refined more, but this is good enough for me.
Regexper

Not sure if it's relevant to the underlying problem you are trying to solve, but you can pass an ISO date string as a constructor arg to Date() and get an object out of it. The constructor is actually very flexible in terms of coercing a string into a Date.

with 02/29 validation from the year 1900 to 2999
(((2000|2400|2800|((19|2[0-9])(0[48]|[2468][048]|[13579][26])))-02-29)|(((19|2[0-9])[0-9]{2})-02-(0[1-9]|1[0-9]|2[0-8]))|(((19|2[0-9])[0-9]{2})-(0[13578]|10|12)-(0[1-9]|[12][0-9]|3[01]))|(((19|2[0-9])[0-9]{2})-(0[469]|11)-(0[1-9]|[12][0-9]|30)))T([01][0-9]|[2][0-3]):[0-5][0-9]:[0-5][0-9]\.[0-9]{3}Z

Brocks answers are good, but should start with ^ and end with $ so as not to allow prefix/suffix characters if all you are trying to match is the date string alone.

While using QRegExp with IsoDateWithMs the millisecond ones here did not work. instead the following saved the day.
\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d:[0-5]\d\.\d{1,3}
(I know this is a JS entry but it pops up first and would be helpful for c++ devs)

Related

How to match sequences of consecutive Date like characters string in Dart?

I have consecutive characters as date like 20210215 and 14032020
I am trying to convert to date string like 2021.02.15 and 14.03.2020
My first problem is the consecutive characters it is in 2 format type. Like:
1) 20210215
2) 14032020
And my second problem to convert them to date string without changing the format. Like:
1) 2021.02.15
2) 14.03.2020
When I search about regex couldn't find any pattern to convert the above {20210215} consecutive characters examples to date {2021.02.15} string.
What is correct regex pattern to convert both format as I describe above in Dart?
UPDATE-1:
I need to turn this string "20210215" to this "2021.02.15" as a string and not DateTime. Also I need to turn this string "14032020" to this string "14.03.2020". And I don't want to turn to DateTime string.
First I need to detected if the year is in beginning of the string or end of it. Than put dot (.) between the day, month and year string.
UPDATE-2:
this is best I can found but it turns 02 day or month to 2. But I need as it is.
var timestampString = '13022020';//'20200213';
var re1 = RegExp(
r'^'
r'(?<year>\d{4})'
r'(?<month>\d{2})'
r'(?<day>\d{2})'
r'$',
);
var re2 = RegExp(
r'^'
r'(?<day>\d{2})'
r'(?<month>\d{2})'
r'(?<year>\d{4})'
r'$',
);
var dateTime;
var match1 = re1.firstMatch(timestampString);
if (match1 == null) {
var match2 = re2.firstMatch(timestampString);
if (match2 == null) {
//throw FormatException('Unrecognized timestamp format');
dateTime = '00.00.0000';
print('DATE_TIME: $dateTime');
} else {
var _day = int.parse(match2.namedGroup('day'));
var _month = int.parse(match2.namedGroup('month'));
var _year = int.parse(match2.namedGroup('year'));
dateTime = '$_day.$_month.$_year';
print('DATE_TIME(match2): $dateTime');
}
} else {
var _year = int.parse(match1.namedGroup('year'));
var _month = int.parse(match1.namedGroup('month'));
var _day = int.parse(match1.namedGroup('day'));
dateTime = '$_year.$_month.$_day';
print('DATE_TIME(match1): $dateTime');
}
Output:
DATE_TIME: 2020.2.13
But I need to get output as 2020.02.13.
Second is match1 also prints 1302.20.20 But if I remove var match2 section and if format is like 20200213 it works but doesn't print the 0 as I post it above.
You can use
text.replaceAllMapped(RegExp(r'\b(?:((?:19|20)\d{2})(0?[1-9]|1[0-2])(0?[1-9]|[12][0-9]|3[01])|(0?[1-9]|[12][0-9]|3[01])(0?[1-9]|1[0-2])((?:19|20)\d{2}))\b'), (Match m) => m[4] == null ? "${m[1]}.${m[2]}.${m[3]}" : "${m[4]}.${m[5]}.${m[6]}")
The \b(?:((?:19|20)\d{2})(0?[1-9]|1[0-2])(0?[1-9]|[12][0-9]|3[01])|(0?[1-9]|[12][0-9]|3[01])(0?[1-9]|1[0-2])((?:19|20)\d{2}))\b regex matches
\b - a word boundary
(?: - start of a non-capturing group:
((?:19|20)\d{2}) - year from 20th and 21st centuries
(0?[1-9]|1[0-2]) - month
(0?[1-9]|[12][0-9]|3[01]) - day
| - or
(0?[1-9]|[12][0-9]|3[01]) - day
(0?[1-9]|1[0-2]) - month
((?:19|20)\d{2}) - year
) - end of the group
\b - word boundary.
See the regex demo.
See a Dart demo:
void main() {
final text = '13022020 and 20200213 20111919';
print(text.replaceAllMapped(RegExp(r'\b(?:((?:19|20)\d{2})(0?[1-9]|1[0-2])(0?[1-9]|[12][0-9]|3[01])|(0?[1-9]|[12][0-9]|3[01])(0?[1-9]|1[0-2])((?:19|20)\d{2}))\b'), (Match m) =>
m[4] == null ? "${m[1]}.${m[2]}.${m[3]}" : "${m[4]}.${m[5]}.${m[6]}"));
}
Returning 13.02.2020 and 2020.02.13 20.11.1919.
If Group 4 is null, the first alternative matched, so we need to use Group 1, 2 and 3. Else, we join Group 4, 5 and 6 with a dot.

Regex date formats

I need help with with three regular expressions for date validation. The date formats to validate against should be:
- MMyy
- ddMMyy
- ddMMyyyy
Further:
I want the regular expressions to match the exact number of digits in the formats above. For instance, January should be 01, NOT 1:
060117 // ddMMyy format: Ok
06117 // ddMMyy format: NOT Ok
Hyphens and slashes are NOT allowed, like: 06-01-17, or 06/01/17.
Below are the regex:es that I use. I cannot get them quite right though.
string regex_MMyy = #"^(1[0-2]|0[1-9]|\d)(\d{2})$";
string regex_ddMMyy = #"^(0[1-9]|[12]\d|3[01])(1[0-2]|0[1-9]|\d)(\d{2})$";
string regex_ddMMyyyy = #"^(0[1-9]|[12]\d|3[01])(1[0-2]|0[1-9]|\d)(\d{4})$";
var test_MMyy_1 = Regex.IsMatch("0617", regex_MMyy); // Pass
var test_MMyy_2 = Regex.IsMatch("617", regex_MMyy); // Pass, do NOT want this to pass.
var test_ddMMyy_1 = Regex.IsMatch("060117", regex_ddMMyy); // Pass
var test_ddMMyy_2 = Regex.IsMatch("06117", regex_ddMMyy); // Pass, do NOT want this to pass.
var test_ddMMyyyy_1 = Regex.IsMatch("06012017", regex_ddMMyyyy); // Pass
var test_ddMMyyyy_2 = Regex.IsMatch("0612017", regex_ddMMyyyy); // Pass, do NOT want this to pass.
(If anyone could take allowed days for each month, and leap years into account, that would be a huge bonus :)).
Thanks,
Best Regards

Parse time string using regex

My time string may be in one of the following formates (x and y - integer numbers, h and m - symbols):
xh ym
xh
ym
y
Examples:
1h 20m
45m
2h
120
What regular expression should I write to get x and y numbers from such string?
(\d+)([mh]?)(?:\s+(\d+)m)?
You can then inspect groups 1-3. For your examples those would be
('1', 'h', '20')
('45', 'm', '')
('2', 'h', '')
('120', '', '')
As always, you might want to use some anchors ^, $, \b...
I'm going to assume you're using .NET due to your username. :)
I think in this case, it's easier to use TimeSpan.ParseExact for this task.
You can specify a list of permitted formats (see here for the format for these) and ParseExact will read in the TimeSpan according to them.
Here is an example:
var formats = new[]{"h'h'", "h'h 'm'm'", "m'm'", "%m"};
// I have assumed that a single number means minutes
foreach (var item in new[]{"23","1h 45m","1h","45m"})
{
TimeSpan timespan;
if (TimeSpan.TryParseExact(item, formats, CultureInfo.InvariantCulture, out timespan))
{
// valid
Console.WriteLine(timespan);
}
}
Output:
00:23:00
01:45:00
01:00:00
00:45:00
The only problem with this is that it is rather inflexible. Additional whitespace in the middle will fail to validate. A more robust solution using Regex is:
var items = new[]{"23","1h 45m", "45m", "1h", "1h 45", "1h 45", "1h45m"};
foreach (var item in items)
{
var match = Regex.Match(item, #"^(?=\d)((?<hours>\d+)h)?\s*((?<minutes>\d+)m?)?$", RegexOptions.ExplicitCapture);
if (match.Success)
{
int hours;
int.TryParse(match.Groups["hours"].Value, out hours); // hours == 0 on failure
int minutes;
int.TryParse(match.Groups["minutes"].Value, out minutes);
Console.WriteLine(new TimeSpan(0, hours, minutes, 0));
}
}
Breakdown of the regex:
^ - start of string
(?=\d) - must start with a digit (do this because both parts are marked optional, but we want to make sure at least one is present)
((?<hours>\d+)h)? - hours (optional, capture into named group)
\s* - whitespace (optional)
((?<minutes>\d+)m?)? - minutes (optional, capture into named group, the 'm' is optional too)
$ - end of string
I would say that mhyfritz' solution is simple, efficient and good if your input is only what you shown.
If you ever need to handle corner cases, you can use a more discriminative expression:
^(\d+)(?:(h)(?:\s+(\d+)(m))?|(m?))$
But it can be overkill...
(get rid of ^ and $ if you need to detect such pattern in a larger body of text, of course).
Try this one: ^(?:(\d+)h\s*)?(?:(\d+)m?)?$:
var s = new[] { "1h 20m", "45m", "2h", "120", "1m 20m" };
foreach (var ss in s)
{
var m = Regex.Match(ss, #"^(?:(\d+)h\s*)?(?:(\d+)m?)?$");
int hour = m.Groups[1].Value == "" ? 0 : int.Parse(m.Groups[1].Value);
int min = m.Groups[2].Value == "" ? 0 : int.Parse(m.Groups[2].Value);
if (hour != 0 || min != 0)
Console.WriteLine("Hours: " + hour + ", Mins: " + min);
else
Console.WriteLine("No match!");
}
in bash
echo $string | awk '{for(i=1;i<=NF;i++) print $i}' | sed s/[hm]/""/g

UK Date Regular Expression [duplicate]

This question already has answers here:
Does anyone know of a reg expression for uk date format
(7 answers)
Closed 9 years ago.
I'm trying to create a regular expression that validates UK date format. I have the following:
(0[1-9]|[12][0-9]|3[01])[- /.](0[1-9]|1[012])[- /.](19|20)\d\d
This works great for validating: 09/12/2011. But if the date is: 9/12/2011 it will not validate correctly. Is there a regular expression that allows me to use a single number and two numbers for the day section? For example "09" and "9".
Just make the leading 0 optional:
(0?[1-9]|[12][0-9]|3[01])[- /.](0?[1-9]|1[012])[- /.](19|20)\d\d
You will need an additional validation step, though - this regex of course won't check for invalid dates like 31-02-2000 etc. While it's possible to do this in regex, it's not recommended because it's much easier to do this programmatically, and that regex is going to be monstrous. Here is a date validating regex (that uses the mmddyyyy format, though) to show what I mean.
My preference goes to a combination of the simple regex, (\d{1,2})[-/.](\d{1,2})[-/.](\d{4}), with some code that validates that this is indeed a correct date. You will have to have that code anyways, unless you want to make a monstrous regex that rejects "29-02-2011" but not "29-02-2008".
Anyway, here's a breakdown of that regex so you can see what's going on:
\d{1,2}: this part matches one or two ({1,2}) digits (\d), making up the day portion of the date.
[-/.]: this matches one of the characters inside the brackets, i.e, either a ., a /, or a -.
\d{1,2}: again, this matches one or two digits from the month.
[-/.]: another separator...
\d{4}: this matches exactly four ({4}) digits for the year portion.
Note that the day, month, and year portion of the regular expression are inside parentheses. This is to create groups. Each of those three portions will be captured into a group that you can retrieve from the match. Groups are identified with a number, starting with 1, from left to right. This means that the day will be group 1, the month group 2, and the year group 3. There is also a group 0 that always contains the entire text matched.
You can use the groups to perform the second part of the validation and reject invalid dates like "30-02-2011", "31-4-2011", or "32-13-2011".
If you want to reject inputs that use two different separators, like "31-12.2011", you can use a slightly more advanced feature called backreferences:
(\d{1,2})([-/.])(\d{1,2})\2(\d{4})
Note that now I placed the first separator inside a group. This changes the month to group 3, and the year to group 4. The separator is matched by group 2. The backreference is that \2 part between the month and the year. It matches whatever was matched by the 2nd previous group. If you walk back two groups from the backreference you end up in group 2, the separator. If that group matched a ., the backreference with match only a . as well; if it matched a -, the backreference will match only a -; and so on.
What is "the UK date format" anyway?
Officially, it's 2011-02-21 today, see BS EN 28601 / ISO 8601.
On the web, you should all be using the format defined in RFC 3339.
Correct way to check for the day is to ban the [4-9]. numbers too.
Something like 0[0-9]|[12][0-9]|3[01]|[^0-9][0-9]|^[0-9]
Yes. {n,m} is the quantifier that say "at least n element, max m elements". So you can write \d{1,2} (matches 1 or 2 digits). Complete date: \d{1,2}/\d{1,2}/\d{4}
Alternative: Make the leading zero optional:
0?\d/0?\d/\d{4}
The question mark says, that the element before the question mark is optional.
Use this code, I am validating everything for the date. :-
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class FinalDateValidator {
private Pattern pattern;
private Matcher matcher;
public boolean isValidDate(final String date) {
Pattern pattern;
Matcher matcher;
final String DATE_PATTERN = "([0-9]{4})/(0?[1-9]|1[012])/(0[1-9]|[12][0-9]|3[01]|[1-9])";
pattern = Pattern.compile(DATE_PATTERN);
matcher = pattern.matcher(date);
if (matcher.matches()) {
matcher.reset();
if (matcher.find()) {
int year = Integer.parseInt(matcher.group(1));
String month = matcher.group(2);
String day = matcher.group(3);
System.out.println("__________________________________________________");
System.out.println("year : "+year +" month : "+month +" day : "+day);
if (day.equals("31")
&& (month.equals("4") || month.equals("6")
|| month.equals("9") || month.equals("11")
|| month.equals("04") || month.equals("06") || month
.equals("09"))) {
return false; // only 1,3,5,7,8,10,12 has 31 days
} else if (month.equals("2") || month.equals("02")) {
// leap year
if (year % 4 == 0) {
if (day.equals("30") || day.equals("31")) {
return false;
} else {
return true;
}
} else {
if (day.equals("29") || day.equals("30")
|| day.equals("31")) {
return false;
} else {
return true;
}
}
} else {
return true;
}
} else {
return false;
}
} else {
return false;
}
}
public static void main(String argsp[]){
FinalDateValidator vs = new FinalDateValidator();
System.out.println("1: 1910/12/10---"+vs.isValidDate("1910/12/10"));
System.out.println("2: 2010/2/29---"+vs.isValidDate("2010/02/29"));
System.out.println("3: 2011/2/29---"+vs.isValidDate("2011/02/29"));
System.out.println("3: 2011/2/30---"+vs.isValidDate("2011/02/30"));
System.out.println("3: 2011/2/31---"+vs.isValidDate("2011/02/31"));
System.out.println("4: 2010/08/31---"+vs.isValidDate("2010/08/31"));
System.out.println("5: 2010/3/10---"+vs.isValidDate("2010/03/10"));
System.out.println("6: 2010/03/33---"+vs.isValidDate("2010/03/33"));
System.out.println("7: 2010/03/09---"+vs.isValidDate("2010/03/09"));
System.out.println("8: 2010/03/9---"+vs.isValidDate("2010/03/9"));
System.out.println("9: 1910/12/00---"+vs.isValidDate("1910/12/00"));
System.out.println("10: 2010/02/01---"+vs.isValidDate("2010/02/01"));
System.out.println("11: 2011/2/03---"+vs.isValidDate("2011/02/03"));
System.out.println("12: 2010/08/31---"+vs.isValidDate("2010/08/31"));
System.out.println("13: 2010/03/39---"+vs.isValidDate("2010/03/39"));
System.out.println("14: 201011/03/31---"+vs.isValidDate("201011/03/31"));
System.out.println("15: 2010/032/09---"+vs.isValidDate("2010/032/09"));
System.out.println("16: 2010/03/922---"+vs.isValidDate("2010/03/922"));
}
}
Enjoy...
I ran into the similar requirements.
Here is the complete regular expression along with Leap Year validation.
Format: dd/MM/yyyy
(3[01]|[12]\d|0[1-9])/(0[13578]|10|12)/((?!0000)\d{4})|(30|[12]\d|0[1-9])/(0[469]|11)/((?!0000)\d{4})|(2[0-8]|[01]\d|0[1-9])/(02)/((?!0000)\d{4})|
29/(02)/(1600|2000|2400|2800|00)|29/(02)/(\d\d)(0[48]|[2468][048]|[13579][26])
It can be easily modified to US format or other EU formats.

Does anyone know of a reg expression for uk date format

Hi does any one know a reg ex for a uk date format e.g. dd/mm/yyyy.
The dd or mm can be 1 character e.g. 1/1/2010 but the year must always be 4 characters.
Thanks in advance
^\d{1,2}/\d{1,2}/\d{4}$
will match 1/1/2000, 07/05/1999, but also 99/77/8765.
So if you want to do some rudimentary plausibility checking, you need
^(0?[1-9]|[12][0-9]|3[01])/(0?[1-9]|1[012])/\d{4}$
This will still match 31/02/9999, so if you want to catch those, it's getting hairier:
^(?:(?:[12][0-9]|0?[1-9])/0?2|(?:30|[12][0-9]|0?[1-9])/(?:0?[469]|11)|(?:3[01]|[12][0-9]|0?[1-9])/(?:0?[13578]|1[02]))/\d{4}$
But this still won't catch leap years. So, modifying a beast of a regex from regexlib.com:
^(?:(?:(?:(?:31\/(?:0?[13578]|1[02]))|(?:(?:29|30)\/(?:0?[13-9]|1[0-2])))\/(?:1[6-9]|[2-9]\d)\d{2})|(?:29\/0?2\/(?:(?:(1[6-9]|[2-9]\d)(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468][048]|[3579][26])00))))|(?:0?[1-9]|1\d|2[0-8])\/(?:(?:0?[1-9])|(?:1[0-2]))\/(?:(?:1[6-9]|[2-9]\d)\d{2}))$
will match
1/1/2001
31/5/2010
29/02/2000
29/2/2400
23/5/1671
01/1/9000
and fail
31/2/2000
31/6/1800
12/12/90
29/2/2100
33/3/3333
All in all, regular expressions may be able to match dates; validating them is not their forte, but if they are all you can use, it's certainly possible. But looks horrifying :)
Regex is not the right tool for this job.
It is very difficult (but possible) to come up with the regex to match a valid date. Things like ensuring Feb has 29 days on leap year and stuff is not easily doable in regex.
Instead check if your language library provides any function for validating dates.
PHP has one such function called checkdate :
bool checkdate ( int $month , int $day , int $year)
\b(0?[1-9]|[12][0-9]|3[01])[/](0?[1-9]|1[012])[/](19|20)?[0-9]{2}\b
Match :
1/1/2010
01/01/2010
But also invalid dates such as February 31st
^\d{1,2}/\d{1,2}/\d{4}$
In braces there is min and max char count. \d means digit, ^ start, and $ end of string.
\b(0[1-9]|[12][0-9]|3[01])[- /.](0[1-9]|1[012])[- /.](19|20)\d\d
This wont validate the date, but you can check for format
I ran into the similar requirements.
Here is the complete regular expression along with Leap Year validation.
Format: dd/MM/yyyy
(3[01]|[12]\d|0[1-9])/(0[13578]|10|12)/((?!0000)\d{4})|(30|[12]\d|0[1-9])/(0[469]|11)/((?!0000)\d{4})|(2[0-8]|[01]\d|0[1-9])/(02)/((?!0000)\d{4})|
29/(02)/(1600|2000|2400|2800|00)|29/(02)/(\d\d)(0[48]|[2468][048]|[13579][26])
It can be easily modified to US format or other EU formats.
edited:
(3[01]|[12]\d|0[1-9])/(0[13578]|10|12)/((?!0000)\d{4})|(30|[12]\d|0[1-9])/(0[469]|11)/((?!0000)\d{4})|(2[0-8]|[01]\d|0[1-9])/(02)/((?!0000)\d{4})|29/(02)/(1600|2000|2400|2800|00)|29/(02)/(\d\d)(0[48]|[2468][048]|[13579][26])
There are two things you want to do, which in my view are best considered separately
1) You want to make sure that the date is a real, actual date.
For example the 2019-02-29 isn't a real date whereas 2020-02-29 is a real date because 2020 is a leap year
2) You want to check that the date is in the correct format (so dd/mm/yyyy)
The second point can be done easily enough with a simple RegEx, plenty of examples of that.
To complicate matters, if you ask Firefox if 2019-02-29 is a real date, it'll return NaN, which is what you'd expect.
Chrome, on the other hand will say it is a real date and give you back the 1st of March 2019 - which will validate
Chrome, will also accept a single digit number as a proper date too for some strange reason, feed it "2" and it'll give you full date from 2001 back - which will validate
So first step is to create a function which attempts to decipher a date (no matter the format) and works cross-browser to return a boolean indicating if the date is valid or not
function validatableDate(value)
{
Date.prototype.isValid = function()
{ // An invalid date object returns NaN for getTime() and NaN is the only
// object not strictly equal to itself.
return this.getTime() === this.getTime();
};
minTwoDigits = function(n)
{ //pads any digit less than 10 with a leading 0
return (parseInt(n) < 10 ? '0' : '') + parseInt(n);
}
var valid_date = false;
var iso_array = null;
// check if there are date dividers (gets around chrome allowing single digit numbers)
if ((value.indexOf('/') != -1) || (value.indexOf('-') != -1)) { //if we're dealing with - dividers we'll do some pre-processing and swap them out for /
if (value.indexOf('-') != -1) {
dash_parts = value.split('-');
value = dash_parts.join("/");
//if we have a leading year, we'll put it at the end and work things out from there
if (dash_parts[0].length > 2) {
value = dash_parts[1] + '/' + dash_parts[2] + '/' + dash_parts[0];
}
}
parts = value.split('/');
if (parts[0] > 12) { //convert to ISO from UK dd/mm/yyyy format
iso_array = [parts[2], minTwoDigits(parts[1]), minTwoDigits(parts[0])]
} else if (parts[1] > 12) { //convert to ISO from American mm/dd/yyyy format
iso_array = [parts[2], minTwoDigits(parts[0]), minTwoDigits(parts[1])]
} else //if a date is valid in either UK or US (e.g. 12/12/2017 , 10/10/2017) then we don't particularly care what format it is in - it's valid regardless
{
iso_array = [parts[2], minTwoDigits(parts[0]), minTwoDigits(parts[1])]
}
if (Array.isArray(iso_array)) {
value = iso_array.join("-");
var d = new Date(value + 'T00:00:01Z');
if (d.isValid()) //test if it is a valid date (there are issues with this in Chrome with Feb)
{
valid_date = true;
}
//if the month is Feb we need to do another step to cope with Chrome peculiarities
if (parseInt(iso_array[1]) == 2) {
month_info = new Date(iso_array[0], iso_array[1], 0);
//if the day inputed is larger than the last day of the February in that year
if (iso_array[2] > month_info.getDate()) {
valid_date = false;
}
}
}
}
return valid_date;
}
That can be compressed down to
function validatableDate(t) {
Date.prototype.isValid = function () {
return this.getTime() === this.getTime()
}, minTwoDigits = function (t) {
return (parseInt(t) < 10 ? "0" : "") + parseInt(t)
};
var a = !1,
i = null;
return -1 == t.indexOf("/") && -1 == t.indexOf("-") || (-1 != t.indexOf("-") && (dash_parts = t.split("-"), t = dash_parts.join("/"), dash_parts[0].length > 2 && (t = dash_parts[1] + "/" + dash_parts[2] + "/" + dash_parts[0])), parts = t.split("/"), i = parts[0] > 12 ? [parts[2], minTwoDigits(parts[1]), minTwoDigits(parts[0])] : (parts[1], [parts[2], minTwoDigits(parts[0]), minTwoDigits(parts[1])]), Array.isArray(i) && (t = i.join("-"), new Date(t + "T00:00:01Z").isValid() && (a = !0), 2 == parseInt(i[1]) && (month_info = new Date(i[0], i[1], 0), i[2] > month_info.getDate() && (a = !1)))), a
}
That gets you a cross-browser test as to whether the date can be validated or not and it'll read & decipher dates in formats
yyyy-mm-dd
dd-mm-yyyy
mm-dd-yyyy
dd/mm/yyyy
mm/dd/yyyy
Once you've validated the date is a real, proper one you can then test the format with a regex. So for UK dd/mm/yy
function dateUK(value) {
valid_uk_date=false;
valid_date=validatableDate(value);
if(valid_date && value.match(/^(0?[1-9]|[12][0-9]|3[01])[\/](0?[1-9]|1[012])[\/]\d{4}$/))
{ valid_uk_date=true;
}
return valid_uk_date;
}
You then know that the date is a real one and that it's in the correct format.
For yyyy-mm-dd format, you'd do:
function dateISO(value) {
valid_iso_date=false;
valid_date=validatableDate(value);
if(valid_date && value.match(/^\d{4}[\/\-]\d{1,2}[\/\-]\d{1,2}$/))
{ valid_iso_date=true;
}
return valid_iso_date;
}
It depends how thorough you want to be of course, for a rough check of format sanity a RegEx may be enough for your purposes. If however you want to test if the date is a real one AND if the format is valid then this will hopefully help point you along the way
Thanks