date regex which validates 3 different formats - regex

I need a regex for date string which validates
YYYY:MM:DD:HH
YYYY:MM:DD:HH:mm
YYYY:MM:DD:HH:mm:ss
means all 3 formats are valid.
Can someone help me with this ?
I have
d\d\d\d:(0\d|1[012]):([012]\d|3[01]):([01]\d|2[0-3])$ YYYY:MM:DD:HH
^\d\d\d\d:(0\d|1[012]):([012]\d|3[01]):([01]\d|2[0-3]):[0-5]\d$ YYYY:MM:DD:HH:MM
^\d\d\d\d:(0\d|1[012]):([012]\d|3[01]):([01]\d|2[0-3]):[0-5]\d:[0-5]\d$ YYYY:MM:DD:HH:MM:SS
These 3 regex and needs to be combine in one

this is your pattern
YYYY:MM:DD:HH(:mm(:ss)?)?
? means 0 or 1 time
you can test it here

I kept your year month day expression d\d\d\d:(0\d|1[012]):([012]\d|3[01]):([01]\d|2[0-3]). Since your hour and minute expressions where the same :[0-5]\d I just required them to appear zero, once or twice with.
The resulting expression is:
^\d\d\d\d:(0\d|1[012]):([012]\d|3[01]):([01]\d|2[0-3])(:[0-5]\d){0,2}$
This expression by francis-gagnon is a slight modification to prevent edge cases where the day or month is expressed as 00.
^\d\d\d\d:(0[1-9]|1[012]):(0[1-9]|[12]\d|3[01]):([01]\d|2[0-3])(:[0-5]\d){0,2}$
If you're looking to also check the date is valid then you could use something like this monster which will test each date position to it's valid and that the time will fit into 24 hour clock:
^(?:(?:(?:(?:(?:1[6-9]|[2-9]\d)?(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468][048]|[3579][26])00)))(:|\/|-|\.)(?:0?2\1(?:29)))|(?:(?:(?:1[6-9]|[2-9]\d)?\d{2})(:|\/|-|\.)(?:(?:(?:0?[13578]|1[02])\2(?:31))|(?:(?:0?[13-9]|1[0-2])\2(?:29|30))|(?:(?:0?[1-9])|(?:1[0-2]))\2(?:0?[1-9]|1\d|2[0-8]))))(?::(?:[01]\d|2[0-3]))?(?::[0-5]\d){0,2}$

\d{4}:[0-1][0-9]:[0-3][0-9](?::[0-5][0-9](?::[0-5][0-9])?)?

Related

RegEx for PostgreSQL 'interval' function

I need to implement regex validation for value that will be used in my server side to get data where certain timestamp is older (smaller) than now() - interval 'myValue'.
pSQL interval function is explained here, and in short it can have values like 2 days,3 years,12 hours, but also you can nest more different values like 2 days 6 hours 30 minutes etc.
I currently have a regex /^\d+\s(seconds?|minutes?|hours?|days?|weeks?|months?|years?)$/i that accepts only one value (e.g. 2 days), but can't figure out how to allow multiple values, and set a rule that a certain string from this group can only be repeated once or not at all.
This regex /^\d+\s(seconds?|minutes?|hours?|days?|weeks?|months?|years?)(\s\d+\s(seconds?|minutes?|hours?|days?|weeks?|months?|years?))*$/i allows nesting but also allows repetition of values e.g. 2 days 12 hours 6 hours 2 minutes which will result in a fatal error in pSQL query.
I tried restricting repetition of values in this group with \1 and {0,1} combination of regex operators but I just can't nail it precisely enough.
NOTE: Regex is unfortunately only way I can validate this value, since I don't have access to server-side controller which receives this value nor do I have access to client-side frontend of this form. I can't just throw exceptions or skip query because it is a part of important cron-job, and must be stable at all time.
(All I have access to is json schema of this value, and therefore can only define regex pattern for it)
Any help is appreciated, thanks.
You can use
^(?!.*(second|minute|hour|day|week|month|year).*\1)\d+\s+(?:second|minute|hour|day|week|month|year)s?(?:\s+\d+\s+(?:second|minute|hour|day|week|month|year)s?)*$
See the regex demo
Details
^ - start of string
(?!.*(second|minute|hour|day|week|month|year).*\1) - no second, minute, hour day, week, monthoryear` string repetition allowed in the whole string
\d+\s+(?:second|minute|hour|day|week|month|year)s? - 1 or more digits, one or more whitespaces, then either second, minute, hour, day, week, month or year, and then an optional s letter
(?:\s+\d+\s+(?:second|minute|hour|day|week|month|year)s?)* - zero or more repetition of one or more whitespaces followed with the pattern described above
$ - end of string.
Forget it. The only complete documentation of the supported values for interval is the implementation (the guts are in ParseDateTime).
Consider these:
SELECT INTERVAL '12 00:12:00';
interval
══════════════════
12 days 00:12:00
(1 row)
SELECT INTERVAL '12 d 12 mins';
interval
══════════════════
12 days 00:12:00
(1 row)
SELECT INTERVAL '3-2';
interval
════════════════
3 years 2 mons
(1 row)
What I would do in your place is to write a function that casts the string to interval and catches and reports an error:
CREATE FUNCTION interval_ok(text) RETURNS boolean
LANGUAGE plpgsql AS
$$BEGIN
PERFORM CAST ($1 AS interval);
RETURN TRUE;
EXCEPTION
WHEN invalid_datetime_format THEN
RETURN FALSE;
END;$$;

Regex expression for date within dates range

I need to validate with regex a date in format yyyy-mm-dd (2019-12-31) that should be within the range 2019-12-20 - 2020-01-10.
What would be the regex for this?
Thanks
Regex only deal with characters. so we have to work out at each position in the date what are the valid characters.
The first part is easy. The first two characters have to be 20
Now it gets complicated the next character can be a 1 or a 2 but what follows depends on the value of that character so we split the rest of the regex into two sections the first if the third character matches 1 and the second if it matches 2
We know that if the third character is a 1 then what must follow is the characters 9-12- as the range starts at 2019-12-20 now for the day part. The 9th character is the tens for the day this can only be 2 or 3 as we are already in the last month and the minimum date is 20. The last character can be any digit 0-9. This gives us a day match of [23][0-9]. Putting this together we now have a pattern for years starting 2019 as 19-12-[23][0-9]
It the third character is a 2 then we can match up to the day part of the date a gain as the range ends in January. This gives us a partial match of 20-01- leaving us to work on the day part. Hear we know that the first character of the day can either be a 1 or 0 however if it's a 1 then the last character must be a 0 and if it's a 0 then the last character can only be in the range 1 to 9. This give us another alteration (?:0[1-9]|10) Putting the second part together we get 20-01-(?:0[1-9]|10).
Combining these together gives the final regex 20(?:19-12-[23][0-9]|20-01-(?:0[1-9]|10))
Note that I'm assuming that the date you are testing against is a validly formatted date.
Try this:
(2019|2020)\-(12|01)\-([0-3][0-9]|[0-9])
But be aware that this will allow number up to where the first digit is between zero and three and the second digit between zero and nine for the dd value. You could specify all numbers you want to allow (from 20 to 10) like this (20|21|22|23|24|25|26|27|28|29|30|31|01|1|02|2|03|3|04|4|05|5|06|6|07|7|08|8|09|9|10).
(2019|2020)\-(12|01)\-(20|21|22|23|24|25|26|27|28|29|30|31|01|1|02|2|03|3|04|4|05|5|06|6|07|7|08|8|09|9|10)
But honestly... Regular-Expressions are not the right tool for this. RegExp gives a mask to something, not a logical context. Use regex to extract the data/value from a string and validate those values using another language.
The above 2nd Regex will, f.e. match your dates, but also values outside of this range since there is no context between 2019|2020 and the second group 12|01 so they match values like 2019-12-11 but also 2020-12-11.
To only match the values you want this will be a really large regex like this (inner brackets only if you need them) ((2019)-(12)-(20)|(2019)-(12)-(21)|(2019)-(12)-(22)|...) and continue with all possible dates - and ask yourself: what would you do if you find such a regex in a project you have to work with ;)
Better solution (quick and dirty, there might be better solutions):
(?<yyyy>20[0-9]{2})\-(?<mm>[01][0-9]|[0-9])\-(?<dd>[0-3][0-9]|[0-9])
This way you have three named groups (yyyy, mm, dd) you can access and validate the matched values... The regex is smaller, you have a better association between code and regex and both are easier to maintain.

Regex for month and day before specific symbols

I'm trying to get the day and month from strings such as:
5月2日 or 4月22日 or 12月2日
However I can't see to figure out the correct regex:
I've tried \d{1,2}[^月] and \d{1,2}[^日] however this only returns something if there is a double digit in the day or month.
Any ideas what I'm missing?
Thanks.
\d{1,2} is matching 1 digit and [^月] is matching another. Your current regex will match two digits and then any character except 月
The correct way to ensure the 月 follows is to use a lookahead \d{1,2}(?=月) as seen in use here
Assuming you have 12 months per year and up to 31 days per month this will get you close, you'll still have to do bounds checking after you determine the syntax is correct; (read; month 19 day 37 will be valid syntax here)
1?\d月[123]?\d日
Edit: Here's a better regex that doesn't need to be bounds checked and doesn't require lookahead;
^(1[012]|[1-9])月(3[01]|[12]\d|[1-9])日$

Regex pattern for a number within a number

Can anyone think of a better way to write this? It works but it is a little ugly.
Input data looks like this: 125100001
The first two numbers are the year, next two are the week number, and last 5 are the serial. I want to validate that the week number is not over 52 for an angular input[number] pattern option. Basically just to leverage the $error field :)
So here it is:
^\d\d(0[0-9]|1[0-9]|2[0-9]|3[0-9]|4[0-9]|5[0-2]){1}\d{5}$
Use this:
^(\d{2})([0-4][1-9]|[1-5]0|5[12])(\d{5})$
Notes
The first set of parentheses (0[1-9]|1[0-2]) validates the month: 01-12
The second set of parentheses ([0-4][1-9]|[1-5]0|5[12]) validates the week: 01-52
If you wish, you can retrieve each component with groups 1, 2 and 2
Just for the week part:
[0-4]\d|5[0-2]
so the entire regex would be:
^\d\d([0-4]\d|5[0-2])\d{5}$

Regular Expression to match dates in dd/mm/yy format and check for valid values

Does anyone have a regurlar expression available which only accepts dates in the format dd/mm/yy but also has strict checking to make sure that the date is valid, including leap year support?
I am coding in vb.net and am struggling to work this one out.
I don't think the leap year support is doable in a regex without using some ugly regex.
You will have to check the date validity after validating input with the regex.
As hinted by Keeper, you could use the DateTime.ParseExact method to validate your date :
Public Function IsValidDate(ByVal dateString As String) As Boolean
Try
DateTime.ParseExact(dateString, "dd/MM/yy", System.Globalization.CultureInfo.InvariantCulture)
Return True
Catch ex As FormatException
Return False
End Try
End Function
Apart from the fact that such a regex would be a long dirty unmaintainable thing if it existed, you can't even tell for sure if an year in YY format is a leap year or not. 00 is leap if and only if it is a multiple of 400. 2000 was leap, 1900 wasn't.
The following regex makes sure that date is between 01 and 31, month is between 01 and 12 and year is between 1900 and 2099. Delete the (?:19|20) part to make it dd/mm/yy format: then year can be anything from 00 to 99. Do the real validation using standard date-time libraries - use the regex for just client side validations (to save a trip to server - assuming you're doing date-time validation at server), or as a screening test before feeding to the real validator.
^(0[1-9]|[12]\d|3[01])/(0[1-9]|1[0-2])/((?:19|20)\d{2})$
It will be hard, or ugly and a maintenance nightmare, or even impossible.
Just do a check in code after Regex validation.
No need to use a regex because there's already a date parsing function: DateTime.ParseExact
I think it is extreamly hard to check whether year leap or not with reqular expression. Please take a look at this article about your problem. Here is a citate from here:
Again, how complex you want to make
your regular expression depends on the
data you are using it on, and how big
a problem it is if an unwanted match
slips through. If you are validating
the user's input of a date in a
script, it is probably easier to do
certain checks outside of the regex.
For example, excluding February 29th
when the year is not a leap year is
far easier to do in a scripting
language. It is far easier to check if
a year is divisible by 4 (and not
divisible by 100 unless divisible by
400) using simple arithmetic than
using regular expressions.
You'd probably be better off just doing the format-validation in regex and handling the date-validation separately.
You're trying to use the regex hammer to solve an eminently non-nail shaped problem.
Would it not be better to extract the numbers using regular expressions, but validate it programatically?
There is no need to verify the format because the "parse" methods will do this for you. the parse will compare all date format strings in DateTimeFormatInfo against the string you pass to the method. The parse-exact method will only compare the specified string against the data format strings you pass to the method.
Imports System.Globalization
Module Sample
Public Function IsValidDateString1(ByVal s As String) As Boolean
Return Date.TryParseExact(s, "dd/MM/yy", CultureInfo.InvariantCulture, DateTimeStyles.None, Nothing)
End Function
Public Function IsValidDateString2(ByVal s As String) As Boolean
Static _dateFormats() As String = New String() {"dd/MM/yy", "d/M/yy", "d/M/yyyy"}
Return Date.TryParseExact(s, _dateFormats, CultureInfo.InvariantCulture, DateTimeStyles.None, Nothing)
End Function
Public Sub Main()
Debug.WriteLine("single")
Debug.WriteLine(IsValidDateString1("31/12/2001")) 'wrong format
Debug.WriteLine(IsValidDateString1("31/12/01"))
Debug.WriteLine(IsValidDateString1("29/2/08")) '<-be careful
Debug.WriteLine(IsValidDateString1("29/ 2/08")) '<-be careful
Debug.WriteLine(IsValidDateString1("29/02/08"))
Debug.WriteLine(IsValidDateString1("29/02/09")) 'invalide date
Debug.WriteLine("multiple")
Debug.WriteLine(IsValidDateString2("31/12/2001"))
Debug.WriteLine(IsValidDateString2("31/12/01"))
Debug.WriteLine(IsValidDateString2("29/2/08"))
Debug.WriteLine(IsValidDateString2("29/ 2/08")) '<-be careful
Debug.WriteLine(IsValidDateString2("29/02/08"))
Debug.WriteLine(IsValidDateString2("29/02/09")) 'invalid date
End Sub
End Module