I have a need to find a match for any kind of legit date format, but - for a specific given date only, that I am given as a parameter.
for example: 01-05-2020
I need it to match as many formats as possible, such as, but not only : 01/05/2020, 1/5/20, 05-01-2020 2020-05-01, and so on.
not match: 02-05-2020, or any other date that is not the first of May 2020.
thanks,
Dani
As upog already said, you must have a constant format for the month and date. 5th jan and 1st may can easily switch places. However given you have a constant format we can construct a regex.
For the format - Date/Month/Year and Date-Month-Year. You can construct a regex to match 1st may of 2020 like so-
^0?1[\/\-]0?5[\/\-](?:20)?20$
Live demo
For the format - Month/Date/Year and Month-Date-Year. You can construct a regex to match 1st may of 2020 like so-
^0?5[\/\-]0?1[\/\-](?:20)?20$
Live demo
Related
Right now, Regex say is valid a date if I have 200011 - Which is Jan 1st 2000
but i want to restrict that to have the format YYYYMMDD so it will accept only 20000101 as a valid date. How can I achieve this?
My code:
^(?:(?:(?:(?:(?:[1-9]\d)(?:0[48]|[2468][048]|[13579][26])|(?:(?:[2468][048]|[13579][26])00))([-\/.]?)(?:0?2\1(?:29)))|(?:(?:[1-9]\d{3})([-\/.]?)(?:(?:(?:0?[13578]|1[02])\2(?:31))|(?:(?:0?[13-9]|1[0-2])\2(?:29|30))|(?:(?:0?[1-9])|(?:1[0-2]))\2(?:0?[1-9]|1\d|2[0-8])))))$
You need to remove ? after all 0s:
^(?:(?:(?:(?:(?:[1-9]\d)(?:0[48]|[2468][048]|[13579][26])|(?:(?:[2468][048]|[13579][26])00))([-\/.]?)(?:02\1(?:29)))|(?:(?:[1-9]\d{3})([-\/.]?)(?:(?:(?:0[13578]|1[02])\2(?:31))|(?:(?:0[13-9]|1[0-2])\2(?:29|30))|(?:(?:0[1-9])|(?:1[0-2]))\2(?:0[1-9]|1\d|2[0-8])))))$
See the regex demo
For example, the last 0?[1-9] would match 0 one or zero times, and then a non-zero digit. When you remove ? quantifier, the 0 will become required.
I am trying to use Regex to parse a series of strings to extract one or more text dates that may be in multiple formats. The strings will look something like the following:
24 Aug 2016: nno-emvirt010a/b; 16 Aug 2016 nnt-emvirt010a/b nnd-emvirt010a/b COSI-1.6.5
24.16 nno-emvirt010a/b nnt-emvirt010a/b nnd-emvirt010a/b EI.01.02.03\
9/23/16: COSI-1.6.5 Logs updated at /vobs/COTS/1.6.5/files/Status_2016-07-27.log, Status_2016-07-28.log, Status_2016-08-05.log, Status_2016-08-08.log
I am not concerned about validating the individual date fields; just extracting the date string. The part I am unable to figure out is how to not match on number sequences that match the pattern but aren’t dates (‘1.6.5’ in ex. (1) and 01.02.03 in ex. (2)) and dates that are part of a file name (2016-07-27 in ex. (3)). In each of these exception cases in my input data, the initial numbers are preceded by either a period(.), underscore (_) or dash (-), but I cannot determine how to use this to edit the pattern syntax to not match these strings.
The pattern I have that partially works is below. It will only ignore the non date matches if it starts with 1 digit as in example 1.
/[^_\.\(\/]\d{1,4}[/\-\.\s*]([1-9]|0[1-9]|[12][0-9]|3[01]|[a-z]{3})[/\-\.\s*]\d{1,4}/ig`
I am not sure about vba check if this works . seems they have given so much options : https://www.safaribooksonline.com/library/view/regular-expressions-cookbook/9781449327453/ch04s04.html
^(?:(1[0-2]|0?[1-9])/(3[01]|[12][0-9]|0?[1-9])|↵
(3[01]|[12][0-9]|0?[1-9])/(1[0-2]|0?[1-9]))/(?:[0-9]{2})?[0-9]{2}$
^(?:
# m/d or mm/dd
(1[0-2]|0?[1-9])/(3[01]|[12][0-9]|0?[1-9])
|
# d/m or dd/mm
(3[01]|[12][0-9]|0?[1-9])/(1[0-2]|0?[1-9])
)
# /yy or /yyyy
/(?:[0-9]{2})?[0-9]{2}$
According to the test strings you've presented, you can use the following regex
See this regex in use here
(?<=[^a-zA-Z\d.]|^)((?:\d{1,2}\s*[A-Z][a-z]{2}\s*\d+)|(?:(?:\d{1,2}\/){2}\d+)|(?:\d+(?:-\d{2}){2})|\d{2}\.\d{2})(?=[^a-zA-Z\d.])
This regex ensures that specific date formats are met and are preceded by nothing (beginning of the string) or by a non-word character (specifically a-z, A-Z, 0-9) or dot .. The date formats that will be matched are:
24 Aug 2016
24.16
9/23/16
The regex could be further manipulated to ensure numbers are in the proper range according to days/month, etc., however, I don't feel that is really necessary.
Edits
Edit 1
Since VBA doesn't support lookbehinds, you can use the following. The date is in capture group 1.
(?:[^a-zA-Z\d.]|^)((?:\d{1,2}\s*[A-Z][a-z]{2}\s*\d+)|(?:(?:\d{1,2}\/){2}\d+)|(?:\d+(?:-\d{2}){2})|\d{2}\.\d{2})(?=[^a-zA-Z\d.])
Edit 2
As per bulbus's comment below
(?:[^\w.]|^)((?:\d{1,2}\s*[A-Z][a-z]{2}\s*\d{2,4})|(?:(?:\d{1,2}\/){2}\d{2,4})|(?:\d{2,4}(?:-\d{2}){2})|\d{2}\.\d{2})
Took liberty to edit that a bit.
replaced [^a-zA-Z\d.] with [^\w.], comes with added advantage of excluding dates with _2016-07-28.log
Due to 1 removed trailing condition (?=[^a-zA-Z\d.]).
Forced year digits from \d+ to \d{2,4}
Edit 3
Due to added conditions of the regex, I've made the following edits (to improve upon both previous edits). As per the OP:
The edited pattern above works in all but 2 cases:
it does not find dates with the year first (ex. 2016/07/11)
if the date is contained within parenthesis in the string, it returns the left parenthesis as part of the date (ex. match = (8/20/2016)
Can you provide the edit to fix these?
In the below regexes, I've changed years to \d+ in order for it to work on any year greater than or equal to 0.
See the code in use here
(?:[^\w.]|^)((?:\d{1,2}\s+[A-Z][a-z]{2}\s+\d+)|(?:(?:\d{1,2}\/){2}\d+)|(?:\d+(?:\/\d{1,2}){2})|(?:\d+(?:-\d{2}){2})|\d{2}\.\d+)
This regex adds the possibility of dates in the XXXX/XX/XX format where the date may appear first.
The reason you are getting ( as a match before the regex is the nature of the Full Match. You need to, instead, grab the value of the first capture group and not the whole regex result. See this answer on how to grab submatches from a regex pattern in VBA.
Also, note that any additional date formats you need to catch need to be explicitly set in the regex. Currently, the regex supports the following date formats:
\d{1,2}\s+[A-Z][a-z]{2}\s+\d+
12 Apr 17
12 Apr 2017
(?:\d{1,2}\/){2}\d+
1/4/17
01/04/17
1/4/2017
01/04/2017
\d+(?:\/\d{1,2}){2}
17/04/01
2017/4/1
2017/04/01
17/4/1
\d+(?:-\d{2}){2}
17-04-01
2017-04-01
\d{2}\.\d+ - Although I'm not sure what this date format is even used for and how it could be considered efficient if it's missing month
24.16
Currently I'm writing a regex to validate datetime strings. The basic format is
(Date)(?: *,?)(Time)
I've worked out the date and time portions of the regex just fine, but I'm having trouble making the regex allow for all three of these cases:
date and no separator and no time
date and separator and time
no date and no separator and time
The easy way (making all three parts optional) has the unintended side-effect of allowing the case
no date and separator and no time
Is this something that can be achieved with a lookaround?
Example valid input:
January 25 2004, 10:30 PM
2004-1-25
10:30 PM
Invalid input:
,
(The Date and Time regexes handle a bunch of cases, and I've got that worked out - it's preventing that invalid case while allowing for all three forms that I still need.)
If you include the separator with either the date or time group, then a separator by itself will not match:
^(DATE(?: *,? *))?(TIME)?$
It has an optional date and separator, followed by an optional time.
See demo: http://regex101.com/r/dH6nI2/1
You could use alternation:
((Date)((?: *,?)(Time))?|((Date)(?: *,?))?(Time))
Though it would lead to some duplication.
I think I found a solution which works. I use a lookahead at the beginning to verify that it doesn't start with a separator... and then make the separator an optional part of the time section.
/^(?=[A-Za-z0-9])(Date)(?:(?:Separator)?(Time))$/
As an example, to validate datetimes with the date in the format YYYY-MM-DD and times in the format HH:ii and Separator , *| + (optional comma and one or more spaces), it would be
/^(?=[0-9])(?:([0-9]{4})-([0-9]{2})-([0-9]{2}))?(?:(?:, *| +)?([0-9]{2}:[0-9]{2}))?$/
I have the following pattern which I'm trying to use to match credit card expiration dates:
(0[1-9]|1[0-2])\/?(([0-9]{4})|[0-9]{2}$)
and I'm testing on the following strings:
02/13
0213
022013
02/2013
02/203
02/2
02/20322
It should only match the first four strings, and the last 3 should not be a match as they are invalid. However the current pattern is also matching the last string. What am I doing wrong?
You're missing start of line anchor ^ and parenthesis are unmatched.
This should work:
re = /^(0[1-9]|1[0-2])\/?([0-9]{4}|[0-9]{2})$/;
OR using word boundaries:
re = /\b(0[1-9]|1[0-2])\/?([0-9]{4}|[0-9]{2})\b/;
Working Demo: http://regex101.com/r/gN5wH2
Since we're talking about a credit card expiration date, once you have validated the input date string using one of the fine regex expressions in the other answers, you'll certainly want to confirm that the date is not in the past.
To do so:
Express your input date string as YYYYMM. For example: 201409
Do the same for the current date. For example: 201312
Then simply compare the date strings lexicographically: For example: 201409 ge 201312.
In Perl, ge is the greater than or equal to string comparison operator. Note that as #Dan Cowell advised, credit cards typically expire on the last day of the expiry month, so it would be inappropriate to use the gt (greater than) operator.
Alternatively, if your language doesn't support comparing strings in this fashion, convert both strings to integers and instead do an arithmetic comparison.
Move a right paran:
^(0[1-9]|1[0-2])\/?(([0-9]{4}|[0-9]{2})$)
The end anchor wasn't being applied to the [0-9]{4} option, so more numbers were allowed.
Given a value I want to validate it to check if it is a valid year. My criteria is simple where the value should be an integer with 4 characters. I know this is not the best solution as it will not allow years before 1000 and will allow years such as 5000. This criteria is adequate for my current scenario.
What I came up with is
\d{4}$
While this works it also allows negative values.
How do I ensure that only positive integers are allowed?
Years from 1000 to 2999
^[12][0-9]{3}$
For 1900-2099
^(19|20)\d{2}$
You need to add a start anchor ^ as:
^\d{4}$
Your regex \d{4}$ will match strings that end with 4 digits. So input like -1234 will be accepted.
By adding the start anchor you match only those strings that begin and end with 4 digits, which effectively means they must contain only 4 digits.
The "accepted" answer to this question is both incorrect and myopic.
It is incorrect in that it will match strings like 0001, which is not a valid year.
It is myopic in that it will not match any values above 9999. Have we already forgotten the lessons of Y2K? Instead, use the regular expression:
^[1-9]\d{3,}$
If you need to match years in the past, in addition to years in the future, you could use this regular expression to match any positive integer:
^[1-9]\d*$
Even if you don't expect dates from the past, you may want to use this regular expression anyway, just in case someone invents a time machine and wants to take your software back with them.
Note: This regular expression will match all years, including those before the year 1, since they are typically represented with a BC designation instead of a negative integer. Of course, this convention could change over the next few millennia, so your best option is to match any integer—positive or negative—with the following regular expression:
^-?[1-9]\d*$
This works for 1900 to 2099:
/(?:(?:19|20)[0-9]{2})/
Building on #r92 answer, for years 1970-2019:
(19[789]\d|20[01]\d)
To test a year in a string which contains other words along with the year you can use the following regex: \b\d{4}\b
In theory the 4 digit option is right. But in practice it might be better to have 1900-2099 range.
Additionally it need to be non-capturing group. Many comments and answers propose capturing grouping which is not proper IMHO. Because for matching it might work, but for extracting matches using regex it will extract 4 digit numbers and two digit (19 and 20) numbers also because of paranthesis.
This will work for exact matching using non-capturing groups:
(?:19|20)\d{2}
Use;
^(19|[2-9][0-9])\d{2}$
for years 1900 - 9999.
No need to worry for 9999 and onwards - A.I. will be doing all programming by then !!! Hehehehe
You can test your regex at https://regex101.com/
Also more info about non-capturing groups ( mentioned in one the comments above ) here http://www.manifold.net/doc/radian/why_do_non-capture_groups_exist_.htm
you can go with sth like [^-]\d{4}$: you prevent the minus sign - to be before your 4 digits.
you can also use ^\d{4}$ with ^ to catch the beginning of the string. It depends on your scenario actually...
/^\d{4}$/
This will check if a string consists of only 4 numbers. In this scenario, to input a year 989, you can give 0989 instead.
You could convert your integer into a string. As the minus sign will not match the digits, you will have no negative years.
I use this regex in Java ^(0[1-9]|1[012])[/](0[1-9]|[12][0-9]|3[01])[/](19|[2-9][0-9])[0-9]{2}$
Works from 1900 to 9999
If you need to match YYYY or YYYYMMDD you can use:
^((?:(?:(?:(?:(?:[1-9]\d)(?:0[48]|[2468][048]|[13579][26])|(?:(?:[2468][048]|[13579][26])00))(?:0?2(?:29)))|(?:(?:[1-9]\d{3})(?:(?:(?:0?[13578]|1[02])(?:31))|(?:(?:0?[13-9]|1[0-2])(?:29|30))|(?:(?:0?[1-9])|(?:1[0-2]))(?:0?[1-9]|1\d|2[0-8])))))|(?:19|20)\d{2})$
You can also use this one.
([0-2][0-9]|3[0-1])\/([0-1][0-2])\/(19[789]\d|20[01]\d)
In my case I wanted to match a string which ends with a year (4 digits) like this for example:
Oct 2020
Nov 2020
Dec 2020
Jan 2021
It'll return true with this one:
var sheetName = 'Jan 2021';
var yearRegex = new RegExp("\b\d{4}$");
var isMonthSheet = yearRegex.test(sheetName);
Logger.log('isMonthSheet = ' + isMonthSheet);
The code above is used in Apps Script.
Here's the link to test the Regex above: https://regex101.com/r/SzYQLN/1
You can try the following to capture valid year from a string:
.*(19\d{2}|20\d{2}).*
Works from 1950 to 2099 and value is an integer with 4 characters
^(?=.*?(19[56789]|20\d{2}).*)\d{4}$