RegEx: date/time (sub)string - regex

I am trying to validate, with a RegEx, a user input to a date/time field in the following format: yyyy-mm-dd HH:mm. Since a user would type this one character at a time, the RegEx also needs to allow for all the substrings (e.g., yyyy-mm, yyyy-mm-dd H, etc.).
For example, the following date 2022-01-29 11:59 or any substring of it, such as (empty string), 202, 2022-01-, 2022-01-29, etc. should match.
Strings like 2022-13, 222- or 2024-01-01 23:111 should not match.
I am trying something like,
[0-9]{4}?-?(0[1-9]|1[0-2])-?(0[1-9]|[1-2][0-9]|3[0-1])? ?(2[0-3]|[01][0-9])?:?[0-5]?[0-9]?$
making every character optional, but it does not work since it can just skip parts of the string, e.g. 2022-29 returns match...

It becomes a bit convoluted if you want to match number ranges properly as you type. Here is a regex solution with test cases that has many nested groups (to allow partial input) and validates the ranges of month, day, hours, and minutes:
const regex = /^([0-9]{0,3}|[0-9]{4}(-([01]|(0[1-9]|1[0-2])(-([0-3]|(0[1-9]|[12][0-9]|3[01])( ([0-2]|([0-1][0-9]|2[0-3])(:([0-5]|([0-4][0-9]|5[0-9]))?)?)?)?)?)?)?)?)$/;
[
'',
'2',
'20',
'202',
'2022',
'2022-',
'2022-0',
'2022-01',
'2022-01-',
'2022-01-2',
'2022-01-29',
'2022-01-29 ',
'2022-01-29 1',
'2022-01-29 11',
'2022-01-29 11:',
'2022-01-29 11:5',
'2022-01-29 11:59',
'202-01',
'2022-1-1',
'2022-13',
'2022-01-32',
'2024-01-01 24',
'2024-01-01 01:60',
'2024-01-01 23:111'
].forEach(str => console.log(str, '=>', regex.test(str)));
Output:
=> true
2 => true
20 => true
202 => true
2022 => true
2022- => true
2022-0 => true
2022-01 => true
2022-01- => true
2022-01-2 => true
2022-01-29 => true
2022-01-29 => true
2022-01-29 1 => true
2022-01-29 11 => true
2022-01-29 11: => true
2022-01-29 11:5 => true
2022-01-29 11:59 => true
202-01 => false
2022-1-1 => false
2022-13 => false
2022-01-32 => false
2024-01-01 24 => false
2024-01-01 01:60 => false
2024-01-01 23:111 => false
Hierarchy of regex:
^( -- start of string & group
[0-9]{0,3} -- 0 to 3 digits for partial year
| -- logical or
[0-9]{4} --
( --
- -- dash
( --
[01] -- one digit of month
| -- logical or
(0[1-9]|1[0-2]) -- two digits of month
( --
- -- dash
( --
[0-3] -- one digit of day
| -- logical or
(0[1-9]|[12][0-9]|3[01]) -- two digits of day
( --
-- space
( --
[0-2] -- one digit of hours
| -- logical or
([0-1][0-9]|2[0-3]) -- two digits of hours
( --
: -- colon
( --
[0-5] -- one digit of minutes
| -- logical or
([0-4][0-9]|5[0-9]) -- two digits of minutes
)? --
)? --
)? --
)? --
)? --
)? --
)? --
)? --
)$ -- end of string and group
Learn more about regex: https://twiki.org/cgi-bin/view/Codev/TWikiPresentation2018x10x14Regex

Related

How to write a regular expression that contains the day and month?

I need to write a regular expression for my .htacces file that matches the following:
gallery/a-day-like-today
gallery/a-day-like-today/
gallery/a-day-like-today/2-18
gallery/a-day-like-today/2-18/
Basically, the numbers (2-18) represents the month and the day. It could also contain 1 digit like 9-8 (september the 8th)
If there is a date, then both the month and the day are mandatory.
I started doing something but i'm stock. Any ideas how to achieve it?
^gallery/a-day-like-today(?:/(\d{1,12})-(\d{1,31})?)?$
My solution doesn't work because it picks days that are out of the range {1,31}
my htacces looks like this:
RewriteRule ^gallery/a-day-like-today(?:/(\d{1,12})-(\d{1,31})?)?$ gallery/a_day_like_today.php?month=$1&day=$2 [L]
https://regex101.com/r/zkBQSJ/1
From (\d{1,12}) I can tell that you try to do a range check for month 1...12. Your regex actually means a number 1 to 12 digits long. You can do a range check in regex, but it's a bit convoluted, meaning you might want to extract the numbers, and do the range check on the extracted numbers.
Here is a regex solution for your range check for month and day:
const regex = /^gallery\/a-day-like-today(?:\/?|\/([1-9]|1[0-2])-([1-9]|[12][0-9]|3[01])\/?)$/;
[
'gallery/a-day-like-today',
'gallery/a-day-like-today/',
'gallery/a-day-like-today/2-18',
'gallery/a-day-like-today/2-18/',
'gallery/a-day-like-today/12-1/',
'gallery/a-day-like-today/13-1/',
'gallery/a-day-like-today/1-31/',
'gallery/a-day-like-today/1-32/'
].forEach(str => {
let m = str.match(regex);
console.log(str, '==>', m);
});
Output:
gallery/a-day-like-today ==> [
"gallery/a-day-like-today",
undefined,
undefined
]
gallery/a-day-like-today/ ==> [
"gallery/a-day-like-today/",
undefined,
undefined
]
gallery/a-day-like-today/2-18 ==> [
"gallery/a-day-like-today/2-18",
"2",
"18"
]
gallery/a-day-like-today/2-18/ ==> [
"gallery/a-day-like-today/2-18/",
"2",
"18"
]
gallery/a-day-like-today/12-1/ ==> [
"gallery/a-day-like-today/12-1/",
"12",
"1"
]
gallery/a-day-like-today/13-1/ ==> null
gallery/a-day-like-today/1-31/ ==> [
"gallery/a-day-like-today/1-31/",
"1",
"31"
]
gallery/a-day-like-today/1-32/ ==> null
Explanation of regex:
^gallery\/a-day-like-today -- literal text at beginning of string
(?: -- non-capture group start
\/? -- optional slash
| -- logical OR
\/ -- literal slash
(?:[1-9]|1[0-2]) -- capture group 1 for a single digit 1..9, or two digits 10...12
- -- literal dash
(?:[1-9]|[12][0-9]|3[01]) -- capture group 2 for a single digit 1..9, or two digits 10...29, or 30...31
\/? -- optional slash
) -- non-capture group end
$ -- end of string
Learn more about regex: https://twiki.org/cgi-bin/view/Codev/TWikiPresentation2018x10x14Regex
This doesn't check if the numbers make sense as a month day, but it's simple so maybe it will work for your case.
gallery/a-day-like-today/?(\d{1,}-\d{1,})?/?
BTW. Why not just check for gallery/a-day-like-today and not worry about what goes after?
This regex matches all your conditions plus I added more test cases
^gallery/a-day-like-today/{0,1}\d{0,2}-{0,1}\d{0,2}/{0,1}$
https://regex101.com/r/XqcYf6/1
If you want to return the month and day (when present), you can use the regex
^gallery/a-day-like-today/{0,1}(\d{0,2})-{0,1}(\d{0,2})/{0,1}$
https://regex101.com/r/8WYO0t/1
Test cases
gallery/a-day-like-today
gallery/a-day-like-today/
gallery/a-day-like-today/1-1
gallery/a-day-like-today/1-1/
gallery/a-day-like-today/1-31
gallery/a-day-like-today/1-31/
gallery/a-day-like-today/12-1
gallery/a-day-like-today/12-1/
gallery/a-day-like-today/12-31
gallery/a-day-like-today/12-31/
NOTE: This does not check whether the month-day pair are valid, only matches the pattern. For example, /99-99/ would pass but not be a valid month-day pair. You'll have to tell me if this is OK or not.

pattern match in ruby for time

I build a pattern match for the following time formats
4:05am or 11:03pm
it works on rubular but irb does not find a match. How can I fix it?
irb(main):568:0> "4:00am" =~ /[\d+:\d{2}(a|p)m]/
=> 0
As noted in comments, you have put your regex in brackets which indicate a character class. Easy enough to fix.
irb(main):001:0> "4:00am" =~ /\d+:\d{2}(a|p)m/
=> 0
But this would also work for:
irb(main):002:0> "4:70am" =~ /\d+:\d{2}(a|p)m/
=> 0
Instead you may want to specify that the : can be followed by the digits 0-5 and then by any digit.
irb(main):005:0> "4:50am" =~ /\d+:[0-5]\d(a|p)m/
=> 0
irb(main):006:0> "4:70am" =~ /\d+:[0-5]\d(a|p)m/
=> nil
But this can still detect a crazy time like "14:40am". Given that it picks up on AM/PM, we can assume a 12 hour clock and modify your regular expression still further:
irb(main):009:0> "14:40am" =~ /^(\d|1[0-2]):[0-5]\d(a|p)m$/
=> nil
irb(main):010:0> "12:40am" =~ /^(\d|1[0-2]):[0-5]\d(a|p)m$/
=> 0
If you want a complete regex for matching time values, we can use the one from the Regexp::Common::time Perl module. You can get this by installing the module and printing $RE{time}{hms}.
(?:(?<!\d)(?:(?:(?=\d)(?:[01]\d|2[0123]|(?<!\d)\d)))[:.\x20](?:(?:[0-5]\d))(?:[:.\x20](?:\d\d))?(?:\s?(?:(?:(?=[AaPp])(?:[ap](?:m|\.m\.)?|[AP](?:M|\.M\.)?))))?)
The hour must be in the range 0 to 24. The minute and second values must be in the range 0 to 59, and must be two digits (i.e., they must have leading zeroes if less than 10).
The hour, minute, and second components may be separated by colons (:), periods, or spaces.
The "seconds" value may be omitted.
The time may be followed by an "am/pm" indicator; that is, one of the following values:
a am a.m. p pm p.m. A AM A.M. P PM P.M.
There may be a space between the time and the am/pm indicator.
3.1.2 :018 > re = %r{(?:(?<!\d)(?:(?:(?=\d)(?:[01]\d|2[0123]|(?<!\d)\d)))[:.\x20](?:(?:[0-5]\d))(?:[
:.\x20](?:\d\d))?(?:\s?(?:(?:(?=[AaPp])(?:[ap](?:m|\.m\.)?|[AP](?:M|\.M\.)?))))?)}
=> /(?:(?<!\d)(?:(?:(?=\d)(?:[01]\d|2[0123]|(?<!\d)\d)))[:.\x20](?:(?:[0-5]\d))(?:[:.\x20](?:...
3.1.2 :019 > re.match('12:59 a.m.')
=> #<MatchData "12:59 a.m.">
3.1.2 :020 > re.match('12:59 a.m. and stuff')
=> #<MatchData "12:59 a.m.">
3.1.2 :021 > re.match('12:59 z.m.')
=> #<MatchData "12:59">
3.1.2 :022 > re.match('13:59 a.m.')
=> #<MatchData "13:59 a.m.">
3.1.2 :023 > re.match('99:59 a.m.')
=> nil
3.1.2 :024 > re.match('11:60 a.m.')
=> nil
3.1.2 :026 > re.match('4:12')
=> #<MatchData "4:12">
3.1.2 :027 > re.match('4:1')
=> nil

Preg_match / split barcode

I am struggeling with reading a GS1-128 barcode, and trying to split it up into the segments it contains, so I can fill out a form automatically.
But I can't figure it out. Scanning my barcode gives me the following:
]d2010704626096200210KT0BT2204[GS]1726090021RNM5F8CTMMBHZSY7
So I tried starting with preg_match and made the following:
/]d2[01]{2}\d{14}[10|17|21]{2}(\w+)/
Which gives me this result:
Array ( [0] => ]d2010704626096200210KT0BT2204 [1] => KT0BT2204 )
Now [1] is actually correct, men [0] isnt, so I have run into a wall.
In the end, this is the result I would like (without 01,10,17,21):
(01) 07046260962002
(10) KT0BT2204
(17) 60900
(21) RNM5F8CTMMBHZSY7
01 - Always 14 chars after
17 - Always 6 chars after
10 can be up to 20 chars, but always has end delimiter <GS> - But if barcode ends with 10 <GS> is not present
21 can be up to 20 chars, but always has end delimiter <GS> - But if barcode ends with 21 <GS> is not present
I tried follwing this question: GS1-128 and RegEx
But I couldnt figure it out.
Anyone that can help me?
This regex should do what you want (note I've split it into separate lines for clarity, you can use it like this with the x (extended) flag, or convert it back to one line):
^]d2(?:
01(?P<g01>.{14})|
10(?P<g10>(?:(?!\[GS]).){1,20})(?:\[GS]|$)|
17(?P<g17>.{6})|
21(?P<g21>(?:(?!\[GS]).){1,20})(?:\[GS]|$)
)+$
It looks for
start-of-line ^ followed by a literal ]d2 then one or more of
01 followed by 14 characters (captured in group g01)
10 followed by up to 20 characters, terminated by either [GS] or end-of-line (captured in group g10)
17 followed by 6 characters (captured in group g17)
21 followed by up to 20 characters, terminated by either [GS] or end-of-line (captured in group g21)
finishing with end-of-line $
Note that we need to use tempered greedy tokens to avoid the situation where a 10 or 21 code might swallow a following code (as in the second example in the regex demo below).
Demo on regex101
In PHP:
$barcode = ']d201070462608682672140097289158930[GS]10101656[GS]17261130';
preg_match_all('/^]d2(?:
01(?P<g01>.{14})|
10(?P<g10>(?:(?!\[GS]).){1,20})(?:\[GS]|$)|
17(?P<g17>.{6})|
21(?P<g21>(?:(?!\[GS]).){1,20})(?:\[GS]|$)
)+$/x', $barcode, $matches);
print_r($matches);
Output:
Array
(
[0] => Array
(
[0] => ]d201070462608682672140097289158930[GS]10101656[GS]17261130
)
[g01] => Array
(
[0] => 07046260868267
)
[1] => Array
(
[0] => 07046260868267
)
[g10] => Array
(
[0] => 101656
)
[2] => Array
(
[0] => 101656
)
[g17] => Array
(
[0] => 261130
)
[3] => Array
(
[0] => 261130
)
[g21] => Array
(
[0] => 40097289158930
)
[4] => Array
(
[0] => 40097289158930
)
)
Demo on 3v4l.org
]d2[01]{2}(\d{14})(?:10|17|21)(\w+)\[GS\](\w+)(?:10|17|21)(\w+)
You can try something like this.
See demo..
https://regex101.com/r/Bw238X/1

Filebeat regex - whitespace before digits

I have the following config in my filebeat.yml :
- type: log
close_renamed: true
paths:
- /logs/example.log
multiline:
pattern: '^[A-Za-z]{3} [A-Za-z]{3} [0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2} [0-9]{4}'
negate: true
match: after
timeout: 3s
fields_under_root: true
fields:
type: oracle
sourcetype: oracle
tags: ["oracle"]
Example.log (truncated) :
...
Thu Oct 1 23:01:00 2020 +00:00
LENGTH : '275'
ACTION :[7] 'CONNECT'
DATABASE USER:[1] '/'
PRIVILEGE :[6] 'SYSDBA'
CLIENT USER:[9] 'test_user'
CLIENT TERMINAL:[5] 'pts/0'
STATUS:[1] '0'
DBID:[10] '1762369616'
SESSIONID:[10] '4294967295'
USERHOST:[21] 'testdevserver'
CLIENT ADDRESS:[0] ''
ACTION NUMBER:[3] '100'
Thu Oct 1 23:01:00 2020 +00:00
LENGTH : '296'
ACTION :[29] 'SELECT STATUS FROM V$INSTANCE'
DATABASE USER:[1] '/'
PRIVILEGE :[6] 'SYSDBA'
CLIENT USER:[9] 'test_user'
CLIENT TERMINAL:[5] 'pts/0'
STATUS:[1] '0'
DBID:[10] '1762369616'
SESSIONID:[10] '4294967295'
USERHOST:[21] 'testdevserver'
CLIENT ADDRESS:[0] ''
ACTION NUMBER:[1] '3'
I noticed that this pattern '^[A-Za-z]{3} [A-Za-z]{3} [0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2} [0-9]{4}' doesnt work for the example above Thu Oct 1 23:01:00 2020 +00:00 because there is a whitespace after Oct and before 1.
How do I remove this whitespace so the pattern would match accordingly?
Thanks!
J
If there are multiple spaces instead of a single space, you can use + to match 1 or more.
Currently that would not get the desired match, as the day starts with a single digit 1. You can update the day part to match 1 or 2 digits using \d{1,2}
^[A-Za-z]{3} [A-Za-z]{3} +[0-9]{1,2} [0-9]{2}:[0-9]{2}:[0-9]{2} [0-9]{4}
Regex demo
You might for example make the pattern a bit more precise for the time and year part. You could extend it to also make the days and months an exact match.
^[A-Za-z]{3} +[A-Za-z]{3} +(?:[1-9]|[12]\d|3[01]) +(?:[01]\d|2[0-3]):[0-5]\d:[0-5]\d +(?:19|20)\d{2}\b
Regex demo

HTML5 Form Input Pattern Currency Format

Using HTML5 I have an input field that should validate against a dollar amount entered. Currently I have the following markup:
<input type="number" pattern="(\d{3})([\.])(\d{2})">
This works great for an amount that is greater than 100.00 and less than 1,000.00. I am trying to write the pattern (regex) to accept different dollar amounts. Maybe upwards of 100,000.00. Is this possible?
The best we could come up with is this:
^\\$?(([1-9](\\d*|\\d{0,2}(,\\d{3})*))|0)(\\.\\d{1,2})?$
I realize it might seem too much, but as far as I can test it matches anything that a human eye would accept as valid currency value and weeds out everything else.
It matches these:
1 => true
1.00 => true
$1 => true
$1000 => true
0.1 => true
1,000.00 => true
$1,000,000 => true
5678 => true
And weeds out these:
1.001 => false
02.0 => false
22,42 => false
001 => false
192.168.1.2 => false
, => false
.55 => false
2000,000 => false
If you want to allow a comma delimiter which will pass the following test cases:
0,00 => true
0.00 => true
01,00 => true
01.00 => true
0.000 => false
0-01 => false
then use this:
^\d+(\.|\,)\d{2}$
How about :
^\d+\.\d{2}$
This matches one or more digits, a dot and 2 digits after the dot.
To match also comma as thousands delimiter :
^\d+(?:,\d{3})*\.\d{2}$
Another answer for this would be
^((\d+)|(\d{1,3})(\,\d{3}|)*)(\.\d{2}|)$
This will match a string of:
one or more numbers with out the decimal place (\d+)
any number of commas each of which must be followed by 3 numbers and have upto 3 numbers before it (\d{1,3})(\,\d{3}|)*
Each or which can have a decimal place which must be followed by 2 numbers (.\d{2}|)
I like to give the users a bit of flexibility and trust, that they will get the format right, but I do want to enforce only digits and two decimals for currency
^[$\-\s]*[\d\,]*?([\.]\d{0,2})?\s*$
Takes care of:
$ 1.
-$ 1.00
$ -1.0
.1
.10
-$ 1,000,000.0
Of course it will also match:
$$--$1,92,9,29.1 => anyway after cleanup => -192,929.10
I'm wrote this price pattern without zero price.
(0\.((0[1-9]{1})|([1-9]{1}([0-9]{1})?)))|(([1-9]+[0-9]*)(\.([0-9]{1,2}))?)
Valid For:
1.00
1
1.5
0.10
0.1
0.01
2500.00
Invalid For:
0
0.00
0.0
2500.
2500.0000
Check my code online: http://regexr.com/3binj
this in my pattern currency '[0-9]+(,[0-9]{1,2})?$' also input type text
valid for:
1
0,01
1,5
0,10
0,1
0,01
2500,00
Use this pattern "^\d*(\.\d{2}$)?"