Regex Match for Valid Dates only - regex

I am trying to create a regex that only matches for valid dates (in MM/DD or MM/DD/YY(YY) format)
My current regex (\d+)/(\d+)/?(\d+)? is very simple but it matches any number that has a / before/after. I.e. if a string is 2015/2016 12/25 it will see both of these as matches but i only want the 12/25 portion.
Here is a link to some sample RegEx.

You can add word boundaries (\b) to make sure you match the date string as a whole "word" (so that the match does not start in the middle of a number) and restrict the occurrences \d matches with the help of limiting quantifiers:
\b(\d{2})/(\d{1,2})/?(\d{4}|\d{2})?\b
See the regex demo
The regex breakdown:
\b - word boundary to make sure there is a non-word character or start of string right before the digit
(\d{2}) - match exactly 2 digits
/ - match a literal /
(\d{1,2}) - match and capture 1 to 2 digits
/? - match 1 or 0 /
(\d{4}|\d{2})? - match 1 or 0 occurrences of either 4 or 2 digits
\b - trailing word boundary

Related

validate string data using regex which allowed only numbers and - (hyphen) of length 6 but hypen should not be the end and the digits not same

Validate the string using the regex which has the - (hypen)
Requirement is : string contains only digits and - (hyphens) and not end with - (hyphen) and all other digits not be the same.
^([0-9-])(?!\1+$)[0-9-]{5}$
The above one allow only digits and hyphen but its not restricted end with hyphen and check all other digits are same.
ex:
1111-1 Not allowed because all are same digits
1111-2 Allowed
11112- Not allowed as its end with - Hypen
-12345 Not allowed as its start with - hypen
You might write the pattern as
^(\d)(?!(?:\1|-)+$)(?!\d*-\d*-)[\d-]{4}\d$
Explanation
^ Start of string
(\d) Capture a single digit in group 1
(?! Negative lookahead
(?:\1|-)+$ Check that to the right there is not only the group 1 value or hyphens
(?!\d*-\d*-) Assert not 2 hyphens
) Close lookahead
[\d-]{4} Match 4 digits or hyphens
\d Match a digit
$ End of string
Regex demo
If there should be at least 1 hyphen:
^(\d)(?!(?:\1|-)+$)(?=\d*-)[\d-]{4}\d$
Regex demo
My 2 cents to allow [01] hyphens:
^(?=.{6}$)(\d)(?=.*(?!\1)\d)\d+(?:-\d+)?$
See an online demo

Regex exclude whitespaces from a group to select only a number

I need to take only a number (a float number) from a text, but I can't remove the whitespaces...
** Update
I have a problem with this method, I only need to consider numbers and ',' between '- EUR' and 'Fee' as rule.
You can use
- EUR\W*(.*?)\W*Fee
See the regex demo.
Variations of the regex that might work in different regex engines:
- EUR\W*\K.*?(?=\W*Fee)
(?<=- EUR\W*).*?(?=\W*Fee)
Details:
- EUR - literal text
\W* - zero or more non-word chars
(.*?) - Group 1: any zero or more chars other than line break chars as few as possible
\W*- zero or more non-word chars
Fee - a string.
You could also match the number format in capture group 1
- EUR\b\D*(\d+(?:,\d+)?)\s+Fee\b
- EUR\b Match - EUR and a word boundary
\D* Match 0+ times any char except a digit
( Capture group 1
\d+(?:,\d+)? Match 1+ digits with an optional decimal part
) Close group 1
\s+Fee\b Match 1+ whitespace chars, Fee and a word boundary
Regex demo
this is working i removed the , from (.) in test string.
Regex example - working

Regex for extracting digits in a string not in a word and not separated by a symbol?

I want to extract an ID from a search query but I don't know the length of the ID.
From this input I want to get the numbers that are not in the words and the numbers that are not separated by symbols.
12 11231390 good123e41 12he12o1 1391389 dajue1290a 12331 12-10 1.2 test12.0why 12+12 12*6 2d1139013 09`29 83919 1
Here I want to return
12 11231390 1391389 12331 83919 1
So far I've tried /\b[^\D]\d*[^\D]\b/gm but I get the numbers in between the symbols and I don't get the 1 at the end.
You could repeatedly match digits between whitespace boundaries. Using a word boundary \b would give you partial matches.
Note that [^\D] is the same as \d and would expect at least a single character.
Your pattern can be written as \b\d\d*\d\b and you can see that you don't get the 1 at the end as your pattern matches at least 2 digits.
(?<!\S)\d+(?:\s+\d+)*(?!\S)
The pattern matches:
(?<!\S) Negateive lookbehind, assert a whitespace boundary to the left
\d+(?:\s+\d+)* Match 1+ digits and optionally repeat matching 1+ whitespace chars and 1+ digits.
(?!\S) Negative lookahead, assert a whitspace boundary to the right
Regex demo
If lookarounds are not supported, you could use a match with a capture group
(?:^|\s)(\d+(?:\s+\d+)*)(?:$|\s)
Regex demo

Working with regex for alphanumeric

I'm trying a regex fro Alpha Numeric of length 7 (with positions 1,3,4 as characters and positions 2,5,6,7 as digits).
[a-zA-Z]|[0-9]|[a-zA-Z]|[a-zA-Z]|[0-9]|[0-9]|[0-9]
Can someone help me?
The sequence "character, digit, character, character, digit, digit, digit" is expressed in regex as
[a-zA-Z][0-9][a-zA-Z]{2}[0-9]{3}
If you're working in PCRE (with say, PHP):
^([a-zA-Z])([0-9])(?1){2}(?2){3}$
Breakdown:
^ - from the start of the string
([a-zA-Z]) - match and capture a single character in the ranges given: a-z, A-Z
([0-9]) - match and capture a single character in the ranges given: 0-9
(?1){2} - redo the regex in the first group twice (recursive subpattern)
(?2){3} - redo the regex in the second group 3 times (recursive subpattern)
$ - the end of the string
If you want to match this in the middle of a sentence, exchange ^ and $ for \b - which will match a word boundary
See the demo
If you're not using PCRE:
^[a-zA-Z][0-9][a-zA-Z]{2}[0-9]{3}$
Which does the same thing, but has some copy-paste involved

RegEx to match 2 or more digits in a string

Suppose I have strings like:
ABC-L-W7P-1423
ABC-L-W7E-87
CH-L-W7-756
I need to grab the number at the end. That number might be 2, 3 or 4 digits. But currently what I have is:
=REGEXREPLACE(B2,"[^0-9]","")
Which of course also grabs the '7' in 'W7P' which I don't want.
EDIT:
I also need to match something like this:
CH-M-311-MM
So always a 2, 3 or 4 (or 5) digit number, but I need single digits excluded.
You can use =REGEXEXTRACT with \b[0-9]{2,4}\b:
=REGEXEXTRACT(B2, "\b[0-9]{2,4}\b")
See the regex demo.
Details:
\b - a leading word boundary
[0-9]{2,4} - 2 to 4 digits
\b - trailing word boundary
In case your 2-4 digits are always preceded with -, you may use
=REGEXREPLACE(B2,"^.*-([0-9]{2,4})\b.*","$1")
See this regex demo
Details:
^ - start of string
.*- - any 0+ chars up to the last - that is followed with...
([0-9]{2,4}) - (Group 1 referred to with $1 in the replacement pattern) - 2 to 4 digits
\b - a trailing word boundary
.* - any chars up to the end of string.
I'm not sure which language you use, but if it supports lookarounds, you can assert that there is a - (dash) on the left side.
(?<=-)\d+
See: https://regex101.com/r/sI9zR9/1