Regex for month with optional leading 0 - regex

I am trying to match various months, that may be in the form of:
01
1
12
13
09
All of the above inputs are valid except for 13.
The current regex I have for this is:
0?(?#optional leading 0, for example 04)
\d(?#followed by any number, 01, 2, 09, etc.)
|(?#or 10,11,12)
1[012]
What's wrong with the above regex? Here's an example link: https://regex101.com/r/cujCmD/1

I would phrase the regex as:
^(?:0?[1-9]|1[012])$
Demo
The parentheses and anchors are needed to ensure that the alternation chosen gets applied to the entire number input.

Related

Match street number from different formats without suffixes

We've a "street_number" field which has been freely filed over the years that we want to format. Using regular expressions, we'd like to to extract the real "street_number", and the "street_number_suffix".
Ex: 17 b, "street_number" would be 17, and "street_number_suffix" would be b.
As there's a dozen of different patterns, I'm having troubles to tune the regular expression correctly. I consider using 2 different regexes, one to extract the "street_number", and another to extract the "street_number_suffix"
Here's an exhaustive set of patterns we'd like to format and the expected output:
# Extract street_number using PCRE
input street_number street_number_suffix
19-21 19 null
2 G 2 G
A null A
1 bis 1 bis
3 C 3 C
N°10 10 null
17 b 17 b
76 B 76 B
7 ter 7 ter
9/11 9 null
21.3 21 3
42 42 null
I know I could invoke an expressions that matches any digits until a hyphen using \d+(?=\-).
It could be extended to match until a hyphen OR a slash using \d+(?=\-|\/), thought, once I include \s to this pattern, 21 from 19-21 will match. Adding conditions may no be that simple, which is why I ask your help.
Could anyone give me a helping hand on this ? If it can help, here's a draft: https://regex101.com/r/jGK5Sa/4
Edit: at the time I'm editing, here's the closest regex I could find:
(?:(N°|(?<!\-|\/|\.|[a-z]|.{1})))\d+
Thought the full match of N°10 isn't 10 but N°10 (and our ETL doesn't support capturing groups, so I can't use /......(\d+)/)
To get the street numbers, you could update the pattern to:
(?<![-/.a-z\d])\d+
Explanation
(?<! Negative lookbehind
[-/.a-z\d] Match any of the listed using a charater class
) Close the negative lookbehind
\d+ Match 1+ digits
Regex demo

Regex to match some dates matching non-dates

I'm using some Regex to find date strings of the form Jan 12, 2015 or Feb 3, 1999.
The regex I'm using is \w+\s\d{1,2},\s\d{4} and it's working correctly, but the thing is that on the file are also some strings with the form:
Weg 58, 4047 or Strasse 1, 4482 and I also match them.
How can I avoid those non-date matches? My approach is:
The first string (the one of the month, Jan, Feb, etc.) has to have always length 3.
The year has to start with 1 or 2.
The thing is that I dont know how can I add these two options to my regex. Any help please?
You can make the test right here: https://regex101.com/r/bN2pO0/1
Thanks in advance.
Since the months won't change (ie: consistent values between January - Decemeber, we can put the 3 starting characters).
We can then use a OR | operator to select years starting with 1 or 2
/((Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s\d{1,2},\s(1|2)\d{3})/ig
https://regex101.com/r/bN2pO0/3
Just as you used \d{1,2} to match a digit 1 or 2 times and \d{4} to match a digit 4 times, you can use \w{3} to match a word character 3 times.
For the year, you can use the pipe "or" operator |.
\w{3}\s\d{1,2},\s(?:1|2)\d{3}
Although, this will also match non-dates of form Abc xy, 1xyz
If you want, you can go with brute force approach or just get rid of regex and use code to capture the dates.
Brute force:
(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s[0-2]?[0-9],\s[12]\d{3}

how to make a regular expressions to accept only digits not started with zero or zero only?

I'm trying to create a regex to accept digits not starting with zero or a single zero digit.
Example matches
0
50
798
Example rejects
01
046
0014
00
0001
My attempt was to use /[0]|[1-9][0-9]*/ to match the values in the following text:
0, 50, 798
01, 046, 0014, 00, 0001
This attempt can be run at http://regexr.com/3bb00
Use following regex :
^(0|[1-9]\d*)$
see Demo https://regex101.com/r/zT8uI2/2
This regex contains 2 part, 0 or [1-9]\d* which is a digit that doesn't starts with zero.
Note that if you want to match your numbers within other texts you need a word boundary instead of start and end anchors :
\b(0|[1-9]\d*)\b
see demo https://regex101.com/r/zT8uI2/3
It seems that you have two cases in your regex:
Match a single zero
Match digits that don't start with zero.
The first case is easy: /0/
The second case is also pretty easy /[1-9]\d*/. The [1-9] matches the digit that is not 0. Then, we can have 0 or more digits.
To get both of these cases, just use a bar to do either or
/0|[1-9]\d*/
Hmm, why not something like..
if(input[0] == '0' && input.size() > 1) // reject
else //accept
Please check this http://regexr.com/3bb09
Took the tip from https://www.safaribooksonline.com/library/view/regular-expressions-cookbook/9781449327453/ch06s06.html
and improved it to negate numbers starting with 0.
RE: \b[^0,]*([1-9][0-9]*|0)\b
Text: 0, 50, 798, 01, 046, 0014, 00, 0001
Matched only 0, 50 and 798
Thanks
Venkat

What's the best Regular Expression to use for returning some phone numbers, but not all?

I'm new to Regular expressions and working on something that will return all UK phone numbers with an area code beginning 01, 02, 03 or 07 only. It has to not look up 08 or 09. It also has to take in to account the different grouping styles too. But here's the kicker... it's got to be 80 characters or less.
This was my best shot:
(01|02|03|07|44\D*1|44\D*2|44\D*3|44\D*7|)(\d\D*){9}
The problem is that it's returning any 9 digit or less number and I can't figure out why.
Any help would be grand!
(01|02|03|07|44\D*1|44\D*2|44\D*3|44\D*7) is matching either 0 or 44\D* followed by 1, 2, 3 or 7 which simplifies to:
(?:44\D*|0)[1237]
Putting that with the rest gives:
(?:44\D*|0)[1237](\D*\d\D*){9}
Debuggex Demo

Regular expression for matching numbers and ranges of numbers

In an application I have the need to validate a string entered by the user.
One number
OR
a range (two numbers separated by a '-')
OR
a list of comma separated numbers and/or ranges
AND
any number must be between 1 and 999999.
A space is allowed before and after a comma and or '-'.
I thought the following regular expression would do it.
(\d{1,6}\040?(,|-)?\040?){1,}
This matches the following (which is excellent). (\040 in the regular expression is the character for space).
00001
12
20,21,22
100-200
1,2-9,11-12
20, 21, 22
100 - 200
1, 2 - 9, 11 - 12
However, I also get a match on:
!!!12
What am I missing here?
You need to anchor your regex
^(\d{1,6}\040?(,|-)?\040?){1,}$
otherwise you will get a partial match on "!!!12", it matches only on the last digits.
See it here on Regexr
/\d*[-]?\d*/
i have tested this with perl:
> cat temp
00001
12
20,21,22
100-200
1,2-9,11-12
20, 21, 22
100-200
1, 2-9, 11-12
> perl -lne 'push #a,/\d*[-]?\d*/g;END{print "#a"}' temp
00001 12 20 21 22 100-200 1 2-9 11-12 20 21 22 100-200 1 2-9 11-12
As the result above shows putting all the regex matches in an array and finally printing the array elements.