Data annotation regular expression - regex

I need a regular expression for a string like this:
ex. 1234-1234-12345
where the first two numbers must be between 01-18 and the whole string must be 15 characters long
example: 0511-xxxx-xxxxx.
I tried using [RegularExpression(#"^[0-9]{1,18}$",
ErrorMessage = "Invalid Id.")]
but it doesnt work, it even gives me an error that says ',' is missing.
Lets make it even easier, a numeric string 13 character long where the first two digits must be between 01-18.
Ex. 1234567890123
(I would prefer the first format but this one work too).
I don't know how to use Regex so if someone can kindly give me a link to somewhere I can learn I would very much appreciate it.
And, most importantly, if there is a better way to get around this without using Regex I would appreciate it as well.
Apparently, my request is a little unclear. What I want it that the first two digits (XXxx-xxxx-xxxxx) be 01, 02, 03, 04, 05, 06, 07, 08, 09, 10, 11, 12, 13, 14, 15, 16, 17, 18.

Your "first two numbers" is a little unclear, but how about:
var pattern = #"(0\d|1[0-8])\d\d-\d{4}-\d{5}";
If you want to match the whole string and not just find the substring, you need
var pattern = #"^(0\d|1[0-8])\d\d-\d{4}-\d{5}$";
If you didn't have the groups separated by hyphens, use:
var pattern = #"^(0\d|1[0-8])\d{11}$";
You can use it like
Regex.IsMatch(aString, pattern)

Related

Regex for the number that should not start with some specific two digit numbers

I am trying to build a regex for following rules
No characters – only numbers are allowed to enter
Numbers must contain not less than 8 digits
Numbers and combinations, starting as follows should not be allowed:
[3.1] - Any number starting with 0 (zero) and with 1 (one);
[3.2] - Numbers starting with 20, 21, 22, 23, 24, 25, 26, and 27.
I am able to achieve regex for points 1, 2 and 3.1 like this
^[2-9]{1}[0-9]{7,}$
But I am not able to find solution for point 3.2
This is one of the options I started with, this regex matches the string that must start with 20, 21, 22, 23, 24, 25, 26, or 27.
^([2][0-7])[2-9]{1}[0-9]{7,}$
I just have to find the negation of this first condition.
Please help!
Rather than trying to negate the 20-27 condition, just match numbers that start with 28 or 29 instead:
^(?:2[89]|[3-9]\d)\d{6,}$
Demo on regex101

Match if something is not preceded by something else

I'm trying to parse a string and extract some numbers from it. Basically, any 2-3 digits should be matched, except the ones that have "TEST" before them. Here are some examples:
TEST2XX_R_00.01.211_TEST => 00, 01, 211
TEST850_F_11.22.333_TEST => 11, 22, 333
TESTXXX_X_12.34.456 => 12, 34, 456
Here are some of the things I've tried:
(?<!TEST)[0-9]{2,3} - ignores only the first digit after TEST
_[0-9]{2,3}|\.[0-9]{2,3} - matches the numbers correctly, but matches the character before them (_ or .) as well.
I know this might be a duplicate to regex for matching something if it is not preceded by something else but I could not get my answer there.
Unfortunately, there is no way to use a single pattern to match a string not preceded with some sequence in Lua (note that you can't even rely on capturing an alternative that you need since TEST%d+|(%d+) will not work in Lua, Lua patterns do not support alternation).
You may remove all substrings that start with TEST + digits after it, and then extract digit chunks:
local s = "TEST2XX_R_00.01.211_TEST"
for x in string.gmatch(s:gsub("TEST%d+",""), "%d+") do
print(x)
end
See the Lua demo
Here, s:gsub("TEST%d+","") will remove TEST<digits>+ and %d+ pattern used with string.gmatch will extract all digit chunks that remain.

Complex Regular Expression, PEG, or Multiple Passes?

I am trying to extract some data from the following examples:
Name 789, 10-mill 12-27b
Manufacturer XY-2822, 10-mill, 17-25b
Other Manufacturer 16b Part
Another Manufacturer FER M9000, 11-mill, 11-40
18b Part
Maker 11-31, 10-mill
Maker 1x or 2x; max size 1x (34b), 2x (38/24b)
Maker REC6 15/18/26b. Square.
Producer FC-40 11-13-16-19-22-25-27-30-34b
What I'd like my results to be respectively are:
12, 27
17, 25
16
11, 40
18
11-31
34, 38, 24 (optional, its fine if only the latter two are provided)
15, 18, 26
11, 13, 16, 19, 22, 25, 27, 30, 34
I am happy to do this in multiple passes, using an expression grammar though I don't think that'll really help.
I'm having trouble using lookaheads and lookbehinds to grab that data and exclude things like "11-mill" and "XY-2822". What I find happening is I am able to exclude those matches but end up truncating good results for others matches.
What is the best way to go about this?
My current regex is
/(?:(\d+)[b\b\/-])([b\d\b]*)[^a-z]/i
which is capturing the letter 'b' (which is okay) but not capturing 34b in the final example
Not sure what are your exact requirements/formats but you can try this:
/(?:\G(?!^)[-\/]|^(?:.*[^\d\/-])?)\K\d++(?![-\/]\D)/
http://rubular.com/r/WJqcCNe2pr
details:
# two possible starts:
(?: # next occurrences
\G # anchor for the position after the previous match
(?!^) # not at the start of the line
[-\/]
| # first occurrence
^
(?:.*[^\d\/-])? # (note the greedy quantifier here,
# to obtain the last result of the line)
)
\K # discards characters matched before from the whole match
\d++ # several digits with a possessive quantifier to forbid backtracking
(?![-\/]\D) # not followed by an hyphen of a slash and a non-digit
You can improve the pattern if you replace (?:.*[^\d\/-])? with [^-\d\/\n]*+(?>[-\d\/]+[^-\d\/\n]+)* (remove the \n if you work line by line.). The goal of this change is to limit the backtracking (that occurs atomic group by atomic group, instead of character by character for the first version).
Perhaps, you can replace the negative lookahead with this kind of positive lookahead: (?=[-\/]\d|b|$)
An other version here.
Perhaps this:
(?<=\d-)\d+|\d+(?=-\d+)|\d+(?=(?:\/\d+)*b)
https://regex101.com/r/nR3eS9/1

What's the best Regular Expression to use for returning some phone numbers, but not all?

I'm new to Regular expressions and working on something that will return all UK phone numbers with an area code beginning 01, 02, 03 or 07 only. It has to not look up 08 or 09. It also has to take in to account the different grouping styles too. But here's the kicker... it's got to be 80 characters or less.
This was my best shot:
(01|02|03|07|44\D*1|44\D*2|44\D*3|44\D*7|)(\d\D*){9}
The problem is that it's returning any 9 digit or less number and I can't figure out why.
Any help would be grand!
(01|02|03|07|44\D*1|44\D*2|44\D*3|44\D*7) is matching either 0 or 44\D* followed by 1, 2, 3 or 7 which simplifies to:
(?:44\D*|0)[1237]
Putting that with the rest gives:
(?:44\D*|0)[1237](\D*\d\D*){9}
Debuggex Demo

python regex repetition with capture question

using python3's regex capabilities, is it possible to capture variable numbers of capture blocks, based on the number of the repetitions found? for instance, in the following search strings, i want to capture all the digit strings with the same regex.
search string 1(trying to capture: 89, 45):
zzz89zzz45.mp3
search string 2(trying to capture: 98, 67, 89, 45):
zzz98zzz67zzz89zzz45.mp3
search string 3(trying to capture: 98, 67, 89, 45, 55, 111):
zzz98zzz67zzz89zzz45vdvd55lplp111.mp3
the following regex will match all the repetitions, though all the values are not available for later use(only 1 digit string is captured):
((\d+)\D*)*\.mp3$
the other 2 options are writing a different regex for every case, or use findall(). Is there a way to adjust the above regex in order to capture every digit string for later use with various numbers of repetitions using just regex facilities, or to do this in python3, are you forced to use findall()?
Most or all regular expression engines in common use, including in particular those based on the PCRE syntax (like Python's), label their capturing groups according to the numerical index of the opening parenthesis, as the regex is written. So no, you cannot use capturing groups alone to extract an arbitrary, variable number of subsequences from a string.
The closest you can get (as far as I know) is to manually write out a certain number of capturing groups, something like this:
s = ...
res = re.match(r'\D*' + 25 * r'(\d+)\D+')
numbers = [r for r in res.groups() if r is not None]
This will get you up to 25 groups of digits. If you need more, replace 25 with some higher number.
I wouldn't be surprised if this were less efficient than the iterative approach with findall(), although I haven't tested it.
This will match all the numbers before the dot:
s = "zzz98zzz67zzz89zzz45vdvd55lplp111.mp3"
res = re.findall("[0-9]+(?=.*\\.)", s)
print(res)