I've been struggling to figure out how to best do this regular expression.
Here are my requirements:
Up to 8 characters
Can only be alphanumeric
Can only contain up to three alpha characters [a-z] (zero alpha characters are valid to)
Any ideas would be appreciated.
This is what I've got so far, but it only looks for contiguous letter characters:
^(\d|([A-Za-z])(?!([A-Za-z]{3,}))){0,8}$
I'd write it like this:
^(?=[a-z0-9]{0,8}$)(?:\d*[a-z]){0,3}\d*$
It has two parts:
(?=[a-z0-9]{0,8}$)
Looksahead and matches up to 8 alphanumeric to the end of the string
(?:\d*[a-z]){0,3}\d*$
Essentially allowing injection of up to 3 [a-z] among \d*
Rubular
On rubular.com
12345678 // matches
123456789
#(#*#$
12345 // matches
abc12345
abcd1234
12a34b5c // matches
12ab34cd
123a456 // matches
Alternatives
I do think regex is the best solution for this, but since the string is short, it would be a lot more readable to do this in two steps as follows:
It must match [a-z0-9]{0,8}
Then, delete all \d
The length must now be <= 3
Do you have to do this in exactly one regular expression? It is possible to do that with standard regular expressions, but the regular expression will be rather long and complicated. You can do better with some of the Perl extensions, but depending on what language you're using, they may or may not be supported. The cleanest solution is probably to check whether the string matches:
^[A-Za-z0-9]{0,8}$
but doesn't match:
([A-Za-z].*){4}
i.e. it's an alpha string of up to 8 characters (first regular expression), but doesn't contain 4 or more alpha characters (possibly separated by other characters (second regular expression).
/^(?!(?:\d*[a-z]){4})[a-z0-9]{0,8}$/i
Explanation:
[a-z0-9]{0,8} matches up to 8 alphanumerics.
Lookahead should be placed before the matching happens.
The (?:\d*[a-z]) matches 1 alphabetic anywhere. The {4} make the count to 4. So this disables the regex from matching when 4 alphabetics can be found (i.e. limit the count to ≤3).
It's better not to exploit regex like this. Suppose you use this solution, are you sure you will know what the code is doing when you revisit it 1 year later? A clearer way is just check rule-by-rule, e.g.
if len(theText) <= 8 and theText.isalnum():
if sum(1 for c in theText if c.isalpha()) <= 3:
# valid
The easiest way to do this would be in multiple steps:
Test the string against /^[a-z0-9]{0,8}$/i -- the string is up to 8 characters and only alphanumeric
Make a copy of the string, delete all non-alphabetic characters
See if the resulting string has a length of 3 or less.
If you want to do it in one regular expression, you can use something like:
/^(?=\d*(?:[a-z]?\d*){0,3}$)[a-z0-9]{0,8}$/i
Which looks for a alphanumeric string between length 0 and 8 (^[a-z0-9]{0,8}$), but first uses a lookahead ((?=\d*(?:[a-z]?\d*){0,3}$)) to make sure that the string
has at most 3 alphabetic characters.
Related
Is there a regular expression for? :
String of length 8
First two chracters fixed 'UE' or 'ue'
remaining 6 characters must be digits [0-9]
Eg: https://regex101.com/r/PufypE/1
The expression i tried
\^(UE|ue){2}[0-9]{6}\
but its not working (no match found!)
You want:
\b(UE|ue)[0-9]{6}\b
You don't need the {2} next to the (UE|ue) since you are specifying those exactly. The \b is a word boundary so this will match a list like you put in the comment: UE123456,ue654321 This is a good site to play with a regex on for this kind of stuff: http://regex101.com
Regex should be:
^[Uu][Ee][0-9]{6}$
(UE|ue){2} in your regex would match 2 occurrences of UE or ue
I need a regular expression for 4 characters. The first 3 characters must be a number and the last 1 must be a letter or a digit.
I formed this one, but it not working
^([0-9]{3}+(([a-zA-Z]*)|([0-9]*)))?$
Some valid matches: 889A, 777B, 8883
I need a regular expression for first 3 will be a number and the last 1 will be a alphabet or digit
This regex should work:
^[0-9]{3}[a-zA-Z0-9]$
This assumes string is only 4 characters in length. If that is not the case remove end of line anchor $ and use:
^[0-9]{3}[a-zA-Z0-9]
Try this
This will match it anywhere.
\d{3}[a-zA-Z0-9]
This will match only beginning of a string
^\d{3}[a-zA-Z0-9]
You can also try this website: http://gskinner.com/RegExr/
It makes it very easy to create and test your regex.
Just take the stars out...
^([0-9]{3}+(([a-zA-Z])|([0-9])))?$
The stars mean zero or more of something before it. You are already using an or (|) so you want to match exactly one of the class, or one of the other, not zero or more of the class, or zero or more of the other.
Of course, it can be simplified further:
^\d{3}[a-zA-Z\d]$
Which literally means... three digits, followed by a character from either lowercase or uppercase a-z or any digit.
I am trying to make a regular expression for consumer products models.
I have this regular expression: ([a-z]*-?[0-9]+-?[a-z]*-?){4,}
which I expect to limit this whole special string to 4 or more but what happens is that the limit is applied to only the digits.
So this example matches: E1912H while this does not: EM24A1BF although both should match.
Can you tell me what I am doing wrong or how can I make the limit to the whole special string not only the digits?
Limitations:
1- String contains at least 1 digit
2- string can contains characters
3- string can contain "-"
4- minimum length = 4
Summary of your conditions so far:
require at least 1 digit [0-9]
require at least 4 symbols {4,}
can have characters [a-zA-Z]
can have short dash [-]
The following regexp meets them all:
^(?=.*\d)([A-Za-z0-9-]+){4,}$
Note: ^ and $ symbols mean entire input string is validated. Alter this if it`s not the case.
it cant match... EM24A1BF contains EM, which are 2 [a-z], not 1 as your regex states.
Something like this
[a-z]*-?\d+-?[a-z]*-?\d*[a-z]+
matches both your expression and all these:
E1912H
EM24A1BF
eM24A1BF
eM-24A-1BF
eM-24A-
eM24A-1BF
eM-24A1BF
To be sure your string meets both your requirements (the characters'position and composition AND the length requirement), you need to use a non-consuming regular expression
Check this out
([\w-]*\d+[\w-]*){4,}
it matches the following
32ES5200G
LE32K900
N55XT770XWAU3D
The following regex will match the range 9-11 digits: /\d{9,11}/
What is the best way to write a regex matching exactly 9 or 11 digits (excluding 10)?
Using the pattern attribute of an input element, thus the regex should match the entire value of the input field. I want to accept any number containing 9 or 11 digits.
Well, you could try something like:
^\d{9}(\d{2})?$
This matches exactly nine digits followed by an optional extra-two-digits (i.e., 9 or 11 digits).
Alternatively,
^(\d{9}|\d{11})$
may work as well.
But remember that not everything necessarily has to be done with regular expressions. It may be just as easy to check the string matches ^\d*$ and that the string length itself is either 9 or 11 (using something like strlen, for example).
This regex would do
^(\d{9}|\d{11})$
or if you dont want to match it exactly
\D(\d{9}|\d{11})\D
/[^\d](\d{9}|\d{11})[^\d]/
Depending on which tool you are using, you may need to escape the (, | and ) characters.
Note that in order to not match 8, or any other number other than 9 or 11, the regex must be bounded with something to indicate that the match is surrounded by non-digit characters. Ideally, it would be some kind of word-boundary character, but the syntax for that would vary depending on the tool.
/\b(\d{9}|\d{11})\b/
works with some tools. What are you working with?
I have a barcode of the format 123456########. That is, the first 6 digits are always the same followed by 8 digits.
How would I check that a variable matches that format?
You haven't specified a language, but regexp. syntax is relatively uniform across implementations, so something like the following should work: 123456\d{8}
\d Indicates numeric characters and is typically equivalent to the set [0-9].
{8} indicates repetition of the preceding character set precisely eight times.
Depending on how the input is coming in, you may want to anchor the regexp. thusly:
^123456\d{8}$
Where ^ matches the beginning of the line or string and $ matches the end. Alternatively, you may wish to use word boundaries, to ensure that your bar-code strings are properly separated:
\b123456\d{8}\b
Where \b matches the empty string but only at the edges of a word (normally defined as a sequence consisting exclusively of alphanumeric characters plus the underscore, but this can be locale-dependent).
123456\d{8}
123456 # Literals
\d # Match a digit
{8} # 8 times
You can change the {8} to any number of digits depending on how many are after your static ones.
Regexr will let you try out the regex.
123456\d{8}
should do it. This breaks down to:
123456 - the fixed bit, obviously substitute this for what you're fixed bit is, remember to escape and regex special characters in here, although with just numbers you should be fine
\d - a digit
{8} - the number of times the previous element must be repeated, 8 in this case.
the {8} can take 2 digits if you have a minimum or maximum number in the range so you could do {6,8} if the previous element had to be repeated between 6 and 8 times.
The way you describe it, it's just
^123456[0-9]{8}$
...where you'd replace 123456 with your 6 known digits. I'm using [0-9] instead of \d because I don't know what flavor of regex you're using, and \d allows non-Arabic numerals in some flavors (if that concerns you).