Regex exactly 4 Letters 3 Digits any order - regex

I am trying to do a regex to get this cases:
Correct:
IUG4455
I4UG455
A4U345A
Wrong:
IUGG453
IIUG44555
need to be exactly 4 letters (in any order) and exactly 3 digits (in any order).
i tried use that expression
[A-Z]{3}\\d{4}
but it only accept start with letters (4) then digits (3).

You have a couple of options for this:
Option 1: See regex in use here
\b(?=(?:\d*[A-Z]){3})(?=(?:[A-Z]*\d){4})[A-Z\d]{7}\b
\b Assert position as a word boundary
(?=(?:\d*[A-Z]){3}) Positive lookahead ensuring the following matches
(?:\d*[A-Z]){3} Match the following exactly 3 times
\d* Match any digit any number of times
[A-Z] Match any uppercase ASCII character
(?=(?:[A-Z]*\d){4}) Positive lookahead ensuring the following matches
(?:[A-Z]*\d){4} Match the following exactly 4 times
[A-Z]* Match any uppercase ASCII character any number of times
\d Match any digit
[A-Z\d]{7} Match any digit or uppercase ASCII character exactly 7 times
\b Assert position as a word boundary
If speed needs to be taken into consideration, you can expand the above option and use the following:
\b(?=\d*[A-Z]\d*[A-Z]\d*[A-Z])(?=[A-Z]*\d[A-Z]*\d[A-Z]*\d[A-Z]*\d)[A-Z\d]{7}\b
Option 2: See regex in use here
\b(?=(?:\d*[A-Z]){3}(?!\d*[A-Z]))(?=(?:[A-Z]*\d){4}(?![A-Z]*\d))[A-Z\d]+\b
Similar to Option 1, but uses negative lookahead to ensure an extra character (uppercase ASCII letter or digit) doesn't exist in the string.
Having two positive lookaheads back-to-back simulates an and such that it ensures both subpatterns are satisfied starting at that particular position. Since you have two conditions (3 uppercase ASCII letters and 4 digits), you should use two lookaheads.

As an alternative,
(?:(?<d>\d)|(?<c>[A-Z])){7}(?<-d>){3}(?<-c>){4}
doesn't require any lookarounds. It just matches seven letter-or-digits and then checks it found 3 digits and 4 letters.
Adjust the 3 and 4 to taste... your examples have 4 digits and 3 letters.
Also add word boundaries or anchors depending on whether you are trying to match whole words or a whole string.

Related

Python regex for sequence containing at least two digits/letters

using the Python module re, I would like to detect sequences that contain at least two letters (A-Z) and at least two digits (0-9) from a text, e.g., from the text
"N03FZ467 other text N03671"
precisely the sub-string "N03FZ467" shall be matched.
The best I have got so far is
(?=[A-Z]*\d)[A-Z0-9]{4,}
which detects sequences of length at least 4 that contain only letters A-Z and digits 0-9, and at least one digit and one letter.
How can I make sure I respectively get at least two?
If you want to match full words, start matching at word boundaries \b.
Check the first condition (two upper) by a lookahead: (?=(?:\d*[A-Z]){2})
If this succeeds, match the second requirement, two digits: (?:[A-Z]*\d){2}
Finally match any remaining [A-Z\d]* until another \b.
Putting it together:
\b(?=(?:\d*[A-Z]){2})(?:[A-Z]*\d){2}[A-Z\d]*\b
See this demo at regex101 or a Python demo at tio.run
Note that a lookahead is a zero length assertion, it does not consume characters. If you don't specifiy a starting point eg \b, the lookahead will be used at any place which is less efficient.
Further to mention, the minimum length of at least four will be satisfied by the requirements.
Use look aheads, one for each requirement:
^(?=(.*\d){2})(?=(.*[A-Z]){2}).*
See live demo.
Regex breakdown:
(?=(.*\d){2}) is "2 digits somewhere ahead"
(?=(.*[A-Z]){2}) is "2 letters somewhere ahead"
The more efficient version:
^(?=(?:.*?\d){2})(?=(?:.*?[A-Z]){2}).*
It's more efficient because it doesn't capture (uses non-capturing groups (?:...)) and it uses the reluctant quantifier .*? which matches as early as possible in the input, whereas .* will scan ahead to the end then backtrack to find a match.
If you only want to match chars A-Z and 0-9 you can use a single lookahead (if supported) to make sure there are 2 digits present, and then match 2 times A-Z when matching the string.
As you have asserted 2 chars and matching 2 chars, then length is automatically at least 4 chars.
\b(?=[A-Z\d]*\d\d)[A-Z\d]*[A-Z]{2}[A-Z\d]*\b
Explanation
\b A word boundary to prevent a partial word match
(?=[A-Z\d]*\d\d) Positive lookahead, assert 2 digits to the right
[A-Z\d]* Match optional chars A-Z or digits
[A-Z]{2} Match 2 uppercase chars A-Z
[A-Z\d]* Match optional chars A-Z or digits
\b A word boundary
See a regex demo.
I would enhance given answer and do this:
(?=\b(?:\D+\d+){2}\b)(?=\b(?:[^a-z]+[a-z]+){2}\b)\S+
Regex demo
This contains two lookaheads, each validating one rule:
(?=\b(?:\D+\d+){2}\b) - lookahead that asserts that what follows is word boundary \b, then its a non-digits followed by digits \D+\d+ to determine that we have at least two such groups. Then words boundary again, two be sure we are within one "word".
Another look ahead is the same, but now isntead of digits and non digits we have letter [a-z] and non-letters [^a-z] - (?=\b(?:[^a-z]+[a-z]+){2}\b)
At the end, we just match whole 'word' with \S+ which is simply match all non-whitespace characters (since we asserted earlier our 'word', this is sufficient).

How to write regex to prevent passwords with consecutive characters?

I have to check the validation of a password that must have at least 3 capital letters, 3 lower case letters, 2 digits, at least one of those characters (!##$*), and the trickiest one it can not have the same character in a row. For example, "beer" is not allowed.
That's what I have done but it doesn't do a lot:
(?=[0-9]{2})&(?=[a-z]{3})&(?=[A-Z]{3})&(?=[!##$*])&(?:(?!.([a-z]|[0-9]|[A-Z]|[!##$*]{2})))
You may use the following pattern:
^(?=(?:.*[A-Z]){3})(?=(?:.*[a-z]){3})(?=(?:.*[0-9]){2})(?=.*[!##$*])(?!.*(.)\1).*$
Demo.
Breakdown:
^ - Beginning of string.
(?=(?:.*[A-Z]){3}) - A positive Lookahead to assert at least 3 capital letters.
(?=(?:.*[a-z]){3}) - A positive Lookahead to assert at least 3 lowercase letters.
(?=(?:.*[0-9]){2}) - A positive Lookahead to assert at least 2 digits.
(?=.*[!##$*]) - A positive Lookahead to assert at least one symbol.
(?!.*(.)\1) - A negative Lookahead to reject the same character repeated twice in a row.
.*$ - Match zero or more characters until the end of the string.
Note: If you want to prevent the user from using any additional characters, you may replace the final .* part with:
[A-Za-z0-9!##$*]*

regex to match exactly 16 consecutive digits in a potentially larger string [duplicate]

This question already has answers here:
Java REGEX to match an exact number of digits in a string
(5 answers)
Closed 3 years ago.
I need a regular expression that will match exactly 16 consecutive digits, no more or less, regardless of what, if anything, is surrounding it. I've gone through a couple iterations but all have had issues:
\d{16} will match any 16 consecutive digits, including 16 digits within a longer string of digits
^\d{16}$ will match a line that is exactly 16 consecutive digits, but if there is anything else in the string, the match will fail
\D\d{16}\D will match a string of 16 consecutive digits, but only if it is surrounded by non-digit characters. If the string of 16 digits is alone on the line, it fails
\D?\d{16}\D? will match a longer string of consecutive digits
[\D^]\d{16}[\D$] does not treat ^ and $ as their special meanings, but rather treats them as literal characters.
How can I create the regex I need?
Edit: These are PCRE regex
You can use lookaround
(?<!\d)\d{16}(?!\d)
(?<!\d) - Match should not be preceded by digit
\d{16} - Match digits (0 - 9) 16 times
(?!\d) - Match should not be followed by digit
Regex demo
This is close D\d{16}\D to what you want, except, as you noted, it grabs the non-digits surrounding the 16-digit sequence. Modify it with a lookbehind and a lookahead to use non-digits as anchors, without including them in a match:
(?<!\d)\d{16}(?!\d)
(?<=\D|^)\d{16}(?=\D|$)
The key here is positive lookarounds. These can verify nondigit characters without capturing them.
(?<=\D|^) Ensure that behind the match is either a nondigit character or the start of the string
\d{16} Capture exactly 16 digits
(?=\D|$) Ensure that following the match is either a nondigit character or the end of the string
Demo

Regexp struggling

I am trying to match a string (length =4) with lower case letters and digits. That could be 4 digits but not 4 letters. For example I want to match:
d4rt
df5h
34d6
4567
But not 'erty'.
I get that pattern ([a-z]+|[0-9]+){4} but that keeps me the 4 letters case.
Your regex ([a-z]+|[0-9]+){4} uses an alternation which will either match 1+ lowercase characters or 1+ digits in a capturing group and repeat that 4 times. That would also match 4 letters.
If lookarounds are supported, you could use a negative lookahead to assert that what follows are not 4 lowercase characters.
To match a string with length of 4, you could use anchors to assert the start ^ and the end $ of the string.
^(?![a-z]{4})[a-z0-9]{4}$
Regex demo
Your expression is matching four {4} of whatever either any number greater than 1 of lower case letters [a-z] or any number greater than one of digits. Therefore, your code is actually matching more than 4 of letters or digits too.
Your problem can be solved with lookaheads.
(?=[a-z]{0,3}[0-9])[a-z0-9]{4}
(?=[a-z]*[0-9]) looks ahead to find zero or more letters until it finds a number. But when it finds, such a sequence it will continue matching from the beginning of the lookahead. Thin of it as a sort of "pre match".
[a-z0-9]{4} This part checks for four numbers or lower case characters, but we are already sure that there is at least one number there because of the lookahead.
As your requirement says, the string should contain at least one digit and rest can be anything containing digits and lowercase alphabets of exactly 4 characters, you can use this regex,
^(?=.*\d)[a-z0-9]{4}$
Explanation:
^ --> Start of input
(?=.*\d) --> Look ahead to ensure the input contains at least one digit
[a-z0-9]{4} --> Ensures only lowercase alphabets and digits are matched in allowed character set
$ --> End of input
Demo

Regex pattern - matching capital letters with combination of numbers and a hyphen

I need help using regular expression. I have some strings with these possibility use cases. The string will always start with a capital letters followed by 3 numbers then with a hyphen, followed by numbers:
A012-123
B001-012
C023-456
I've tried: [A-Z0-9]-[0-9] and i can't get a match. Can someone show me how to construct this correctly?
[A-Z][0-9]{3}-[0-9]{3}
The {3} means that, match only three times. This will match any string which starts with a capital letter, followed by 3 digits, a hypen and 3 digits.
But if the number of digits after - can be anything, then you can use
[A-Z][0-9]{3}-[0-9]+
This match any string which starts with a capital letter, followed by 3 digits, a hypen and one or more digits.
Note: Instead of writing [0-9], you can use \d. They both are one and the same. So your first regex will become
[A-Z]\d{3}-\d{3}