Single regex for complex password validation [duplicate] - regex

This question already has answers here:
Regexp Java for password validation
(17 answers)
Closed 8 years ago.
I have to validate password so that they meet these rules
A) The password must contain characters from 3 of the following 4 classes:
English Upper Case Letters A, B, C, ... Z
English Lower Case Letters a, b, c, ... z
Westernised Arabic Numerals 0, 1, 2, ... 9
Non-alphanumeric (“special characters”)
For example, punctuation, symbols.
{},.<>;:'?/|`~!##$%^&*()_-+= space
B) The password must be at least8 characters long;
Can this be done in a single Regex. What would that Regex be?

This task isn't suitable for doing with a regular expression.
It can be done in a regular expression, but it'd be so convoluted and complicated that you're better off doing the check in some other way.
Just because something can be done with regular expressions doesn't mean it's a good idea.

I think using complicated regular expression isn't a way that should be used at all costs. In this case, using a simple method with four booleans will be easier to write, easier to read and probably also faster.

You could check that it is:
not purely numbers and alphanumerics (this is slightly more aggressive than your conditions say);
not purely lowercase and special characters
A single regular expression to check this would be something like
(?![A-Za-z0-9]+$|[a-z{},.<>;:'?/|`~!##$%^&*()_-+= -]+$).{8,}
I intentionally ignored your exact specification. In particular, I did not want to allow Pass1234, and I don't think it makes sense to set a maximum length, and I did not restrict the set of allowed characters at all (i.e. there are minimum requirements, but you can go wild and use control characters or accented characters if you like). These things are easy enough to fix if you disagree.
To strictly implement your spec, you could check that the password does not consist of purely any two groups; so not all upper and lower case, and not all lowercase and numbers, and not all uppercase and numbers, and not all numbers and specials, and not all lowercase and specials, and not all uppercase and specials, but again, this is somewhat tedious and IMHO counter-productive.
You are not saying which regex flavor you are using. I have assumed you have the Perl negative lookahead (?!...) at your disposal. This is significantly harder if you are restricted to traditional BRE or ERE syntax.

I think you have achieve a very close result with a single regular expressions. Here is an example:
^((?=.*[!##$%&,()_=/\.\-\*\+\?])[A-Za-z0-9!##$%&,()_=/\.\-\*\+\?]{8,20})$
This says:
At least 1 control character
Can contain alpha numeric characters
Is between 8 and 20 characters long

Related

Activate "char_classes" in boost regex library [duplicate]

How do I create a regular expression that detects hexadecimal numbers in a text?
For example, ‘0x0f4’, ‘0acdadecf822eeff32aca5830e438cb54aa722e3’, and ‘8BADF00D’.
How about the following?
0[xX][0-9a-fA-F]+
Matches expression starting with a 0, following by either a lower or uppercase x, followed by one or more characters in the ranges 0-9, or a-f, or A-F
The exact syntax depends on your exact requirements and programming language, but basically:
/[0-9a-fA-F]+/
or more simply, i makes it case-insensitive.
/[0-9a-f]+/i
If you are lucky enough to be using Ruby, you can do:
/\h+/
EDIT - Steven Schroeder's answer made me realise my understanding of the 0x bit was wrong, so I've updated my suggestions accordingly.
If you also want to match 0x, the equivalents are
/0[xX][0-9a-fA-F]+/
/0x[0-9a-f]+/i
/0x[\h]+/i
ADDED MORE - If 0x needs to be optional (as the question implies):
/(0x)?[0-9a-f]+/i
Not a big deal, but most regex engines support the POSIX character classes, and there's [:xdigit:] for matching hex characters, which is simpler than the common 0-9a-fA-F stuff.
So, the regex as requested (ie. with optional 0x) is: /(0x)?[[:xdigit:]]+/
It's worth mentioning that detecting an MD5 (which is one of the examples) can be done with:
[0-9a-fA-F]{32}
This will match with or without 0x prefix
(?:0[xX])?[0-9a-fA-F]+
If you're using Perl or PHP, you can replace
[0-9a-fA-F]
with:
[[:xdigit:]]
Just for the record I would specify the following:
/^[xX]?[0-9a-fA-F]{6}$/
Which differs in that it checks that it has to contain the six valid characters and on lowercase or uppercase x in case we have one.
Another example: Hexadecimal values for css colors start with a pound sign, or hash (#), then six characters that can either be a numeral or a letter between A and F, inclusive.
^#[0-9a-fA-F]{6}
If you are looking for an specific hex character in the middle of the string, you can use "\xhh" where hh is the character in hexadecimal. I've tried and it works. I use framework for C++ Qt but it can solve problems in other cases, depends on the flavor you need to use (php, javascript, python , golang, etc.).
This answer was taken from:http://ult-tex.net/info/perl/
This one makes sure you have no more than three valid pairs:
(([a-fA-F]|[0-9]){2}){3}
Any more or less than three pairs of valid characters fail to match.
In Java this is allowed:
(?:0x?)?[\p{XDigit}]+$
As you see the 0x is optional (even the x is optional) in a non-capturing group.
In case you need this within an input where the user can type 0 and 0x too but not a hex number without the 0x prefix:
^0?[xX]?[0-9a-fA-F]*$
first, instead of ^ and $ use \b as this is a word delimiter and can help when the hash is not the only string in the line.
i came here looking for similar but specialized regex and came up with this:
\b(\d+[a-f]+\d+[\da-f]*|[a-f]+\d+[a-f]+[\da-f]*)\b
I needed to detect hashes like git commit identifiers (and similar) in console and more then matching all possible hashes i prioritize NOT matching random words or numbers like EB or 12345678
So a heuristic approach i made is that I assume a hash will be alternating between numbers and letters reasonably often and the chains of only numbers or only letters will be short.
Another important fact is that MD5 hash is 32 characters long (as mentioned by #Adaddinsane) and git displays a shortened version with only 10 characters, so above example can be modified as follows:
for 10-char long hashes i assume the groups will be at most 3-char long
\b(\d+[a-f]+\d+[\da-f]{1,7}|[a-f]+\d+[a-f]+[\da-f]{1,7})\b
for up to 32-char long hashes i assume the groups will be at most 5-char long
\b(\d+[a-f]+\d+[\da-f]{17,29}|[a-f]+\d+[a-f]+[\da-f]{17,29})\b
you can easily change a-f to a-fA-F for case insensitivity or add 0[xX] at the front for that 0x prefix matching
those examples will obviously not match exotic but valid hashes that have very long sequences of only numbers or only letters in the front or extreme hashes like only 0s
but this way i can match hashes and reduce accident false-positive matches significantly, like dir name or line number

Regex to have two out of three character types [duplicate]

My client has requested that passwords on their system must following a specific set of validation rules, and I'm having great difficulty coming up with a "nice" regular expression.
The rules I have been given are...
Minimum of 8 character
Allow any character
Must have at least one instance from three of the four following character types...
Upper case character
Lower case character
Numeric digit
"Special Character"
When I pressed more, "Special Characters" are literally everything else (including spaces).
I can easily check for at least one instance for all four, using the following...
^(?=.*?[A-Z])(?=.*?[a-z])(?=.*?\d)(?=.*?[^a-zA-Z0-9]).{8,}$
The following works, but it's horrible and messy...
^((?=.*?[A-Z])(?=.*?[a-z])(?=.*?\d)|(?=.*?[A-Z])(?=.*?[a-z])(?=.*?[^a-zA-Z0-9])|(?=.*?[A-Z])(?=.*?\d)(?=.*?[^a-zA-Z0-9])|(?=.*?[a-z])(?=.*?\d)(?=.*?[^a-zA-Z0-9])).{8,}$
So you don't have to work it out yourself, the above is checking for (1,2,3|1,2,4|1,3,4|2,3,4) which are the 4 possible combinations of the 4 groups (where the number relates to the "types" in the set of rules).
Is there a "nicer", cleaner or easier way of doing this?
(Please note, this is going to be used in an <asp:RegularExpressionValidator> control in an ASP.NET website, so therefore needs to be a valid regex for both .NET and javascript.)
It's not much of a better solution, but you can reduce [^a-zA-Z0-9] to [\W_], since a word character is all letters, digits and the underscore character. I don't think you can avoid the alternation when trying to do this in a single regex. I think you have pretty much have the best solution.
One slight optimization is that \d*[a-z]\w_*|\d*[A-Z]\w_* ~> \d*[a-zA-Z]\w_*, so I could remove one of the alternation sets. If you only allowed 3 out of 4 this wouldn't work, but since \d*[A-Z][a-z]\w_* was implicitly allowed it works.
(?=.{8,})((?=.*\d)(?=.*[a-z])(?=.*[A-Z])|(?=.*\d)(?=.*[a-zA-Z])(?=.*[\W_])|(?=.*[a-z])(?=.*[A-Z])(?=.*[\W_])).*
Extended version:
(?=.{8,})(
(?=.*\d)(?=.*[a-z])(?=.*[A-Z])|
(?=.*\d)(?=.*[a-zA-Z])(?=.*[\W_])|
(?=.*[a-z])(?=.*[A-Z])(?=.*[\W_])
).*
Because of the fourth condition specified by the OP, this regular expression will match even unprintable characters such as new lines. If this is unacceptable then modify the set that contains \W to allow for more specific set of special characters.
I'd like to improve the accepted solution with this one
^(?=.{8,})(
(?=.*[^a-zA-Z\s])(?=.*[a-z])(?=.*[A-Z])|
(?=.*[^a-zA-Z0-9\s])(?=.*\d)(?=.*[a-zA-Z])
).*$
The above Regex worked well for most scenarios except for strings such as "AAAAAA1$", "$$$$$$1a"
This could be an issue only in iOS ( Objective C and Swift) that the regex "\d" has issues
The following fix worked in iOS, i.e changing to [0-9] for digits
^((?=.*?[A-Z])(?=.*?[a-z])(?=.*?[0-9])|(?=.*?[A-Z])(?=.*?[a-z])(?=.*?[^a-zA-Z0-9])|(?=.*?[A-Z])(?=.*?[0-9])(?=.*?[^a-zA-Z0-9])|(?=.*?[a-z])(?=.*?[0-9])(?=.*?[^a-zA-Z0-9])).{8,}$
Password must meet at least 3 out of the following 4 complexity rules,
[at least 1 uppercase character (A-Z) at least 1 lowercase character (a-z) at least 1 digit (0-9) at least 1 special character — do not forget to treat space as special characters too]
at least 10 characters
at most 128 characters
not more than 2 identical characters in a row (e.g., 111 not allowed)
'^(?!.(.)\1{2}) ((?=.[a-z])(?=.[A-Z])(?=.[0-9])|(?=.[a-z])(?=.[A-Z])(?=.[^a-zA-Z0-9])|(?=.[A-Z])(?=.[0-9])(?=.[^a-zA-Z0-9])|(?=.[a-z])(?=.[0-9])(?=.*[^a-zA-Z0-9])).{10,127}$'
(?!.*(.)\1{2})
(?=.[a-z])(?=.[A-Z])(?=.*[0-9])
(?=.[a-z])(?=.[A-Z])(?=.*[^a-zA-Z0-9])
(?=.[A-Z])(?=.[0-9])(?=.*[^a-zA-Z0-9])
(?=.[a-z])(?=.[0-9])(?=.*[^a-zA-Z0-9])
.{10,127}

Regex to find 3 out of 4 conditions

My client has requested that passwords on their system must following a specific set of validation rules, and I'm having great difficulty coming up with a "nice" regular expression.
The rules I have been given are...
Minimum of 8 character
Allow any character
Must have at least one instance from three of the four following character types...
Upper case character
Lower case character
Numeric digit
"Special Character"
When I pressed more, "Special Characters" are literally everything else (including spaces).
I can easily check for at least one instance for all four, using the following...
^(?=.*?[A-Z])(?=.*?[a-z])(?=.*?\d)(?=.*?[^a-zA-Z0-9]).{8,}$
The following works, but it's horrible and messy...
^((?=.*?[A-Z])(?=.*?[a-z])(?=.*?\d)|(?=.*?[A-Z])(?=.*?[a-z])(?=.*?[^a-zA-Z0-9])|(?=.*?[A-Z])(?=.*?\d)(?=.*?[^a-zA-Z0-9])|(?=.*?[a-z])(?=.*?\d)(?=.*?[^a-zA-Z0-9])).{8,}$
So you don't have to work it out yourself, the above is checking for (1,2,3|1,2,4|1,3,4|2,3,4) which are the 4 possible combinations of the 4 groups (where the number relates to the "types" in the set of rules).
Is there a "nicer", cleaner or easier way of doing this?
(Please note, this is going to be used in an <asp:RegularExpressionValidator> control in an ASP.NET website, so therefore needs to be a valid regex for both .NET and javascript.)
It's not much of a better solution, but you can reduce [^a-zA-Z0-9] to [\W_], since a word character is all letters, digits and the underscore character. I don't think you can avoid the alternation when trying to do this in a single regex. I think you have pretty much have the best solution.
One slight optimization is that \d*[a-z]\w_*|\d*[A-Z]\w_* ~> \d*[a-zA-Z]\w_*, so I could remove one of the alternation sets. If you only allowed 3 out of 4 this wouldn't work, but since \d*[A-Z][a-z]\w_* was implicitly allowed it works.
(?=.{8,})((?=.*\d)(?=.*[a-z])(?=.*[A-Z])|(?=.*\d)(?=.*[a-zA-Z])(?=.*[\W_])|(?=.*[a-z])(?=.*[A-Z])(?=.*[\W_])).*
Extended version:
(?=.{8,})(
(?=.*\d)(?=.*[a-z])(?=.*[A-Z])|
(?=.*\d)(?=.*[a-zA-Z])(?=.*[\W_])|
(?=.*[a-z])(?=.*[A-Z])(?=.*[\W_])
).*
Because of the fourth condition specified by the OP, this regular expression will match even unprintable characters such as new lines. If this is unacceptable then modify the set that contains \W to allow for more specific set of special characters.
I'd like to improve the accepted solution with this one
^(?=.{8,})(
(?=.*[^a-zA-Z\s])(?=.*[a-z])(?=.*[A-Z])|
(?=.*[^a-zA-Z0-9\s])(?=.*\d)(?=.*[a-zA-Z])
).*$
The above Regex worked well for most scenarios except for strings such as "AAAAAA1$", "$$$$$$1a"
This could be an issue only in iOS ( Objective C and Swift) that the regex "\d" has issues
The following fix worked in iOS, i.e changing to [0-9] for digits
^((?=.*?[A-Z])(?=.*?[a-z])(?=.*?[0-9])|(?=.*?[A-Z])(?=.*?[a-z])(?=.*?[^a-zA-Z0-9])|(?=.*?[A-Z])(?=.*?[0-9])(?=.*?[^a-zA-Z0-9])|(?=.*?[a-z])(?=.*?[0-9])(?=.*?[^a-zA-Z0-9])).{8,}$
Password must meet at least 3 out of the following 4 complexity rules,
[at least 1 uppercase character (A-Z) at least 1 lowercase character (a-z) at least 1 digit (0-9) at least 1 special character — do not forget to treat space as special characters too]
at least 10 characters
at most 128 characters
not more than 2 identical characters in a row (e.g., 111 not allowed)
'^(?!.(.)\1{2}) ((?=.[a-z])(?=.[A-Z])(?=.[0-9])|(?=.[a-z])(?=.[A-Z])(?=.[^a-zA-Z0-9])|(?=.[A-Z])(?=.[0-9])(?=.[^a-zA-Z0-9])|(?=.[a-z])(?=.[0-9])(?=.*[^a-zA-Z0-9])).{10,127}$'
(?!.*(.)\1{2})
(?=.[a-z])(?=.[A-Z])(?=.*[0-9])
(?=.[a-z])(?=.[A-Z])(?=.*[^a-zA-Z0-9])
(?=.[A-Z])(?=.[0-9])(?=.*[^a-zA-Z0-9])
(?=.[a-z])(?=.[0-9])(?=.*[^a-zA-Z0-9])
.{10,127}

Regex for at least 8 + upper and lower+numbers or other non-alphabetic

Contains at least 8 characters.
Contains upper and lower case letters.
Contains numbers or other non-alphabetic characters.
What could be the reg ex for the above criteria?
I am creating a check for stronger password :)
c# i am using
This should do it:
(?=.*?[a-z])(?=.*?[A-Z])(?=.*?[^a-zA-Z]).{8,}
See here: rubular
Explained:
(?=.*?[a-z]) //lookahead, there has to be a lower case alphabetic char
(?=.*?[A-Z]) //lookahead, there has to be a upper case alphabetic char
(?=.*?[^a-zA-Z]) //lookahead, there has to be a non-alphabetic char
.{8,} // any character at least 8 times
Don't try to use one regexp for all rules -- it's hard, and more importantly it will be hard to read and modify by future programmers. Instead, write one function for each rule. Use a string length function for the first rule, then use separate regular expressions (or a simple scan of the string)for uppercase letters, lowercase letters and numbers.
Your test then becomes something like:
if (len(password) >= 8 &&
contains_lower(password) &&
contains_upper(password) &&
contains_number(password)) {
...
}
Your code becomes absolutely clear in its intent, and if you have to change just one piece of the algorithm you don't have to reinvent a complex regular expression. Plus, you'll be able to unit test each rule independently.
Compare that to an example someone wrote in another answer to this question:
(?=.*?[a-z])(?=.*?[A-Z])(?=.*?[^a-zA-Z]).{8,}
Which of these two answers looks easier to understand, easier to modify and easier to test? You can't even guess what the regex is doing until you spend a few (or many) moments studying it. And what if the requirement changes to ".. and has at least one underscore"? How do you change the pattern, especially when you weren't the one who came up with the pattern to begin with?

RegEx question for password strength validation

I'm looking for a single regular expression for our password requirements. Passwords:
Must be at least 8 characters
Cannot contain spaces
Contain both lowercase and UPPERCASE characters
Contain at least one numeric digit
Contain at least one special character (i.e. any character not 0-9,a-z,A-Z)
It'll probably be easier to code the logic. Regex is used for matching patterns. Passwords tend to be somewhat random strings, so the problem doesn't lend itself easily to be solved by a regex. It's possible but will be cryptic to read and hard to maintain.
Idea and most of the work taken from http://www.zorched.net/2009/05/08/password-strength-validation-with-regular-expressions/
^\S*(?=\S{8,})(?=\S*[a-z])(?=\S*[A-Z])(?=\S*[\d])(?=\S*[\W])\S*$
I used the basic answer at the bottom of his post, but replaced all the dots with \S to rule out space characters, and moved around some of the assertions.