HTML5 - Password Regex - regex

I try to make a valid html5 pattern for a Password. It should be at least 9 characters long and contain at least one Uppercase, one lowercase, one digit and one specialcharacter of this list
()[]{}?!$%&/=*+~,.;:<>-_
I made this regex but it doesn't work... anyone can fix this?
pattern="^(?=.*\d)(?=.*[a-z])(?=.*[()[]{}?!$%&/=*+~,.;:<>-_])(?=.*[A-Z]).{9,}?$"

pattern="(?=.*\d)(?=.*[a-z])(?=.*[()\[\]{}?!$%&/=*+~,.;:<>_-])(?=.*[A-Z]).{9,}"
There are several errors:
^(?=.*\d)(?=.*[a-z])(?=.*[()[]{}?!$%&/=*+~,.;:<>-_])(?=.*[A-Z]).{9,}?$
# ] needs to be escaped ----^^ ^ ^
# otherwise it will close the character class | |
# [ too but for no logical reason | |
# the - is used to define a character range ----+ |
# the range >-_ gives >?#ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_ |
# there's no reason to make this quantifier non-greedy -------------+
In addition, anchors ^ and $ are implicit, you don't have to put them.
Note that using ranges, you can also write the pattern like that:
pattern="(?=.*\d)(?=.*[a-z])(?=.*[!$%&(-/:-?_{}~])(?=.*[A-Z]).{9,}"

Related

Regex find characters in string

Given the following string
2010-01-01XD2010-01-02XX2010-01-03NX2010-01-04XD2010-01-05DN
I am trying to find all instances of the date followed by one or two characters ie 2010-01-01XD but not where the characters are XX
I have tried
(2010-01-02[^X]{2})|(2010-01-08[^X]{2})|(2010-01-07[^X]{2})|(2010-01-05[^X]{2})|(2010-01-15[^X]{2})
this works if both chars are not X. I have also tried
(2010-01-02[^X]{1,2})|(2010-01-08[^X]{1,2})|(2010-01-07[^X]{1,2})|(2010-01-05[^X]{1,2})|(2010-01-15[^X]{1,2})
this works for for DX but not XD
So trying to be a little clearer
2010-01-01XD
2010-01-01DX
2010-01-01ND
All above should be picked up
2010-01-01XX
And this ignored
You can use this regex based on negative lookahead:
(20[0-9]{2}-(?:0[1-9]|1[0-2])-(?:0[1-9]|[12][0-9]|3[01])(?!XX)[A-Z]{2})
RegEx Demo
Easiest way is to use a lookahead assertion (if available).
# (2010-01-01|2010-01-02|2010-01-08|2010-01-07|2010-01-05|2010-01-15)(?!XX)(?i:([a-z]){1,2})
( # (1 start), One of these dates
2010-01-01
| 2010-01-02
| 2010-01-08
| 2010-01-07
| 2010-01-05
| 2010-01-15
) # (1 end)
(?! XX ) # Look ahead assertion, cannot match XX here
(?i: # 1 or 2 of any U/L case letter
( [a-z] ){1,2} # (2)
)
You could likely use a simple pattern with a negtive lookahead such as this:
\d{4}-\d{2}-\d{2}(?!XX)[A-Z]{1,2}
example: http://regex101.com/r/dI1nW4/2
To allow Unicode characters (with the exception of XX) you could use:
\d{4}-\d{2}-\d{2}(?!XX)\D{1,2}
example: http://regex101.com/r/yB5fI0/1
20[0-9]{2}-[01][0-9]-[0-3][0-9]([A-Z][A-WYZ]|[A-WYZ][A-Z])
See it in action.
A negative look ahead is the easiest way to assert the letters not being XX, but there are some simplifications you can make to the alternation by recognising the parts of the date shared by all dates you're trying to match, making this shorter regex:
2010-01-(02|08|07|05|15)(?!XX)[A-Z]{1,2}

Confusion in regex pattern for search

Learning regex in bash, i am trying to fetch all lines which ends with .com
Initially i did :
cat patternNpara.txt | egrep "^[[:alnum:]]+(.com)$"
why : +matches one or more occurrences, so placing it after alnum should fetch the occurrence of any digit,word or signs but apparently, this logic is failing....
Then i did this : (purely hit-and-try, not applying any logic really...) and it worked
cat patternNpara.txt | egrep "^[[:alnum:]].+(.com)$"
whats confusing me : . matches only single occurrence, then, how am i getting the output...i mean how is it really matching the pattern???
Question : whats the difference between [[:alnum:]]+ and [[:alnum:]].+ (this one has . in it) in the above matching pattern and how its working???
PS : i am looking for a possible explanation...not, try it this way thing... :)
Some test lines for the file patternNpara.txt which are fetched as output!
valid email = abc#abc.com
invalid email = ab#abccom
another invalid = abc#.com
1 : abc,s,11#gmail.com
2: abc.s.11#gmail.com
Looking at your screenshot it seems you're trying to match email address that has # character also which is not included in your regex. You can use this regex:
egrep "[#[:alnum:]]+(\.com)" patternNpara.txt
DIfference between 2 regex:
[[:alnum:]] matches only [a-zA-Z0-9]. If you have # or , then you need to include them in character class as well.
Your 2nd case is including .+ pattern which means 1 or more matches of ANY CHARACTER
If you want to match any lines that end with '.com', you should use
egrep ".*\.com$" file.txt
To match all the following lines
valid email = abc#abc.com
invalid email = ab#abccom
another invalid = abc#.com
1 : abc,s,11#gmail.com
2: abc.s.11#gmail.com
^[[:alnum:]].+(.com)$ will work, but ^[[:alnum:]]+(.com)$ will not. Here is the reasons:
^[[:alnum:]].+(.com)$ means to match strings that start with a a-zA-Z or 0-9, flows two or more any characters, and end with a 'com' (not '.com').
^[[:alnum:]]+(.com)$ means to match strings that start with one or more a-zA-Z or 0-9, flows one character that could be anything, and end with a 'com' (not '.com').
Try this (with "positive-lookahead") :
.+(?=\.com)
Demo :
http://regexr.com?38bo0

Regex match repetition of a character OR single character

I need to escape more than one # in a line or any commas, dots.
Example:
Not valid: test##test test,test test#te,st test#t.e,st
Valid: test#test test#te#st
Next pattern does exactly what I want (it checks whether a line contains ## or , or . so the result is true/false):
/(#)\1+|[,.]/
but I don't like | sign here.
How can I fix it to use [ ] only? Or is there another way to do this?
I don't like | sign here. How can I fix it to use [ ] only?
You can't. Using | is the only way if you want to have different patterns for # and the other characters.
You can simplify your expression a bit. For example:
##+|[,.]

Password validator without special characters

I am a newbie to RegEx and have done a lot of searching already but have not found anything specific.
I am writing a regular expression that validates a password string.
The acceptable string must have at least 3 of 4 character types: digits, lowercase, uppercase, special char[<+$*)], but must not include another special set of characters(|;{}).
I got an idea regarding the inclusion(that is if it is the right way).
It looks like this:
^((a-z | A-Z |\d)(a-z|A-Z|[<+$*])(a-z|[<+$*]|\d)(A-Z|[<+$*]|\d)).*$
How do I ensure that user does not enter special chars(|;{})
This is what I tried with the exclusion string:
^(?:(?![|;{}]).)*$
I have tried a bit of tricks to combine the two in a single regEx but can't get it to work.
Any input on how to do this right?
Don't try to do it all in one regex. Make two different checks.
Say you're working in Perl (since you didn't specify language):
$valid_pw =
( $pw =~ /^((a-z | A-Z |\d)(a-z|A-Z|[<+$*])(a-z|[<+$*]|\d)(A-Z|[<+$*]|\d)).*$/ ) &&
( $pw !~ /\|;{}/ );
You're saying "If the PW matches all the inclusion rules, and the PW does NOT match any of the excluded characters, then the password is valid."
Look how much clearer that is than something like #Jerry's response above of:
^(?![^a-zA-Z]*$|[^a-z0-9]*$|[^a-z<+$*]*$|[^A-Z0-9]*$|[^A-Z<+$*]*$|[^0-9<+$*]*$|.*[|;{}]).*$
I don't doubt that Jerry's version works, but which one do you want to maintain?
In fact, you could break it down even further and be extremely clear:
my $cases_matched = 0;
$cases_matched++ if ( $pw =~ /\d/ ); # digits
$cases_matched++ if ( $pw =~ /[a-z]/ ); # lowercase
$cases_matched++ if ( $pw =~ /[A-Z]/ ); # uppercase
$cases_matched++ if ( $pw =~ /<\+\$\*/ ); # special characters
my $is_valid = ($cases_matched >= 3) && ($pw !~ /\|;{}/); # At least 3, and none forbidden.
Sure, that takes up 6 lines instead of one, but in a year when you go back and have to add a new rule, or figure out what the code does, you'll be glad you wrote it that way.
Just because you can do it in one regex doesn't mean you should.
Your current regex will not work for enforcing the at least 3 of 4 requirement. Using regex for this gets pretty complicated, but in my opinion the best way to do this is to use a negative lookahead that contains all of the failure cases, so that the entire match will fail if any of the negative cases are met. In this case the "at least 3 of 4" requirement can also be described as "fail if any 2 groups are not found". This also makes it very easy to add the final requirement to ensure that no characters from [|;{}] are found:
^ # beginning of string anchor
(?! # fail if
[^a-zA-Z]*$ # no [a-z] or [A-Z] anywhere
| # OR
[^a-z0-9]*$ # no [a-z] or [0-9] anywhere
| # OR
[^a-z<+$*]*$ # no [a-z] or [<+$*] anywhere
| # OR
[^A-Z0-9]*$ # no [A-Z] or [0-9] anywhere
| # OR
[^A-Z<+$*]*$ # no [A-Z] or [<+$*] anywhere
| # OR
[^0-9<+$*]*$ # no [0-9] or [<+$*] anywhere
| # OR
.*[|;{}] # a character from [|;{}] exists
)
.*$ # made it past the negative cases, match the entire string
Here it is as a single line:
^(?![^a-zA-Z]*$|[^a-z0-9]*$|[^a-z<+$*]*$|[^A-Z0-9]*$|[^A-Z<+$*]*$|[^0-9<+$*]*$|.*[|;{}]).*$
Example: http://rubular.com/r/4YV6Aj0vqh
This is for accepting only the characters you mentioned:
^(?:(?=.*[0-9])(?=.*[a-z])(?=.*[<+$*)])|(?=.*[a-z])(?=.*[<+$*)])(?=.*[A-Z])|(?=.*[0-9])(?=.*[A-Z])(?=.*[<+$*)])|(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z]))[0-9A-Za-z<+$*)]+$
And this one for all the characters you mentioned, and any special characters except |;{}.
^(?:(?=.*[0-9])(?=.*[a-z])(?=.*[<+$*)])|(?=.*[a-z])(?=.*[<+$*)])(?=.*[A-Z])|(?=.*[0-9])(?=.*[A-Z])(?=.*[<+$*)])|(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z]))(?!.*[|;{}].*$).+$
(One difference is that the first regex doesn't accept the special char # but the second does).
I have also used + since passwords logically can't be 0 width.
However, it's quite long, longer than F.J's regex, oh well. That's because I'm using positive lookaheads, which require more checks.

Creating a regex to parse a build version

I'm tyring to grab a build verson from a file that contains the following line:
<Assembly: AssemblyVersion("004.005.0862")>
and I would like it to return
4.5.862
I'm using sed in dos and got the following to spit out 004.005.0862
echo "<Assembly: AssemblyVersion("004.005.0862")>" | sed "s/[^0-9,.]//g"
How do I get rid of the leading zeros for each part of the build number?
The regex to do this in a single step looks like this:
^.*"0*([0-9]+\.)0*([0-9]+\.)0*([0-9]+).*
with sed-specific escaping and as a full expression, it becomes a little longer:
s/^.*"0*\([0-9]\+\.\)0*\([0-9]\+\.\)0*\([0-9]\+\).*/\1\2\3/g
The regex breaks down as
^ # start-of-string
.*" # anything, up to a double quote
0*([0-9]+\.) # any number of zeros, then group 1: at least 1 digit and a dot
0*([0-9]+\.) # any number of zeros, then group 2: at least 1 digit and a dot
0*([0-9]+) # any number of zeros, then group 3: at least 1 digit
.* # anything up to the end of the string
Maybe ... | sed "s/[^0-9]*0*([1-9][0-9,.]*)/\1/g". I'm using a subpattern to filter out the part you need, ignoring leading zeros and non-numeric characters.
There are probably many more clever ways, but one that works (and is reasonably easy to understand) is to pipe it through additional calls:
echo "version(004.005.0862)" | sed "s/[^0-9,.]//g" | sed "s/^0*//g" | sed "s/\.0*/./g"