regex for passphrase - regex

I'm a newbie to regex and trying to create a regex to validate pass phrases. I would like to validate that the pass phrase has:
n number of words or more
words should be separated by spaces
words, each with n characters or more
a number in at least one of the words
at least one special character in one of the words
This is what what I have so far
^(?=.*?((?:.*?\b[a-zA-Z0-9]{2,40}\b)\s*){3,})(?=.*?[##$%^&+=])(?=.*?[0-9]).*$
It matches this Pe2sI#sHy?ThYulU# phrase that does not have at least 3 words (no spaces). What am I doing wrong?

You should use \s+ instead of \s*. The latter allows zero spaces, the former requires at least one. But your regex is overly complicated. Try this:
^ # Start of string
(?=.*\d) # Assert at least one digit
(?=.*[##$%^&+=]) # Assert at least one special char
\s* # Optional leading whitespace
(?: # Match...
\S{2,} # at least 2 non-spaces
\s+ # at least 1 whitespace
){2,} # at least 2 times
\S{2,} # Match at least 2 non-spaces (making 3 "words" minimum)

A little late to this, so this is just an observation.
This is a take-off on #Tim Pietzcker's method.
Although 'words' can be anything, if you want to require at least
3 words have [a-zA-Z0-9]{2,40} characters imbedded, you could do this.
^ # String start
(?=.*[##$%^&+=]) # Assert 1 special char
(?=.*\d) # Assert 1 digit
(?: # Special 'Word Group' -- Need 2 words
.* # Any char, 0 or more times
[a-zA-Z0-9]{2,40} # Alpha/num char, 2 to 40 times
.* # Any char, 0 or more times
\s # a whitespace, only 1 required
){2} # 'Word Group' end, do 2 times
.* # Any char, 0 or more times
[a-zA-Z0-9]{2,40} # Alpha/num char, 2 to 40 times -- Need 1 word
This should match at least 3 special [a-zA-Z0-9]{2,40} words separated by at least 1 space
including a digit and special character.
update
Yes, you can combine it into a single group done {3} times in 2 ways I know of.
Use a capture buffer as a flag
^(?=.*[##$%^&+=])(?=.*\d)(?:(?:(?!\1)|\s).*[a-zA-Z0-9]{2,40}().*){3}
^ ^
---------------------------------------
^ # String start
(?=.*[##$%^&+=]) # Assert 1 special char
(?=.*\d) # Assert 1 digit
(?: # Special 'Word Group'
(?: #.. grping start ....
(?!\1) # Either capt group 1 is UN-DEFINED
| \s # OR, require a whitespace
) #.. grping end ....
.* # Any char, 0 or more times
[a-zA-Z0-9]{2,40} # Alpha/num char, 2 to 40 times
() # DEFINE Capture group 1
.* # Any char, 0 or more times
){3} # 'Word Group' end, do 3 times
Or, by using a conditional
^(?=.*[##$%^&+=])(?=.*\d)(?:(?(1)\s).*([a-zA-Z0-9]{2,40}).*){3}
^ ^
---------------------------------------
^ # String start
(?=.*[##$%^&+=]) # Assert 1 special char
(?=.*\d) # Assert 1 digit
(?: # Special 'Word Group'
(?(1)\s) # Conditional, require a whitespace if capture group 1 captured anything
.* # Any char, 0 or more times
([a-zA-Z0-9]{2,40}) # Capture group 1, Alpha/num char, 2 to 40 times
.* # Any char, 0 or more times
){3} # 'Word Group' end, do 3 times

Related

How can I allow one space in a regular expression

The following Regex checks for a number which starts with 6, 8 or 9 and has to be exactly 8 digits long.
/^(6|8|9)\d{7}$/
Now I want to accept one space in between digits as well, but don't know where to start.
For example both 61234567 and 6123 4567 should be allowed, but only the first one passes my current regex.
Can you help me create it?
You may use
^(?!.*(?:\s\d+){2})[689](?:\s?\d){7}$
See the regex demo
Details
^ - start of string
(?!.*(?:\s\d+){2}) - a negative lookahead that fails the match if, after any 0+ chars other than line break chars, as many as possible occurrences, there are two occurrences of a whitespaces followed with 1+ digits
[689] - 6, 7 or 9
(?:\s?\d){7} - seven occurrences of an optional whitespace followed with a single digit
$ - end of string.
To allow leadign/trailing whitespace, add \s? (1 or 0) or \s* (0 or more) right after ^ and before $.
To allow a single 1+ more whitespace chunk in the digit string, use
^(?!.*(?:\s+\d+){2})[689](?:\s*\d){7}$
See this regex demo.
You could use the regular expression
/^[689](?:\d{7}|(?=.{8}$)\d* \d+)$/
demo
We can make this self-documenting by writing it in free-spacing mode:
/
^ # match beginning of line
[689] # match '6', '8' or '9'
(?: # begin non-capture group
\d{7} # match 7 digits
| # or
(?=.{8}$) # require the remainder of the line to have 8 chars
\d*\ \d+ # match 0+ digits, a space, 1+ digits
) # end non-capture group
$ # match end of line
/x # free-spacing regex definition mode

Regex allow only one dash or only one space

I want an expression that allows number and one dash OR number and one space. Space or dash are optional.
I tried this
/^([0-9]+(-[0-9]+)?)|([0-9]+(\s[0-9]+)?)$/
Accepted regular expressions:
11-222
444 99
You can put the OR in the middle of your expression: ^([0-9]+)(\s|-)([0-9]+)$ works with your examples in Notepad++.
Let's explain your regex.
^ # beginning of line
( # start group 1
[0-9]+ # 1 or more digits
( # start group 2
- # a hyphen
[0-9]+ # 1 or more digits
)? # end group 2, optional
) # end group 1
| # OR
( # start group 3
[0-9]+ # 1 or more digits
( # start group 4
\s # a space
[0-9]+ # 1 or more digits
)? # end group 4, optional
) # end group 3
$ # end of line
The OR acts between the group 1 at the beginning of the line and the group 3 at the end of the line. But you want group 1 and group 3 anchored at the beginning and at the end.
Add a group over group 1 and 3:
^(([0-9]+(-[0-9]+)?)|([0-9]+(\s[0-9]+)?))$
You can use non capture groups (more efficient) instead of capture group
^(?:(?:[0-9]+(?:-[0-9]+)?)|(?:[0-9]+(?:\s[0-9]+)?))$
Combine the hyphen and the space in a character class and remove the superfluous groups:
^[0-9]+(?:[-\s][0-9]+)?$
If your regex flavour supports it, change the [0-9] into \d. Finally your regex becomes:
^\d+(?:[-\s]\d+)?$
Much simpler, no?

Using regex on a file to pull data out. Having issues with multi-line

I am looking to get to the next line of data within a text file. Here is an example of data from the file I am working with.
0519 ABF 244 AN A1 ADV STUFF 1.0 2.0 Somestuff 018 0155 MTWTh 10:30A 11:30A 20 20 0 6.7
Somestuff 011 0145 MTWTh 12:30P 1:30P
I have been trying to move to the next line by utilizing a variety of code such as.. carriage return \n using \s+ to replace the large space after 6.7. using m like so //m not finding a result just yet.
Here is some example code
while !regex_file.eof?
line = regex_file.gets.chomp
if line =~ ^.*?\d{4}\s+[A-Z]+\s+\d{3}.+$
puts line
end
end
Using https://rubular.com/ this particular set of code matches my desired output for the first line
0519 ABF 244 AN A1 ADV STUFF 1.0 2.0 Somestuff 018 0155 MTWTh 10:30A 11:30A 20 20 0 6.7
but does not match and haven't figured out how to match the next line.
Somestuff 011 0145 MTWTh 12:30P 1:30P
Try something like this: the \n captures the new line, and you can apply your own rules to capture anything you want which comes after \n - see below pls:
^.*\d{4}\s+[A-Z]+\s+\d{3}.+\n.*$
I've made an arbitrary assumption about the requirements for matching the second line. It is more demanding than the requirements for matching the first that are reflected in your regex, but I thought the additional complexity would have some educational value for you.
Here is a regular expression (untested) for matching both lines. Note you don't need ^.*? at the beginning of the regex and for the part of the regex that matches the first line .+$ adds nothing, so I removed it. After all you are just matching each line separately (line), and will display the entire line if there's a match. As well, the end-of-string anchor \z is more appropriate than the end-of-line anchor ($), though either can be used.
r = /
(?: # begin non-capture group
\d{4} # match 4 digits
\s+ # match > 0 whitespaces
[A-Z]+ # match > 0 uppercase letters
\s+ # match > 0 whitespaces
\d{3} # match 3 digits
| # or
\b # match a (zero-width) word break
[A-Z] # match 1 uppercase letter
[a-z]* # match >= 0 lowercase letter
\s+ # match > 0 whitespaces
\d{3} # match 3 digits
\s+ # match > 0 whitespaces
\d{4} # match 4 digits
\s+ # match > 0 whitespaces
[A-Za-z]+ # match > 0 letters
(?: # begin non-capture group
\s+ # match > 0 whitespaces
(?: # begin a non-capture group
0\d # match 0 followed by any digit
| # or
1[012] # match 1 followed by 0, 1 or 2
) # end non-capture group
: # match a colon
[0-5][0-9] # match 0-5 followed by 0-9
){2} # end non-capture group and execute twice
) # end non-capture group
/x # free-spacing regex definition mode
This regular expression is conventionally written as follows.
r = /(?:\d{4}\s+[A-Z]+\s+\d{3}|\b[A-Z][a-z]*\s+\d{3}\s+\d{4}\s+[A-Za-z]+(?:\s+(?:0\d|1[012]):[0-5][0-9]){2})/
You might go through the file putsing matching lines as follows:
File.foreach(fname) { |line| puts line if line.match? r }
See IO::foreach, which is a very convenient method for reading files line-by-line. Note IO class methods (such foreach) are commonly invoked with File as their receiver. That's OK, as File.superclass #=> IO, so File inherits those methods from IO.
When used without a block foreach returns an enumerator, which is often convenient as well. If, for example, you wished to return an array of matching lines (rather than puts them), you could write:
File.foreach(fname).with_object([]) do |line, arr|
arr << line.chomp if line.match? r
end
Your current regex:
^.*?\d{4}\s+[A-Z]+\s+\d{3}.+$
matches in this order:
the beginning of the line (^)
zero or more characters non-greedy .*?
four digits (\d{4})
one or more spaces (\s+)
one or more capital letters ([A-Z]+)
one or more spaces
three digits (\d{3})
one or more characters (.+)
the end of the line ($)
The second line of your file is:
Somestuff 011 0145 MTWTh 12:30P 1:30P
starts matching 0145 MTWT but then fails to match \d{3}

Specific password regular expression

I am having problems creating a regular expresion. It needs to fullfill the following:
1) Has 8-12 characters
2) At least 1 uppercase letter
3) At least 3 lowercase letters
4) At least 1 number
5) At least 1 special character
6) Has to start with a lowercase, upercase or numeric
7) Maximum of 2 repeating characters
Thanks in advance!
This should work
^(?=.*[A-Z])(?=(?:.*[a-z]){3})(?=.*[0-9])(?=.*[!"#$%&'()*+,\-./:;<=>?#[\]^_`{|}~])(?=(?:(.)(?!\1\1))+$)[a-zA-Z0-9].{7,11}$
Explained / Expanded
^ # BOS
(?= .* [A-Z] ) # 1 upper
(?=
(?: .* [a-z] ){3} # 3 lower
)
(?= .* [0-9] ) # 1 number
(?=
.* [!"#$%&'()*+,\-./:;<=>?#[\]^_`{|}~] # 1 special
)
(?= # Maximum 2 repeating
(?:
( . ) # (1)
(?! \1 \1 )
)+
$
)
[a-zA-Z0-9] # First alnum
.{7,11} # 8 to 12 max chars
$ # EOS
What you got so far?
Also, which set of regex are you using ?
I'd start with the length of the expression
Restrict it to be 8-12, something like [a-zA-Z]{8,12}
For the requirements on the first one you can use a []+
For the other requirements it's a little tricker

Matching words that may contain 1-2 digits

I use the following regex for matching words with a length of 4 that has 1 number and 3 capital letters:
\b(?=[A-Z]*\d[A-Z]*\b)[A-Z\d]{4}\b
What I would like to know is how I need to modify the expression to filter out words with a length of 10, that contains 0-2 numbers.
\b(?=[A-Z]*\d[A-Z]*\b)[A-Z\d]{10}\b
This will work for 1 number occurence, but how do i extend it to filter 0 and 2 numbers as well?
Sample: http://regexr.com?32u40
Put the length check into the lookahead:
\b(?=[A-Z\d]{10}\b)(?:[A-Z]*\d){0,2}[A-Z]*\b
Explanation:
\b # Start at a word boundary
(?= # Assert that...
[A-Z\d]{10} # 10 A-Z/digits follow
\b # until the next word boundary.
) # (End of lookahead)
(?: # Match...
[A-Z]* # Any number of ASCII uppercase letters
\d # and exactly one digit
){0,2} # repeat 0, 1 or 2 times.
[A-Z]* # Match any number of letters
\b # until the next word boundary.