Regex for pattern with hyphens - regex

Is there a regex in Java that allows alphanumeric characters (both upper and lower case), has to start with a letter, could end with a letter or a digit and also contain hyphens in the middle?
I have ^[a-zA-Z][A-Za-z0-9-]$ but not sure if it could work for all cases.

^[A-Za-z]([A-Za-z0-9-]*[A-Za-z0-9])?$
^[A-Za-z]: starting with a letter
(...)?$: optionally followed by this group, and end in it
[A-Za-z0-9-]*: any number of letters, digits and hyphens
[A-Za-z0-9]: one letter or digit
You need point 2 or you'll miss single-letter sequences, which are also valid accorrdding to your description
With Python, I do this:
(?i)^[a-z]([a-z\d-][a-z\d])?$

Related

Regex How can I find names only written in upper-case letters(mandatory), and the names may also contain numbers, literal spaces, and hyphens

If I want to find "KFC", "EU 8RF", and IK-OTP simultaneously, what should the code look like?
My code is :
db.business.find({name:/^[A-Z\s?\d?\-?]*$/}, {name:1}).sort({"name":1})
but it will return the name that is whole number, such as 1973, 1999. How should I improve my code? TIA
Use a lookahead to require at least one letter.
^(?=.*[A-Z])[A-Z\d\s-]+$
DEMO
You can use
^(?=[\d -]*[A-Z])[A-Z\d]+(?:[ -][A-Z\d]+)*$
See the regex demo.
Details:
^ - start of string
(?=[\d -]*[A-Z]) - a positive lookahead that requires an uppercase ASCII letter after any zero or more digits, spaces or hyphens immediately to the right of the current location
[A-Z\d]+ - one or more uppercase ASCII letters or digits
(?:[ -][A-Z\d]+)* - zero or more repetitions of a space or - and then one or more uppercase ASCII letters or digits
$ - end of string.

Regex To Validate A String, But The String Can't Contain n Number Of A Specific Character

Recently I ran into a validation situation I've been trying to solve with regex. The rules are as such:
Must start with a capital letter
Center of the string may be of any length
Center of the string may have any combination of upper and lower case letters and numbers
Center of the string may have up to one underscore
Must end with a number
I have attempted to match this string with the following regex:
^(?!_{2,})([A-Z][a-zA-Z0-9_]*[0-9])$
and
^(?<=_{0,1})([A-Z][a-zA-Z0-9_]*[0-9])$
Both of these attempts still match cases where there is more than one underscore present. I.E. App_l_e9 or App__le9.
How can you check to see if your regex match, I.E. the ([A-Z][a-zA-Z0-9_]*[0-9]) part contains zero or one underscore in any place within the middle of the string?
The simplest approach would probably be this
^[A-Z][a-zA-Z0-9]*_?[a-zA-Z0-9]*[0-9]$
Explanation:
^[A-Z] Must start with an uppercase letter
[a-zA-Z0-9]* A combination of uppercase and lowercase letters and numbers of any length (also 0-length)
_? Either zero or one underscore character
[a-zA-Z0-9]* Again A combination of uppercase and lowercase letters and numbers of any length (also 0-length)
[0-9]$ Must end with a number
This will accept A_9 or AA0_xY8 but for instance not aXY_34 or Aasf1__asdf5
If the underscore in the middle part must not be the first or last character of this middlepart, you can replace the * with a + like this.
^[A-Z][a-zA-Z0-9]+_?[a-zA-Z0-9]+[0-9]$
So this, won't accecept for instance A_9 anymore, but the word must at least be Ax_d9
You might also start the match with an uppercase A-Z and immediately check that the string ends with a number 0-9 using a positive lookahead to prevent catastrophic backtracking.
^[A-Z](?=.*[0-9]$)[a-zA-Z0-9]*_?[a-zA-Z0-9]*$
^ Start of string
[A-Z] Match an uppercase char A-Z
(?=.*[0-9]$) Positive lookahead to assert a digit 0-9 at the end of the string
[a-zA-Z0-9]* Optionally match any of the listed
_? Match an optional _
[a-zA-Z0-9]* Optionally match any of the listed
$ End of string
Regex demo
Or with an optional group
^[A-Z](?=.*[0-9]$)[a-zA-Z0-9]*(?:_[a-zA-Z0-9]*)?$
Regex demo

Extend regular expression

I want to find invoice numbers with a regex. The string has be longer than 3 char. It may contain signs like {., , /, _}, all numbers and it may contain one or two capital letters - those can stay alone or after each other. That is, what I'm currently trying, without success.
`([0-9-\.\\\/_]{,3})([A-Z]{0,2})?`
Here I have two examples, which should be matched:
019S836/03717008
DR094255
This should not be matched:
DRF094255
Can somebody help me please?
You can use
^(?!(?:[^A-Z]*[A-Z]){3})(?=\D*\d)[0-9A-Z.\\\/_-]{3,}$
See the regex demo.
Details:
^ - start of string
(?!(?:[^A-Z]*[A-Z]){3}) - a negative lookahead that fails the match if, immediately to the right of the current location (i.e. from the start of string), there are three occurrences of any zero or more chars other than uppercase ASCII letters followed with one uppercase ASCII letter
(?=\D*\d) - there must be at least one digit in the string
[0-9A-Z.\\\/_-]{4,} - four or more occurrences of digits, uppercase letters, ., \, /, _ or -
$ - end of string.

Validating an obfuscation token

I am building a secured algorithm to get rid of obfuscation attacks. The user is validated with the token which should satisfy following condition:
username in lowercase letters only and username is at least 5 digit long.
username is followed with #.
After # first two characters are important. A digit and a character always. This part contains at least a digit, a lowercase and an upperCase Letter.
In between there could be any number of digits or letters only.
In the last the digit and character should exactly match point-3's digit and character.
It should end with #.
The characters in the middle of two # should be at least 5 characters long.
The complete token consists only of two #, lowercase and uppercase letters and digits. And
I don't know about regular expression but my guide told me this task is easily achieved at validation time by regular expressions. After I looked for long on the internet and found some links which are similar and tried to combine them and got this:
^[a-z]{5,}#[a-zA-Z0-9]{2}[A-Z][0-9A-Za-z]*[a-zA-Z0-9]{2}#$
But this only matches 1 test case. I don't know how I can achieve the middle part of two hashes. I tried to explain my problem as per my english. Please help.
Below test cases should pass
userabcd#4a39A234a#
randomuser#4A39a234A#
abcduser#2Aa39232A#
abcdxyz#1q39A231q#
randzzs#1aB1a#
Below test cases should fail:
randuser#1aaa1a#
randuser#1112#
randuser#a1a1##
randuser#1aa#
u#4a39a234a#
userstre#1qqeqe123231q$
user#1239a23$a#
useabcd#4a39a234a#12
You may try:
^[a-z]{5,}#(?=[^a-z\n]*[a-z])(?=[^A-Z\n]*[A-Z])(\d[a-zA-Z])[a-zA-Z\d]*\1#$
Explanation of the above regex:
^, $ - Represents start and end of the line respectively.
[a-z]{5,} - Matches lower case user names 5 or more times.
# - Matches # literally.
(?=[^a-z]*[a-z]) - Represents a positive look-ahead asserting at least a lowercase letters.
(?=[^A-Z]*[A-Z]) - Represents a positive look-ahead asserting at least an uppercase letters.
(\d[a-zA-Z]) - Represents a capturing group matching first 2 character i.e. a digit and a letter. If you want other way then use [a-zA-Z]\d.
[a-zA-Z\d]* - Matching zero or more of the characters in mentioned character set.
\1 - Represents back-reference exactly matching the captured group.
You can find the demo of the above regex in here.
Note: If you want to match one string at a time i.e. for practical purposes; remove \n from the character sets.
You can use this regex as an alternative.
^[a-z]{5,}#(?=.*?[a-z])(?=.*?[A-Z])(\d[a-zA-Z])[a-zA-Z\d]*\1#$
Recommended reading: Principle of contrast

RegEx - Password Strength

I'm trying to make a regex for allowing only strong passwords, strong in this case being defined as:
Must start with a letter (either uppercase or lowercase)
Must have at least 8 and up to 12 characters
Must have at least one uppercase letter
Must have at least three lowercase letters
Must have at least two numbers
Must have at least two special characters
Maximum number of identical consecutive characters is three
Now, last one is giving me trouble. How do I count consecutive characters?
For example, FOOfoo!?123 should work, but FOOOfoo!?12 should not (because or three esses).
What I've got so far:
^[A-Za-z]{1}(?=.*[A-Z]{1,})(?=.*[a-z]{3,})(?=.*[0-9]{2,})(?=.*[!?#*#&$]{2,}).{8,12}$
One more thing: something is amiss, because my regex above claims strings like FooFoo!?123 are invalid. I think it's because it only checks for one or more uppercase letters or three or more lowercase letters or numbers or specials, but I don't want that, I want that is the password contains three lowercase letters in total, it should be valid. How do I do that?
When you have so many conditions, it might be a good idea - provided your environment allows that - to split the regex and check each condition separately.
If you cannot do that, here is a free-spacing version of the fixed regex:
^ # start of string
(?=[^A-Z]*[A-Z]) # At least 1 uppercase ASCII letter
(?=(?:[^a-z]*[a-z]){3}) # at least 3 lowercase ASCII letters
(?=(?:[^0-9]*[0-9]){‌​2}) # at least 2 ASCII digits
(?=(?:[^!?#*#&$]*‌​[!?#*#&$]){2}) # at least 2 special symbols
(?!.*(‌​.)\1{2}) # No 3 consecutive characters
[A-Za-z] # An ASCII letter
.{7,‌​11} # 7 to 11 any characters but newline
$ # end of string
As a one-liner:
^(?=[^A-Z]*[A-Z])(?=(?:[^a-z]*[a-z]){3})(?=(?:[^0-9]*[0-9]){2})(?=(?:[^!?#*#&$]*[!?#*#&$]){2})(?!.*(.)\1{2})[A-Za-z].{7,11}$
See the regex demo
Notes:
Must have at least three lowercase letters and similar conditions are implemented using the principle of contrast, i.e. before [a-z], we may have 0+ opposite chars matched with [^a-z].
To match the 3 letters globally, not consecutively, we need to use a limiting quantifier on the grouping, not on the character class, thus, [a-z]{3,} (=consecutive 3 or more lowercase letters) is turned into (?:[^a-z]*[a-z]){3} (=3 sequences of non-lowercase letters followed with 1 lowercase letter).
The condition you needed is (?!.*(‌​.)\1{2}) - a negative lookahead ((?!...)) that checks for the presence of any character captured with (.) that is repeated twice after it with the \1 backreference and {2} limiting quantifier set on the backreference. And .* means that the repeated characters may appear anywhere in the string.