Regex with more than one OR/AND operator - regex

I'm trying to match text that is:
a combination of numbers and letters, and might contain [:,.]
OR
a * character plus at least one number OR letter (not necessarily in this order)
Meaning my regex should match all these
Bf1305020008401 6798ubbii230693
Nettbank til: Troij iudh Betalt: 03.05.13
7509*30.04
*87589
but not these:
0205
252,25

Yes, regex alternation with | does not have the meaning in a character group (e.g. [a-z|0-9]) that it does elsewhere in a pattern. (Think of it as implied between characters & character ranges within a character group, making it redundant.)
Pattern
This pattern should do what you need:
^((?=^.{0,}[0-9])(?=^.{0,}[a-zA-Z])[0-9a-zA-Z :,.]{2,}|(?!^\*$)(?=^[0-9.a-zA-Z]{0,}\*[0-9.a-zA-Z]{0,})(?!^[0-9.a-zA-Z]{0,}\*[0-9.a-zA-Z]{0,}\*)[*0-9.a-zA-Z]{2,})$
It matches...
Bf1305020008401 6798ubbii230693
Nettbank til: Troij iudh Betalt: 03.05.13
7509*30.04
*87589
...and does not match...
0205
252,25
...as you require.
You can try the pattern with the inputs you specified in a regex fiddle.
Explanation
Some explanation for the 1st subpattern (on the left side of the |) matching your 1st set of match criteria:
(?=^.{0,}[0-9]) - Assert that a number appears in the string.
(?=^.{0,}[a-zA-Z]) - Assert that a letter also (i.e. AND) appears in the string.
[0-9a-zA-Z :,.]{2,} - "a combination of numbers and letters, and might contain [ :,.]" (assuming the aforementioned assertions)
Similarly, some explanation for the 2nd subpattern (on the right side of the |) matching your 2nd set of match criteria:
(?!^\*$) - Assert that the string is not just *.
(?=^[0-9.a-zA-Z]{0,}\*[0-9.a-zA-Z]{0,}) - Assert that the string contains *.
(?!^[0-9.a-zA-Z]{0,}\*[0-9.a-zA-Z]{0,}\*) - Assert that the string does not contain more than one *.
[*0-9.a-zA-Z]{2,} - "a * character + atleast one number OR letter (not necessarily in this order)" (assuming the aforementioned assertions)
There is probably room to sand & polish the pattern - especially the lookahead assertions for * in the second subpattern I suspect; but it works and conveys the strategy I employed of multiple lookahead assertions to constrain each of the two subpatterns to fit your requirements.

As you comment below, I think you dose want a full line match, and by saying number and letter, I think it means digits and letters both occurred in the right match.
And by saying "a * character + atleast one number OR letter" I suppose "*" occurs only once in match.
Maybe you could try this one:
(^(?=.*[a-zA-Z]+)(?=.*[0-9]+)[0-9a-zA-Z :,.]+$)|(^[a-zA-Z0-9.]*\*[a-zA-Z0-9.]+$)|(^[a-zA-Z0-9.]+\*[a-zA-Z0-9.]*$)
It matches:
Bf1305020008401 6798ubbii230693
Nettbank til: Troij iudh Betalt: 03.05.13
7509*30.04
*87589
123456*
.*.
test123
123test
But won't match any of:
0205
252,25
*
123*345*789
rebound
test
123
Original:
This should work
(^[A-Za-z0-9 ]*(([A-Za-z]+[ ]*[0-9]+)|([0-9]+[ ]*[A-Za-z]+))[A-Za-z0-9 ]*$)|(^\*[A-Za-z0-9]+$)

Related

How to make regex that can take at most one asterisk in character class?

I want to create a regular expression that match a string that starts with an optional minus sign - and ends with a minus sign. In between must begin with a letter (upper or lower case) which can be followed by any combination of letters, numbers and may, at most, contain one asterix (*)
So far I have came up with this
[-]?[a-zA-Z]+[a-zA-Z0-9(*{0,1})]*[-]
Some examples of what I am trying to achieve.
"-yyy-" // valid
"-u8r*y75-" // valid
"-u8r**y75-" // invalid
Code
See regex in use here
^-?[a-z](?!(?:.*\*){2})[a-z\d*]*-$
Alternatively, you can use the following regex to achieve the same results without using a negative lookahead.
See regex in use here
^-?[a-z][a-z\d]*(?:\*[a-z\d]*)?-$
Results
Input
** VALID **
-yyy-
-u8r*y75-
** INVALID **
-u8r**y75-
Output
-yyy-
-u8r*y75-
Explanation
^ Assert position at the start of the line
-? Match zero or one of the hyphen character -
[a-z] Match a single ASCII alpha character between a and z. Note that the i modifier is turned on, thus this will also match uppercase variations of the same letters
(?!(?:.*\*){2}) Negative lookahead ensuring what follows doesn't match
(?:.*\*){2} Match an asterisk * twice
[a-z\d*]* Match any ASCII letter between a and z, or a digit, or the asterisk symbol * literally, any number of times
- Match this character literally
$ Assert position at the end of the line
Try this one:
-(((\w|\d)*)(\*?)((\w|\d)*))-
You can try it here:
https://regex101.com/
(-)?(\w)+(\*(?!\*)|\w+)(-)
I used grouping to make it more clear. I changed [a-zA-Z0-9] to \w which stands for the same.
(\*(?!\*)|\w+)
This is the important change. Explained in words:
If it is a star \* and the preceding char was not a star(?!\*) (called negative lookahead = look at the preceding part) or if it is \w = [a-zA-Z0-9].
Use this site to test: https://regexr.com/
They have a pretty good explaination on the left menu under "Reference".

RegEx expression to handle multiple conditions of breaking sentences

I am trying to make a regex that is used in an exception.
Therefore it must return false for these sentences (the leading digits are included in the strings):
3.{17} this is italics and should break.{18} 
4. this is another sentence and should break. 
5. This is another sentence and should break. 
And it must return true for these:
There are 2 reasons for this 1. you are here and 2. you are communicating. 
Is it 2? they wanted to know. 
1 digit at the beginning but with 1. with a period should return true.
In other words, if the beginning of the string is a number followed by a period, it should return false (even if "\{\d+\}" follows it optionally) and the character following the space does not matter. And it must return true if the number and period (or ! or ?) is embedded in the sentence followed by a lower case character, in other cases it must be false.
As a further note: this goes into a java properties file, and the value is then passed to a perl5 regex engine to return broken text.
I try to express it in one expression, but somehow I cannot get it right.
This is what have come up with until now:
^([^0-9\.]+[\.]|
[^\.!\?]*[\?!]+[\?!\.]+|
[0-9]+[^\?!\.]+[\?!\.]+|
[^0-9]*[0-9]+[^\?!\.]+[\?!\.]+)
(\{\d+\}[\u0020\u00A0]|
[\u0020\u00A0]*)[a-z]
I seem to arrived at an impasse and can't see what is I have wrong.
Thanks for any advice.
Update:
A simpler format with look-ahead: ^(?!\d+\.)[^.!?]*[.!?]+(\{\d+\}\s|\s*)\p{Ll} based on the comments.
You may use
^(?!\d+\.)[^.!?]*[.!?]+(\{\d+\}\s|\s*)\p{Ll}
See the regex demo.
The pattern matches:
^ - start of string anchor
(?!\d+\.) - a negative lookahead that will fail the match if its pattern is matched at the start of the string: 1+ digits followed with a dot
[^.!?]* - 0+ chars other than ., ! and ?
[.!?]+ - 1 or more ., ! or ? symbols
(\{\d+\}\s|\s*) - either a { + 1 or more digits + } or 0+ whitespaces (if you are not interested in the value captured with this capturing group, you may turn it into a non-capturing one by adding ?: after the first ().
\p{Ll} - a lowercase letter (if a u modifier is used, it will also match all Unicode lowercase letters).

Matching any password except one containing repeating characters [duplicate]

Edit: Thanks for the advice to make my question clearer :)
The Match is looking for 3 consecutive characters:
Regex Match =AaA653219
Regex Match = AA5556219
The code is ASP.NET 4.0. Here is the whole function:
public ValidationResult ApplyValidationRules()
{
ValidationResult result = new ValidationResult();
Regex regEx = new Regex(#"^(?=.*\d)(?=.*[a-zA-Z]).{8,20}$");
bool valid = regEx.IsMatch(_Password);
if (!valid)
result.Errors.Add("Passwords must be 8-20 characters in length, contain at least one alpha character and one numeric character");
return result;
}
I've tried for over 3 hours to make this work, referencing the below with no luck =/
How can I find repeated characters with a regex in Java?
.net Regex for more than 2 consecutive letters
I have started with this for 8-20 characters a-Z 0-9 :
^(?=.*\d)(?=.*[a-zA-Z]).{8,20}$
As Regex regEx = new Regex(#"^(?=.*\d)(?=.*[a-zA-Z]).{8,20}$");
I've tried adding variations of the below with no luck:
/(.)\1{9,}/
.*([0-9A-Za-z])\\1+.*
((\\w)\\2+)+".
Any help would be much appreciated!
http://regexr.com?34vo9
The regular expression:
^(?=.{8,20}$)(([a-z0-9])\2?(?!\2))+$
The first lookahead ((?=.{8,20}$)) checks the length of your string. The second portion does your double character and validity checking by:
(
([a-z0-9]) Matching a character and storing it in a back reference.
\2? Optionally match one more EXACT COPY of that character.
(?!\2) Make sure the upcoming character is NOT the same character.
)+ Do this ad nauseum.
$ End of string.
Okay. I see you've added some additional requirements. My basic forumla still works, but we have to give you more of a step by step approach. SO:
^...$
Your whole regular expression will be dropped into start and end characters, for obvious reasons.
(?=.{n,m}$)
Length checking. Put this at the beginning of your regular expression with n as your minimum length and m as your maximum length.
(?=(?:[^REQ]*[REQ]){n,m})
Required characters. Place this at the beginning of your regular expression with REQ as your required character to require N to M of your character. YOu may drop the (?: ..){n,m} to require just one of that character.
(?:([VALID])\1?(?!\1))+
The rest of your expression. Replace VALID with your valid Characters. So, your Password Regex is:
^(?=.{8,20}$)(?=[^A-Za-z]*[A-Za-z])(?=[^0-9]*[0-9])(?:([\w\d*?!:;])\1?(?!\1))+$
'Splained:
^
(?=.{8,20}$) 8 to 20 characters
(?=[^A-Za-z]*[A-Za-z]) At least one Alpha
(?=[^0-9]*[0-9]) At least one Numeric
(?:([\w\d*?!:;])\1?(?!\1))+ Valid Characters, not repeated thrice.
$
http://regexr.com?34vol Here's the new one in action.
Tightened up matching criteria as it was too broad; for example, "not A-Za-z" matches a lot more than is intended. The previous REGEX was matching on the string "ThiIsNot". For the most part, passwords are only going to contain alphanumeric and punctation characters, so I limited the scope, which made all matches more accurate. Used character classes for human readability. Added and exclusion list, and differentiated upper and lower case letters.
^(?=.{8,20}$)(?!(?:.*[01IiLlOo]))(?=(?:[\[[:digit:]\]\[[:punct:]\]]*[\[[:alpha:]\]]){2})(?=(?:[\[[:digit:]\]\[[:punct:]\]\[[:upper:]\]]*[\[[:lower:]\]]){1})(?=(?:[\[[:digit:]\]\[[:punct:]\]\[[:lower:]\]]*[\[[:upper:]\]]){1})(?=(?:[\[[:alpha:]\]\[[:punct:]\]]*[\[[:digit:]\]]){1})(?=(?:[\[[:alnum:]\]]*[\[[:punct:]\]]){1})(?:([\[[:alnum:]\]\[[:punct:]\]])\1?(?!\1))+$
The breakdown:
^(?=.{8,20}$) - Positive lookahead that the string is between 8 and 20 chars
(?!(?:.*[01IiLlOo])) - Negative lookahead for any blacklisted chars
(?=(?:[\[[:digit:]\]\[[:punct:]\]]*[\[[:alpha:]\]]){2}) - Verify that at least 2 alpha chars exist
(?=(?:[\[[:digit:]\]\[[:punct:]\]\[[:upper:]\]]*[\[[:lower:]\]]){1}) - Verify that at least 1 lowercase alpha exists
(?=(?:[\[[:digit:]\]\[[:punct:]\]\[[:lower:]\]]*[\[[:upper:]\]]){1}) - Verify that at least 1 uppercase alpha exists
(?=(?:[\[[:alpha:]\]\[[:punct:]\]]*[\[[:digit:]\]]){1}) - Verify that at least 1 digit exists
(?=(?:[\[[:alnum:]\]]*[\[[:punct:]\]]){1}) - Verify that at least 1 special/punctuation char exists
(?:([\[[:alnum:]\]\[[:punct:]\]])\1?(?!\1))+$ - Verify that no char is repeated more than twice in a row

RegEx No more than 2 identical consecutive characters and a-Z and 0-9

Edit: Thanks for the advice to make my question clearer :)
The Match is looking for 3 consecutive characters:
Regex Match =AaA653219
Regex Match = AA5556219
The code is ASP.NET 4.0. Here is the whole function:
public ValidationResult ApplyValidationRules()
{
ValidationResult result = new ValidationResult();
Regex regEx = new Regex(#"^(?=.*\d)(?=.*[a-zA-Z]).{8,20}$");
bool valid = regEx.IsMatch(_Password);
if (!valid)
result.Errors.Add("Passwords must be 8-20 characters in length, contain at least one alpha character and one numeric character");
return result;
}
I've tried for over 3 hours to make this work, referencing the below with no luck =/
How can I find repeated characters with a regex in Java?
.net Regex for more than 2 consecutive letters
I have started with this for 8-20 characters a-Z 0-9 :
^(?=.*\d)(?=.*[a-zA-Z]).{8,20}$
As Regex regEx = new Regex(#"^(?=.*\d)(?=.*[a-zA-Z]).{8,20}$");
I've tried adding variations of the below with no luck:
/(.)\1{9,}/
.*([0-9A-Za-z])\\1+.*
((\\w)\\2+)+".
Any help would be much appreciated!
http://regexr.com?34vo9
The regular expression:
^(?=.{8,20}$)(([a-z0-9])\2?(?!\2))+$
The first lookahead ((?=.{8,20}$)) checks the length of your string. The second portion does your double character and validity checking by:
(
([a-z0-9]) Matching a character and storing it in a back reference.
\2? Optionally match one more EXACT COPY of that character.
(?!\2) Make sure the upcoming character is NOT the same character.
)+ Do this ad nauseum.
$ End of string.
Okay. I see you've added some additional requirements. My basic forumla still works, but we have to give you more of a step by step approach. SO:
^...$
Your whole regular expression will be dropped into start and end characters, for obvious reasons.
(?=.{n,m}$)
Length checking. Put this at the beginning of your regular expression with n as your minimum length and m as your maximum length.
(?=(?:[^REQ]*[REQ]){n,m})
Required characters. Place this at the beginning of your regular expression with REQ as your required character to require N to M of your character. YOu may drop the (?: ..){n,m} to require just one of that character.
(?:([VALID])\1?(?!\1))+
The rest of your expression. Replace VALID with your valid Characters. So, your Password Regex is:
^(?=.{8,20}$)(?=[^A-Za-z]*[A-Za-z])(?=[^0-9]*[0-9])(?:([\w\d*?!:;])\1?(?!\1))+$
'Splained:
^
(?=.{8,20}$) 8 to 20 characters
(?=[^A-Za-z]*[A-Za-z]) At least one Alpha
(?=[^0-9]*[0-9]) At least one Numeric
(?:([\w\d*?!:;])\1?(?!\1))+ Valid Characters, not repeated thrice.
$
http://regexr.com?34vol Here's the new one in action.
Tightened up matching criteria as it was too broad; for example, "not A-Za-z" matches a lot more than is intended. The previous REGEX was matching on the string "ThiIsNot". For the most part, passwords are only going to contain alphanumeric and punctation characters, so I limited the scope, which made all matches more accurate. Used character classes for human readability. Added and exclusion list, and differentiated upper and lower case letters.
^(?=.{8,20}$)(?!(?:.*[01IiLlOo]))(?=(?:[\[[:digit:]\]\[[:punct:]\]]*[\[[:alpha:]\]]){2})(?=(?:[\[[:digit:]\]\[[:punct:]\]\[[:upper:]\]]*[\[[:lower:]\]]){1})(?=(?:[\[[:digit:]\]\[[:punct:]\]\[[:lower:]\]]*[\[[:upper:]\]]){1})(?=(?:[\[[:alpha:]\]\[[:punct:]\]]*[\[[:digit:]\]]){1})(?=(?:[\[[:alnum:]\]]*[\[[:punct:]\]]){1})(?:([\[[:alnum:]\]\[[:punct:]\]])\1?(?!\1))+$
The breakdown:
^(?=.{8,20}$) - Positive lookahead that the string is between 8 and 20 chars
(?!(?:.*[01IiLlOo])) - Negative lookahead for any blacklisted chars
(?=(?:[\[[:digit:]\]\[[:punct:]\]]*[\[[:alpha:]\]]){2}) - Verify that at least 2 alpha chars exist
(?=(?:[\[[:digit:]\]\[[:punct:]\]\[[:upper:]\]]*[\[[:lower:]\]]){1}) - Verify that at least 1 lowercase alpha exists
(?=(?:[\[[:digit:]\]\[[:punct:]\]\[[:lower:]\]]*[\[[:upper:]\]]){1}) - Verify that at least 1 uppercase alpha exists
(?=(?:[\[[:alpha:]\]\[[:punct:]\]]*[\[[:digit:]\]]){1}) - Verify that at least 1 digit exists
(?=(?:[\[[:alnum:]\]]*[\[[:punct:]\]]){1}) - Verify that at least 1 special/punctuation char exists
(?:([\[[:alnum:]\]\[[:punct:]\]])\1?(?!\1))+$ - Verify that no char is repeated more than twice in a row

Regular Expression to match set of arbitrary codes

I am looking for some help on creating a regular expression that would work with a unique input in our system. We already have some logic in our keypress event that will only allow digits, and will allow the letter A and the letter M. Now I need to come up with a RegEx that can match the input during the onblur event to ensure the format is correct.
I have some examples below of what would be valid. The letter A represents an age, so it is always followed by up to 3 digits. The letter M can only occur at the end of the string.
Valid Input
1-M
10-M
100-M
5-7
5-20
5-100
10-20
10-100
A5-7
A10-7
A100-7
A10-20
A5-A7
A10-A20
A10-A100
A100-A102
Invalid Input
a-a
a45
4
This matches all of the samples.
/A?\d{1,3}-A?\d{0,3}M?/
Not sure if 10-A10M should or shouldn't be legal or even if M can appear with numbers. If it M is only there without numbers:
/A?\d{1,3}-(A?\d{1,3}|M)/
Use the brute force method if you have a small amount of well defined patterns so you don't get bad corner-case matches:
^(\d+-M|\d+-\d+|A\d+-\d+|A\d+-A\d+)$
Here are the individual regexes broken out:
\d+-M <- matches anything like '1-M'
\d+-\d+ <- 5-7
A\d+-\d+ <- A5-7
A\d+-A\d+ <- A10-A20
/^[A]?[0-9]{1,3}-[A]?[0-9]{1,3}[M]?$/
Matches anything of the form:
A(optional)[1-3 numbers]-A(optional)[1-3 numbers]M(optional)
^A?\d+-(?:A?\d+|M)$
An optional A followed by one or more digits, a dash, and either another optional A and some digits or an M. The '(?: ... )' notation is a Perl 'non-capturing' set of parentheses around the alternatives; it means there will be no '$1' after the regex matches. Clearly, if you wanted to capture the various bits and pieces, you could - and would - do so, and the non-capturing clause might not be relevant any more.
(You could replace the '+' with '{1,3}' as JasonV did to limit the numbers to 3 digits.)
^A?\d{1,3}-(M|A?\d{1,3})$
^ -- the match must be done from the beginning
A? -- "A" is optional
\d{1,3} -- between one and 3 digits; [0-9]{1,3} also work
- -- A "-" character
(...|...) -- Either one of the two expressions
(M|...) -- Either "M" or...
(...|A?\d{1,3}) -- "A" followed by at least one and at most three digits
$ -- the match should be done to the end
Some consequences of changing the format. If you do not put "^" at the beginning, the match may ignore an invalid beginning. For example, "MAAMA0-M" would be matched at "A0-M".
If, likewise, you leave $ out, the match may ignore an invalid trail. For example, "A0-MMMMAAMAM" would match "A0-M".
Using \d is usually preferred, as is \w for alphanumerics, \s for spaces, \D for non-digit, \W for non-alphanumeric or \S for non-space. But you must be careful that \d is not being treated as an escape sequence. You might need to write it \\d instead.
{x,y} means the last match must occur between x and y times.
? means the last match must occur once or not at all.
When using (), it is treated as one match. (ABC)? will match ABC or nothing at all.
I’d use this regular expression:
^(?:[1-9]\d{0,2}-(?:M|[1-9]\d{0,2})|A[1-9]\d{0,2}-A?[1-9]\d{0,2})$
This matches either:
<number>-M or <number>-<number>
A<number>-<number> or A<number>-A<number>
Additionally <number> must not begin with a 0.