I have to verify that strings match the following format before the first whitespace (if there is one):
Up to 3 leading letters
At least 4 consecutive digits
Up to 3 trailing letters
To give examples, the following are valid:
1234
Abc123456DeF
1234 blah+
XyZ01234
I'm having trouble avoiding this case however: 123a+b blah
So far I have (^\w{0,3}\d{4}\w{0,3})\s* but the problem lies in making sure a non-letter isn't caught in the first section.
I can see a couple solutions:
Run regex twice, first getting the string up to the first whitespace ([^\s]+) then apply regex again to that making sure it ends in up to 3 letters (^\w{0,3}\d{4}\w{0,3}$). This is what I do now, but surely there's a way to do this in one expression - I just can't figure out how
Make sure no non-letters exist between the (potential) 3 trailing letters and the (potential) whitespace. (^\w{0,3}\d{4}\w{0,3}no non-letters)\s*
I've tried negative lookahead (?!.*) but that doesn't seem to do anything.
This regex satisfy your specifications.
Regex: ^\w{0,3}\d{4,}\w{0,3}\s?$
Explanation:
According to your specifications.
\w{0,3}? Up to 3 leading letters
\d{4,} At least 4 consecutive digits
\w{0,3}? Up to 3 trailing letters
I have to verify that strings match the following format before the first whitespace (if there is one):
\s? hence an optional space.
Regex101 Demo
Note:- I am keeping this as stroked out because there were many shortcomings pointed out in comments. So to maintain the context of comments.
Solution:
Like I said in my comment.
#JCK: Problem is . . even whitespace is optional. Thus making it difficult to differentiate between first and second part.
Now employing a lookahead solves this problem. Complete regex goes like this.
Regex: ^(?=.*[0-9]{4,}[A-Za-z]{0,3}(?:\s|$))[A-Za-z]{0,3}[0-9]{4,}[A-Za-z]{0,3}\s*?(?:\S*\s*)*$
Explanation:
(?=.*[0-9]{4,}[A-Za-z]{0,3}(?:\s|$)) This positive lookahead makes sure that the first part defined by your specifications is matched. It looks for mentioned specs and either a \s or $ i.e end of string. Thus matching the first part.
[A-Za-z]{0,3}[0-9]{4,}[A-Za-z]{0,3}\s*?(?:\S*\s*)* Rest of the regex is as per the specifications.
Check by entering strings one by one.
Regex: (^[A-Za-z]{0,3}\d{4,}[A-Za-z]{0,3})(?:$|\s+)
\w is same as [A-Za-z0-9_], so to match just letters you should use [A-Za-z].
(?:$|\s+) matches end of string or at least one whitespace (hence ignoring the rest of the string).
Related
I have a string that the following structure:
ABCD123456EFGHIJ78 but sometimes it's missing a number or a character like:
ABC123456EFGHIJ78 or
ABCD123456E or
ABCD12345EFGHIJ78
etc.
That's why I need regular expressions.
What I want to extract is the first letter of the third group, in this case 'E'.
I have the following regex:
(\D+)+(\d+)+(\D{1})\3
but I don't get the letter E.
This seems to work for the example cases you provided.
^(?:[A-Za-z]+)(?:\d+)(.)
It assumes that the first group is only letters and that the second group is only digits.
There's already a nice answer.
But for the records, your initial proposal was very close to work. You just needed to say that the character matching the 3rd group can repeat several times by adding a star:
^(\D+)(\d+)(\D{1})\3*
The main weakness is that \D matches any char except digits, so also spaces. Making it more robust leads us to explicit the range of chars accepted:
^([A-Za-z]+)(\d+)([A-Za-z]{1})\3*
It's much better, but my favourite uses \w to match at the end of the pattern any non white character:
([A-Za-z]+)(\d+)([A-Za-z]{1})\w*
I really don't use RegEx that much. You could say I am RegEx n00b. I have been working on this issue for a half a day.
I am trying to write a pattern that looks backward from a number character. For example:
1. bob1 => bob
2. cat3 => cat
3. Mary34 => Mary
So far I have this (?![A-Z][a-z]{1,})([A-Za-z_])
It only matches for individual characters, I want all the characters before the number character. I tried to add the ^ and $ into my pattern and using an online simulator. I am unsure where to put the ^ and $.
NOTE: I am using RegEx for the .NET Framework
You may use a regex like
[\p{L}_]+(?=\d)
or
[\w-[\d]]+(?=\d)
See the regex demo
Pattern details
[\p{L}_]+ - any 1 or more letters (both lower- and uppercase) and/or _
OR
[\w-[\d]]+ - 1 or more word chars except digits (the -[] inside a character class is a character class subtraction construct)
(?=\d) - a positive lookahead that requires a digit to appear immediately to the right of the current location
If we break down your RegEx, we see:
(?![A-Z][a-z]{1,}) which says "look ahead to find a string that is NOT one uppercase letter followed one or more lowercase letters" and ([A-Za-z_]) which says "match one letter or underscore". This should end up matching any single lowercase letter.
If I understand what you want to achieve, then you want all of the letters before a number. I would write something like that as:
\b([a-zA-Z]+)[0-9]
This will start at a word boundary \b, match one or more letters, and require a digit right after the matched string.
(The syntax I used seems to match this document about .NET RegEx: https://learn.microsoft.com/en-us/dotnet/standard/base-types/regular-expressions)
In light of Wiktor Stribizew's comment, here is a pure match RegEx:
\b[a-zA-Z_]+(?=[0-9])
This matches the pattern and then looks ahead for the digit. This is better than my first lookahead attempt. (Thank you Wiktor.)
http://www.rexegg.com/regex-lookarounds.html
I'm try to build regex pattern which requires the string to contain multicase letters together, but there's no success.
Here's what I have, but it doesn't work:
(?=[A-Z]+)(?=[a-z]+)(?=[0-9]+)
In other words, the string should to match only if it contains uppercase and lowercase and digits in any order like that:
MyPass777 <-- match
Mypass777 <-- match
MyPass <-- no match
mypass777 <-- no match
So, how to let this work?
Your positive lookaheads must also use .* before your conditions to allow for any arbitrary number of character before letter or numbers:
\b(?=.*[A-Z])(?=.*[a-z])(?=.*[0-9])[A-Za-z0-9]+\b
RegEx Demo
Also note use of \b (word boundary) on either side of your regex to make sure to match complete words only.
If you want a yes/no test, then use alternation.
Require something that has a upper and eventually a lower OR something that has a lower and eventually a upper.
With spaces added for clarity
(?: [a-z].*[A-Z] | [A-Z].*[a-z] )
With a third requirement, numbers, it gets combinatorially more expensive.
You're better off testing in three phases. Does this have a uppercase? If not, fail. Does it have a lowercase? If not, fail. Does it have a number? If not, fail. Else, it's okay.
Use separate regexes instead of single regex to gain additional benefits.
With this approach, you do not limit user to enter uppercase+lowercase+digits, but if they use for example uppercase+lowercase+punctation, the password will be considered equally good.
Test 4 cases:
[A-Z]
[a-z]
[0-9]
[\!+\-*##$%\^&*[\]{}:";'<>?,./] ' or refer to Unicode character class P (punctuation) instead
Now count matching cases.
1-2 cases: weak password.
3 cases: good password.
4 cases: strong password.
This pattern does forward lookahead and requires that the next character be an uppercase letter, a lowercase letter, and a digit at the same time. It never matches.
You want something like
(?=\w*[A-Z])(?=\w*[a-z])(?=\w*[0-9])(\w+\b)
At least, that's my best understanding of your problem: You want a string of alphanumeric characters that contains at least one uppercase letter, at least one lowercase letter, and at least one digit.
I need a regex that will match strings of letters that do not contain two consecutive dashes.
I came close with this regex that uses lookaround (I see no alternative):
([-a-z](?<!--))+
Which given the following as input:
qsdsdqf--sqdfqsdfazer--azerzaer-azerzear
Produces three matches:
qsdsdqf-
sqdfqsdfazer-
azerzaer-azerzear
What I want however is:
qsdsdqf-
-sqdfqsdfazer-
-azerzaer-azerzear
So my regex loses the first dash, which I don't want.
Who can give me a hint or a regex that can do this?
This should work:
-?([^-]-?)*
It makes sure that there is at least one non-dash character between every two dashes.
Looks to me like you do want to match strings that contain double hyphens, but you want to break them into substrings that don't. Have you considered splitting it between pairs of hyphens? In other words, split on:
(?<=-)(?=-)
As for your regex, I think this is what you were getting at:
(?:[^-]+|-(?<!--)|\G-)+
The -(?<!--) will match one hyphen, but if the next character is also a hyphen the match ends. Next time around, \G- picks up the second hyphen because it's the next character; the only way that can happen (except at the beginning of the string) is if a previous match broke off at that point.
Be aware that this regex is more flavor dependent than most; I tested it in Java, but not all flavors support \G and lookbehinds.
I am implementing the following problem in ruby.
Here's the pattern that I want :
1234, 1324, 1432, 1423, 2341 and so on
i.e. the digits in the four digit number should be between [1-4] and should also be non-repetitive.
to make you understand in a simple manner I take a two digit pattern
and the solution should be :
12, 21
i.e. the digits should be either 1 or 2 and should be non-repetitive.
To make sure that they are non-repetitive I want to use $1 for the condition for my second digit but its not working.
Please help me out and thanks in advance.
You can use this (see on rubular.com):
^(?=[1-4]{4}$)(?!.*(.).*\1).*$
The first assertion ensures that it's ^[1-4]{4}$, the second assertion is a negative lookahead that ensures that you can't match .*(.).*\1, i.e. a repeated character. The first assertion is "cheaper", so you want to do that first.
References
regular-expressions.info/Lookarounds and Backreferences
Related questions
How does the regular expression (?<=#)[^#]+(?=#) work?
Just for a giggle, here's another option:
^(?:1()|2()|3()|4()){4}\1\2\3\4$
As each unique character is consumed, the capturing group following it captures an empty string. The backreferences also try to match empty strings, so if one of them doesn't succeed, it can only mean the associated group didn't participate in the match. And that will only happen if string contains at least one duplicate.
This behavior of empty capturing groups and backreferences is not officially supported in any regex flavor, so caveat emptor. But it works in most of them, including Ruby.
I think this solution is a bit simpler
^(?:([1-4])(?!.*\1)){4}$
See it here on Rubular
^ # matches the start of the string
(?: # open a non capturing group
([1-4]) # The characters that are allowed the found char is captured in group 1
(?!.*\1) # That character is matched only if it does not occur once more
){4} # Defines the amount of characters
$
(?!.*\1) is a lookahead assertion, to ensure the character is not repeated.
^ and $ are anchors to match the start and the end of the string.
While the previous answers solve the problem, they aren't as generic as they could be, and don't allow for repetitions in the initial string. For example, {a,a,b,b,c,c}. After asking a similar question on Perl Monks, the following solution was given by Eily:
^(?:(?!\1)a()|(?!\2)a()|(?!\3)b()|(?!\4)b()|(?!\5)c()|(?!\6)c()){6}$
Similarly, this works for longer "symbols" in a string, and for variable length symbols too.