This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 2 years ago.
I'd like to use Regex to ensure that there are at least two upper case characters in a string, in any position, together or not.
The following gives me two together:
([A-Z]){2}
Environment - Classic ASP VB.
You can use the simple regex
[A-Z].*[A-Z]
which matches an upper case letter, followed by any number of anything (except linefeed), and another uppercase letter.
If you need it to allow linefeeds between the letters, you'll have to set the single line flag. If you're using JavaScript (U should always include flavor/language-tag when asking regex-related questions), it doesn't have that possibility. Then the solution suggested by Wiktor S in a comment to another answer should work.
[A-Z].*[A-Z]
A to Z , any symbols between, again A to Z
update
As Wiktor mentioned in comments:
This regex will check for 2 letters on a line (in most regex flavors), not in a string.
So
[A-Z][^A-Z]*[A-Z]
Should do the thing (In most regex flavors/tools)
I believe what you're looking for is something like this:
.*([A-Z]).*([A-Z]).*
Broken out into segments thats:
.* //Any number of characters (including zero)
([A-Z]) //A capital letter
.* //Any number of characters (including zero)
([A-Z]) //A second capital letter
.* //Any number of characters (including zero)
Related
This question already has answers here:
Matching Unicode letter characters in PCRE/PHP
(5 answers)
Closed 2 years ago.
I am trying to write a regex to match full words with or without an apostrophe.
I did this:
\b[a-zA-Z']+\b
However, it is matching the letters in bold Jönas while the desired is to not match the word Jönas at all because of the ö on it.
The right matches should go for anything in a-zA-Z'
Thus following cases should match in full:
Jonas
Don't
hasn'T
But not for:
Jönas
Dön't
Hélló
demo here: https://regex101.com/r/2sVN5S/1/ (where Jönas and Hélton should not be matched at all not even partially)
How to fix the regex, to follow this exact match?
UPDATE. Anubhava and Wiktor Stribiżew pointed out that using \b[a-zA-Z']+\b in Unicode mode is enough (fiddle 1 and fiddle 2).
As said Wiktor, there is no use case the answer below is relevant (no engine supports look-around groups while not supporting Unicode mode). So this answer isn't anymore relevant.
You can use this regex:
\b(?<![\x80-\xFF])[a-zA-Z']+(?![\x80-\xFF])\b
Here, [\x80-\xFF] stands for a range of character codes above ASCII 7bit set (where non-english letters lies). Basically, it looks for:
a sequence of english letters with or without apostrophes ...
not preceded by non-english letters (negative look-before group (?<!...)
not followed by non-english letters (negative look-ahead group (?!...)
Working Regex101.com sample.
This question already has answers here:
Regex not to allow double underscores
(3 answers)
Closed 3 years ago.
I have tried different regular expressions already but I am not sure how to have it catch one or more underscore. If are two together, must be invalid.
First word must be capital letter, then any character, the problem is underscore
I have this: (^[A-Z])(\w{6,30} ?=*(_))
This regex may work for you with a negative lookahead condition:
^[A-Z](?![^_]*__)\w{6,30}$
(?![^_]*__) is a negative lookahead condition that fails the match if __ appear anywhere after first capital letter.
RegEx Demo
If you mean a pattern which is a word starting with a capital letter followed by some groups consisting of a single underscore and a word:
^[A-Z]\w{6,30}(_\w{6,30})*$
This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 4 years ago.
I'm currently using this website to create some regular expressions for a programming language I want to build, at the moment I'm just setting up an expression for identifiers.
In my language, identifiers are expressed like most languages:
They cannot begin with a digit, or special character other than an underscore
After the first character they can contain alphanumeric and underscore characters
Given those rules I've come up with the following expression by myself:
^\D\w+$
Obviously, it doesn't account for special characters, however the following expression does (which I didn't make myself):
^(?!\d)\w+$
Why does the second expression account special characters? Shouldn't they be producing the same results?
I will explain why the second regex works.
The second regex uses a lookahead. After matching the start of the string, the engine checks whether the next character is a digit but it does not match it! This is important because if the next character is not a digit, it tries to use \w to match that same character, which it couldn't if the character is a symbol, if it is a digit, the negative lookahead fails and nothing is matched.
\D on the other hand, will match the character if it is not a digit, and \w will match whatever comes after that. That means all symbols are accepted.
This ^(?!\d)\w+$ means a string consisted of word characters [a-zA-Z0-9_] that doesn't start with a digit.
This ^\D\w+$ means a non-digit character followed by at least one character from [a-zA-Z0-9_] set.
So #ab01 is matched by second regex while first regex rejects it.
(?!\d)\w+ means "match a word which is not prepended with digits". But as you're wrapping it with ^ and $ characters it is basically the same as just ^\w+$ which is obviously not the same as ^\D\w+$. ^(?!\d).+\w+$ (note ".+" in the middle) would behave the same as ^\D\w+$
This question already has answers here:
Regex lookahead, lookbehind and atomic groups
(5 answers)
Closed 5 years ago.
I'm trying to learn a more advanced regular expressions for a password validator I'm working on because I think using regular expressions would be the best way out. I am using Java as my programming language
So for my pattern people suggested this (?=.*?[A-Z]) as to say "at least one upper case in the string". I have tried searching it at least but nothing seems to make it clear ?=.*? how this part makes sure it at least there.
here is the whole pattern ^(?=.*?[A-Z])(?=.*?[a-z])(?=.*?[0-9])(?=.*?[#?!#$%^&*-]).{8,}$
from what i understand
? means optional and occurs once
= means well i don't know yet
. is a wildcard
[A-Z] is the range of uppercase letters from A-Z
TLDR: So my question is how does this (?=.*?[A-Z]) make it sure atleast one uppercase letter is included? Any in-depth explanation?
(?= is the start of a look-ahead group — the question mark does not mean the same as a ? elsewhere
.*? is a non-greedy match against anything or nothing. The question-mark here also does not mean 'optional'.
[A-Z] is a character set containing the upper case ASCII letters A through to Z.
) is the end of the look-ahead group
So the net result is:
"Look ahead and see if, after maybe some characters, there is an upper case letter."
Your full expression, ^(?=.*?[A-Z])(?=.*?[a-z])(?=.*?[0-9])(?=.*?[#?!#$%^&*-]).{8,}$, can be read as:
"Match if the string contains an upper case letter, and a lower case letter, and a digit, and a non-alphanumeric, and there are at least 8 characters in total."
The regex is using a feature named positive lookahead, this is part of the regex lookarounds:
Positive lookahead: (?=...). Ex: a(?=b) matches a if followed by b
Negative lookahead: (?!...). Ex: a(?!b) matches a if not followed by b
Positive lookbehind: (?<=...). Ex: (?<=a)b matches b if preceded by a
Negative lookbehind: (?<!...). Ex: (?<=a)b matches b if not preceded by a
For your whole regex, you can see easily your pattern with this diagram:
Diagram link
Related to (?=.*?[A-Z]), it is being used after the ^. So, ^(?=.*?[A-Z])$ means match a line that start and end with whatever thing but having a uppercase character at the end
This question already has an answer here:
Learning Regular Expressions [closed]
(1 answer)
Closed 7 years ago.
I have the following regex pattern:
^[A-Za-z][A-Za-z0-9_-]+$`
It is used to match; alphanumeric characters, underscores and dashes, with the first character being alphabetical.
This works as expected, but I also need it to be able to match single characters. A conditions of a fails.
How can I modify the pattern to make a single alphabetical character pass?
The + means "one or more". Replace it with * for "zero or more".
^[A-Za-z][A-Za-z0-9_-]*$
This shoudl do it for you