What does this regex does? - regex

This is the string from Automapping configuration of NHibernate. I wonder what it does.
return string.Format("{0}_", Regex.Replace(member.Name, "([a-z](?=[A-Z])|[A-Z](?=[A-Z][a-z]))", "$1_").ToUpper());

Ok, let's break it up.
//This is the start
([a-z](?=[A-Z])|[A-Z](?=[A-Z][a-z]))
[a-z](?=[A-Z]) //this means to match one lower case a-z followed by an uppercase A-Z
| //or
[A-Z](?=[A-Z][a-z]) //One uppercase A-Z followed by one uppercase and one lowercase a-z
//The replace
$1_ //Replace the match with "the match plus underscore".
//aBxx would become a_Bxx and ABcxx would be A_Bcxx

Lets break it down:
([a-z](?=[A-Z])
The first section matches any lower case character that ends in an upper case character
|[A-Z](?=[A-Z][a-z]))
Or any upper case character that ends in an upper then a lower case character
For example this will match
AAb with the first 'A' being the match or
aB
with the 'a' being the match.
The regex is using http://www.regular-expressions.info/lookaround.html

Related

I feel like this regex pattern should work but it doesn't

I'm new to regex and honestly not that experienced.
I got this regex pattern that I want to try and use.
/(a..e.)([a-zA-Z])/gi
The plan is that it should match any words that follow the pattern. So I can loop over a list of words and it locks A in the first slot at E in the second to last spot. And it finds all words that matches this. However I've run into an issue. I expect it to match with the word ADDER however it doesn't. When I remove the last period, so that the pattern becomes
/(a..e)([a-zA-Z])/gi
It does work. Shouldn't these two basically be the same? Since we're using a wildcard dot?
Using the https://regexr.com/
The (a..e.)([a-zA-Z]) pattern looks for an a, after which there must be any two chars (other than line break chars), then an e letter, and then any single char other than line break chars. This pattern neither guarantees you match a whole word, nor that the chars matched with . will be letters.
/(a..e.)([a-zA-Z])/gi is not equal to /(a..e)([a-zA-Z])/gi as they match and consume different strings. Since there is no . after e, the second pattern matches fewer chars, not allowing any single char other than line break chars after e letter before any single letter (the last pattern part).
To match words starting with the a letter, followed with two more letters, then an e letter, and then one more letter you can use
/\ba[a-z]{2}e[a-z]\b/gi
See the regex demo. Details:
/gi - match all occurrences (g) in a case insensitive way (i)
\b - matches a word boundary
a - a / A
[a-z]{2} - two ASCII letters
e - an e letter
[a-z] - any ASCII letter
\b - matches a word boundary.

Regex not allowed Capital letter followed by lower case letters followed by number followed by special character

regex need to match the below format
minimum 1 upper case
minimum 1 lower case
minimum 1 number case
minimum 1 special character
not allow More than two identical characters in a row
but we don't want to follow the specific below Patten(Initial cap word, followed by number, followed by special character- (e.g.,Fall2015!)) means upper case followed by lower case followed by number followed by special character
(?=.{8,24}$)(?=.*[a-z])(?=.*[A-Z])(?=.*[0-9])(?=.*[_.!#$*=-?#])(([A-Za-z0-9_.!#$*=-?#])\2?(?!\2))
Try this:
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[_.!#$*=?#-])(?!.*(.)\1\1)(?!.*[A-Z][a-z]+\d+[_.!#$*=?#-])[\w.!#$*=?#-]{8,24}$
The key changes are:
^ to anchor the expression to start
(?!.*(.)\1\1) which prevents tripled chars
(?!.*[A-Z][a-z]+\d+[_.!#$*=?#-]) to prevent input like “Fall2015!”
[\w.!#$*=?#-]{8,24}$ to restrict input to only these chars and only 8-24 length
Moving the hyphen to the end of the char class so it is a literal hyphen (not a range)
Note also the introduction of \d as shorthand for [0-9] and \w as shorthand for [a-zA-Z0-9_].

Check uppercase AND lowercase

I'm searching from a while for the regular expression that matches a word containing between 8 and 20 characters combining uppercases AND lowercases:
Here an example of valid expression : AaAaAaAaA or aaaaaaaaA
Not Valid expression example : aaaaaaaaa or AAAAAAAA
What I did till now is : ^[a-z][A-Z]{8,20}$, but as you can see, that doesn't work
Somebody have an idea for that ?
Best regards
Credit to anubhava, lookahead was the way to go.
\b(?=[a-z]+[A-Z]+|[A-Z]+[a-z]+)[a-zA-Z]{8,20}\b
The look ahead ensures that there is at least one capital and lowercase letter in any order in the match. This will match whole words that are 8-20 chars long and contain at least 1 upper case and at least 1 lower case letter.
^(?=[a-z]+[A-Z]+|[A-Z]+[a-z]+)[a-zA-Z]{8,20}$
Will anchor to the beginning and end of the string, and thus match only a single word.
You can see it in action here http://regex101.com/r/uE5lT4/4
Edit: First version did not match a word if the only capital was the last letter.
Your character class needs to include both a-z and A-Z ranges. Regex should be:
^[a-zA-Z]{8,20}$
Your regex:
^[a-z][A-Z]{8,20}$
is actually checking for a lower case character followed by 8 to 20 upper case alphabets.
Update: to make sure you match at least one lower and one upper case alphabet use lookahead regex like this:
^(?=[A-Z]*[a-z])(?=[a-z]*[A-Z])[a-zA-Z]{8,20}$
RegEx Demo

Regex to find Upper case character at beginning of each word in a field

I created a function that will compare a field against a regex and return 0 if it doesn't match the patter and 1 if it does. I've already created the class so I could create a UDF for the pattern matching.
function(expression,rexex) //If it matches it
I have been researching regex in SQL server for a bit this weekend and am at a bit of a crossroad.
I basically need to have the following pattern with 1 passing and 0 failing. Basically I want the first letter of every word do be capitalized:
the dog is bad - 0
The Dog Is Bad - 1
I'm ashamed to say that it's taken me all day just to figure out how to idenfity the first letter of each work and see if it's capital.
Here is what I have so far.
[\p{Lu}\p{Lt}]
Any help or nudge in the right direction would be appreciated.
Start of match (^) followed by one or more groups ((...)+) of a capital letter ([A-Z]) followed by zero or more word characters (\w*) followed by one or more spaces, or the end ((\s+|$)).
/^([A-Z]\w*(\s+|$))+/
This assumes letters only, and only one space per word:
^((?:\b[A-Z][a-z]*\b) {0,1})+$
Debuggex Demo
Free spaced:
^ //Start of line
( //(Capture)
(?: //(Non-capture)
\b // Followed by word boundary
[A-Z] // Followed by a capital letter
[a-z]* // Followed by zero or more lowercase letters
\b // Followed by word boundary
) {0,1} // Followed by either no space, or one space
)+ // One or more times
$ //End of line
You can use a Negative Lookahead (?!) to validate the line/sentence:
/(?!.*?\b[a-z].*?\b)^.*?$/gm
This will not pass on any string or line which has a word that begins with a lowercase letter.
As it seems you want to be unicode compatible, I'd do:
(?:^|\s+)(\p{lu}\p{Ll}*)

How to regex a string - length 8, first character letter and remaining numeric

I am trying to create a RegEx to match a string with the following criterion
Length 8
First character must be a letter a-z or A-Z
The remaining 7 must be numeric 0-9
examples
a5554444
B9999999
c0999999
This is what I have so far
^[0-9]{8}$
What am I missing to check the first character? I tried
^[a-zA-Z][0-9]{8}$
but that's not working.
I think this is what you want:
^[a-zA-Z][0-9]{7}$
the {...} metacharacter only matches the most previous pattern which in your case is [0-9]. the regex interpretation is as follows:
start at the beginning of the string (^)
match any character a-z or A-Z ([a-zA-Z]) in the first spot only one time
match any character 0-9 starting at the second spot ([0-9])
the preceding pattern mentioned in step 3 of [0-9] must exist exactly 7 times ({7})
When you put {8} as per your original question, you'll assume a string length total of 9: the first character being alphabetic case insensitive and the remaining 8 characters being numeric.