Regular Expression for email to check repetitive characters - regex

I'm validating email address with regular expression. I would like to test for a following conditions:
minimum of 3 characters in name, symbol #, minimum 3 characters in first part of domain, a dot,no more than 3 repetitive characters
I tried this regular expression and it's working fine for all cases except last one.
/^[A-Za-z0-9._%+-]{3,}\#[A-Za-z0-9.-]{3,}\.[A-Za-z]{2,4}$/
It's not checking for repetitive character(any character) after dot(.)
Not Ok: test#test.ccccom, test#test.coooom
Ok : test#test.com
Don't know what is wrong with last portion of my RE.
Any input will be appreciated.

You can use the following regex:
^(?!.*([A-Za-z0-9])\1{3})[A-Za-z0-9._%+-]{3,}\#[A-Za-z0-9-]{3,}\.[A-Za-z]{2,4}$
Changes made:
(?!.*([A-Za-z0-9])\1{3}) - This is a negative lookahead that makes sure that none of the characters repeat more than thrice in a row.
The rest of the regex is same as it is, except for the removal of the . from the second character class.
RegEx Demo
If you want to disallow repeated characters after the last ., then you could use the following instead:
^[A-Za-z0-9._%+-]{3,}\#[A-Za-z0-9-]{3,}\.(?!([A-Za-z0-9])\1{3})[A-Za-z]{2,4}$
RegEx Demo

This won't allow more than three repeated characters after the last dot,
^[A-Za-z0-9._%+-]{3,}\#[A-Za-z0-9.-]{3,}\.(?:(?!(.)\1{3})[a-zA-Z]){2,4}$
DEMO

Related

Regex check for name Initials

I am trying to create a regex that checks if one or more middle-name initials have the following stucture:
INITIAL.[BLANK]INITIAL.[BLANK]INITIAL.
There can be multiple Initials as long as they are followed by a dot (.) - blank spaces are only allowed between two initials (e.g. L. B.)
It should not be possible to have a space after an initial if there's no other initial following.
At the moment, I have the following Regex which doesn't work perfectly as of now:
([A-Z]\. (?=[A-Z]|$))+
Using regex101, this is an example:
As you can see, it still matches the string even though there's a blank space at the end, without having another Initial following.
I am not sure why this is happening. I am just learning regex and would be glad if anyone could provide me with a solution to my problem :)
The error you're seeing is because at the last step, your expression reads in [A-Z]\. looks ahead for $ (and finds it). I would express the pattern this way: (?:[A-Z]\. )*[A-Z]\.$. Treat the last initial specially because it does not have a final space.
The pattern you tried ([A-Z]\. (?=[A-Z]|$))+ uses a repeated capturing group which will give you the value of the last iteration.
In that repetition you match a space <code>[A-Z]\. </code> effectively meaning that it should be present in the match.
You could repeat 0+ times matching a char [A-Z] followed by a space to match multiple occurrences.
Then match a char [A-Z] asserting what is on the right is not a non whitespace char.
\b(?:[A-Z]\. )*[A-Z]\.(?!\S)
Regex demo
If there can be multiple spaces but it should not match a newline:
\b(?:[A-Z]\.[^\S\r\n]*)*[A-Z]\.(?!\S)
Regex demo

No period in first part of regular expression

This is what I'm currently working with:
((?i)(\w|^){0,25}[0-9]{3})[^\.]*#(gmail)\.com
What I'm attempting to do is block any email that is any amount of characters but with 3 numbers trailing the characters.
This works. HOWEVER, when Google creates a username for people, it usually chooses firstname.lastname####gmail.com. I don't want an email with a period before the #gmail.com to be included.
I have played and played with this expression, and I can't get it. So for example john.doe123#gmail.com, the expression is tagging everything after the period. I need for the regex to check the ENTIRE email and check to see if it follows the expression. I know there is this tidbit ^[^\.]*$ but I have no idea where to put it.
You could match 0-25 word characters followed by 3 digits \w{0,25}[0-9]{3} and use anchors to assert the start ^ and the end $ of the string.
^\w{0,25}[0-9]{3}#gmail\.com$
Regex demo
If you want to make use of the negated character class [^ you could match 0-25 times matching any char except a whitespace char, # or a dot followed by 3 digits using [^\s#.]{0,25}[0-9]{3}
^[^\s#.]{0,25}[0-9]{3}#gmail\.com$
Regex demo

Weird in a regular expression

I tried the following regular expression:
Pattern: ((.[^[0-9])+)(([0-9]{1,3}([.][0-9]{3})+)|([0-9]+))
My goal is to match any string (excluding digit number) followed by a specified number, e.g. MG2999, dasdassa33232
I used the above regular expression.
It's weird as follows:
V375 (not matched)
Vv375 (matched)
Vvv375 (not matched, but first character is not matched)
Vvvv375 (matched)
...
I don't understand why the first character is never matched. May I need your help?
For your quick test, please try: http://regex101.com/
Thanks in advance!
--
Vu
(.[^[0-9])+) matches any character (.), followed by any character except digits and [, repeatedly.
You probably want [^0-9]+ here – or, simpler, \D+.
The rest of there regular expression has similar problems but since I don’t know the number format you want to match I cannot correct that.

Regular expression to correct email address

I need help in writing one regular expression where I want to remove unwanted characters in the start and end of the email address. For example:
z>user1#hotmail.com<kt
z>user2#hotmail.pk<kt
z>puser3#yahoo.com<kt
z>npuser4#yaoo.uk<kt
After applying regular expression my emails should look like:
user1#hotmail.com
user2#hotmail.pk
puser3#yahoo.com
npuser4#yaoo.uk
Regular expression should not applied if email address is already correct.
You can try deleting matches of
^[^>]*>|<[^>]*$
(demo)
Debuggex Demo
Find ^[^>]*>([^<]*)<*.*$ and replace it with \1
Here's an example on regex101
I think you might be missing the point of a regular expression slightly. A regular expression defines the 'shape' of a string and return whether or not the string conforms to that shape. A simple expression for an email address might be something like:
[a-z][A-Z][0-9]*.?[a-z][A-Z][0-9]+#[a-z][A-Z][0-9]*.[a-z]+
But it is not simple to write one catch-all regular expression for an email address. Really, what you need to do to check it properly is:
Ensure there is one and only one '#'-sign.
Check that the part before the at sign conforms to a regular expression for this part:
Characters
Digits
Extended characters: .-'_ (that list may not be complete)
Check that the part after the #-sign conforms to the reg-ex for domain names:
Characters
Digits
Extended characters: . -
Must start with character or digit and must end with a proper domain name ending.
Try using a capturing group on anything between the characters you don't want. For example,
/>([\w|\d]+#[\w\d]+.\w+)</
Basically, any part that the regexp inside () matches is saved in a capturing group. This one matches anything that's inside >here< that starts with a bunch of characters or digits, has an #, has one or more word or digit characters, then a period, then some word characters. Should match any valid email address.
If you need characters besides >< to be matched, make a character class. That's what those square bracketed bits are. If you replace > with [.,></?;:'"] it'll match any of those characters.
Demo (Look at the match groups)

Condition for max character limit and on minimum character putting condition

I am trying to do do following match using regex.
The input characters should be capital letters starting from 2-10 characters.
If it's 2 characters then allow only those 2 characters which does not contain A,E,I,O,U either at first place or second place.
I tried:
[B-DF-HJ-NP-TV-XZ]{2,10}
It works well, but I am not too sure if this is the right and most efficient way to do regex here.
All credit to Jerry, for his answer:
^(?:(?![AEIOU])[A-Z]{2}|[A-Z]{3,10})$
Explanation:
^ = "start of string", and $ = "end of string". This is useful for preventing false matches (e.g. a 10-character match from an 11 character input, or "MR" matching in "AMRXYZ").
(?![AEIOU]) is a negative look-ahead for the characters A,E,I,O and U - i.e. the regex will not match if the text contains a vowel. This is only applied to the first half of the conditional "OR" (|) regex, so vowels are still allowed in longer matches.
The rest is fairly obvious, based on what you've already demonstrated an understanding about regex in your question above.