No period in first part of regular expression - regex

This is what I'm currently working with:
((?i)(\w|^){0,25}[0-9]{3})[^\.]*#(gmail)\.com
What I'm attempting to do is block any email that is any amount of characters but with 3 numbers trailing the characters.
This works. HOWEVER, when Google creates a username for people, it usually chooses firstname.lastname####gmail.com. I don't want an email with a period before the #gmail.com to be included.
I have played and played with this expression, and I can't get it. So for example john.doe123#gmail.com, the expression is tagging everything after the period. I need for the regex to check the ENTIRE email and check to see if it follows the expression. I know there is this tidbit ^[^\.]*$ but I have no idea where to put it.

You could match 0-25 word characters followed by 3 digits \w{0,25}[0-9]{3} and use anchors to assert the start ^ and the end $ of the string.
^\w{0,25}[0-9]{3}#gmail\.com$
Regex demo
If you want to make use of the negated character class [^ you could match 0-25 times matching any char except a whitespace char, # or a dot followed by 3 digits using [^\s#.]{0,25}[0-9]{3}
^[^\s#.]{0,25}[0-9]{3}#gmail\.com$
Regex demo

Related

Regular Expression for email formatting without hypen at first and last

I have created the regular expression which will take the email address as in following format:
abc#xyz.com.in
Regular Expression
/^(?!-)[\w-\.]+#([\w-]+\.)+[\w-]{2,4}/
I am trying to do the email which is not having hyphen at start and last.
Invalid Format
-abc#xyz.com
abc#xyz.com-
valid format
abc#xyz.com
abc#xyz.com.in
Your regex can be edited in a simple way (see a demo at Regex101):
/^[\w\.]+[\w\.\-]*#[\w\.]+\.[\w\.]{2,4}$/
^: This is the beginning of the line
[\w\.]+: This is the first part of the email before # can have only word characters (\w) or dot (\.) at least once.
[\w\.\-]*: After that, the same characters from the list before can occur including the dash (\-) and as many times as you want. Remember, the dash has to be escaped if used in the list between [ and ], otherwise it represents a range instead of the dash itself.
#: This matches itself.
[\w\.]+: After the #` character, there must be at least one character from the list.
\.: Then followed by the dot literally.
[\w\.]{2,4}: Finally the last 2-4 characters.
$: And the end of a line.
The difference between this and your Regex is just a little:
/^[\w\.]+[\w\.\-]*#[\w\.]+\.[\w\.]{2,4}$/
/^(?!-)[\w-\.]+#([\w-]+\.)+[\w-]{2,4}/
I rather avoided the negative look-ahead and specify (whitelist) the characters that can occur on the position, unless it is really needed to blacklist them (which I generally try to avoid). The rest of the Regex is quite similar except you should escape the dash - character between the list braces [ and ].
Finally, I omitted the capturing groups ( and ) and leave it up to you to place them wherever you need.
Add \w to each end of your regex, and include the end anchor$
^\w[\w.-]+#([\w-]+\.)+[\w-]{2,4}\w$
Note also the dot doesn't need escaping within a character class.
a complete email RegEx
/^(([^<>()[\]\\.,;:\s#"]+(\.[^<>()[\]\\.,;:\s#"]+)*)|(".+"))#((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$/

Altering Regex to allow apostrophe in email address

At work, our current ValidationExpression looks horrible and very confusing for me. We are using the WebForms <asp:RegularExpressionValidator> user control which looks like this:
<asp:RegularExpressionValidator ID="regEmail" runat="server"
ValidationGroup="EditEmails"
Text="*" ErrorMessage="Invalid email address."
ControlToValidate="txtAdd"
Display="Dynamic"
ValidationExpression="^(([A-Za-z0-9]+_+)|([A-Za-z0-9]+\-+)|([A-Za-z0-9]+\.+)|([A-Za-z0-9]+\++))*[A-Za-z0-9]+#((\w+\-+)|(\w+\.))*\w{1,63}\.[a-zA-Z]{2,6}$"/>
I need to somehow alter this to allow apostrophes ( ' ) inside an email because at the moment this expression is failing.
Example of an email that needs to pass this validation: Test.O'neill#example.co.uk
I am unsure what the expression does but I'm sure this could be made shorter (maybe not simpler but that does not matter as long as it works).
Anyone know of a better regular expression I could use which works against valid emails and takes this into consideration? Thank you!
EDIT: My question is different because the proposed duplicate question does not work for VB.Net RegularExpression Validator user control.
See regex in use here
^([a-z\d]+[_+.-])*[a-z\d']+#(\w+[.-])*\w{1,63}\.[a-z]+$
^ Assert position at the start of the line
([a-z\d]+[_+.-])* Capture the following any number of times
[a-z\d]+ Match any ASCII letter or digit one or more times (also matches uppercase variants with i flag enabled)
[_+.-] Match any character in the set
[a-z\d']+ Match any ASCII letter, digit, or apostrophe one or more times
# Match this literally
(\w+[.-])* Capture the following any number of times
\w+ Match any word character one or more times
[.-] Match any character in the set
\w{1,63} Match any word character between one and 63 times
\. Match a literaly dot .
[a-z]+ Match any ASCII letter one or more times
$ Assert position at the end of the line
To implement the above pattern in a case-insensitive manner, add the RegexOptions.IgnoreCase flag. For more information see this post.

Use Regular Expressions to find URLs without certain word patterns

I am trying to write a Regular Expression that can match URLs that don't have a certain pattern. The URLs I am trying to filter out shouldn't have an ID in them, which is 40 Hex uppercase characters.
For example, If I have the following URLs:
/dev/api/appid/A1B2C3D4E5A1B2C3D4E5A1B2C3D4E5A1B2C3D4E5/users
/dev/api/apps/list
/dev/api/help/apps/applicationname/apple/osversion/list/
(urls are made up, but the idea is that there are some endpoints with 40-length IDs, and some endpoints that don't, and some endpoints that are really long in total characters)
I want to make sure that the regular expression is only able to match the last 2 URLs, and not the first one.
I wrote the following regex,
\S+(?:[0-9A-F]{40})\S+
and it matches endpoints that do have the long ID in them, but skips over the ones that should be filtered. If I try to negate the regex,
\S+(?![0-9A-F]{40})\S+
It matches all endpoints, because some URLs have lengths that are greater than what the ID should be (40 characters).
How can I use a regular expression to filter out exactly the URLs I need?
Try this regex:
^(?!.*\/[0-9A-F]{40}\/).*$
Click for Demo
Explanation:
^ - asserts the start of the string/url
(?!.*\/[0-9A-F]{40}\/) - Negative Lookahead to check for the presence of a / followed by exactly 40 HEX characters followed by / somewhere in the string. Since, it is a negative lookahead, any string/url containing this pattern will not be matched.
.* - matches 0+ occurrences of any character except a newline character
$ - asserts the end of the string
^((?![A-F0-9]{40}).)*$
Uses a negative lookahead to match any line that doesn't have 40 hex digits in a row. Try it here.

Regular expression let periods in (.)

My regular expression lets in periods for some reason, how can I keep that from happening.
Rules:
4-15 characters
Any alphanumeric characters
Underscore as long as it's not first or last
[A-Za-z][A-Za-z0-9_]{3,14}
I don't want "bad.example" for work.
Edit: changed to 4-15 characters
Your regex matches example as a substring of bad.example. Use anchors to prevent that:
^[A-Za-z][A-Za-z0-9_]{1,12}[A-Za-z]$
Note that (like your regex) this regex also prevents digits from matching in the first and last position - if they should be allowed (as per your specs), just add 0-9 at the end of the character classes.
^[A-Za-z][A-Za-z0-9_]{3,14}$
try this
This will match any alphanumeric at the beginning and end. In the middle it will accept from one up to twelve alphanumerics including an underscore:
^[a-zA-Z\d]\w{1,12}[a-zA-Z\d]$
It does not match bad.example but matches only example as your regex allows a character from 4 to 15.See here.
http://regex101.com/r/xV4eL5/5
To prevent it you need to match the whole input and not make partial matches.Put a ^ start anchor and $ end anchor.
Use
\A[A-Za-z0-9][\w]{1,12}[A-Za-z0-9]\Z

regular expression not working as expected with the plus quantifier

I have
/\d+/
Using the string >"tom666tom"
It matches the 666. Shouldnt it fail when it hits the first t in tom?
How exactly is the regex engine working here. I know the plus sign means one or more.
it will fail if you tell the regex is should start and end with a number like so
/^\d+$/
the ^ defines the start of the string and $ the end.
Pattern search one or more digits (+) in the input string
You are not telling your expression to match the entire string. If any part of the string contains one or more digits, it will match. Use the ^ (zero-length start of line marker) and $ (zero-length end of line marker) to delimit your regex and indicate that the only thing on the line should be digits: /^\d+$/.
It shouldn't fall when it encounters first t in "tom" because a +
matches 1 or more of the preceeding token. This is a greedy match, and
will match as many characters as possible before satisfying the next
token.
In your regex /\d+/, the + is placed after \d which matches any digit.
As said in the definition, the regex engine is working perfectly, because it is matching the previous token (\d) as many times it could.
So it will match the digits till it encounters a mismatch.
So the preceeding token here is \d and hence, regex engine is working fine.