Improve exim regex to catch everything but specified adresses - regex

I'm using this regex to catch any incoming e-mails excluding mails from from specific people.
^(.(?!(zulgrib#exemple.com|zulgrib#example.org)).)*$/i
This regex correctly let through these scenarios
Zulgrib at example.com <Zulgrib#example.com>
<Zulgrib#example.com>
<Zulgrib#example.com> In behalf of Robot
Regex correctly catches these kind of headers
Associate#example.org
Your Associate Associate#example.com
If an excluded e-mail address is alone, it will catch it, I would like to prevent that. Example:
zulgrib#exemple.org
What should be modified to allow this to work and why my current method is not correct ?
If I understand the documentation, . matches any character, void is not a character, but using * is not working.

First, some issues in your current regex:
exemple has a different spelling than example
Literal points need to be escaped. So \.com instead of .com.
There are two dots (.) in the outermost group, which means you only capture text with an even number of characters, and don't exclude the case where the email addresses start at the beginning of the string. The first dot should not be there.
To make an exception for when the email address is the only thing in the input, I fear you'll have to specify that as a separate alternative in which (unfortunately) you'll have to repeat those email addresses:
^(?:zulgrib#example\.com|zulgrib#example\.org)$|^(?!(?:.*(?:zulgrib#example\.com|zulgrib#example\.org))).*$

Related

Regex match domain that contain certain subdomain

I have this regex (not mine, taken from here)
^[^\.]+\.example\.org$
The regex will match *.example.org (e.g. sub.example.org), but will leaves out sub-subdomain (e.g. sub.sub.example.org), that's great and it is what I want.
But I have other requirement, I want to match subdomain that contain specific string, in this case press. So the regex will match following (literally any subdomain that has word press in it).
free-press.example.org
press.example.org
press23.example.org
I have trouble finding the right syntax, have looked on the internet and mostly they works only for standalone text and not domain like this.
Ok, let's break down what the "subdomain" part of your regex does:
[^\.]+ means "any character except for ., at least once".
You can break your "desired subdomain" up into three parts: "the part before press", "press itself", and "the part after press".
For the parts before and after press, the pattern is basically the same as before, except that you want to change the + (one or more) to a * (zero or more), because there might not be anything before or after press.
So your subdomain pattern will look like [^\.]*press[^\.]*.
Putting it all together, we have ^[^\.]*press[^\.]*\.example\.org$. If we put that into Regex101 we see that it works for your examples.
Note that this isn't a very strict check for valid domains. It might be worth thinking about whether regexes are actually the best tool for the "subdomain checking" part of this task. You might instead want to do something like this:
Use a generic, more thorough, domain-validation regex to check that the domain name is valid.
Split the domain name into parts using String.split('.').
Check that the number of parts is correct (i.e. 3), and that the parts meet your requirements (i.e. the first contains the substring press, the second is example, and the third is org).
If you're looking for a regex that matches URLs whose subdomains contain the word press then use
^[^\.]*press[^\.]*\.example\.org$
See the demo

Google Analytics IP Filter Exclude

Could someone help me with some REGEX...
I have been blocking internal traffic using the filter pattnrn:
10.*..
This just bit me in the foot as this is blocking all referral traffic between our sites.
What I want to do now is block everything except 10.103..
Do I need to apply two separate ranges, or can I accomplish this with one filter?
If you want to block everything but 10.103.xxx.xxx, use an include filter instead of the usual exclude filter.
NOTE ABOUT REGEXES MATCHING IPs IN ANALYTICS
I am not sure if the filter I suggested above uses regex or not (literal string match), but it doesn't make a difference because there's no way the expression 10.103. could be misinterpreted in an IP address.
Your original pattern, on the other hand, is bogus and is probably hurting you. That's because in a regex the dot . is not a literal dot, but represents any character. Your expression, in fact, excludes every single IP that merely starts with 10 (not just 10. that is ten-dot), including 100.xxx, 101.xxx etc.
The correct version of your original excluding regex would be 10\..*, which contains an escaped dot (\.), then proceeds to any characters after that (.*).
REGEXP are very good explained in the Google Analytics Help (here).
For multiple IPs, there is this little helper, which generates the REGEXP for you.
If you want to block internal traffic, just ADD NEW FILTER and CUSTOM then EXCLUDE and put the IP in REGEXP in the field, that's it.

validate email addresses using a regex. [duplicate]

This question already has answers here:
How can I validate an email address using a regular expression?
(79 answers)
Closed 7 years ago.
I am trying to validate email addresses using a regex. This is what I have now ^([\w-.]+)#([\w-]+).(aero|asia|be|biz|com.ar|co.in|co.jp|co.kr|co.sg|com|com.ar|com.mx|com.sg|com.ph|co.uk|coop|de|edu|fr|gov|in|info|jobs|mil|mobi|museum|name|net|net.mx|org|ru)*$ I found many solutions using non-capturing groups but did not know why. Also can you tell me if this is the correct regex and also if there are any valid emails which are not being validated correctly and vice-versa
Don’t bother, there are many ways to validate an email address. Ever since there are internationalized domain names, there’s no point in listing TLDs. On the other hand, if you want to limit your acceptance to only a selection of domains, you’re on the right track. Regarding your regex:
You have to escape dots so they become literals: . matches almost anything, \. matches “.”
In the domain part, you use [\w-] (without dot) which won’t work for “#mail.example.com”.
You probably should take a look at the duplicate answer.
This article shows you a monstrous, yet RFC 5322 compliant regular expression, but also tells you not to use it.
I like this one: /^.+#.+\...+$/ It tests for anything, an at sign, any number of anything, a dot, anything, and any number of anything. This will suffice to check the general format of an entered email address. In all likelihood, users will make typing errors that are impossible to prevent, like typing john#hotmil.com. He won’t get your mail, but you successfully validated his address format.
In response to your comment: if you use a non-capturing group by using (?:…) instead of (…), the match won’t be captured. For instance, all email addresses have an at sign, you don’t need to capture it. Hence, (john)(?:#)(example\.com) will provide the name and the server, not the at sign. Non-capturing groups are a regex possibility, they have nothing to do with email validation.

Regex for multiple specific email addresses

I am still figuring my way around regex and have come across a problem that I am trying to solve. How do I validate for multiple specific email addresses?
For example, I want to only allow testdomain.com, realdomain.com, gooddomain.com to be validated. All other email addresses are not allowed.
annie#testdomain.com OK
aaron1#realdomain.com OK
amber#gooddomain.com OK
annie#otherdomain.com NOT OK
But I'm stil unclear on how to add multiple specific email addresses for the regex.
Any and all help would be appreciated.
Thank you,
Do you mean to include various ligitimate domains in one regex?
\b[A-Z0-9._%-]+#(testdomain|gooddomain|realdomain)\.com\b
You didn't specify which language you're using, but most regex implementations have a notion of logical operators, so the domain part of your pattern would have something like:
(domain1|domain2|domain3)
\b[A-Z0-9._%-]+#(testdomain|realdomain|gooddomain)\.com\b
Assuming the above works for testdomain:
\b[A-Z0-9._%-]+#(?:testdomain|realdomain|gooddomain)\.com\b
Also, please note that you will have to add a case insensitive i modifier for this to work with your test cases, or use [A-Za-z0-9._%-] instead of [A-Z0-9._%-]
See here
To make this expandable to many domains, I would probably capture the domain name and then compare that captured domain name with your whitelist in code.
.+#(.+)
First, ".+" will match any number (more than 0) of any characters up until the last "#" symobol in the string.
Second, "#" will match the "#" symbol.
Third, "(.+)" will match and capture (capture because of the parenthesis) any character string after the "#" symbol.
Then, depending on the language you are using, you can get the captured string. Then you can see if that captured string is in your domain whitelist. Note, you'll want to do a case insensitive comparison in this last step.
The official standard is known as RFC 2822.
Use OR operator | for all domain names you want to allow. Do not forget to escape . in the domain.
[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*#(?:testdomain\.com|realdomain\.com|gooddomain\.com)
Also use case-insensitivity modifier/flag to allow capital letters in the address.

Regex no special char in specific place

Right now I have:
(?!.*([._-])\1)(?=.*#)[\w.#-]+
which finds test#foo
I want to make it so that test cannot start or end with a special character.
For instance, I want it to find:
tes-t#foo
test#-foo
but not:
-test#foo
test-#foo
-test-#foo
You can define a character class that doesn't include specific characters using ^, e.g. [^a] will match anything apart from an a.
I would split the regex that matches the pre-# word into three sections; one to match the leading character, one to match the middle, and one to match the last character. You'll also need to handle the special case of the pre-# word only having a single character.
This is not an area where you should be recreating the wheel: there’s too much to get wrong.
I’m not sure what you really want to do. Addresses like president#whitehouse.gov and a plain old postmaster are probably both deliverable but highly unlikely to do what you want.
The only reasonable way to validate a mail address is to send mail to that address and get back a non-automatable reply showing that it is the right human at the other end. But this cannot be done in real time. Which means the best you can do is make them type it twice to try to weed out typos. That’s not much help, really. That’s why signing up for something over the web always involves a negotiated handshake.
However, if all you need is to validate an RFC 5322–compliant address, you may use this pattern.
Just understand that testing an address for compliance with the RFC should never be confused with validating that mail address — which is something else altogether.