Trying to find regexp that is acceptable by Gmail routing settings - regex

Trying to set up a "catch-all" rule in Gmail, the rule is catching all incoming messages to the domain however I want to exclude mail sent to specific addresses in the catch-all rule.
Have followed this:
https://robbettis.blog/setting-a-catchall-email-for-g-suite-in-2018/
but with the variation of changing "1. Specify envelope recipients to match" from "All Recipients" to "Pattern Match"
and then used:
Match everything except for specified strings
to try something like:
^(?!(red|green|blue)$).+$
in the pattern to match but Google apparently uses a different standard of RegExp and says my regexp has invalid syntax.
I don't have a strong RegExp background, any advice is appreciated.
Can someone please help finding an expression google's system will accept to achieve this or an alternative to achieve the same outcome?

The flavour of regex used here is RE2 – https://github.com/google/re2/wiki/Syntax
which doesn't support (before) text not matching such as (?!example).
In terms of a solution, a rule with a lower numbered order will be enacted first. So perhaps you could capture the ^(red|green|blue).+$ items using that to do x, then with the next rule do the desired action y.

You can achieve this by excluding certain emails using an Address List. To access this feature, your catch-all rule needs to be defined in the parent organisational unit's Routing. Default Routing does not provide the "Address lists" option.
I recently had this explained to me step by step by a Google Support staff member so thought I'd share the process for others.
Navigate to "Routing", it's the one at the bottom:
|
Choose your parent organisational unit and click "ADD ANOTHER RULE". You may not have any child organisational units, in which case it will be the only one that appears here:
|
Under "Email messages to affect", choose "Inbound" and "Internal – Receiving":
|
Under "For the types of messages above, do the following" enter the preferred action to take. For example, "Modify message" > "Change envelope recipient":
|
Click "Show options" to reveal options for "Address lists" and "Account types to affect"
Choose "Apply address lists to recipients" ("correspondents" means senders)
Choose "Bypass this setting for specific addresses/domains"
Add or create a list of emails that you want to exclude from this catch-all rule. This is the magical part :). In the example below, I've created a list called "Exclude from catch-all rules". You can use this list across multiple catch-all rules for different domains.
Under "Account types to affect", untick "Users", tick "
Unrecognised/catch-all"
|
Under "Envelope filter", choose "Only affect specific envelope recipients" and select "Pattern match" to enter your catch-all regular expression:

Related

Mod_security rule exception for url/arg

An image on our site is flagging a modsec rule I am trying to add a rule exception for only that occurrence. The number at the start of the flagged string is a session number, so I have added a regex to my rule.
I've tried various permutations but had no joy and would appreciate some advice.
Blocked URI:
https://www.website.com/application/login?0--preLoginHeaderPanel-companyLogo
Modsec log snippet:
[file "/usr/share/modsecurity-crs/rules/REQUEST-942-APPLICATION-ATTACK-SQLI.conf"] [line "65"] [id "942100"] [msg "SQL Injection Attack Detected via libinjection"] [data "Matched Data: 1c found within ARGS_NAME:0--preLoginHeaderPanel-companyLogo: 0--preLoginHeaderPanel-companyLogo"]
Attempted exceptions (within apache.conf):
SecRuleUpdateTargetById 942100 !ARGS_NAMES:'[0-9][0-9]?--preLoginHeaderPanel-companyLogo'
Core Rule Set Dev on Duty here. Rule 942100 is one of our 'LibInjection' rules. LibInjection is quite opaque (it's a third party library/operator), so you're correct that a rule exclusion is the way to fix this issue.
The use of regular expressions in this context follows a specific form. They need to be sandwiched inside forward slashes, like so:
SecRuleUpdateTargetById 942100 "!ARGS_NAMES:/^[0-9][0-9]?--preLoginHeaderPanel-companyLogo/"
I added in a starting anchor at the beginning of the regular expression. You might want to think whether anchoring at the end is a good idea, as well.
For more examples and information, we have some great documentation on this here: https://coreruleset.org/docs/configuring/false_positives_tuning/#support-for-regular-expressions

Google Data Studio: Show users for certain URL-path

I want to show and compare usertraffic from Google-Analytics in Data Studio. I need to break up the traffic between localized versions of our page.
Pages Paths and user groups
The basic language is German and hosted on www.domain.ltd/. The URL-path for english content is www.domain.ltd/en/ and for polish content we use the URL www.domain.ltd/pl/.
I want separate the user-traffic for each language-path and compare it in a linear diagram.
RegEx
I set up a new field with RegEx:
case
WHEN REGEXP_MATCH(Page, '^.*garbe-industrial\\.de\\/en\\/.*') THEN "Tarffic EN"
WHEN REGEXP_MATCH(Page, '^.*garbe-industrial\\.de\\/pl\\/.*') THEN "Tarffic PL"
else "Tarffic DE"
end
I combined the new field to a linear diagram. The diagram does not show any data.
I tried different approaches:
RegEx: .*\/en\/.* instead of the donain-path-version (1)
--
(1) Update: changed the format to "code" in order to make the full RegEx visible.
I wonder if your regular expression is too specific in the code example. I would think that just putting in .*/en/.* and .*/pl/.* would work to capture those pages. I do something similar for our Spanish pages. This works on my pages:
case when regexp_match(page, '.*/es/.*') then "Spanish"
else "English"
end

Regex Expression to replace email address domain, for users email address

I am trying to solve an email domain co-existence problem with Exchange online. Basically i need it so when a message is sent to one tenant (domain.com) and forwarded to another tenant (newdomain.com) - that the To and/or CC headers are replaced with the endpoint (newdomain.com) email addresses before they are delivered to the final destination.
For Example:
1) Gmail (or any) user sends and email to sally.sue#domain.com, MX is looked up for that domain, it is delivered to the Office 365 Tenant for domain.com
2) That same office 365 tenant, is set to forward emails to sally.sue#newdomain.com (different tenant)
3) When the message arrives to sally sue at newdomain.com and she hits "Reply All" the original sender AND her (sally.sue#domain.com) are added to the To: line in the email.
The way to fix that is to use Header Replacement with Proofpoint, which as mentioned below works on a single users basis. The entire question below is me trying to get it to work using RegEx (As thats the only solution) for a large number of users.
I need to convert the following users email address:
username#domain.com to username#newdomain.com
This has to be done using ProofPoint which is a cloud hosted MTA. They have been able to provide some sort of an answer but its not working.
Proofpoint support has suggested using this:
Header Name : To
Find Value : domain\.com$
Replace : newdomain\.com$ or just newdomain.com
Neither of the above work. In both cases the values are just completely ignored.
This seems to find the values:
Header Name : To
Find Value : \b[A-Z0-9._%-]+#[A-Z0-9.-]+\.[A-Z]{2,4}\b
Replace : $1#fake.com
But the above simply and only replaces the To: line (in the email) with the literal string: $1#fake.com
I would also need to be able to find lowercase and numbers in email addresses as well. i believe the above example only finds caps.
I need it do the following:
Header Name : To
Find Value : \b[A-Z0-9._%-]+#[A-Z0-9.-]+\.[A-Z]{2,4}\b (find users email address, domain)
Replace : user.name#newdomain.com
This is for a large number of users so there is no way to manually update or create separate rules for each user.
If i do create a individual rule, then it works as expected but as stated that requires manually typing out each user To: address And their new desired To: address.
This solution here almost worked: Regex to replace email address domains?
I have a couple of observations from general experience, although I have not worked with Office365 specifically.
First, a regex used for replacement usually needs to have a "capture group". This is often expressed with parentheses, as in:
match : \b([A-Z0-9._%-]+)#domain.com$
replacement : $1#newdomain.com
The idea is that the $1 in the replacement pattern is replaced with whatever was found within the () in the matching pattern.
Note that some regex engines use a different symbol for the replacement, so it might be \1#newdomain.com or some such. Note also that some regex engines need the parentheses escaped, so the matching pattern might be something like \b\([A-Z0-9._%-]+\)#domain.com$
Second, if you want to include - inside a "character class" set (that is, inside square brackets []), then the - should be first; otherwise it's ambiguous because - is also used for a range of characters. The regex engine in question might not care, but I suggest writing your matching pattern as:
\b([-A-Z0-9._%]+)#domain.com$
This way, the first - is unambiguously itself, because there is nothing before it to indicate the start of a range.
Third, for lowercase letters, it's easiest to just expand your character class set to include them, like so:
[-A-Za-z0-9._%]

How to create Gmail filter searching for text only at start of subject line?

We receive regular automated build messages from Jenkins build servers at work.
It'd be nice to ferret these away into a label, skipping the inbox.
Using a filter is of course the right choice.
The desired identifier is the string [RELEASE] at the beginning of a subject line.
Attempting to specify any of the following regexes causes emails with the string release in any case anywhere in the subject line to be matched:
\[RELEASE\]*
^\[RELEASE\]
^\[RELEASE\]*
^\[RELEASE\].*
From what I've read subsequently, Gmail doesn't have standard regex support, and from experimentation it seems, as with google search, special characters are simply ignored.
I'm therefore looking for a search parameter which can be used, maybe something like atstart:mystring in keeping with their has:, in: notations.
Is there a way to force the match only if it occurs at the start of the line, and only in the case where square brackets are included?
Sincere thanks.
Regex is not on the list of search features, and it was on (more or less, as Better message search functionality (i.e. Wildcard and partial word search)) the list of pre-canned feature requests, so the answer is "you cannot do this via the Gmail web UI" :-(
There are no current Labs features which offer this. SIEVE filters would be another way to do this, that too was not supported, there seems to no longer be any definitive statement on SIEVE support in the Gmail help.
Updated for link rot The pre-canned list of feature requests was, er canned, the original is on archive.org dated 2012, now you just get redirected to a dumbed down page telling you how to give feedback. Lack of SIEVE support was covered in answer 78761 Does Gmail support all IMAP features?, since some time in 2015 that answer silently redirects to the answer about IMAP client configuration, archive.org has a copy dated 2014.
With the current search facility brackets of any form () {} [] are used for grouping, they have no observable effect if there's just one term within. Using (aaa|bbb) and [aaa|bbb] are equivalent and will both find words aaa or bbb. Most other punctuation characters, including \, are treated as a space or a word-separator, + - : and " do have special meaning though, see the help.
As of 2016, only the form "{term1 term2}" is documented for this, and is equivalent to the search "term1 OR term2".
You can do regex searches on your mailbox (within limits) programmatically via Google docs: http://www.labnol.org/internet/advanced-gmail-search/21623/ has source showing how it can be done (copy the document, then Tools > Script Editor to get the complete source).
You could also do this via IMAP as described here:
Python IMAP search for partial subject
and script something to move messages to different folder. The IMAP SEARCH verb only supports substrings, not regex (Gmail search is further limited to complete words, not substrings), further processing of the matches to apply a regex would be needed.
For completeness, one last workaround is: Gmail supports plus addressing, if you can change the destination address to youraddress+jenkinsrelease#gmail.com it will still be sent to your mailbox where you can filter by recipient address. Make sure to filter using the full email address to:youraddress+jenkinsrelease#gmail.com. This is of course more or less the same thing as setting up a dedicated Gmail address for this purpose :-)
Using Google Apps Script, you can use this function to filter email threads by a given regex:
function processInboxEmailSubjects() {
var threads = GmailApp.getInboxThreads();
for (var i = 0; i < threads.length; i++) {
var subject = threads[i].getFirstMessageSubject();
const regex = /^\[RELEASE\]/; //change this to whatever regex you want, this one should cover OP's scenario
let isAtLeast40 = regex.test(subject)
if (isAtLeast40) {
Logger.log(subject);
// Now do what you want to do with the email thread. For example, skip inbox and add an already existing label, like so:
threads[i].moveToArchive().addLabel("customLabel")
}
}
}
As far as I know, unfortunately there isn't a way to trigger this with every new incoming email, so you have to create a time trigger like so (feel free to change it to whatever interval you think best):
function createTrigger(){ //you only need to run this once, then the trigger executes the function every hour in perpetuity
ScriptApp.newTrigger('processInboxEmailSubjects').timeBased().everyHours(1).create();
}
The only option I have found to do this is find some exact wording and put that under the "Has the words" option. Its not the best option, but it works.
I was wondering how to do this myself; it seems Gmail has since silently implemented this feature. I created the following filter:
Matches: subject:([test])
Do this: Skip Inbox
And then I sent a message with the subject
[test] foo
And the message was archived! So it seems all that is necessary is to create a filter for the subject prefix you wish to handle.

Google analytics generalized mail.foo.tld regex filter?

I'm trying to write a google analytics regex that will take all the sources of xxxxx.xxxxx.xxx.mail.foo.tld and rewrite it to mail.foo.tld
Currently I have the main two setup - mail.live.com and mail.yahoo.com.
Field A -> Extract A -- Campaign source -- .*\.mail\.yahoo\.com$
Output To -> Constructor -- Campaign source -- mail.yahoo.com
But, I have half a dozen other xxxx.xxxxx.mail.foo.tld that I would like to rewrite.
This is what I have so far:
Field A -> Extract A -- Campaign source -- .*\.mail\.(\w+)\.(\w+).*$
Output To -> Constructor -- Campaign source -- mail.$A1.$A2
I'm hoping to have $A1 be the domain name and $A2 be the domain (.com, .net, .co.uk, etc).
I'm particulary concerned that .co.uk and similar don't turn into garbage because once they're garbage I have no way to go back and edit the GA records. Any suggestions?
Depending on what patterns you are trying to match this could work:
Field A -> Extract A -- Campaign source -- .+\.mail\.([\w\.]+)$
Output To -> Constructor -- Campaign source -- mail.$A1
This would mean:
.+\. - this bit requires some kind of subdomain before mail
mail\. - this requires a mail. subdomain in there
([\w\.]+)$ - this requires something to be after the mail. and captures the
whole lot into a single capture group. It doesn't matter if
this is a .com .co.uk etc
When I say "Depending on what patterns" what I'm thinking is - is there anything that would get in the way of a match by clamping the end ($) in this manner. If there are querystrings etc tagged on the end then this could have problems and you should use a different technique (If you could post some example strings you need to match it would help)
In fact if that is the case you could just make the second tld optional:
Field A -> Extract A -- Campaign source -- .+\.mail\.([\w]+(\.[\w]+)?)$
Output To -> Constructor -- Campaign source -- mail.$A1
In this case it is saying the same as before but the (\.[\w]+)? means an optional second .tld
I usually use this one: it puts the criteria to the end of the domain string, having a tld from 2 to 4 chars, and handles any number of sub domain levels:
(messag|courrier|zimbra|imp|mail)(.*)\.(.*)\..{2,4}$
You can customize the list of services you want to catch, in addition to the ones mentioned here, depending on ISPs present in your area.