Google Analytics filters, only two countries - regex

I want create a filter for include only two countries. For example United Kingdom and Russia.
I have two filters, first is excluding all countries. It is a filter where I set regex as pattern '.' and next filter is including only for this countries, pattern: United Kingdom|Russia.
But now I don't have any results displayed. Whats wrong with my regex?

Your regex is fine. You need to include the variable against which the regex filter is going to be executed. In your case, type Country (Pays in French)
Preview (Sorry for the French):
PS: I have tested it on my G account.
Edit:
As per your comment below, the . would only match one character (There are no countries with one character name). If you want to match all countries then your regex pattern would like .+ yet that leaves with this question: If you want to match all countries why use a filter in the first place?

If you set a filter to exclude all countries then there will be nothing in your reports, it does not matter what any other filters do because they cannot cancel each other out.
You simply need to place the include filter as it works as an "Include-only".
The RegEx you have already seems to be working on both filters, once again the problem is that your exclude filter is excluding everything.

Related

How to use Postgres Regex Replace with a capture group

As the title presents above I am trying to reference a capture groups for a regex replace in a postgres query. I have read that the regex_replace does not support using regex capture groups. The regex I am using is
r"(?:[\s\(\)\=\)\,])(username)(?:[\s\(\)\=\)\,])?"gm
The above regex almost does what I need it to but I need to find out how to only allow a match if the capture groups also capture something. There is no situation where a "username" should be matched if it just so happens to be a substring of a word. By ensuring its surrounded by one of the above I can much more confidently ensure its a username.
An example application of the regex would be something like this in postgres (of course I would be doing an update vs a select):
select *, REGEXP_REPLACE(reqcontent,'(?:[\s\(\)\=\)\,])(username)(?:[\s\(\)\=\)\,])?' ,'NEW-VALUE', 'gm') from table where column like '%username%' limit 100;
If there is any more context that can be provided please let me know. I have also found similar posts (postgresql regexp_replace: how to replace captured group with evaluated expression (adding an integer value to capture group)) but that talks more about splicing in values back in and I don't think quite answers my question.
More context and example value(s) for regex work against. The below text may look familiar these are JQL filters in Jira. We are looking to update our usernames and all their occurrences in the table that contains the filter. Below is a few examples of filters. We originally were just doing a find a replace but that doesn't work because we have some usernames that are only two characters and it was matching on non usernames (e.g je (username) would place a new value in where the word project is found which completely malforms the JQL/String resulting in something like proNEW-VALUEct = balh blah)
type = bug AND status not in (Closed, Executed) AND assignee in (test, username)
assignee=username
assignee = username
Definition of Answered:
Regex that will only match on a 'username' if its surrounded by one of the specials
A way to regex/replace that username in a postgres query.
Capturing groups are used to keep the important bits of information matched with a regex.
Use either capturing groups around the string parts you want to stay in the result and use their placeholders in the replacement:
REGEXP_REPLACE(reqcontent,'([\s\(\)\=\)\,])username([\s\(\)\=\)\,])?' ,'\1NEW-VALUE\2', 'gm')
Or use lookarounds:
REGEXP_REPLACE(reqcontent,'(?<=[\s\(\)\=\)\,])(username)(?=[\s\(\)\=\)\,])?' ,'NEW-VALUE', 'gm')
Or, in this case, use word boundaries to ensure you only replace a word when inside special characters:
REGEXP_REPLACE(reqcontent,'\yusername\y' ,'NEW-VALUE', 'g')

Grok custom pattern for space delimited file

I'm trying to load a file to structured table in Athena. I am using GROK pattern to load it to the table but not able to find the correct pattern. The file format is as below:
L1127 ACTUALS 214171 ON 27649075 -00000000000000000409618.02 601 MBS DAILY VISION - CAN OS
L1127 ACTUALS 412821 ON 27649075 002060 -00000000000000000002657.33 521 MBS DAILY VISION - CAN OS
GROK pattern I'm using:
(?<BusinessUnit>.{5})%{SPACE}(?<Type>.{7})%{SPACE}(?<PSGLAccountNumber>.{6})%{SPACE}(?<Province>.{2})%{SPACE}(?<DepartmentId>.{8})%{SPACE}(?<ProductId>.{6})%{SPACE}(?<Amount>.{27})%{SPACE}(?<TransCode>.{3})%{SPACE}(?<Feed>.{35})
I'm having trouble when the ProductId has no value.
Any help would be appreciated.
(?<ProductId>.{6})%{SPACE} means that you expect the ProductId field to be exactly six characters followed by any number of spaces. From the data you posted it seems to me that what should happen is that in the first row ProductId would end up as six spaces.
If the problem is that it becomes six spaces and you want it to be an empty string, you could for example use (?<ProductId>\S*)%{SPACE} (\S* matches zero or more non-space characters).
If this does not solve your problem, perhaps you could describe in some more detail what trouble you are having, and what you want to happen?
Update: in a comment you indicated that the problem with this solution is that the ProductId column becomes "-00000". The reason for that is that the %{SPACE} pattern before (?ProductId… consumes all the spaces between the DepartmentId and Account fields. To solve this you could for example limit the number of spaces that can appear between the DepartmentId and ProductId fields. In the example data you post there are two spaces, and since the fields are fixed-width I assume this is always the case. Using a pattern like …(?<DepartmentId>.{8})\s{2}(?<ProductId>\S*)%{SPACE}(?<Amount>.{27})… should fix the problem.
I was able to make it work using the below pattern below
%{WORD:BusinessUnit}%{SPACE}%{WORD:Type}%{SPACE}%{POSINT:PSGLAccountNumber}%{SPACE}%{WORD:Province}%{SPACE}%{POSINT:DepartmentId}%{SPACE}%{custompat:ProductId}%{SPACE}%{NUMBER:Amount}%{SPACE}%{NUMBER:TransCode}%{SPACE}(?<Feed>[A-Za-z0-9\-\s]{26})
And using custom pattern:
custompat ([0-9]{6}|\s{6})

Regex to identify catgory pages but exclude products

I want to just see data for URLs which contain collection + category in google analytics so URLs which contain /collections/category example: https://baileynelson.com.au/collections/glasses
However i don't want to see data for products example: https://baileynelson.com.au/collections/glasses/products/adler
The regex i created is: ^/collections/(.*?)$ but it seems to be including product URLs.
Any ideas on how to create regex just so collection pages like https://baileynelson.com.au/collections/glasses, https://baileynelson.com.au/collections/sunglasses - but then product URLs are excluded?
Cheers!
Try using this regex here.
https:\/\/baileynelson\.com\.au\/collections\/[\w]+
The first part: htttps:\/\/baileynelson\.com\.au\/collections\/ This matches the domain and the path collections. The /s and .s are escaped.
Second part: [\w]+ This matches any words (abcde...z), and the + makes it so that is matches any amount.

REGEX: select everything to the left until the first specified delimeter

I'm using ColdFusion functions to query an Active Directory Database, return Membership information for a user, then REGEX functions to search the output for specified groups. I made "|" a delimiter.
Anyway, here's some example output:
CN=Group One,OU=Distribution Lists,DC=Domain,DC=org|CN=Group Two,OU=Distribution Lists,DC=Domain,DC=org|CN=Group Three,OU=Distribution Lists,DC=Domain,DC=org|CN=Group Four,OU=Distribution Lists,DC=Domain,DC=org|CN=Group Five,OU=Distribution Lists,DC=Domain,DC=org
What I would like to capture is this:
CN=Group Three,OU=Distribution Lists,DC=Domain,DC=org
Here is what I've tried so far:
^|CN=(.*Group? Three)
Here's a link to the example: http://rubular.com/r/DIGZOPwTag
What's my problem?
Well, this doesn't work all that great... It goes to the left, but it goes too far! How do I stop it at the first occurrence of |CN= to the left?
Thank you in advance for your time. It is appreciated.
!!Clarification!!
Better Example Output:
CN=Pay Band 50,OU=Distribution Lists,DC=Domain,DC=org|CN=Human Resources,OU=Distribution Lists,DC=Domain,DC=org|CN=SiteA Staff,OU=Distribution Lists,DC=Domain,DC=org|CN=SiteB Additional Staff,OU=Distribution Lists,DC=Domain,DC=org|CN=Executives,OU=Distribution Lists,DC=Domain,DC=org
Desired matches:
I'm looking for specific groups:
Site Name w/Possible Spaces Staff
Site Name w/Possible Spaces Additional Staff
It would be awesome to return: "StieAlpha Staff", "Site Beta Additional Staff". It would also be acceptable to include the "CN=" prefix because I could use it to do queries later.
"Staff" and "Additional Staff" will always be part of the group(s) I want to match.
What I've tried, again
^|CN=[^|CN=]*? Staff|Additional? Staff
This new example is not quite perfect as it doesn't grab all of "Site Beta". "Site Beta" Could be any name of any building, for example.
example link: http://rubular.com/r/vq5JcrvaBR
It is not really clear what you want to extract, if only the "Group Three" CN value or all CN values.
You can extract every CN value with this regex:
CN=([^,]*)
this regex begins extracting after each "CN=" occurence and continues extraction until the first comma (,).
A RegEx to fit your demands is
^.*?\|. Visualisation:

Google Analytics regex filtering: exclude except

I'm quite new to regex and trying to do some Google Analytics filtering.
I have a website: www.domain.com (international version) and I have many country versions in subdirectories. e.g. www.domain.com/se
Now I want to create a filter to only show the international site. Therefore I have 15 (15 countries) exclude filters, which works ok, except for my "services" pages. The Swedish exclude, also filters out www.domain.com/services.
How can I exclude "/se" and "/se/" and "/se/*" without loosing my /services pages?
I'm not sure what google-analytics filters look like but if for france your filter looks like:
yoururl/fr
For sweden you could probably use:
yoururl/se(/|$)
Which says match /se followed by either a forward slash or end of input (/|$). You should probably add that to the end of all your filters to avoid excluding any other pages unintentionally e.g. yoururl/friendslist for the french example above.
See it on RegExr