REGEX for Google Analytics Filters

REGEX for Google Analytics Filters - regex

I have a link with a lot of parameters and want to exclude most of them (only keep one) and replace it for a name
The link has the structure below:
https://www.abc.ab/something/somethingelse?card_type=status&a-bunch-of-trailing
Card_type can have different "status" values
Ideally I would like to keep:
https://www.abc.ab/something/somethingelse?card_type=status and replace ?card_type=status by "/card_type"
I attempted this on GA:
Search string
/*(.*?)\card_type\=*
Replace string:
/card_type
But this isn't working at all

You could match ?card_type= followed by matching any char 0+ times .* If it should be from the start of the string you could use an anchor ^ at the start of the pattern.
In the replacement use the first capturing group followed by the replacement string.
(.*?)\?card_type=.*
(.*?) Capture group 1 matching any char 0+ times non greedy
\? Match a ? by escaping it
card_type= Match literally
.* Match any char 0+ times non greedy
Replace with
$1/card_type
Regex demo
To get a bit more precise match for the url instead of using .*?, you might match the protocol:
^(https?:\/\/\S+\/[^?]*)\?card_type=.*
Regex demo

Related

How to utilize a URL regex match inside of the last grouping?

This is my regex
(?:!\(([elementType | inlineType]+),([\w\d\s\#]+),([\w\d\s]+),([\w\d\s\-]+),([\s\w\d\.\:\/\-\_\?\=]+)\))
This regex has 5 groups separated by commas, where as the last one needs to match any URL. I'm having a bit difficulty with it as it doesn't seem like i can paste in a URL matcher that I find on the internet inside of it. It seems to only be matching one thing at a time.

Some notes about your pattern:
\w also matches \d
This notation is a character class [elementType | inlineType]+ but as you also want to match block you can also use \w for that
You don't have to escape these characters in a character class . : _ ? =
Using 5 capture groups, where the 5th one matches an url like pattern:
!\(([\w#-]+),\s*([\w#-]+),\s*([\w-]+),\s*([\w-]+),\s*(https?:\/\/\S+)\)
Explanation
! Match literally
\( Match (
([\w#-]+),\s*([\w#-]+),\s*([\w-]+),\s*([\w-]+),\s* 4 capture groups where the character class [\w#-]+ matches 1+ occurrences of the allowed listed characters
(https?:\/\/\S+) Capture group 5, match the protocol with an optional s, then :// and 1+ non whitespace chars
\) Match )
Regex demo

Match group followed by group with different ending

For example, let's say I have a list of words:
words.txt
accountable
accountant
accountants
accounted
I want to match "accountant\naccountants"
I've tried /(\n\w+){2}s/, but \w+ seems to be perfectly matching different things.
My RegEx also matches the following undesirable texts:
action
actionables
actionable
actions
Am I reaching out too far in what regex can do?

You could for example use a capture group, and match a newline followed by a backreference to the same captured text and an s char.
If the first word can also be at the start of the string, instead of being preceded by a newline, you can use an anchor ^ instead.
^(\w+)\n\1s$
^ Start of string
(\w+) Capture group 1, match 1+ word chars
\n\1s Match a newline, backreference \1 to match the same text as group 1 and an s char
$ End of string
Regex demo

REGEX: Select all text between last underscore and dot

I'm having trouble retrieving specific information of a string.
The string is as follows:
20190502_PO_TEST.pdf
This includes the .pdf part. I need to retrieve the part between the last underscore (_) and the dot (.) leaving me with TEST
I've tried this:
[^_]+$
This however, returns:
TEST.PDF
I've also tried this:
_(.+)\.
This returns:
PO_TEST

This pattern [^_]+$ will match not an underscore until the end of the string and will also match the .
In this pattern _(.+). you have to escape the dot to match it literally like _(.+)\. see demo and then your match will be in the first capturing group.
What you also might use:
^.*_\K[^.]+
^.*_ Match the last underscore
\K Forget what was matched
[^.]+ Match 0+ times not a dot
Regex demo

Regex Extract a string between two words containing a particular string

I have the below string
abc-12d-ef-oy-5678-xyz--**--20190120075439322am--**--ghi-66d-ef-oy-8877-sdf--**--sfdfdsgfg--**--20190120075765487am
It is kind of multi character delimited string, delimited by '--**--' I am trying to extract the first and second words which has the -oy- tag in it. This is a column in a table. I am using the regex_extract method but i am not able extract the string which contains a string and ends with a string.
Here is one pattern that i tried .*(.*oy.*)--

If the -oy- can not be at the start or at the end, you could use this pattern to match the 2 hyphen delimited strings with -oy-:
[a-z0-9]+(?:-[a-z0-9]+)*-oy(?:-[a-z0-9]+)+
Regex details
[a-z0-9]+ Match 1+ times a-z0-9
(?: Non capturing group
-[a-z0-9]+ Match - and 1+ times a-z0-9
)* Close group and repeat 0+ times
-oy Match literally
(?:-[a-z0-9]+)+ Repeat 1+ times a group which will match - and 1+ times a-z0-9
You can extend the character class [A-Za-z0-9] to allow what you want to match like uppercase chars.
Regex demo | Java demo
If the matches should be between delimiters, you could use a positive lookbehind and positive lookahead and an alternation:
(?<=^|--\\*\\*--)[a-z0-9]+(?:-[a-z0-9]+)*-oy(?:-[a-z0-9]+)+(?=--\\*\\*--|$)
See a Java demo

You can use this regex which will match string containing -oy- and capture them in group1 and group2.
^.*?(\w+(?:-\w+)*-oy-\w+(?:-\w+)*).*?(\w+(?:-\w+)*-oy-\w+(?:-\w+)*)
This regex basically matches two strings delimiter separated containing -oy- using this (\w+(?:-\w+)*-oy-\w+(?:-\w+)*) to capture the text.
Demo

Are you able to select values from capture groups?
(?:--\*\*--|^)(.*?-oy-.*?)(?:--\*\*--|$)
?: - Non-capture group, matches the delimiter, begin of line, or end of line but does not create a capture group
*? - Lazy match so you only grab the contents of the field
https://regex101.com/r/aUAvcx/1
--- Second stab at this follows ---
This is convoluted. Hopefully you can use Lookahead and Lookbehind. The last problem I had was the final record was being "Greedy" and sucking up the field before it too. So I had to add an exclusion in the capture group for your delimiter.
See if this works for you.
(?<=--\*\*--|^)((?:(?:(?!--\*\*--).)*)-oy-(?:(?:(?!--\*\*--).)*))(?=--\*\*--|$)
https://regex101.com/r/aUAvcx/3
Basically the (?: are so we are not getting too many capture groups to work with.
There are three parts to this:
The lookbehind - Make sure the field is framed by the delimiter (or start of line)
The capture group - Grab the contents of the field, making sure a delimiter isn't sucked up into it
The lookahead - Make sure the field is framed by the delimiter (or end of line)
As far as the capture group goes, I check the left and right side of the -oy- to make sure the delimiter isn't there.

Regex to capture everything up to (but not including the 1st space and hyphen)

Here is my samples string
Google Chrome-Helper -type=renderer -field-trial-handle=1
But I want just Google Chrome-Helper
Ive tried: ^.*[ ][-] but captures up to the last parameter.
Example Here

You need to use lazy dot matching and either use capturing or a lookahead:
^(.*?)\s+-
(your value will be in Group 1) or
^.*?(?=\s+-)
See the regex demo with capturing and with a lookahead.
Details:
^ - start of string anchor
.*? - any 0+ chars other than a newline, as few as possible (i.e. the subsequent subpatterns are tried first, this one is skipped, the regex engine only comes back here if they fail to find a match)
(?=\s+-) - a positive lookahead that requires 1+ whitespace and then a hyphen.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

REGEX for Google Analytics Filters - regex

Related

How to utilize a URL regex match inside of the last grouping?

Match group followed by group with different ending

REGEX: Select all text between last underscore and dot

Regex Extract a string between two words containing a particular string

Regex to capture everything up to (but not including the 1st space and hyphen)

Categories

Resources