Google Analytics Regex - Matching Specific Words, but Not Others - regex

I don't usually use Regex. I'm working on Google Analytics Goals and I want to create a step in the funnel that will match URLs containing /resource/ and the word ebook or report, but do not include thank or thanks.
It would match:
/resource/example-ebook-request
/resource/research-report-2018/
It would not match:
/resource/example-ebook-request/thank-you/
/resource/research-report-2018/thanks/
/some-other-ebook-no-resource-subfolder/
I'm having a hard time getting the combination of this correct in a way that will work for Google Analytics since it doesn't support look behind. Any suggestions?

Try Regex: \/resource\/[^\/]*(?:ebook|report)[^\/]*\/?$
Demo

Related

Google analytics regex goal not working correctly

I have a regex to track signups to my site. There could be multiple adresses for a goal.
Here is my regex:
(\/membership\/signed-up\/|\/membership\/campagin\/(?!.*(not-this-campaign)).[-\w]+\/signed-up\/)
I want to match this adresses:
/membership/signed-up/
/membership/campagin/random-campaign/signed-up/
/membership/campagin/other-random-campaign/signed-up/
But I want to exclude this address:
/membership/campagin/not-this-campaign/signed-up/
It works, but it google also matches this address:
/membership/signed-up/step-2/
When I test in http://regexr.com it matches only on the strings I want, but why is google analytics matching more?
Try this :
(\/MEMBERSHIP\/SIGNED\-UP\/(?!.*(STEP\-2))|\/MEMBERSHIP\/CAMPAGIN\/(?!.*(NOT\-THIS\-CAMPAIGN)).[-\w]+\/SIGNED\-UP\/)
You regex its almost correct, but, you need to ensure it dont match with STEP 2

Use Regex to match beginning and end part of URL in Google Analytics

I'm looking for a regex function to implement in a goal for Google Analytics.
Consider this URL: /dagje-uit/....variable part..../contact/vpv/bedankt
Regex should work when beginning of URL matches /dagje-uit/ and end part contains /contact/vpv/bedankt Everything in the middle can be variable.
Without result i've tried
(?=^/dagje-uit/.*)(?=.*/bedankt$).*
(?=^dagje-uit.*)(?=.*bedankt$).*
Thanks in advance!
Regards,
Pim
Forgive me if Google Analytics has some regex standards which I am overlooking but is it possible that your regex is failing because it does not account for the start of the whole of the URL? Adding .* to either end of your regex may help.
It also looks like your regex is over-complex for the conditions you have described. Could a simpler match be :
.*/dagje-uit/.*/contact/vpv/bedankt.*
or
http(s)?://.*/dagje-uit/.*/contact/vpv/bedankt.*
if you want to be a little more confident that it is a valid URL.

Filtering Google Analytics API with Regex - Stop Before a Character (query string)

I'm working with Google Analytics API add-on for Google Spreadsheets to pull in data.
I know basic regex and it turns out that negative lookbacks / not operators (I'm assuming they're the same?) aren't allowed in Google Analytics, therefore I'm having difficulty with this filter.
I want to filter out all URL page paths that have a query string in them. Here's a sample list:
/product/9779/this-is-a-product
/product/27193/this-is-a-product-with-a-query-string?productId=50334&ps=True
/product/281727/this-is-another-product-with-a-really-long-title
/product/979
/product/979/product-12-pump-septic
/product/9790/the-1983-ford-sedan
/product/9791/remington-870-3-express-410-pump-shotgun
/category/2738/this-is-a-category
I want my output to be:
/product/9779/this-is-a-product
/product/281727/this-is-another-product-with-a-really-long-title
/product/979/product-12-pump-septic
/product/9790/the-1983-ford-sedan
/product/9791/remington-870-3-express-410-pump-shotgun
This is the start of my Regex...
ga:pagePath=~^/product/(.*)/
...which ignores the fourth line but I have no idea what to put after the second backslash.
I've tried a few things here (like this one Regular expression to stop at first match) and have been testing my code here (http://www.analyticsmarket.com/freetools/regex-tester).
Any insight would be greatly appreciated!
You can use the following regular expression to match the desired output.
^/product/.*/[\w-]+$
Live Demo
Try this also. It will strictly capture. what you need.
^\/product\/((?:(?!\/|[a-z]).)*)\/[\w-]+$
SEE DEMO : http://regex101.com/r/gS3lF8/2
^/product/\d+/[a-zA-Z0-9-]+$
You can try this.See demo.
http://regex101.com/r/oE6jJ1/16

negative match in google analytics funnel (for use in shopify store)

I am setting up a conversion funnel in Google Analytics and want to capture all product collection pages for my Shopify store. To do this, I want to match everything with this pattern: ^/collections/.* but I also need to exclude everything with this pattern:
^/collections/.*/products/.*
The reason being that collection(product category) pages follow this structure:
/collections/[collection-name]
E.g.,
/collections/shoes
/collections/tshirts
/collections/hats
etc
Product pages follow this structure: /collections/[collection-name]/products/[product-name]
E.g.,
/collections/shoes/products/pink-reeboks
/collections/tshirts/products/plain-white-tee
So I want to capture just the collection pages but not the product pages.
I have already identified a negative lookahead as the ideal way to do this. However, Google Analytics does not allow negative lookaheads, so I need another way to do this.
Any help would be greatly appreciated!
Thanks in advance.
If you use ranges [a-zA-Z0-9] this won't match a \. See Rubular for a simple demo of this regex.
^/collections/[a-zA-Z0-9]*$
Of course you need to update your regex if you use collection tags within Shopify.

Why won't this regexp work in google spreadsheets?

I'm trying to extract from a url using a regexp in google spreadsheets. However the spreadsheet returns #VALUE! with the following error: Invalid regular expression: invalid perl operator: (?<
Here is the regexp I'm using: (?<=raid_boss=)[a-zA-Z0-9_]+
A sample url will contain a variable in it that says raid_boss=name. This regexp should extract name. It works in my testing program, but not in google spreadsheet.
Here is the exact contents of the cell in google spreadsheets: =REGEXEXTRACT( B1 ; "/(?<=raid_boss=)[-a-zA-{}-9_]+" )
Any insight or help would be much appreciated, thank you!
Sounds like whatever regular-expression engine Google Docs is using doesn't support lookbehind assertions. They are a relatively rare feature.
But if you use captures, REGEXEXTRACT will return the captured text, so you can do it that way:
=REGEXEXTRACT( B1 ; "raid_boss=([a-zA-Z0-9_]+)" )
Javascript is not the issue - Google Sheets uses RE2 which lacks lookbehind
along with other useful things.
You could use:
regexextract(B1, ".*raid_boss=(.*)")
or else native sheet functions like FIND, SUBSTITUTE if that isn't working
Finding a good regex testing tool is tricky - for example you can make something that works in http://rubular.com/ but fails in GSheets. You need to make sure your tool supports the RE2 flavour eg: https://regoio.herokuapp.com/