Matching partial URL regex Google analytics - regex

I want show only the URL's that contains "category-".
url/category-cats
url/category-elfs
url/category-dogs

The following can do the job :
^url/category-\w+/?$
But note that based on your regex engine you may need to escape the back slashe!
^url\/category-\w+\/?$

Related

REGEX Match a String (Google Analytics)

I need to pull out links only have just string with excluding numbers and queries in URL in Google Analytics.
so, I need this URL
www.site.com/en/rent/cairo/apartments-for-rent/
and exclude these
www.site.com/en/buy/apartment-for-sale-in-acacia-compound-new-cairo-947145/
www.site.com/en/buy/apartment-for-sale-in-acacia-compound-new-cairo-947145/?price=1000
Thank you
If each URL is on its own line, and that's the only thing on the line (not even whitespace), this simple regex will do the trick: ^[^0-9|?| ]*$

Regex - analytics filter

I'm trying to filter some urls using gapi.client.analytics. What I want to achive is to create a regex filter that covers a lot of options. The regex should keep only urls that have this structure:
subdomain1.domain.com/some-post/
My problem is that I have some other urls that I don't know how to exclude, like:
subdomain1.domain.com/p/code/
subdomain1.domain.com/
subdomain1.domain.com/some-author/some-name/
subdomain2.domain.com/some-post/
subdomain2.domain.com/p/code/
I tried to use: ga:hostname=#subdomain1.domain.com to get links that contain only subdomain1.
I also tried: ga:hostname=~^[^/]+/?[^/]+/?$ to get only those who have 2 / in url.
Unfortunately I coudn't manage to do what I want.
Following regex should match URLs with exact one trailing directory
^[a-zA-Z0-9_-]+\.domain\.com\/[a-zA-Z0-9_-]+\/$
or
^[a-zA-Z0-9_\-\.]+\/[a-zA-Z0-9_-]+\/$
to match every domain.
You can text google analytics regex on analyticsmarket.com

Exclude Character in Google Analytics via Regex

I'm trying to exclude (in a Goal) a character in a regex in Google Analytics.
Basically, I have two pages with the following URL:
/signup/done/b
/signup/done/bp
Note that both might have UTM parameters after in some results as well
I am trying to measure only /done/b
The Regex I had was the following, but it includes both strings:
(/signup/done/plan/b)
When I changed it (and verified it in an external regex tester) I got 0 results, so the /b/ was also not included.
(/signup/done/plan/b[^p])
This regex would handle the case where the URL ends with /b or if there are query parameters:
/signup/done/b($|\?.*)
So examples of converting URLs would be:
/signup/done/b
/signup/done/b?utm_campaign=test&utm_medium=display
/signup/done/b?query=value
Examples of non-converting URLs would be:
/signup/done/bd
/signup/done/b/something

How to set up regex in nutch for filtering URL of techcrunch?

I want to crawl the pages of Techcrunch uploaded after the 1 Jan of 2013.The website follows the pattern
http://www.techcrunch.com/YYYY/MM/DD
So my question is how to setup the regex in urlfilter in nutch so that i could crawl only pages which i want.
+^http://www.techcrunch.com/2013/dd/dd/([a-z0-9\-A-Z]*\/)*
I don't know nutch but do you try:
+^http://www.techcrunch.com/2013/[0-9]{2}/[0-9]{2}.*$
or
+^http://www.techcrunch.com/2013/[0-9]+/[0-9]+.*$
The following expressions will match the URLs you need:
Without groups
http:\/\/www.techcrunch.com\/\d{4}\/\d{2}\/\d{2}\/\w+
With groups
http:\/\/www.techcrunch.com\/(\d{4})\/(\d{2})\/(\d{2})\/(\w+)
I didn't put anchors (^$), but you can put them if you need them for the filtering.
Try them to see if any of them work.
I don't know how nutch works, but a couple of suggestions about your regex that may apply: the / in the regexp should be escaped; the dd parts should be \d\d so they match two digits.
About setting up the regex, check out this answer to see if it helps you.

Regular expression for URL for Google Analytics

I'm trying to set up a goal on Google Analytics and want to match a regular expression against the following url:
/tasks/[random characters of random length]/complete
An example input url would be:
/tasks/12444ab22aaa7/complete
Any ideas?
^/tasks/.*/complete$
Validate with a search in the Top Content report before using in a filter.
Perhaps this
\/tasks\/[^/]*\/complete