I need to pull out links only have just string with excluding numbers and queries in URL in Google Analytics.
so, I need this URL
www.site.com/en/rent/cairo/apartments-for-rent/
and exclude these
www.site.com/en/buy/apartment-for-sale-in-acacia-compound-new-cairo-947145/
www.site.com/en/buy/apartment-for-sale-in-acacia-compound-new-cairo-947145/?price=1000
Thank you
If each URL is on its own line, and that's the only thing on the line (not even whitespace), this simple regex will do the trick: ^[^0-9|?| ]*$
Related
I want to just see data for URLs which contain collection + category in google analytics so URLs which contain /collections/category example: https://baileynelson.com.au/collections/glasses
However i don't want to see data for products example: https://baileynelson.com.au/collections/glasses/products/adler
The regex i created is: ^/collections/(.*?)$ but it seems to be including product URLs.
Any ideas on how to create regex just so collection pages like https://baileynelson.com.au/collections/glasses, https://baileynelson.com.au/collections/sunglasses - but then product URLs are excluded?
Cheers!
Try using this regex here.
https:\/\/baileynelson\.com\.au\/collections\/[\w]+
The first part: htttps:\/\/baileynelson\.com\.au\/collections\/ This matches the domain and the path collections. The /s and .s are escaped.
Second part: [\w]+ This matches any words (abcde...z), and the + makes it so that is matches any amount.
I am using an online tool to crawl my client's website and provide a list of pages / URLs that exist on it.
There is an option to exclude pages, and it gives a regex example of \?.*page=.*$
I would like to ignore everything in the news section (apart from the News page itself)
So would I go with the following?
\?.*news/.*$
If I understand you correctly, you're looking for a regex that matches news/foo or news/foo/bar, but not news/.
You can use this regex for that: .*news/.+
.* string starts with 0 or more character(s)
news/ string includes news/
.+ string ends with 1 or more character(s)
http://regexr.com/3ffj1
I'm trying to exclude (in a Goal) a character in a regex in Google Analytics.
Basically, I have two pages with the following URL:
/signup/done/b
/signup/done/bp
Note that both might have UTM parameters after in some results as well
I am trying to measure only /done/b
The Regex I had was the following, but it includes both strings:
(/signup/done/plan/b)
When I changed it (and verified it in an external regex tester) I got 0 results, so the /b/ was also not included.
(/signup/done/plan/b[^p])
This regex would handle the case where the URL ends with /b or if there are query parameters:
/signup/done/b($|\?.*)
So examples of converting URLs would be:
/signup/done/b
/signup/done/b?utm_campaign=test&utm_medium=display
/signup/done/b?query=value
Examples of non-converting URLs would be:
/signup/done/bd
/signup/done/b/something
I have a list of text lines. Each line contains a title and a URL as follows:
product-title-7134 http://domain.com/page-1
another-product-title-822 http://domain.com/page-218
etc.
Using only .NET regex, please help me extract the url from each line.
I understand it can be done by looking at the string from the end until the http is met and output that part but I don't know the exact regex formula for that. Any help is much appreciated.
I would do that with this regex:
http://(\S+)
And find first group in every match.
This regex will math all https:// and http:// links:
(http|https)(://\S+)
You can test this in the .NET regex tester: http://regexstorm.net/tester
I want show only the URL's that contains "category-".
url/category-cats
url/category-elfs
url/category-dogs
The following can do the job :
^url/category-\w+/?$
But note that based on your regex engine you may need to escape the back slashe!
^url\/category-\w+\/?$