RegEx expression to find only exact match with Capitalization variants - regex

I want to filter down landing pages in Analytics to just see traffic from “/ph” and “/pH” only, and not include other pages that have like /ph-electrode-maintenance-calibration-guide. (ph|pH) didn't work, Any help would be greatly appreciated. enter image description here

Specify Start of String ^ and End of String $ Anchors:
^\/p[hH]$
http://www.regular-expressions.info/anchors.html

Try (\/p(h|H)(\/?$|\/)) it will match all your paths, eg:
path/pH
path/ph
path/pH/anything
path/ph/anything
but not:
path/ph-electrode-maintenance-calibration-guide
Add a ^ to the beginning if your path always begins with /ph:
^(\/p(h|H)(\/?$|\/))
Try RegExr to test it out

Related

Regex for Affiliate URL

For Matomo outgoing link tracking I need the regex pattern, which matched the following URLs:
https://www.example.com/product/?sku=12345&utm_source=123456789
and
https://www.example.com/product/?utm_source=123456789
"https://www.example.com/" and "utm_source=123456789" are always fixed in the URL, just "product/" or "category/product/" change and must replaced by regex pattern.
Thanks
Maybe this example can help you reach your goal:
(?<=https:\/\/www\.example\.com\/).+(?=utm_source=123456789)
It looks for any characters between these two groups:
https://www.example.com/
utm_source=123456789
Given the examples:
https://www.example.com/product/?sku=12345&utm_source=123456789
https://www.example.com/product/?utm_source=123456789
Your matches would be:
product/?sku=12345&
product/?

Why is my Regex include filter not working (google analytics)?

In google analytics, I have created the following include filter:
^https:\/\/(my\..*|accounts\..*|maya\..*\/reports\/(mymessages|favorites)|maya\..*\/account\/notification|info\..*\/(heb|eng)\/management\/generalpages\/pages\/(personalfolder|registration|change_password|userssearchindex|security%20search)\.aspx).*
In order to include only URLs that contains the following addresses:
https://my.tase.co.il
https://accounts.tase.co.il
https://maya.tase.co.il/reports/mymessages
https://maya.tase.co.il/reports/favorites
https://maya.tase.co.il/account/notification
https://info.tase.co.ilManagement/GeneralPages/Pages/PersonalFolder.aspx
https://info.tase.co.ilManagement/GeneralPages/Pages/Registration.aspx
https://info.tase.co.ilManagement/GeneralPages/Pages/Change_Password.aspx
https://info.tase.co.ilManagement/GeneralPages/Pages/UsersSearchIndex.aspx
https://info.tase.co.ilManagement/GeneralPages/Pages/Security%20Search.aspx
But for some reason i cant get it to work.
What am I doing wrong?
Thanks for your help!
The pattern does not match the links that start with info. because the pattern specifies info\..*\/(heb|eng) and in the example data there is no heb or eng present.
You can either remove that part or use a pattern that exactlty matches starting with those urls:
https:\/\/(?:(?:accounts|my)\.tase\.co\.il|maya\.tase\.co\.il\/(?:reports\/(?:mymessages|favorites)|account\/notification)|info\.tase\.co\.il\/Management\/GeneralPages\/Pages\/(?:PersonalFolder|Registration|Change_Password|UsersSearchIndex|Security%20Search)\.aspx).*
See a Regex demo.

Regex: Get subtext from a string

I have a list of text lines. Each line contains a title and a URL as follows:
product-title-7134 http://domain.com/page-1
another-product-title-822 http://domain.com/page-218
etc.
Using only .NET regex, please help me extract the url from each line.
I understand it can be done by looking at the string from the end until the http is met and output that part but I don't know the exact regex formula for that. Any help is much appreciated.
I would do that with this regex:
http://(\S+)
And find first group in every match.
This regex will math all https:// and http:// links:
(http|https)(://\S+)
You can test this in the .NET regex tester: http://regexstorm.net/tester

Match part of url with regex

I have a challenge with a regex match to a url I hope I can bug some of you clever heads with :-)
Please take a look at this testcase https://www.regex101.com/r/bH4hE1/2
I use the regex: (\w+)(.\w+)+(?!.*(\w+)(.\w+)+)
Problem is, it only finds reports.html but I also need to find reports in the first url
https://my.website.com/reports?ref_=kdp_BS
https://my.website.com/reports.html
To capture "reports" or "reports.html" in any path, begin your match after the last /, and capture word characters and .:
/.*\/([.\w+]+)/
See: https://www.regex101.com/r/iZ7dF3/8
Try:
/([^\/?]+)(?:\?.+)?$/gim
It will work end selects:
reports
reports.html

How 'Exclude URLs With regex' In Live HTTP headers

I want to exclude some urls from Live HTTP headers (firefox add-on).
so in Config area i checked Exclude URLs With regex and put the string below in it:
.gif$|.jpg$|.ico$|.css$|.js$|.png$|.bmp$|.jpeg$|google$|bing$|alexa$
i want to remove all images from capturing and any url that contains :
css - js - google - bing - alexa
what is the problem about my regex and would you please fix it for me?
thanks in advance
. means "any char"
$ means "the end of the string"
That said:
.gif$ will match "any string ending with gif that is at least 4-char long"
google$ will match "any string ending with google"
I guess you were looking for something like:
[.](gif|jpg|ico|css|js|png|bmp|jpeg)$|\b(google|bing|alexa)\b
Maybe your regexps get autoanchored with ^ and $ by the tool you're using. In this case, use .* additionally:
.*[.](gif|jpg|ico|css|js|png|bmp|jpeg)$|.*\b(google|bing|alexa)\b.*