Regular expression to match a domain - regex

I want to have a regular Expression for Google Analytic so I can match all the domain including the sub domains
say we have to match a domain name called xyz.com
So i want to match every url that have xyz.com in it.
Example
abcd.xyz.com, abc1232.xyz, www.xyz.com, www.xyz.com/abc
Can anyone help me with that.
My purpose to it to have the traafic reports excluded in Google Analytics that are coming from these sites.

In general, the regular expression to match those domains would be something like .*\.xyz\.com$. The backslashes escape the dots (which are normally wildcard characters and the dollar-sign represents the end of the string.
There are different regex implementations, so you might have to tweak this for your regex engine.

To exclude subdomains like described above you can use GA filter([Exclude] [Hostname] [Matching RegEx]) along with regular expression (xyz.com)|(.*.xyz.com).
This RegEx including both main domain and it's subdomains.

You could try this regex
(.*\.)?xyz\.com
This matches all your required formats for the URL.

Related

Firebase Dynamic Link Url patterns

I need to add url patterns for a domain that has multiple subdomains in the Url.
For example:
https://demo.site1.mybrand.company/ Where .company is the top-level domain and mybrand is the domain.
The problem is that the demo subdomain can change based on the environment in the app, so it could be demo, or test, or anything, so I would like to make sure that any subdomain with site1.mybrand.company can access the Dynamic Links API and generate links for that Url.
What I have tried:
Firebase docs cite that these are too permission and I am not sure if Firebase supports multi-tier domains such as this.
^https://.*.company/.*$
^https://.*.mybrand.company/.*$
^https://.*.site1.mybrand.company/.*$
Has anyone experienced this situation before or know if this particular scenario is supported?
References:
https://support.google.com/firebase/answer/9021429
https://github.com/google/re2/wiki/Syntax
You might use a bit more specific pattern to match either demo or test using an alternation, and extend that to all the allowed names.
^https://(?:demo|test)\.site1\.mybrand\.company/\S*$
The pattern matches:
^ Start of string
https:// Match literally
(?:demo|test) Match either demo or test
\.site1\.mybrand\.company/ Match .site1.mybrand.company/ (note to escape the dot)
\S* Match optional non whitespace chars
$ End of string
Regex demo

Regex to match domain name (but not TLD)

I know there are tons of domain matching regular expressions floating around, but couldn’t find one to answer my particular question. I’m looking for a regular expression that will match only a URL’s domain name, but nothing else (not even the TLD). It doesn’t need to validate the domain.
So given the sample below:
https://www.orchardsoft.com
https://www.horizon-lims.com/contact/us
https://www.quartzy.com
https://qbench.net
https://www.xifin.com
...the regular expression needs to match for the following:
Orchardshot
Horizon-lims
Quartz
QBench
Xifin
The regular expression I'm starting with is this: (.|//(\w+.)+
Is anyone able to point me in the right direction?
As long as you declare the possible TLDs (here: .com.tr, .com, .net), you can use this regex:
([\w-]+)(?=\.(?:com\.tr|com|net))
In fact, an FQDN has a hierarchical structure which makes it impossible to always analyze it correctly with a regex. It would fail (match twice) for entries that contain a TLD in its path like https://www.example.com/a.combination.

Regular expression to find last dot in request URI

I am trying to write a regular expression for apache virtual host configuration that will map request if URI doesn't have certain extensions. Below expression I have written.
^\/bookdata\/.+\.(?!jpg|mp3|mp4|zip|doc|pdf|xls|xlsx).*$
Below URI is not matching to this expression which is perfectly fine.
/bookdata/rw0/media/Q2e_00_RW_U08_Q_Classroom.mp3?fd=1
My problem with below URI which is matching with this expression due to two dots.
/bookdata/rw0/media/ELM2_U02_Track06_Chart2.8.mp3?fd=1
Any small help will be appreciated.
Put the neg. lookahead right at the start, like so
^(?!.*\.(?:jpg|mp3|mp4|zip|doc|pdf|xls|xlsx))\/bookdata\/.+$
See a demo on regex101.com.
As I know request URI doesn't contain request params(after question mark)
So you can ommit .* at the end of your regex then you can match your prefer uris.
This happen because you say that your uri end by those extension must not match.

Regex for both website url versions with wildcard

I'm trying to add in allowed urls in a watchguard firebox webblocker list using regular expression. I'm trying to keep my list short by allowing one entry to apply to both www and non-www versions of a site including subdomains. I'm currently using the following:
(www\.)?ups\.com/*
Which works great for both versions plus subdomains, but has an issue as it allows other sites through that end their domain with ups.com such as jobs-ups.com
How can I make the regular expression know that if there is no subdomain that the url is only going to be ups.com without any other letters before the u, so it will block sites like jobs-ups.com?
You can use the caret ^ to accomplish this
^(?:www\.)?ups\.com\/
DEMO
The caret forces the check at the start of the string. This means it will not match in mid-string, which is what you are wanting.
Not familiar with firebox at all, but generally you should escape your periods and forward slashes. You would also generally use a non-capturing group as well. But if this is simple regex, you can still preserve your original formatting:
^(www.)?ups.com/*

Can Regular Expressions be used for the URL patterns in a Fluid app?

URL patterns are very handy in Fluid app (a Site-specific Browser for OS X) to support scripts/styles for certain specified URLs, for example:
As stated on the official website:
In the "Pattern" table below, you should add a pattern for any URL which you want to your Fluid App to visit. Star ("*") is a special character in this table. Star means "match anything here", and is a powerful way to easily include or exclude very large groups of URL patterns.
Can we use other characters such as "?" other than "*" to match the URLs?
Can regular expression be used instead?
Developer of Fluid here.
Update: Yes! Now you can use either simple Wildcard Patterns or full Regular Expression Patterns.
In Wildcard Patterns, star ("*") is a special Wildcard character that means "match anything here", and is a powerful way to easily include or exclude very large groups of URL patterns.
Alternatively, you can use full Regular Expression Patterns instead of Wildcard Patterns by wrapping your URL pattern in forward slashes / like: /http://google.com/.+/.
Full details on the Whitelist feature in Fluid are here.