Regular expression to find last dot in request URI - regex

I am trying to write a regular expression for apache virtual host configuration that will map request if URI doesn't have certain extensions. Below expression I have written.
^\/bookdata\/.+\.(?!jpg|mp3|mp4|zip|doc|pdf|xls|xlsx).*$
Below URI is not matching to this expression which is perfectly fine.
/bookdata/rw0/media/Q2e_00_RW_U08_Q_Classroom.mp3?fd=1
My problem with below URI which is matching with this expression due to two dots.
/bookdata/rw0/media/ELM2_U02_Track06_Chart2.8.mp3?fd=1
Any small help will be appreciated.

Put the neg. lookahead right at the start, like so
^(?!.*\.(?:jpg|mp3|mp4|zip|doc|pdf|xls|xlsx))\/bookdata\/.+$
See a demo on regex101.com.

As I know request URI doesn't contain request params(after question mark)
So you can ommit .* at the end of your regex then you can match your prefer uris.
This happen because you say that your uri end by those extension must not match.

Related

Regular expression for any domain followed by a folder

I have the following regular expression which basically returns .domain.com/
^[0-9a-zA-Z_\-.]{1,256}\.domain\.com/
I am looking to change the expression so that it returns any domain with a dot com extension that is followed by /js can anyone tell me how I can do this?
Thanks
You can try the following:
(?:http:\/\/)(.*?\.com(?=\/js\/))
If you can tell your regex processor to return capture group #1, you'll get any domain preceded by http:// and followed by /js/.
Otherwise (i.e. using this expression as a stand-alone), you'll get the domain including the http://.
Try this one:
^([\w\d_\-]+[\.]?[\w\d\-_]+)+\.com\/js$

Regular expression to match a domain

I want to have a regular Expression for Google Analytic so I can match all the domain including the sub domains
say we have to match a domain name called xyz.com
So i want to match every url that have xyz.com in it.
Example
abcd.xyz.com, abc1232.xyz, www.xyz.com, www.xyz.com/abc
Can anyone help me with that.
My purpose to it to have the traafic reports excluded in Google Analytics that are coming from these sites.
In general, the regular expression to match those domains would be something like .*\.xyz\.com$. The backslashes escape the dots (which are normally wildcard characters and the dollar-sign represents the end of the string.
There are different regex implementations, so you might have to tweak this for your regex engine.
To exclude subdomains like described above you can use GA filter([Exclude] [Hostname] [Matching RegEx]) along with regular expression (xyz.com)|(.*.xyz.com).
This RegEx including both main domain and it's subdomains.
You could try this regex
(.*\.)?xyz\.com
This matches all your required formats for the URL.

JMeter Proxy exclusion patterns still being recorded

I am using JMeter to record traffic in my browser. In my URL Patterns to Exclude are:
.*\.jpg,
.*\.js,
.*\.png
Which looks like they should block these patterns (I've even tested it with a regex tester here)
Yet, I still see plenty of these files get pulled up. In a related forum someone had a similar issue, but his was caused by having additional url parameters afterwards (eg www.website.com/image.jpg?asdf=thisdoesntmatch). However this doesn't seem to be the case here. Can anyone point me in the right direction?
As already mentioned in the question comments it is probably a problem with the trailing characters. The pattern matcher is executed against the complete url including parameters.
So an URL http://example.com/layout.css?id=123 is not matched against the pattern .*\.css The JMeter HTTP Request Sample seperates the Path and the Parameters so it might be not obvious when you look at the URL.
Solution:Change the pattern to support trailing characters .*\.css.*
Explained
.* Any character
\. Matching the . (dot) character
css The character sequence css
.* Any character
Maybe you can do the oposite: leave blank the URL Patterns to exclude and negate those patterns in the URL Patterns to Include box:
(?!..(bmp|css|js|gif|ico|jpe?g|png|swf|woff))(.)

Regex Expression to Match URL and Exclude Other

Im trying to write a regex expression to match anything (.*)/feed/ with the exception of (.*)/author/feed/
Currently, I have (.*)/feed/(.*) which works well to identify any string /feed/ to redirect. However, I dont want to exlude those that have /author/(.*)/feed/
For example - match http://www.site.com/ANYTHING/feed/ but exclude site.com/author/ANYTHING/feed/
I should clarify that I'm not terribly familiar with regex expressions but this is actually for use within the Redirection plugin for wordpress which states "Full regular expression support."
Any help would be greatly appreciated. Thank you in advance
Depending on the language, you may be able to use a negative look-behind assertion:
(.*)(?<!/author)/feed
The assertion, (?<!/author), ensures that /author does not match behind the text /feed, but does not count it as being matched.

Regex with URLs - syntax

We're using a proprietary tracking system that requires the use of regular expressions to load third party scripts on the URLs we specify.
I wanted to check the syntax of the regex we're using to see if it looks right.
To match the following URL
/products/18/indoor-posters
We are using this rule:
.*\/products\/18\/indoor-posters.*
Does this look right? Also, if there was a query parameter on the URL, would it still work? e.g.
/products/18/indoor-posters?someParam=someValue
There's another URL to match:
/products
The rule for this is:
.*\/products
Would this match correctly?
Well, "right" is a relative term. Usually, .* is not a good idea because it matches anything, even nothing. So while these regexes will all match your example strings, they'll also match much more. The question is: What are you using the regexes for?
If you only want to check whether those substrings are present anywhere in the string, then they are fine (but then you don't need regex anyway, just check for substrings).
If you want to somehow check whether it's a valid URL, then no, the regexes are not fine because they'd also match foo-bar!$%(§$§$/products/18/indoor-postersssssss)(/$%/§($/.
If you can be sure that you'll always get a correct URL as your input and just want to check whether they match you pattern, then I'd suggest
^.*\/products$
to match any URL that ends in /products, and
^.*\/products\/18\/indoor-posters(?:\?[\w-]+=[\w-]+)?$
to match a URL that ends in /products/18/indoor-posters with an optional ?name=value bit at the end, assuming only alphanumeric characters are legal for name and value.