I'm trying to make a regular expression for an URL so that it affects everything except a certain folder. The regex will only apply to everything after the '/', so given an url the url http://www.blah.com/folder/main/file.html, it will apply only to folder/main/file.html - the regex expression I want is the expression that will basically match always when there is no 'folder/' in the url.
You can use negative lookahead. For example this:
^(.(?!folder/))*$
will match anything which does not contain 'folder/'
Related
I already copied a regex expression from somewhere like this:
^(\*|http|https|file|ftp|ws|wss|data):\/\/(\*|(?:\*\.)?(?:[^*]+))?\/(.*)$
It can match URL patterns like https://www.google.com/ or https://*.google.com/, I want to support to reach https://www.google.*/, how could I change the regex?
^(\*|http|https|file|ftp|ws|wss|data):\/\/(\*|(?:\*\.)?(\*|(?:[^*]+)))?\/(.*)$
What would the regular expression look like to include/exclude a specific URL? I posted two URLs below -I need a regex that will distinguish between the two. The only difference in the two URLs is the ending: type vs hcat.
https://post.craigslist.org/k/WDEDan6W4xGILKcEW036_A/w7TH4?s=type
https://post.craigslist.org/k/WDEDan6W4xGILKcEW036_A/w7TH4?s=hcat
I hope I understood your question right.
But if you want to give the exact given URLs in - this should do:
"https://post\.craigslist\.org/k/WDEDan6W4xGILKcEW036_A/w7TH4\?s=(type|hcat)"
With this, Capture Group 1 would contain either type or hcat or nothing.
If you want to check based on this domain URL and the URL should end on the parameter s with type or cat, use this:
"https://post\.craigslist\.org/.*?s=(type|hcat)"
Note: The ? now marks the * as not greedy, it is not the escaped \? from above.
I'm looking for some help with a regex pattern for rewriting a URL. My URL structure is:
http://domain.com/[username]/[token]/[userid]/
The data types are:
username = alphanumeric
token = alphanumeric
userid = numeric
An example with data:
http://domain.com/john1975/aBc123/123456789/
Using a regular expressions I'm trying to get a reference for each piece of data, so I can rewrite to:
index.asp?username={R:1}&token={R:2}&userid={R:3}
Also keep in mind the regex shouldn't be too greedy, so I can still access files such as:
http://domain.com/about.asp
http://domain.com/images/logo.png
The regex I've tried is:
^[0-9a-z]+/[0-9a-z]+/[0-9]+$
This doesn't match my example URL.
You're missing the trailing forward slash. The regex should be :
^([0-9a-z]+)/([0-9a-z]+)/([0-9]+)/$
I'm assuming you're flagging it as case insensitive. If not then you need
^([0-9a-zA-Z]+)/([0-9a-zA-Z]+)/([0-9]+)/$
You also need the brackets so you can call your back references, which are also wrong - you want to match on 1,2 and 3, not 0, which is the match of the whole expression. They should read:
index.asp?username={R:1}&token={R:2}&userid={R:3}
I'm quite bad with regex, and I'm looking to match a criteria.
This is a regex expression that should go emmbed into the url for a firewall, so It will block any url that is not like the list at the end.
This is what Im currently using but its not working:
http://www.youtube.com/(*.*)list=UUFwtOm4N5djdcuTAlNIWJaQ
This is the example url (to be blocked):
http://www.youtube.com/watch?NR=1&feature=fvwp&v=P1b5VY_Bp_o&list=UUFwtOm4N5djdcuTAlNIWJaQ
I'm trying to make a regex that will Success fully match when NR=1 or feature=fvwp
are NOT present, I asume I can do it like this: (?!^feature=fvwp$) but the v= and list=UUFwtOm4N5djdcuTAlNIWJaQ are allowed.
Also the v= should be limited to any character (uppercase and lowercase) and 11 length, I assume its: /^[a-z0-9]{11}$/
How can I build all that together and make it work so it would allow and match only on this urls excluding from allowing the previous criterias that I explained:
http://www.youtube.com/watch?v=4eK_RWpTgcc&feature=BFa&list=UUFwtOm4N5djdcuTAlNIWJaQ
http://www.youtube.com/watch?v=TLRl85TJwZM&feature=BFa&list=UUFwtOm4N5djdcuTAlNIWJaQ
http://www.youtube.com/watch?v=QEV9yqrpxkc&feature=BFa&list=UUFwtOm4N5djdcuTAlNIWJaQ
Can you block based on matching by regex? If so, just use
(.*)www\.youtube\.com/watch\?NR=1&feature=fvwp and block whatever matches that.
I need to match all valid URLs except:
http://www.w3.org
http://w3.org/foo
http://www.tempuri.org/foo
Generally, all URLs except certain domains.
Here is what I have so far:
https?://([-\w\.]+)+(:\d+)?(/([\w/_\.]*(\?\S+)?)?)?
will match URLs that are close enough to my needs (but in no way all valid URLs!) (thanks, http://snipplr.com/view/2371/regex-regular-expression-to-match-a-url/!)
https?://www\.(?!tempuri|w3)\S*
will match all URLs with www., but not in the tempuri or w3 domain.
And I really want
https?://([-\w\.]+)(?!tempuri|w3)\S*
to work, but afaick, it seems to select all http:// strings.
Gah, I should just do this in something higher up the Chomsky hierarchy!
The following regular expression:
https?://(?!w3|tempuri)([-\w]*\.)(?!w3|tempuri)\S*
only matches the first four lines from the following excerpt:
https://ok1.url.com
http://ok2.url.com
https://not.ok.tempuri.com
http://not-ok.either.w3.com
http://no1.w3.org
http://no2.w3.org
http://tempuri.bla.com
http://no4.tempuri.bla
http://no3.tempuri.org
http://w3.org/foo
http://www.tempuri.org/foo
I know what you're thinking, and the answer is that in order to match the above list and only return the first two lines you'd have to use the following regular expression:
https?://(?!w3|tempuri)([-\w]*\.)(?!w3|tempuri)([-\w]*\.)(?!w3|tempuri)\S*
which, in truth, is nothing more than a slight modification of the first regular expression, where the
(?!w3|tempuri)([-\w]*\.)
part appears twice in a row.
The reason why your regular expression wasn't working was because when you include . inside the ()* then that means it can not only match this. and this.this. but also this.this.th - in other words, it doesn't necessarily end in a dot, so it will force it to end wherever it has to so that the expression matches. Try it out in a regular expression tester and you'll see what I mean.