regexing url parameters IIS - regex

I have reviewed a couple questions on regexing url parameters, but none of which seem to specifically address my issue. I have been trying to work out the correct regex pattern in www.regex101.com and I haven't found any successs. I have a url that has parameters which are separated by /'s. I am able to regex one parameter at a time, but I would ideally like to develop a pattern that can extract all of the parameters. So far this is what I have:
\/([a-zA-z]+)\/([a-zA-z]+)\/([a-zA-z-]+)\/
The url that I am trying to modify is:
www.mydomain.com/firstparameter/secondparameter/hyphenated-url-parameter/
The above pattern works for this example, but I need it to also work for these two examples:
www.mydomain.com/firstparameter/secondparameter/
www.mydomain.com/firstparameter/
Is it even possible to write one singular regex that can extract the parameters from each example above?

Try Regex: \/([a-zA-z]+)\/(?:(?:([a-zA-z]+)\/)?([a-zA-z-]+)\/)?
Details:
? Quantifier — Matches between zero and one times, as many times as possible
Demo
The assumption here is that, there is at least one parameter and max 3 parameters.

This should work for any number of parameters:
\/([\w|-]+)
Example

Related

APIGEE - Regular Expression Not Working in Condition

I am trying to use a condition to catch the case in which the query string of a request contains two or more parameters from a specific list. In such a case I wish to raise an error.
Of course, I can use many "and" and "or" clauses, but that will get very messy very quickly as the size of the list of parameters increases. So instead, I opted to use a regex to test for this.
As an example, if the list of parameters is [Bird,Dog,Horse], then any request who has two or more of these parameters in its query string should be matched.
The regular expression I am using is:
/(.(Bird|Dog|Horse).){2}
I tested in various regex testers and it works.
However, when I put the condition:
request.querystring Matches "/(.(Bird|Dog|Horse).){2}"
I never get a match.
Am I missing some specific APIGEE regex rules? Maybe the "{2}" is not supported in APIGEE? Thank you very much!!
Adam
The problem was I used "Matches" instead of "JavaRegex".
I tried "JavaRegex" before, but it also didn't work - the second problem was that I have the "/" at the beginning, which is not needed if you use "JavaRegex".
https://community.apigee.com/questions/65080/regular-expression-not-working-in-condition.html?childToView=65113#answer-65113

Regular expression to check path of url as well as specific parameters

I have url's like the following:
/home/lead/statusupdate.php?callback=jQuery211010657244874164462_1455536082020&ref=e13ec8e3-99a8-411c-be50-7e57991d7acb&status=5&_=1455536082021
I would like a regular expression to use in my Google analytic goal that checks to see that the request uri is /home/lead/statusupdate.php and has ref and status parameter present regardless of what order these parameters are passed and regardless of if there are extra parameters because I really just care about the 2. I have looked at these examples
How to say in RegExp "contain this too"? and Regular Expressions: Is there an AND operator? but I can't seem to adapt the examples given there to work.
Im using this online tool to test http://www.regexr.com/ (perhaps the tool is the buggy one? I'l try in javascript in the mean time)
You can try:
\/home\/lead\/statusupdate\.php\?(ref=|.*(&ref=)).*(&status=)
if the order does not matter, then add the oppostite
\/home\/lead\/statusupdate\.php\?(status=|.*(&status=)).*(&ref=)
all put together
\/home\/lead\/statusupdate\.php\?(((ref=|.*(&ref=)).*(&status=))|((status=|.*(&status=)).*(&ref=)))
try:
(/home/lead/statusupdate.php?A)|(/home/lead/statusupdate.php?B)|(/home/lead/statusupdate.php?C)|(/home/lead/statusupdate.php?D)|(/home/lead/statusupdate.php?E)|(/home/lead/statusupdate.php?F)
Note that here A,B,C,D,E,F are notations for six different permutations for 'callback' string, 'ref' string, 'status' string and '_' string.
Not really elegant but this works:
\/home\/lead\/statusupdate\.php(.*(ref|status)){2}
Looks for /home/lad/statusupdate.php followed by 2x any character followed by ref or status. Admittedly this would be a match for an url with 2x ref or status though.
Demo

Define regular expression that matches urls that end with digits unless anything else comes after

I'm using Scrapy to scrape a web site. I'm stuck at defining properly the rule for extracting links.
Specifically, I need help to write a regular expression that allows urls like:
https://discuss.dwolla.com/t/the-dwolla-reflector-is-now-open-source/1352
https://discuss.dwolla.com/t/enhancement-dwolla-php-updated-to-2-1-3/1180
https://discuss.dwolla.com/t/updated-java-android-helper-library-for-dwollas-api/108
while forbidding urls like this one
https://discuss.dwolla.com/t/the-dwolla-reflector-is-now-open-source/1352/12
In other words, I want urls that end with digits (i.e., /1352 in the example abpve), unless after these digits there is anything after (i.e., /12 in the example above)
I am by no means an expert of regular expressions, and I could only come up with something like \/(\d+)$, or even this one ^https:\/\/discuss.dwolla.com\/t\/\S*\/(\d+)$, but both fail at excluding the unwanted urls since they all capture the last digits in the address.
--- UPDATE ---
Sorry for not being clear in the first place. This addition is to clarify that the digits at the of URLS can change, so the /1352 is not fixed. As such, another example of urls to be accepted is also:
https://discuss.dwolla.com/t/updated-java-android-helper-library-for-dwollas-api/108
This is probably the simplest way:
[^\/\d][^\/]*\/\d+$
or to restrict to a particular domain:
^https?:\/\/discuss.dwolla.com\/.*[^\/\d][^\/]*\/\d+$
See live demo.
This regex requires the last part to be all digits, and the 2nd last part to have at least 1 non-digit.
Here is a java regex may fit your requirements in java style. You can specify number of digits N you are excepting in {N}
^https://discuss.dwolla.com/t/[\\w|-]+/[\\d]+$

need regular expression to match dynamic url to setup goal in Google Analytics

I need to match complete dynamic URL to set-up as a goal in Google Analytics. I don't know how to do that. I have searched on Google with no luck.
So here is the case.
When pressed enter button, the goal URL would be different depending on the product selected.
Example:
http://www.somesite.com/footwear/mens/hiking-boots/atmosphere-boot-p7023.aspx?cl=BLACK
http://www.somesite.com/womens/clothing/waterproof-jackets/canyon-womens-long-jacket-p7372.aspx?cl=KHAKI
http://www.somesite.com/travel/accessories/mosquito-nets/mosquito-net-double-p5549.aspx?cl=WHITE
http://www.somesite.com/ski/accessories/ski-socks-tubes/ski-socks-p2348.aspx?cl=BLACK
If you look closely in the URL, you can see that there are three parts:
http://www.somesite.com/{ 1st part }/{ 2nd part }/{3rd part }/{ page URL }/{ querystring param}
So if I manually change page URL part like p2348 to p1234, website will redirect to the proper page:
http://www.somesite.com/kids/clothing/padded-down-jackets/khuno-kids-padded-jacket-p1234.aspx?cl=BLUE
I don't know how to do that. Please help with regular expression to match those 4 digit while p remains there OR help me with those three parts matching any text/number and then 4 digit product code.
You should try this regex. It's the most simple one and functional as well.
p\d{4}
This will return you strings like p7634, p7351, p0872.
If you are not completely sure there will be exactly 4 digits, use the following regex.
p\d*
This one will return you strings like p43, p9165, p012, p456897689 and others.
Try
p[0-9][0-9][0-9][0-9]\.aspx
if there are always 4 digits after the p.
Your attempt
[^p]\d[0-9][0-9]
does not work because [^p] matches anything except for p, and \d[0-9][0-9] matches only three digits instead of four.

Regular expressions matching date formats and URLs

Hi I want to be able to set the regular expression to allow for dates to be entered like this
01/01/1900 or 01/01/70, I have the following but not sure how to make it so that it takes 4 or 2 at the end.
^([1-9]|0[1-9]|1[012])[- /.]([1-9]|0[1-9]|[12][0-9]|3[01])[- /.][0-9]{4}$
The other one I would like to know is for URL
This one I have no idea how do I make it so that it matches correct URL's?
Thank you
This should match two our four digit numbers:
\d{2}(\d{2})?
Your full regex would be something like this:
^([1-9]|0[1-9]|1[012])[- /.]([1-9]|0[1-9]|[12][0-9]|3[01])[- /.]\d{2}(\d{2})?$
URLs are hard to test. http://localhost is a valid URL and so it https://test.example.co.uk:443/index.ece?foo=bar. I would look for something in your language to test this for you or do a very simple test like this (you will have to delimit some special chars depending on the regex engine you use):
^https?://
To modify your regex so that it takes either 2 or 4 digits at the end, you can try this:
^([1-9]|0[1-9]|1[012])[- /.]([1-9]|0[1-9]|[12][0-9]|3[01])[- /.]([0-9]{4}|[0-9]{2})$
For URLs, you can try (from here):
(http|https)://([\w-]+\.)+[\w-]+(/[\w- ./?%&=]*)?
or have a look at this S.O. question.
^([1-9]|0[1-9]|1[012])[- /.]([1-9]|0[1-9]|[12][0-9]|3[01])[- /.]([0-9]{4}|[0-9]{2})$
Well, is ([0-9]{4}|[0-9]{2}) not good enough for you? Probably you could add some checking that first two digits in the four-digits group is 19 or 20 but it depends on your needs.
As for URL matching look here. There's many of them with tests.
You can use another alternation in at the end to accept 2 or 4 (the same way you do the "or" options for the other date parts). Alternatively, you can require 2 digits in the last position, and then have 2 optional digits after that.
Unless you need to capture the individual parts (day, month, year), you should use non-capturing parentheses, like this (?:) (that's the .NET syntax).
Finally, you should consider the type of validation that you are trying to achieve with this. It is probably better to enforce the format, and not worry about bad forms like 91/73/9004 because even with what you have you can still get invalid dates, like 02/31/2011. Since you probably have to perform further validation, why not simplify the regex to something like ^(?:\d{1,2}[-/.]){2}\d{2}(?:\d{2})?$
As for URLs, stackoverflow is littered with duplicate questions about this.