For all the regex experts out there! I'm trying to figure out how to group my url into parts using regular expressions.
Example:
site.com/user/account/info/settings
I want to be able to capture the user/accout/info url NOT /settings
Can anyone take this challenge and be kind enough to help me out? Thanks!
If you want to get the beginning of the URL try this:
(\/.*\/(?!.*\/.+))
Input:
site.com/foo/remove-me/
site.com/user/account/info/settings
site.com/foo/bar/remove-me
site.com/foo/remove-me?param1=true¶m2=hello+world
Output:
/foo/
/user/account/info/
/foo/bar/
/foo/
https://regex101.com/r/yI5rG4/2
After consideration of all your comments under your post, I understand that you want to get the last segment for controller name extraction. Hence try this:
(?:\/(?!.*\/.+))([^\?\n]*)
Used on these inputs:
site.com/foo/remove-me/
site.com/user/account/info/settings
site.com/foo/bar/remove-me
site.com/foo/remove-me?param1=true¶m2=hello+world
Output for group 1:
remove-me/
settings
remove-me
remove-me
Test here: https://regex101.com/r/kR5tX6/2
Related
I am trying to create a regex in pcre, that is going to salinize URL with multiple slashes like the following:
https://www.domin.com/test1/////test2/somemoretests_67142 https://www.domin.com/test1/test2/somemoretests_67142///// https://www.domin.com/test1/test2///somemoretests_67142
So that I can replace it with the following: https://\2\4 and the link at the end of it looks: https://www.domin.com/test1/test2/somemoretests_67142
I have been struggling with it for the past couple of days, so any regex guru help is more than welcome :)
I have tried the following and more:
(http|https):\/\/(.*)(\/\/+)(.*)
(http|https):\/\/(.*)(\/\/){2,}(.*)
(http|https):\/\/(.*)(\/\/{2})(.*)
I am going to utilize these for Akamai to sanitize our URLs though cloudlet.
You can try:
(?<!https:\/)(?<!http:\/)(\/+$|(?<=\/)\/+)
And substitute the first group with empty string.
Regex demo.
This will produce this output:
https://www.domin.com/test1/test2/somemoretests_67142
https://www.domin.com/test1/test2/somemoretests_67142
https://www.domin.com/test1/test2/somemoretests_67142
For Matomo outgoing link tracking I need the regex pattern, which matched the following URLs:
https://www.example.com/product/?sku=12345&utm_source=123456789
and
https://www.example.com/product/?utm_source=123456789
"https://www.example.com/" and "utm_source=123456789" are always fixed in the URL, just "product/" or "category/product/" change and must replaced by regex pattern.
Thanks
Maybe this example can help you reach your goal:
(?<=https:\/\/www\.example\.com\/).+(?=utm_source=123456789)
It looks for any characters between these two groups:
https://www.example.com/
utm_source=123456789
Given the examples:
https://www.example.com/product/?sku=12345&utm_source=123456789
https://www.example.com/product/?utm_source=123456789
Your matches would be:
product/?sku=12345&
product/?
I need to fix my url pattern:
/^((http(s)?(\:\/\/)){1}(www\.)?([\w\-\.\/])*(\.[a-zA-Z]{2,4}\/?)[^\\\/#?])[^\s\b\n|]*[^\.,;:\?\!\#\^\$ -]/
I thought this regex was ok, but it is not working for urls like: https://xx.xx (without www). 'www' should be optional ((www.)?). Where is the bug?
The problem is not in the (www\.)? part but that parts after that.
Take a look at the [^\\\/#?] and the [^\.,;:\?\!\#\^\$ -] parts.
So a valid URL would be https://xx.xx plus none of \/#? plus none of .,;:?!#^$_- making the url valid if you add those, for example https://xx.xx11.
I do advice you to not try to create your own regex because you are missing a lot!
For example, tlds like .amsterdam are valid. And why are you capturing so many groups?
Your regex as an image made with https://www.debuggex.com/:
My URL is http://example.com/locate/ny/2
in functions, I use below code
$wp_rewrite->add_rule('locate/([^/]+)','index.php?page_id=294&cs=$matches[1]','top');
I got URL like this http://example.com/locate/ny I got this working, but i want to add a pagination after ny like ny?cpaged=3 and rewrite to ny/3
but what is the regexp for index.php?page_id=294&cs=$matches[1]&cpaged=$matches[2] from url http://example.com/locate/ny/2
You need to add another capturing group within the regex that just picks out the digits from the url. Assuming your url structure isn't going to change this regex should work.
$wp_rewrite->add_rule('locate\/([^\/]+)\/(\d*)','index.php?page_id=294&cs=$matches[1]&cpaged=$matches[2]','top');
See here for a demo and to play around with it further: https://regex101.com/r/BNkZBo/1/
I want to match shop name from a url .Please see the example below. Its for url redirection in a word press application.
See the examples given below
http://example.com/outlets/19-awok?page=2
http://example.com/outlets/19-awok
http://example.com/outlets/159-awok?page=3
In all cases i need to get only awok from the url .It will be the text coming after '-' and before query string .
I tried below and its not working
/outlets/(\d+)-(.*)? => /shop/$2
Any help will be greatly appreciated.
You can use this regex:
/outlets/\d+-([^?]+)?
Trailing ? is used to strip previous query string.