Regular expression to match string from url - regex

I want to match shop name from a url .Please see the example below. Its for url redirection in a word press application.
See the examples given below
http://example.com/outlets/19-awok?page=2
http://example.com/outlets/19-awok
http://example.com/outlets/159-awok?page=3
In all cases i need to get only awok from the url .It will be the text coming after '-' and before query string .
I tried below and its not working
/outlets/(\d+)-(.*)? => /shop/$2
Any help will be greatly appreciated.

You can use this regex:
/outlets/\d+-([^?]+)?
Trailing ? is used to strip previous query string.

Related

Regex ignore first 12 characters from string

I'm trying to create a custom filter in Google Analytic to remove the query parts of the url which I don't want to see. The url has the following structure
[domain]/?p=899:2000:15018702722302::NO:::
I would like to create a regex which skips the first 12 characters (that is until:/?p=899:2000), and what ever is going to be after that replace it with nothing.
So I made this one: https://regex101.com/r/Xgbfqz/1 (which could be simplified to .{0,12}) , but I actually would like to skip those and only let the regex match whatever is going to be after that, so that I'll be able to tell in Google Analytics to replace it with "".
The part in the url that is always the same is
?p=[3numbers]:[0-4numbers]
Thank you
Your regular expression:
\/\?p=\d{3}\:\d{0,4}(.*)
Tested in Golang RegEx 2 and RegEx101
It search for /p=###:[optional:####] and capture the rest of the right side string.
(extra) JavaScript:
paragraf='[domain]/?p=899:2000:15018702722302::NO:::'
var regex= /\/\?p=\d{3}\:\d{0,4}(.*)/;
var match = regex.exec(paragraf);
alert('The rest of the right side of the string: ' + match[1]);
Easily use "[domain]/?p=899:2000:15018702722302::NO:::".substr(12)
You can try this:
/\?p\=\d{3}:\d{0,4}
Which matches just this: ?p=[3numbers]:[0-4numbers]
Not sure about replacing though.
https://regex101.com/r/Xgbfqz/1

URL rewrite regex conversion

I'm having trouble trying to learn how to write this URL into a regex template to add in as a rewrite. I've tried various regex sandboxes to figure it out on my own but they won't allow a '/' for instance when I copy an expression from here for testing:
I've got a custom post type (publications) with 2 taxonomies (magazine, issue) which I'm trying to create a good looking URL for.
After many hours I've come here to find out how I can convert this.
index.php?post_type=publications&magazine=test-mag&issue=2016-aug
To a templated regex expression (publication, magazine and issue are constant) that can output.
http://example.com/publications/test-mag/2016-aug/
Hopefully with room to extend if an article is followed through from that page.
Thanks in advance.
EDIT 1:
I've got this for my rule:
^publications/([^/]*)/([^/]*)/?$
and this for my match:
^index.php?post_type=publications&magazine=$matches[1]&issue=$matches[2]$
and testing with this:
http://localhost/publications/test-mag/2016-aug/
but its giving me a 404. What's the problem?
^index\.php\?post_type=publications&magazine=([^&]+)&issue=([^&]+)$
^ start of string
index\.php\?post_type=publications&magazine= literal text
([^&]+) one or more non-ampersand characters (will get all text up to the next url parameter. this is captured as a group
&issue= literal text
([^&]+) one or more non-ampersand characters. also captured
$ end of string
$str = 'index.php?post_type=publications&magazine=test-mag&issue=2016-aug';
preg_match('/magazine=([\w-]+?)&issue=([\w-]+)/', $str, $matches);
$res = 'http://example.com/' . $matches[1] . '/' . $matches[2] . '/';
echo $res; // => http://example.com/test-mag/2016-aug/
You can use the add_rewrite_rule method in the WP Rewrite API to accomplish this.
add_rewrite_rule('^/([^/]*)/([^/]*)/?$','index.php?post_type=publications&magazine=$matches[1]&issue=$matches[2]','top');

Regular expression groups

For all the regex experts out there! I'm trying to figure out how to group my url into parts using regular expressions.
Example:
site.com/user/account/info/settings
I want to be able to capture the user/accout/info url NOT /settings
Can anyone take this challenge and be kind enough to help me out? Thanks!
If you want to get the beginning of the URL try this:
(\/.*\/(?!.*\/.+))
Input:
site.com/foo/remove-me/
site.com/user/account/info/settings
site.com/foo/bar/remove-me
site.com/foo/remove-me?param1=true&param2=hello+world
Output:
/foo/
/user/account/info/
/foo/bar/
/foo/
https://regex101.com/r/yI5rG4/2
After consideration of all your comments under your post, I understand that you want to get the last segment for controller name extraction. Hence try this:
(?:\/(?!.*\/.+))([^\?\n]*)
Used on these inputs:
site.com/foo/remove-me/
site.com/user/account/info/settings
site.com/foo/bar/remove-me
site.com/foo/remove-me?param1=true&param2=hello+world
Output for group 1:
remove-me/
settings
remove-me
remove-me
Test here: https://regex101.com/r/kR5tX6/2

Regex: Get subtext from a string

I have a list of text lines. Each line contains a title and a URL as follows:
product-title-7134 http://domain.com/page-1
another-product-title-822 http://domain.com/page-218
etc.
Using only .NET regex, please help me extract the url from each line.
I understand it can be done by looking at the string from the end until the http is met and output that part but I don't know the exact regex formula for that. Any help is much appreciated.
I would do that with this regex:
http://(\S+)
And find first group in every match.
This regex will math all https:// and http:// links:
(http|https)(://\S+)
You can test this in the .NET regex tester: http://regexstorm.net/tester

How to rewrite url containing plus and special chracters?

We've got some incoming URLs that needs to be redirected, but we are having trouble with URLs that contains pluses (+).
For example any incoming URL must be redirected to the Homepage of the new site:
/eng/news/2005+01+01.htm
Should be redirected to to the home page of the new site
/en/
Using UrlRewriter.net we've set up a rule which works with 'normal' URLs but does not work for the above
<redirect url="~/eng/(.+)" to="/en/index.aspx" />
However it works fine if i change the incoming URL to
/eng/news/2005-01-01.htm
What's the problem and can anyone help?
I don't know about UrlRewriter.net, and I'm not sure which regex syntax it uses. I give some hint based on Perl regex.
what is the ~ at the beginning? Perhaps you mean ^, i.e. beginning of the string.
(.+) matches any character repeated one or more time; it does not match the + sign as you want
This is one way to write a (Perl) regex matching URLs starting with the string /eng/ and containg a + sign:
^\/eng\/.*\+.*
I hope this helps.