Regex for matching a pattern at the end of a URL - regex

I have a really weird pattern that sometimes shows up at the end of urls and I'm trying to write an apache rewrite rule to get rid of it.
Meaning, I'm trying to formulate a regex expression to match anything ending with that pattern
So I might have something like
mysite.com/blah#_=_
And I want to match that pattern after blah but I think I either need to escape it or do something. I currently have these sets of rules, well, really loosely laid out
RewriteEngine on
RewriteCond %{HTTP_HOST} regex-goes-here [NC]
RewriteRule ^/(.*) replace-with-url-minus-pattern [R]

You can't use mod_rewrite or anything on the server's end for to get rid of that. The stuff starting with the # is called a URL Fragment, and it's something browsers (and javascript and other stuff that's client side) use to determine how to display or render content, or used by scripts to pass bits of data between each other. The #_=_ portion of the URI is never even sent to the server.
You'll need to do something on the browser's end, maybe something like this: Remove fragment in URL with JavaScript w/out causing page reload

Related

htaccess regex variable parameter

I'm not used to regex and figure I've lost too many hours trying to resolve this, so thought I'd ask for help. I am trying to prettify the html extension.
My site will use URLs that have variable parameters. For example:
mysite.com/article/this-is-an-entry
mysite.com/article/this-is-an-entirely-different-entry
All will use .html as the extension.
In the htaccess file, I have tried
RewriteRule ^(article\/[a-z].*)$ $1.html [NC,L]
as well as slight variations of this, but cannot get this right. Thanks in advance for any assistance.
Firstly, let's look at the regex you have:
^(article/[a-z].*)$
This matches exactly the string "article/", followed by at least one letter (case insensitive due to the NC flag), followed by zero or more of anything. It's quite broad, but should match the examples you gave.
One way to test that it's matching is to add the R=temp flag to the rule, which tells Apache to redirect the browser to the new URL (I recommend using "=temp" to stop the browser caching the redirect and making later testing harder). You can then observe (e.g. in your browser's F12 debug console) the original request and the redirected one.
RewriteRule ^(article/[a-z].*)$ $1.html [NC,L,R=temp]
However, as CBroe points out, your rule will match again on the target URL, so you need to prevent that. A simple way would be to use the END flag instead of L:
Using the [END] flag terminates not only the current round of rewrite processing (like [L]) but also prevents any subsequent rewrite processing from occurring in per-directory (htaccess) context.
So:
RewriteRule ^(article/[a-z].*)$ $1.html [NC,END]
Alternatively, you can make your pattern stricter, such as changing the . ("anything") to [^.] ("anything other than a dot"):
^(article/[a-z][^.]*)$
To be even more specific, you can add a RewriteCond with an extra pattern to not apply the rule to, such as "anything ending .html".

Apache2: Mod_rewrite wildcard text.html

We have changed e-commerce vendors and I need to perserve some of the SEO we've done.
I want to do a 301 for a set of pages to a single new URL
I have a set of pages that end all with the same tickets.htm. So, for example, I have a pages like /blank_tickets.htm and /concert_tickets.htm and the list goes on.
So I tried this:
RewriteRule ^/(.*)tickets.htm$ /t/tickets/types/standard
I tried variations of this no leading / no $, etc.
I'm sure I'm missing something simple but my Google-fu is not returning a relevant example.
Thanks!
You didn't escape the dot. Use \. in your regular expression at the .htm part.
Edit: also use the [R=301] flag, or simply [R] to produce a HTTP 301 header.

Problem with htaccess GET form variables in a rewritten url

Essentially my problem is thus; I have a MVC system that redirects all requests to index.php on my site. I have a rewrite rule in my htaccess file to handle those requests like so:
RewriteRule ^([a-zAZ\_\-]+)\/([a-zA-Z\_\-]+)\/([^\/?]*) /?module=$1&class=$2&event=$3
Which translates urls into these type of urls
http://example.com/users/login/
http://example.com/users/info/me
My problem is that I also want GET variables to be applied and used in the URL like so
http://example.com/users/login/?var1=val1&var2=val2
http://example.com/users/info/me?var1=val2...
I've written two different regexes that work perfectly well in a my workbench (expresso) and I've tested them out in PHP however they refuse to work in htaccess. They're not particular complex, I have tried:
^([a-zAZ_\-]+)\/([a-zA-Z_\-]+)\/([^\/\?]*)[\?]*(.*) /?module=$1&class=$2&event=$3&$4
and
^([a-zAZ_\-]+)\/([a-zA-Z_\-]+)\/([^\/\?]*)(?(?=\?)\?(.+)) /?module=$1&class=$2&event=$3&$4
Neither of these work and I'm racking my brains as to why. Essentially it just doesn't recognise the fourth group and returns nothing I thought it might have been due to it being next to an ampersand but I did &var=$4 as a test and it still fell over.
Any help with this would be greatly appreciated as this is driving me insane.
Thanks in advance,
Rupert S.
After all, this is what you need:
RewriteRule ^([a-z_-]+)/([a-z_-]+)/([^/?]*) /?module=$1&class=$2&event=$3 [QSA,NC,L]
[QSA] will append the additional GET parameters to the rewritten query string.
[NC] since it is case insensitive, no need for A-Z matches

What's wrong with this regular expression in a .htaccess file?

I'm trying to understand why this regular expression isn't working in my .htaccess file. I want it so whenever a user goes to the job_wanted.php?jid=ID, they will be taken to job/ID.
What's wrong with this?
RewriteEngine On
RewriteCond %{QUERY_STRING} jid=([0-9]+)
RewriteRule ^job_wanted\.php?$ job/%1? [R]
I want it so when a user clicks on http://localhost/jobwehave.co.za/jobs/ID they are shown the same results as what below would show http://localhost/jobwehave.co.za/jobs?id=ID.
Sorry for the mix up. I still very confused to how this works.
The primary problem is that you can't match the query string as part of RewriteRule. You need to move that part into a RewriteCond statement.
RewriteEngine On
RewriteCond %{QUERY_STRING} jid=([0-9]+)
RewriteRule ^job_wanted\.php$ /job/%1?
Editing to reflect your updated question, which is the opposite of what I've shown here. For the reverse, to convert /job/123 into something your PHP script can consume, you'll want:
RewriteEngine On
RewriteRule ^/job/([0-9]+)$ /path/to/job_wanted.php?jid=$1
But you're probably going to have trouble putting this in an .htaccess file anywhere except the root, and maybe even there. If it works at the root, you'll likely need to strip the leading / from the RewriteRule I show here.
Second edit to reflect your comment: I think what you want is complicated, but this might work:
RewriteEngine On
RewriteRule ^/job/([0-9]+)$ /path/to/job_wanted.php?jid=$1 [L]
RewriteCond %{QUERY_STRING} jid=([0-9]+)
RewriteRule ^job_wanted\.php$ http://host.name/job/%1? [R]
Your fundamental problem is that you want to "fix" existing links, presumably out of your control. In order to change the URL in the browser address bar, you must redirect the browser. There is no other way to do it.
That's what the second cond+rule does: it matches incoming old URLs and redirects to your pretty URL format. This either needs to go in a VirtualHost configuration block or in the .htaccess file in the same directory as your PHP script.
The first rule does the opposite: it converts the pretty URL back into something that Apache can use, but it does so using an internal sub-request that hopefully will not trigger another round of rewriting. If it does, you have an infinite loop. If it works, this will invoke your PHP script with a query string parameter for the job ID and your page will work as it has all along. Note that because this rule assumes a different, probably non-existent file system path, it must go in a VirtualHost block or in the .htaccess file at your site root, i.e. a different location.
Spreading the configuration around different places sounds like a recipe for future problems to me and I don't recommend it. I think you'll be better off to change the links under your control to the pretty versions and not worry about other links.
The ^ anchors the regex at the beginning of the string.
RewriteRule matches the URI beginning with a / (unless it's in some per-directory configuration area).
Either prefix the / or remove the anchor ^ (depending on what you want to achieve)
You haven't captured the job ID in the regex, so you can't reference it in the rewritten URL. Something like this (not tested, caveat emptor, may cause gastric distress, etc.):
RewriteRule ^job/([0-9]+) job_wanted.php?jid=$1
See Start Rewriting for a tutorial on this.
You need to escape the ? and . marks if you want those to be literals.
^job_wanted\.php\?jid=9\?$
But although that explains why your pattern isn't matching, it doesn't address the issue of your URL rewriting. I'm also not sure why you want the ^ and $ are there, since that will prevent it from matching most URLs (e.g. http://www.yoursite.com/job_wanted.php?jid=9 won't work because it doesn't start with job_wanted.php).
I don't know htaccess well, so I can only address the regex portion of your question. In traditional regex syntax, you'd be looking for something like this:
s/job_wanted\.php\?jid=(\d*)/job\/$1/i
Hope that helps.
Did you try to escape special characters (like ?)?
The ? and . characters have a special meaning in regular expressions. You probably just need to escape them.
Also, you need to capture the jid value and use it in the rule.
Try to change your rules to this:
RewriteEngine On
RewriteRule ^job_wanted\.php\?jid=([0-9]+)$ /job/$1
Something like
ReWriteRule ^job\_wanted\.php\?jid\=([0-9-]+)$ /job/$1
should do the trick.

Mod-rewrites on apache: change all URLs

Right now I'm doing something like this:
RewriteRule ^/?logout(/)?$ logout.php
RewriteRule ^/?config(/)?$ config.php
I would much rather have one rules that would do the same thing for each url, so I don't have to keep adding them every time I add a new file.
Also, I like to match things like '/config/new' to 'config_new.php' if that is possible. I am guessing some regexp would let me accomplish this?
Try:
RewriteRule ^/?(\w+)/?$ $1.php
the $1 is the content of the first captured string in brackets. The brackets around the 2nd slash are not needed.
edit: For the other match, try this:
RewriteRule ^/?(\w+)/(\w+)/?$ $1_$2.php
I would do something like this:
RewriteRule ^/?(logout|config|foo)/?$ $1.php
RewriteRule ^/?(logout|config|foo)/(new|edit|delete)$ $1_$2.php
I prefer to explicitly list the url's I want to match, so that I don't have to worry about static content or adding new things later that don't need to be rewritten to php files.
The above is ok if all sub url's are valid for all root url's (book/new, movie/new, user/new), but not so good if you want to have different sub url's depending on root action (logout/new doesn't make much sense). You can handle that either with a more complex regex, or by routing everything to a single php file which will determine what files to include and display based on the url.
Mod rewrite can't do (potentially) boundless replaces like you want to do in the second part of your question. But check out the External Rewriting Engine at the bottom of the Apache URL Rewriting Guide:
External Rewriting Engine
Description:
A FAQ: How can we solve the FOO/BAR/QUUX/etc. problem? There seems no solution by the use of mod_rewrite...
Solution:
Use an external RewriteMap, i.e. a program which acts like a RewriteMap. It is run once on startup of Apache receives the requested URLs on STDIN and has to put the resulting (usually rewritten) URL on STDOUT (same order!).
RewriteEngine on
RewriteMap quux-map prg:/path/to/map.quux.pl
RewriteRule ^/~quux/(.*)$ /~quux/${quux-map:$1}
#!/path/to/perl
# disable buffered I/O which would lead
# to deadloops for the Apache server
$| = 1;
# read URLs one per line from stdin and
# generate substitution URL on stdout
while (<>) {
s|^foo/|bar/|;
print $_;
}
This is a demonstration-only example and just rewrites all URLs /~quux/foo/... to /~quux/bar/.... Actually you can program whatever you like. But notice that while such maps can be used also by an average user, only the system administrator can define it.