htaccess: redirect first folder to get param - regex

I am very new to regular expressions and the .htaccess file of the apache webserver.
I want to rewrite a url so that the first subfolder gets converted to a GET-parameter and all the following subfolders should be left as they are...
A few examples:
http://www.example.com/thisisthevariable
should be rewritten to
index.php?p=thisisthevariable
and
http://www.example.com/thisisthevariable/test.jpg
should be rewritten to
test.jpg?p=thisisthevariable
and
http://www.example.com/thisisthevariable/subfolder/and/another/sub/folder/123.gif
should be rewritten to
subfolder/and/another/sub/folder/123.gif?p=thisisthevariable
It should work with a unlimited count of subfolders. So the whole path should be used except the first directory - this should be used as get-parameter of the destination file.
Hope, anyone understands my task and can help me :-)
Thanks!!

RewriteRule ^([^/]+)$ index.php?p=$1
RewriteRule ^([^/]+)/(.+) $2?p=$1
[^/]+ matches anything that's not a forward slash, so the first part before the first /. The () capture it as a match, which is then available as $1 (and $2). For more details, see http://www.regular-expressions.info/refquick.html.

Related

How can I canonicalize URLs in my .htaccess?

I have a Wordpress installation on a LAMP stack, and if I have a post at http://example.com/abc/ , I would like URLs like http://example.com/abc/def.html to be redirected to http://example.com/abc/ . (Note that the slot here occupied by "def" should be without any slashes; this means among other things that things under http://example.com/wp-content/ should be unhindered.)
The rewrite I tried is:
RewriteRule ^(/[^/]+/)[^/]+\.html$ $1 [R=301,L]
As far as I can tell, that says, "Take the first two slashes and everything between them, matching on no more slashes and ending in .html, and redirect to the first captured group." However, with that in place, I can access http://example.com/abc/ , but I get a 404 on attempted access to http://example.com/abc/def.html .
What should I be doing to put the desired redirect behavior in place?
Thanks,
Try this rule:
RewriteRule ^/?([^/]+/)[^/.]+\.html$ /$1 [NC,R=301,L]
make leading slash optional as .htaccess doesn't have it and tweak part after first slash. Make sure this is your very first rule.

Redirect content of a folder to another with htaccess

I would like to restructure some folders on my website, specifically I am want to move what's contained inside "images/" to "images/gallery/", but I don't want to break previous links, so I thought of using htaccess.
I looked up several tutorials and even several questions here on stackoverflow, tried several times, but I can't get the rewrite rule to work.
This is what I have:
RewriteRule ^images/(.*) /images/gallery/$1 [R=301,NC,L]
But when I try to access anything inside /images/ (for example images/test.jpg) it stays into images/test.jpg and doesn't go to images/gallery/test.jpg. So it doesn't seem to have an effect.
Any clue on what I might possibly doing wrong?
Thank you!
Your rule at present will cause a redirect loop since /images/ is present in both source and target URLs and you're not even using anchor $:
You can tweak your regex like this:
RewriteRule ^images/([^/]+)$ /images/gallery/$1 [R=301,NC,L]
Now pattern will match /images/test.jpg but won't match redirected URL /images/gallery/test.jpg due to use of [^/]+ in pattern.
Make sure this rule is first after RewriteEngine On and there is no .htaccess in /images/ folder.
EDIT: If your original path has sub-directories also then use:
RewriteRule ^images/((?!gallery/).+)$ /images/gallery/$1 [R=301,NC,L]

mod_rewrite where the search begins?

Recently, I'm experimenting with PHP's mod_rewrite engine. A bunch of tutorials I've read gave me a pretty good picture how to use its most basic and useful possibilities. But there is still that question I didn't find the answer for. I guess it should be the very first question to be explained but no tutorial gave me the answer yet.
I'm wondering which very part of URL is being considered when trying to match the regex.
Let's say I have a directory my_project on my server and a .htaccess file inside that directory. The browser should see the directory like this:
http://my_website.com/my_project
If I add a rule in .htaccess then which part of the above URL will be considered when trying to match the regex of this rule? I'm pretty good in understanding regular expressions themselves but I can't figure out which chunk of URL does mod_rewrite pick to do the regex.
If my question isn't clear enough let me also put it this way: which exact place of the above URL is matched by the following regex in .htaccess?
^
Yet another question, if I go to
http://my_website.com/my_project/subfolder
will the considered part of the URL will be different or it will always depend on the place where .htaccess is placed?
I figured it out.
To explain the problem and how I got to the answer I'll try to explain it step by step.
Let's assume the following:
.htaccess is placed in a folder my_project in the root path of www.my_website.com. .htaccess consists the following rule:
RewriteRule ^.*$ index.php?matched=$0
To avoid endless loop let's "fire" the rule only if we provide a test parameter in query string, so the complete .htaccess should look like this:
RewriteEngine On
RewriteCond %{QUERY_STRING} test=1
RewriteRule ^.*$ index.php?matched=$0
Now, if everything goes as I thought we should end up in the index.php script placed in my_project folder. To see the whole match let's add the following line to the script:
var_dump($_GET["matched"]);
In the browser we go to http://my_website.com/my_project?test=1 and we expect the output to be:
string(32) "http://my_website.com/my_project"
But it is not! It is instead
string(0) ""
We're almost there. Now let's go to http://my_website.com/my_project/subfolder/?test=1. The output is
string(10) "subfolder/"
That proves one thing - when mod_rewrite starts to compare the URL with regular expressions it doesn't see the PROTOCOL part of the URL as well as the HTTP_HOST part. As my further research reveal, it also ommits every folder above the .htaccess location as well as the query string and hash part of the URL. For the mod_rewrite the URL begins where the .htaccess location begins.
I hope this self-answered question will be helpful for someone in the future.
Enjoy!
Let me give you a practical example
Suppose Your website is www.example.com and it's located in a folder/Directory named 'ex'
You'll place the .htaccess file in your ex folder to make it work for your website www.example.com
Now let's say you want to make this url clean www.example.com/ex/index.php?page=welcome
open your.htaccess file that you have placed in your ex folder and add this following code to it
RewriteEngine On
RewriteRule ^([A-Za-z0-9-+_%*?]+)/?$ index.php?page=$1 [L]
It'll chagle the URL from www.example.com/ex/index.php?page=welcome to www.example.com/ex/welcome
Now let's say you moved your website to a subfolder ex/subfolder OR www.example.com/ex to www.example.com/ex/subfolder
Simply move the .htaccess file with all of your site to that subfolder no need to change the code it'll work the same
([A-Za-z0-9-+_%*?]+) <-- this part with in the brackets is used as regular expression
means you are looking for any Character from A to z and from a to z and any number from 0 to 9 and symbol - , +,_,%,*,? and the + sign after the closing square bracket means more than one .
In short You are asking to for which is ([in here]+) and it's more than one ,if however you remove the + symbol after the bracket it'll return only the first character

Why does RewriteRule . work the same as ^(.*)$?

Consider the following:
RewriteRule ^(.*)$ index.php/$1
Based on the fledgling noob knowledge I have of mod_rewrite, this should match the entire URL part between example.com/ and ?query=string, and then prepend it with index.php/, and generally, that's exactly what happens.
http://example.com/some/stuff -> http://example.com/index.php/some/stuff
Now, consider:
RewriteRule . index.php
According to the developers of the current update to the Concrete5 CMS, this does the exact same thing. And indeed, it appears to do just that on my server as well.
My question is, why does the second RewriteRule yeild the same result instead of something like
http://example.com/some/stuff -> http://example.com/index.phpome/stuff
Shouldn't the . match one character and then be replaced with the index.php string? This is on Apache 2.2.
RewriteRule ^(.*)$ index.php/$1 will match and use the captured text to create a new path with whatever was originally requested added to the end where the $1 is.
RewriteRule . index.php matches because it is an unanchored regex. The previous regex uses ^ and $ to anchor the match which means that the pattern must match entirely while this one does not which means that it will match anywhere in the string, so any string with any character will match. Because mod_rewrite treats each test as a running predicate, this rule will be applied as long as it matches.
When the rule is matched the substitution takes place. The substitution is a complete substitution, so if you don't use backreferences like $1 then whatever was in the original pattern is lost. In this case the new path just becomes index.php.
There is therefore a slight difference between the 2, in that the second just goes directly to index.php without adding the originally requested path on to the end. Most likely Concrete5 CMS is using a front controller which dispatches according to information it pulls from the request directly. Since this isn't a redirect rewrite, the original request will be perserved so that is just used instead: shifting some responsibility from Apache and into the hands of the application code, pursuing less dependence on the hosting environment.
The match on the left is not replaced, it is merely searched for. You must use backreferences to replace/retain specific parts.

.htaccess change / in match to &

I want to have te following URLs on my page:
www.domain.com/<module>/<function>/<query>=<string>/<query>=<string>/<query>=<string>
I know how to match the part with the module and function to valid urls like this:
www.domain.com/index.php?module=<module>&function=<function>
But I have no idea how I can append all those query=string-parameters to the query string.
I currently use RewriteRule ^([A-Za-z0-9_]+)/([A-Za-z0-9_]+)$ index.php?module=$1&function=$2 [NC]as my rule and would like to add those (optional and repeatable) query-string parts.
I hope someone knows more about htaccess and regexp than me xD
These rules need to be placed in .htaccess file in website root folder.
RewriteRule ^(.+)/([a-z0-9_]+)=([^/]+)/?$ $1/?$2=$3 [NC,N,DPI,QSA]
RewriteRule ^([a-z0-9_]+)/([a-z0-9_]+)/?$ /index.php?module=$1&function=$2 [NC,QSA,L]
They will rewrite URL (internally) from this form
http://www.example.com/main/job/p1=value/p2=something+else/PP=yes
into this form
http://www.example.com/index.php?module=main&function=job&p1=value&p2=something+else&PP=yes
These rules need to be placed somewhere on the top of .htaccess -- first rule uses [N] flag which tells Apache to start rewriting from start again (in order to rewrite all <query>=<string> fragments). If you have a lot of rules before this one, Apache will have to "probe" each rule after each iteration, which may put unnecessary load on web server.