mod_rewrite page beautifying - regex

So I'm stuck. I'm not very good with mod_rewrite or regular expressions in general, and this is giving me problems.
I need to redirect url's like
domain.com/view/Some_Article_Name.html
to
domain.com/index.php?p=view&id=Some_Article_Name
The rule that I have now works fine, but it also rewrites for all my stylesheets and images and stuff that shouldn't be rewritten.
RewriteRule ^view/([^/]*)\.html$ /index.php?p=view&id=$1 [L]
It should only redirect pages that start with domain.com/view/*.I imagine that all I need is a rewritecond, but I can't seem to make one that works. Got any idea what I need to add to this to make it work without writing a rewriterule for every individual file?
RewriteCond %{REQUEST_FILENAME} ^(.*)\.html [NC]
RewriteRule ^view/([^/]*)\.html$ /index.php?p=view&id=$1 [L]
is my rewrite statement for this.

I doubt that your stylesheets and images are being affected by this rule. Because that would mean your stylesheets’ and images’ URL paths end with .html. Because otherwise the rule won’t be applied.
I rather suppose that you’re using relative URL paths to reference the stylesheets and images like ./css/style.css or images/foo/bar.png. Such relative URL paths are resolved from a base URL path. And that is the URL path of the current document.
Your original URL path had just one path segment and all relative URL paths worked when starting at that point. But now you introduces another path segment (/view) and relative paths are resolved from that path. So ./css/style.css or images/foo/bar.png is now resolved to /view/css/style.css and /view/images/foo/bar.png instead of /css/style.css and /images/foo/bar.png like it was with /index.php.
The solution: Use absolute URL paths like /css/style.css or /images/foo/bar.png to reference your external resources. With such URL paths the base URL can have any path and the resources are always getting resolved correctly.

Are you saying that you want the css & images to live under the original file location (e.g. on htdocs/view/ where htdocs is your source root, but you want the url for the html to be rerouted to the index.php file?
If that is what you're saying, than what you need is a reverse RewriteRule as well, since the browser will pick any css/images relative to the redirected url (assuming relative image/css references in html).
Other than that, I agree with #David and can't how anything w/o view and .html could be matching.

Related

.htaccess: Rewrite path internally without redirecting

I can access my web server as follows: https://www.example.com/my_old_folder/some_folder/
There's an .htaccess file in /my_old_folder/ with the following code:
RewriteEngine on
RewriteRule ^my_old_folder/(.*) my_new_folder/$1
I want to rewrite the folder my_old_folder internally to my_new_folder, without changing the URL in the browser. Just grab the files from /my_new_folder/ instead of /my_old_folder/. If there's another folder like /some_folder/ in this case, keep it. Only change the name /my_old_folder/ to /my_new_folder/.
Unfortunately, it's not rewriting the path, although I already tried many solutions from the internet, including the above one.
Who can help?
Inside /my_old_folder/.htaccess you can use this rule:
RewriteEngine on
RewriteRule .* /my_new_folder/$0 [L]
It is because all path matching is relative to my_old_folder/ inside /my_old_folder/.htaccess.

Language URL Redirects to original path when trailing slash is missing

I'm currently looking at setting up a few paths for multi language on our site.
I'm wanting to route it through the same php files and just load in a different content file depending on the url structure.
I've been using:
RewriteRule ^(es|fr|us)/(.*)$ /$2 [QSA,L]
Which works well however if the url doesnt have a trailing slash it reverts back. These are virtual folders and just route to the correct index.php file.
For example:
www.example.com/es/signup/ - This works and keeps the url in place
www.example.com/es/signup - without the trailing slash redirects back to www.example.com/signup/ (missing the language path)
Intended outcome would be that www.example.com/es/signup redirects to www.example.com/es/signup/ as well
I've had a look at the redirect log and can't see anything obvious.
I could be wrong but I think it might be due to the directoryslashes setting with Apache? when signup/ is passed over it hits the directory however if signup (without the trailing slash) is passed over it does the 301 redirect due to the directoryslashes setting?
Thanks in advance
A little bit unclear what you want, however:
RewriteRule ^(es|fr|us)/(.*)$ /$2 [QSA,L]
Takes:
www.example.com/es/route
And turns it into:
www.example.com/route
The:
www.example.com/es/route
That redirects to:
www.example.com/es/route/
Is default behavior to achieve it, simply delete the above rule.

Redirect 301 - remove .html extension from URLs

I would like to remove the .html extension from my urls, located into specific directory and redirect 301 them.
Here is how the structure looks like:
mysite.com/category/nameofcategory/pagenumber.html
The thing is that nameofcategory and pagenumber could be any letter or number.
Could you please help me with this?
I wouldn't recommend having your content scattered in many html-files in different folders. This becomes very impractical if you for example want to change the layout of your pages.
Storing the content in a database is a much better solution. If that's not possible perhaps the html files could contain only the formatted text content and a back end script could embed that content to a layout when the page is requested.
This requires that the mod_rewrite module is enabled in the Apache configuration.
In both cases all of the requests would be routed through the back end script and the .htaccess might look something like this:
RewriteEngine on
RewriteRule ^category/([^/.]+)/([^/.]+)/?$ index.php?category=$1&page=$2 [L]
This part of the regex: ([^/.]+) matches and captures a string that doesn't contain the characters / or . and is 1 characters long or longer. The captured strings can be referenced with $1, $2 and so on.
Now the pretty urls like mysite.com/category/foo/bar work. In addition we need to define a rule that redirects the old urls ending in ".html". The rule required might look something like this:
RewriteRule ^category/([^/.]+)/([^/.]+).html$ category/$1/$2 [R=301,L]
One thing to remember while testing and adjusting the redirects is that the redirect may get cached in the browser which may lead to confusing results when testing.
To remove the .html extension on the URL and 301 redirect to the extensionless URL you can try the following in the .htaccess in your "specific directory":
RewriteEngine On
RewriteBase /specific-directory
RewriteRule ^(.*)\.html$ $1 [R=301,L]

^ character not working on mod rewrite in htaccess

I'm having this very annoying problem with my rewrite rules in the .htaccess file.
The context
So what I want is to have these two types of URLs rewrite to different targets:
URL 1 -- http://example.com/rem/call/answer/{Hex String}/{Hex String}/
URL 2 -- http://example.com/answer/{Hex String}/{Hex String}/
This is an extract of my .htaccess file:
RewriteEngine On
RewriteRule rem/call/answer/([a-f0-9]+)/([a-f0-9]+)/?$ /TARGET1
RewriteRule answer/([a-f0-9]+)/([a-f0-9]+)/?$ /TARGET2
The problem
Now the problem is that URL 2 rewrites well (using rule #2) and goes to TARGET 2, but URL 1 rewrites with both rules instead of just rule #1.
I tried several solutions, including the obvious use of the character ^ for "start of string". At that point, my rewrite rules were:
RewriteEngine On
RewriteRule rem/call/answer/([a-f0-9]+)/([a-f0-9]+)/?$ /TARGET1
RewriteRule ^answer/([a-f0-9]+)/([a-f0-9]+)/?$ /TARGET2
However, another problem happened. This time it's URL 1 that rewrites well, with only rule #1 and goes to TARGET 1. But now URL 2 doesn't rewrite at all any more. I'm guessing it's because the second rewrite rule never matches any url and thus never applies.
The only solution I found so far is to remove the ^ and use the [L] flag at the end of rule #1 like so:
RewriteEngine On
RewriteRule rem/call/answer/([a-f0-9]+)/([a-f0-9]+)/?$ /TARGET1 [L]
RewriteRule answer/([a-f0-9]+)/([a-f0-9]+)/?$ /TARGET2
This way, it uses rule #1, matches, but never gets to rule #2. Both urls get rewritten properly with these rules, but it is not a good solution since I might not want to stop the rewriting of URL 1 after the first rule applies (what if I have a third rule I would want to apply to it as well...)
My questions to you
Now that I've stated the problem, my questions here are:
Is the [L] flag the only way to go ? (which I highly doubt, and certainly hope not)
Would ^ be a candidate solution ? (I think so)
If so, how to make it work and why is it not working at all in my case ?
What I suspect
I suspect that it has something to do with the fact that the URL is actually http://example.com/answer/{Hex String}/{Hex String}/ and not just answer/{Hex String}/{Hex String}/, which means that answer/.. isn't really at the beginning of the string and thus prefixing it with ^ doesn't work.
But then it brings me to another question:
How to tell apache to strip the url of the scheme+domain part (i.e. http://example.com/) and match rules with the remainder of the url only (e.g. answer/{Hex String}/{Hex String}/) ?
EDIT
I should also add that I've tried the basic alice-bob example. I have a file named bob.html in my root and the following rule in my .htaccess file:
RewriteRule alice.html$ /bob.html
This works just fine and displays the bob.html page when alice.html is queried. However, if I change the rule to:
RewriteRule ^alice.html$ /bob.html
I then get a 404 error when querying the alice.html page...
As for #anubhava's comment, my full .htaccess file is composed as follows:
RewriteEngine On
[A bunch of RewriteRule that have nothing to do with the topic at hand
(don't contain any "answer" string in them and all work perfectly)]
RewriteRule rem/call/answer/([a-f0-9]+)/([a-f0-9]+)/?$ /TARGET1 [L]
RewriteRule answer/([a-f0-9]+)/([a-f0-9]+)/?$ /TARGET2
ErrorDocument 404 /404.html
Header set Access-Control-Allow-Origin "*"
SetEnv file_uploads On
Ok so, thanks to #anubhava's comments, I solved the problem easily by moving the .htaccess file down one level to the www directory.
I was still quite curious about why this solved my problem, so I went on investigating how apache's rewriting works. I'm not sure I've got all the details right, but here's what I found out.
Location location location
Of course, it goes without saying that the location of files is important, and especially configuration files like .htaccess. But it goes even beyond simple file path, and here is the reason why:
First, you need to keep in mind that the .htaccess file will affect the directory it's located in as well as all its subdirectories. So it would seem logical that a global .htaccess file should be placed at the root directory of your website, since it will affect all subdirectories (i.e. the whole website).
The second thing to keep in mind is that the public_html directory (which in my case was called www, simply a symbolic link to public_html) is the root folder of your website's content. You might have access to its parent directories, but whatever you put outside of your public_html directory is not part of your website's content per se, any resource you put there won't be part of your website's hierarchy (i.e not accessible via http://example.com/path/to/resource).
The regex option ^ matches the start of a string, here in the context of URL rewriting, it's the start of the considered URL. And that's not all, it seems that Apaches resolves matches relatively to the location of your .htaccess file. Which means that the ^ not only references the start of the string you wrote as part of the rule but actually references it relatively to the actual path of the .htaccess file which acts as a "local root directory" for all the rewrite rules in that specific .htaccess file.
Example
Let's say you have a subdirectory (e.g http://example.com/sub/directory/) and inside it you have two files:
http://example.com/sub/directory/.htaccess
http://example.com/sub/directory/bob.html
inside this .htaccess file, you have a rewrite rule as follows:
RewriteRule ^tom.html$ /sub/directory/bob.html
This rule will not match http://example.com/tom.html as you could expect the ^ to act, but instead will match http://example.com/sub/directory/tom.html since this is where the .htaccess file is located.
Conclusion
Generally speaking, let's say you have a rewrite rule such as:
RewriteRule ^PATH$ /TARGET_PATH
This means that the rule will not match the URL against ^PATH$, but instead it actually matches it against ^[Location of the .htaccess file]/PATH$
In other words, the location of the .htaccess file acts as a sort of base URL for all rewrite rules in it (much similar to the base tag in html).
This is why my rewrite rule with ^ didn't work, since my .htaccess file was located above the public_html directory, and that parent directory was acting as the base URL for my rules. Thus the rule would never match any URL since it would compare it with a path never accessed (because above the website's content root).
I hope this was clear enough to help anyone who might encounter the same problem I had.
Cheers

htaccess redirection using regular expression

my .htaccess file contains the following
RewriteCond %{HTTP_HOST} ^www\.mydomain\.org\.in [NC]
RewriteRule ^(.*)$ http://mydomain.org.in/$1 [R=301,L]
I moved the whole site to a subfolder and now none of the css and js files in the webpage load. Can anybody tell me what this regex means or why this is happening?
Note: I inherited the site from my seniors :P
It just redirects any request to www.mydomain.org.in/... to mydomain.org.in/...; i.e. it strips the www from the front. However, this shouldn't cause the resource files to break if you simply move it to a subdirectory, assuming you've moved them as well (though you should probably leave the .htaccess file where it is).
It sounds like the links to your CSS/JS files in your HTML might be broken, perhaps because they use absolute URIs (relative to the domain root rather than the current URI). Try checking them first.
As Will explained the .htaccess is not the issue. Your JS and CSS locations were mentioned not relatively and as such when the location of the source files changed they are not being found by the browsers and as such the page is not rendering.
However, you can try the following .htaccess code in addition to the one you are having and see if it links to the files.
RewriteRule ^(.+)\.css$ http://mydomain.org.in/folder/$1.css [R=302,NC]
RewriteRule ^(.+)\.js$ http://mydomain.org.in/folder/$1.js [R=302,NC]
The above code redirects calls to css and js files to a subfolder in your domain. Change folder to the folder you moved everything to.