Trouble getting a rule to match without any discerning identifiers - regex

I am looking to find a way to match a URL like www.domain.tld/about or www.domain.tld/contact within my .htaccess file. The rule has to be dynamic as the pages come from a CMS so the rule needs to be able to accept any newly created page.
Currently I have the following rule:
RewriteRule ([^/]+)$ ?cat=generic&page=$1 [L]
The issue is that without a trailing / or anything else to help identify the catch, it just triggers a 404 error page. I used to have the rewrite as:
RewriteRule ([^/]+)/$ ?cat=generic&page=$1 [L]
but decided to not have trailing slashes on the end of URL's, unless its a folder path.
Thank you anyone who can help on the issue.

Put this rule on top of all other rules i your .htaccess:
Options +FollowSymLinks -MultiViews
# Turn mod_rewrite on
RewriteEngine On
RewriteBase /
RewriteRule ^broadleaf$ /products/desktops/broadleaf-one [R=301,L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^/]+)/?$ /?cat=generic&page=$1 [L,QSA]
/?$ makes trailing slash optional.

Related

Rewrite url in .htaccess, dummy paths don't lead to 404 page but expose PHP warnings

I have these custom .htaccess redirections
# Add a trailing slash to folders that don't have one
RewriteCond %{REQUEST_URI} !(/$|\.)
RewriteRule (.*) %{REQUEST_URI}/ [R=301,L]
# Exclude these folders from rewrite process
RewriteRule ^(admin|ajax|cache|classes|css|img|webassist|js)($|/) - [L]
# Redirect root requests to /home/ folder
RewriteRule ^(/home/)?$ /home/index.php?nLang=it [NC,L]
# Start rewriting rules
RewriteRule ^risultati.htm$ /home/results.php [NC,L,QSA]
RewriteRule ^sfogliabile/(.*).htm$ /flip/browser.php?iCat=$1 [NC,L]
RewriteRule ^depliant/(.*).htm$ /flip/flyer.php?iSpecial=$1 [NC,L]
RewriteRule ^(.*)/ricerca/$ /ricerca/index.php?nLang=$1 [NC,L,QSA]
RewriteRule ^(.*)/professional/$ /home/pro.php?nLang=$1 [NC,L]
RewriteRule ^(.*)/3/(.*)/$ /products/index.php?nLang=$1&iModule=3 [NC,L]
RewriteRule ^(.*)/3/(.*)/(.*)/(.*).htm$ /products/details.php?nLang=$1&iData=$3&iModule=3 [NC,L]
RewriteRule ^(.*)/4/(.*)/$ /foreground/index.php?nLang=$1&iModule=4 [NC,L]
RewriteRule ^(.*)/4/(.*)/(.*)/(.*).htm$ /foreground/details.php?nLang=$1&iData=$3&iModule=4 [NC,L]
RewriteRule ^(.*)/5/(.*)/$ /specials/index.php?nLang=$1&iModule=5 [NC,L]
RewriteRule ^(.*)/5/(.*)/(.*)/(.*).htm$ /specials/details.php?nLang=$1&iData=$3&iModule=5 [NC,L]
RewriteRule ^(.*)/6/(.*)/$ /gallery/index.php?nLang=$1&iModule=6 [NC,L]
RewriteRule ^(.*)/6/(.*)/(.*)/(.*).htm$ /gallery/details.php?nLang=$1&iData=$3&iModule=6 [NC,L]
RewriteRule ^(.*)/(.*)/(.*)/(.*).htm$ /home/page.php?nLang=$1&iData=$3 [NC,L,QSA]
RewriteRule ^(.*)/$ /home/index.php?nLang=$1 [NC,L]
It works pretty fine for all the pages, except when I type in some non existing paths like:
/it/dummy/
/it/dummy/dummy/
/it/dummy/dummy/dummy/
etc...
Instead of 404 error page, I get a page exposing PHP warning and notices about missing variables and include files, that could lead to security problems and malicious attacks
I tried several things to get a RegExp that work with such paths (so I can redirect the user to the 404 page), but no luck: please, can you help me? Thanks in advance
Change your last rule to this,
# If the request is not for a valid directory
RewriteCond %{REQUEST_FILENAME} !-d
# If the request is not for a valid file
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([a-z]+)/$ home/index.php?nLang=$1 [L,QSA,NC]
That way it will only handle language parameter e.g. /it/ or /en/ but will let other URLs e.g. /it/dummy/ go to 404 handler.
At least your last rule
RewriteRule ^(.*)/$ /home/index.php?nLang=$1
sends all requests to /home/index.php and I suppose this script is the source for the warnings you get.
Since you have such a rule, presumably you actually want non-existing files to go to this script. It wouldn't help then to prevent calling the script because Apache couldn't know which urls will work and which not.
So you need to check for missing parameters or include files in your php script. This is especially reasonable because you never know what parameters attackers might call, as you already mentioned. A general rule of thumb is to check all parameters for validity before using them.
After you added all these checks, it is good practice to switch off error display (there is a php.ini entry for that, display_errors) but only log errors in a file (another entry, log_errors) in a production system.

Apache RewriteCond: how to match only top-level requests (no subdirectory)

After banging my head against this for the better part of a week, it turned out to be the same problem, and solution, as in this thread: RewriteCond in .htaccess with negated regex condition doesn't work?
TL;DR: I had deleted my 404 document at some point. This was causing Apache to run through the rules again when it tried to serve the new page and couldn't. On the second trip through, it would always match my special conditions.
I'm having endless trouble with this regex, and I don't know whether it's because I'm missing something about RewriteCond or what.
Simply, I want to match only top-level requests, meaning any request with no subdirectory. For example I want to match site.com/index.html, but not site.com/subdirectory/index.html.
I thought I would be able to accomplish it with this:
RewriteCond %{REQUEST_URI} !/[^/]+/.*
The interesting thing is, it doesn't work but the reverse does. For example:
RewriteCond %{REQUEST_URI} /[^/]+/.*
That will detect when there is a subdirectory. And it will omit top-level requests (site.com/toplevelurl). But when I put the exclamation point in front to reverse the rule (which RewriteCond is supposed to allow), it stops matching anything.
I've tried many different flavors of regex and different patterns that should work, but none seem to. Any help would be appreciated. this Stack Overflow answer seems like it should answer it but does not work for me.
I've also tested it with this .htaccess rule tester, and my patterns work in the tester, they just don't work on the actual server.
Edit: by request, here is my .htaccess. It allows URLs without file extensions and also does something similar to a custom 404 page (although its purpose is to allow filenames as arguments, not be a 404 replacement).
Options +FollowSymLinks
DirectoryIndex index.php index.html index.htm
<IfModule mod_rewrite.c>
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME}\.html -f
RewriteRule ^(.*)$ $1.html
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME}\.php -f
RewriteRule ^(.*)$ $1.php
RewriteCond %{REQUEST_FILENAME} =/home/me/public_html/site/
RewriteRule ^(.*)$ index.php
RewriteCond %{REQUEST_FILENAME} !-f # Below this is where I would like the new rule
RewriteRule ^(.*)$ newurl.php
</IfModule>
I want to match site.com/index.html, but not site.com/subdirectory/index.html
You can use:
RewriteRule ^[^/]+/?$
Or using RewriteCond:
RewriteCond %{REQUEST_URI} ^/[^/]+/?$

Rewrite URL's .htaccess

I believe it might be a possible duplicate. But I tried my best to search for such a thing that will suit my needs and I found, none.
So here's basically what I have so far, and I will explain what I need modified.
# Forbidden Access
ErrorDocument 403 /403.php
# Not Found
ErrorDocument 404 /404.php
<IfModule mod_rewrite.c>
Options +FollowSymlinks
RewriteEngine On
RewriteBase /
</IfModule>
<IfModule mod_rewrite.c>
# Strip off .php extension if it exists
RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s([^.]+)\.php [NC]
RewriteRule ^ %1 [R,L,NC]
# Unless directory, remove trailing slash
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^/]+)/$ /403.php$1 [R=301,L]
# Resolve .php file for extensionless php urls
RewriteRule ^([^/.]+)$ $1.php [L]
</IfModule>
Now this seems to be working flawlessly. But it has one error. Let me explain first.
1) It does automatically strip-off .php extension if it exists. Not sure if it strip off .php if it is url of an external request. Forgot to check, but maybe you already know so you can tell me ?
2) When I type this... "http://website.dev/img/" it does give me an "403 Forbidden Access". So that's all good.
3) When I try this... "http://website.dev/index" it does load the page even if there is .php extension manually added it will strip it off. So All good in here too...
4) When I try random path like this... "http://website.dev/asdasd" it does give me an "404 Not Found". So we're good in here as well.
But the main problem is here...
5) When I try following... "http://website.dev/dashboard/index" it give me an 404 Not Found even tho it should be loading without issues. It appears for all pages within dashboard directory.
Can you help me to modify that htaccess above please ? I am really tired of searching and I don't know regex at all.
That is because of the faulty regex used in your very last rule to silently add .php extension. Change last rule to:
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_URI}\.php -f [NC]
RewriteRule ^(.+?)/?$ /$1.php [L]
Here's my translation of you rules:
# Strip off .php extension if it exists
RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s([^.]+)\.php [NC]
Bad comment. You regexp means: strip off all files that have 3 uppercase first and and dot php in it. Maybe you've forgotten the ending $?
# Unless directory, remove trailing slash
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^/]+)/$ /403.php$1 [R=301,L]
Why is that? Just do a redirect, and Apache will handle the 301 it for you:
RewriteRule .* - [L,R=403]
And then last question: why you strip off .php extension, if you re-add it later on? (°_o)
So here's what you should do, with some examples, and adapt them you fit your needs:
First test if the file has no special treatment. If so, stop immediately, like this:
RewriteRule ^/(robots\.txt|404\.php|403\.php)$ -
Then test if someone is trying to hack. If so, redirect to whatever you want:
RewriteRule (.*)test.php - [QSA,L]
RewriteRule (.*)setup.php http://noobs.land.com/ [NC,R,L]
RewriteRule (.*)admin(.*) http://noobs.land.com/ [NC,R,L]
RewriteRule (.*)trackback(.*) http://noobs.land.com/ [NC,R,L]
Then, only after this, forbid the php extension:
RewriteRule (.*)php$ - [L,R=404]
Then, accept all static "known" file extension, and stop if it matches:
RewriteRule (.*)(\.(css|js|htc|pdf|jpg|jpeg|gif|png|ico|mpg|mp3|ogg|wav|otf|eot|svg|ttf|woff)){1}$ $1$2 [QSA,L]
Now you can do some testing. If the URI ends with a 'aabb/', test if you have a file named aabb.php, and if so, go for it:
RewriteCond %{REQUEST_URI} (\/([^\/]+))\/$
RewriteCond %{DOCUMENT_ROOT}/%1.php -f
RewriteRule (.*) %{DOCUMENT_ROOT}/%1.php [QSA,L]
If nothing is handled, and you get here, it's a problem, so stop it:
RewriteRule .* - [L,R=404]
FYI, all those sample rules are deeply tested on a production server.
And now with that, you have all what you need to do something good & working.

Movable Type to Wordpress migration: htaccess redirection issue

I'm migrating a rather large (5000+ posts) from Movable Type to WordPress. At this point, I'm stuck trying to ensure that old post urls won't be result in 404s once we go live with the new site.
The old url pattern looks like so:
http://domain.com/site/category/category/post_name.php
And I'd like to redirect those to
http://domain.com/category/category/post_name/
However, I have tried and tried with htaccess redirects, and no matter what I do, it either fails or generates a 500 error. I suspect I'm missing something silly, or that there are conflicting rules maybe, and I'm hoping that someone who knows htaccess better than I do can help me along the right path.
Here's what I've got right now. The rule redirecting /site/ to the root directory works just fine, but the other two have no effect, whether alone or together. I tried both to see if I could redirect a specific post and do it manually that way, but it still won't work.
RewriteEngine On
RewriteRule ^site/(.*) /$1 [NC]
RewriteRule ^site/resources/(.*).php$ /resources/$1 [NC]
RewriteRule ^site/resources/research/safe_urban_form_revisiting_the_relationship_b.php$ /resources/research/safe_urban_form_revisiting_the_relationship_b/ [NC]
# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
# END WordPress
Any help would be extremely useful!
It looks like you may want to use a redirect something like this:
# Redirect /site/any/path/file.php to /any/path/file/:
RewriteRule ^site/(.+)\.php$ $1/ [NC,R=301,L]
Also, I would place this as the first rule immediately after the RewriteBase / line in the Wordpress section.
Since you´ll keep the same domain, why don't you just forget about writing the redirection rules yourself and use the redirection plugin instead? It will be much easier for you to define the redirection rules with the help of the plugin. This is the strategy I follow every time I can
The reason your redirects aren't working as expected is that . is a special character in Regular Expressions' syntax -- it means "any character". You need to escape any special characters like ., ^, etc. with a backslash like so: \..
# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
# Redirect old URLs with ".php" in them.
RewriteRule ^site/(.+)\.php$ $1/ [NC,R=301,L]
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
# END WordPress
I'm not sure if you actually want the RewriteRule ^site/(.*) /$1 [NC] rule in there or if it was just testing. If you do, just add it in after the RewriteBase / statement.

URL Regex - remove trailing slash from file name and end of URL?

So I have this regex problem and wondered if anyone could help me out?
If a user visits http://example.com/index.php/ how can i modify/add to my regex to prevent a trailing slash(s) at the end?
also
I currently have a page, called post.php that can be accessed like so http://example.com/reviews/reviewTitle/ and http://example.com/news/newsTitle/
again, how could I prevent this trailing slash?
Below is the regex I have so far:
RewriteEngine on
RewriteRule ^reviews/([^/\.]+)/?$ reviews/post.php?title=$1 [L]
RewriteRule ^news/([^/\.]+)/?$ news/post.php?title=$1 [L]
RewriteRule ^page/(.*) index.php?page=$1
Note: Im also re-writing http://example.com/index.php?page=1 to http://example.com/page/1 etc, same question, how can I prevent a trailing slash?
Many thanks, I really appreciate any help :)
Try adding this rule after RewriteEngine on
RewriteEngine on
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)/$ $1 [R=301]
RewriteRule ^reviews/([^/\.]+)$ reviews/post.php?title=$1 [L]
RewriteRule ^news/([^/\.]+)$ news/post.php?title=$1 [L]
RewriteRule ^page/(.*) index.php?page=$1
Edited to show full set of rules
Edited 2nd Time Added in
RewriteCond %{REQUEST_FILENAME} !-d
above new rule to allow directorys to still be accessed without causing an infinite redirect loop