.htaccess how to remove file extensions and index files - regex

I'm trying to achieve a few things through .htaccess ,but keep running into issues. Before you tell me I need to research better and there's already a solution on this or a different forum, please know I've already done that. I always try and figure out things on my own before coming here, but this one is truly stumping me. Everything I've tried has only partially worked. Any help or education here would be truly appreciated.
My site has the following simple structure:
(root)
| index.html
| .htaccess
|
|___portal-folder
| index.php
| home.php
|
|_____admin-folder
| index.php
I'm looking to achieve the following:
When a user navigates to any base directory, for instance site.com or site.com/portal-folder/ they don't see the index file name index.html or index.php in their browser.
Same holds true if the user navigates to the full URL site.com/index.html or site.com/portal-folder/index.php I would like the user to see site.com or site.com/portal-folder/ respectively in their browser.
Strip the file extension off all files in the browser. So for instance navigating to site.com/portal-folder/home.php would show as site.com/portal-folder/home in the browser
The following code I'm using kind of works, but I'm getting strange behavior. For instance:
navigating to site.com/portal-folder/index doesn't remove the index file name and show up as site.com/portal-folder/index instead of site.com/portal-folder/ in the browser
navigating to site.com/portal-folder/ doesn't remove the index file name and shows up as site.com/portal-folder/index.php in the browser.
navigating to site.com/portal-folder/index.php takes the user back to the root site.com
navigating to site.com/portal-folder/home works correctly, but navigating to site.com/portal-folder/home.php doesn't strip the .php extension off.
navigating to site.com works correctly, but navigating to site.com/index.html doesn't remove the index file name.
RewriteEngine On
DirectoryIndex index.html index.php
# remove trailing slash
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{THE_REQUEST} \s(.+?)/+[?\s]
RewriteRule ^(.+?)/$ /$1 [R=301,L]
# To internally forward /dir/file to /dir/file.php
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{DOCUMENT_ROOT}/$1.php -f
RewriteRule ^(.+?)/?$ /$1.php [L]
Server Information: Apache Version 2.4.46

Have it this way:
DirectoryIndex index.html index.php
RewriteEngine On
# To externally redirect /dir/file.php to /dir/file and optionally remove index
RewriteCond %{THE_REQUEST} \s/+(.*?/)?(?:index|(\S+?))(?:\.php|\.html)?[/\s?] [NC]
RewriteRule ^ /%1%2 [R=301,L,NE]
# remove trailing slash
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} ^(.+)/+$
RewriteRule ^ %1 [R=301,NE,L]
# To internally forward /dir/file to /dir/file.php
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{DOCUMENT_ROOT}/$1.php -f
RewriteRule ^(.+?)/?$ $1.php [L]

Related

mod_rewrite redirects with absolute path in URL

I am trying to use Apache mod_rewrite. The first thing I did was to rewrite my url to an index.php file which was working fine. But I thought I should remove the trailing slash(es) too because I would prefer this to be handled by Apache instead of my PHP router.
Here's the whole content of my .htaccess file:
RewriteEngine on
# one of the attempts to remove trailing slash
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} (.*)/+$
RewriteRule ^(.*)/+$ $1 [R=301,L]
# This is the rewriting to my index.php (working)
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ index.php?/$1 [L]
The issue:
I read several questions about trailing slash removal but I could not find a working answer for me:
For every answer I tried, I was able to reach my PHP router index.php (located in Phunder\public\) without trailing slash:
// Requested URL | No redirection
http://localhost/projects/Phunder/public/home | http://localhost/projects/Phunder/public/home
But when requesting the same page with a trailing slash I get redirected with the absolute path included:
// Requested URL | Wrong redirection
http://localhost/projects/Phunder/public/home/ | http://localhost/C:/xampp/htdocs/projects/Phunder/public/home
Other informations:
I always clear my cache while testing
Changing my last RewriteRule to RewriteRule ^(.*)/?$ index.php?/$1 [L] results in a 404 Error with URL having a trailing slash.
The actual wrong redirection results in a 403 Error
I'm a beginner with mod_rewrite I'm not always understanding what I try (sadly). Is there something I missed or misused ? What should I do to get the expected behaviour ?
Redirect rules need either absolute URL or a RewriteBase. You can extract full URI from %{REQUEST_URI} as well like this:
RewriteEngine on
# one of the attempts to remove trailing slash
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} ^(.+)/+$
RewriteRule ^ %1 [R=301,NE,L]
# This is the rewriting to my index.php (working)
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ index.php?/$1 [L,QSA]

How to insert part of url via htaccess

I've got old URLs like this:
http://example.com/news/newscategory/77654
And it should be rewritten to this URL structure:
http://example.com/news/newscategory/id/77654
So the news is static here as it's the folder. the newscategory can be something else (multiple categories). And I want to redirect the user (301) to new url with id/ part.
I fetch the url via php where the news is key, newscategory is value, id is key and number at the end is again, value.
I'm using this htaccess for this:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} -s [OR]
RewriteCond %{REQUEST_FILENAME} -l [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^.*$ - [NC,L]
RewriteRule ^.*$ index.php [NC,L]
</IfModule>
Then I use some php library to fetch the values.
Edit: What I'm asking is that I want to redirect user from first url to second one so that I can fetch the id.
The first link is already working, but I still have links to the old url structure that leads to category view. Live urls:
old structure: http://www.autonet.ee/uudised/paevauudised/77665
new structure: http://www.autonet.ee/uudised/paevauudised/id/77665
the old one should redirect user to the new link.
You can use:
RewriteEngine On
# add /id/ in /news/... URLS
RewriteRule ^(news/[^\w-]+)/(\d+)$ /$1/id/$2 [R=301,L,NC]
# front controller rule
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^ index.php [L]
Try this rule if i understand correctly
RewriteRule ^([.*])/([.*])/([.*])/([.*])$ index.php?$1=$2&$3=$4
please explain if missed something.
EDIT
okay, according to you news and id are static but their value is dynamic please try the rule below
RewriteRule ^(news)/([.*])/(id)/([.*])$ index.php?news=$1&id=$2

Error getting .htaccess to direct googlebot using _escaped_fragment_

I am trying to get my pages indexed on google using a prerendering service for my backbone app.
I know the setup works fine when I specifically add googlebot to the useragent list but Ive been advised against this in favor of using the _escaped_fragment_ method. Only problem is the _escaped_fragment_ parameter isn't getting passed correctly. Can some help please?
thanks!!!
# html5 pushstate (history) support:
<ifModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{HTTP_HOST} ^example\.com$ [OR]
RewriteCond %{HTTPS} !on
RewriteRule ^(.*)$ https://www.example.com/$1 [R=301,L]
# If requested resource exists as a file or directory
# (REQUEST_FILENAME is only relative in virtualhost context, so not usable)
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -f [OR]
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -d
# Go to it as is
RewriteRule ^ - [L]
# If non existent
# If path ends with / and is not just a single /, redirect to without the trailing /
RewriteCond %{REQUEST_URI} ^.*/$
RewriteCond %{REQUEST_URI} !^/$
RewriteRule ^(.*)/$ $1 [R,QSA,L]
# Handle Prerender.io
RequestHeader set X-Prerender-Token "xxxxxxxx"
RewriteCond %{HTTP_USER_AGENT} baiduspider|facebookexternalhit|twitterbot|rogerbot|linkedinbot|embedly|quora\ link\ preview|showyoubot|outbrain|pinterest|slackbot|vkShare|W3C_Validator [NC,OR]
RewriteCond %{QUERY_STRING} _escaped_fragment_
# Proxy the request
RewriteRule ^(?!.*?(\.js|\.css|\.xml|\.less|\.png|\.jpg|\.jpeg|\.gif|\.pdf|\.doc|\.txt|\.ico|\.rss|\.zip|\.mp3|\.rar|\.exe|\.wmv|\.doc|\.avi|\.ppt|\.mpg|\.mpeg|\.tif|\.wav|\.mov|\.psd|\.ai|\.xls|\.mp4|\.m4a|\.swf|\.dat|\.dmg|\.iso|\.flv|\.m4v|\.torrent|\.ttf|\.woff))(.*) http://service.prerender.io/https://www.example.com/$2 [P,L]
# If non existent
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !index
RewriteRule (.*) index.html [L,QSA]
</ifModule>
All the apache modules are loaded and working.
So the .htaccess is actually correct... here Google's official answer.
Quote from http://productforums.google.com/forum/#!category-topic/webmasters/crawling-indexing--ranking/bZgWCJTnl08%5B1-25%5D by John Mueller (google employee)
Looking at your blog's homepage, one thing to keep in mind is that the Fetch
as Googlebot feature does not parse the content that it fetches. So when you
submit toddmoyer.net/blog/ , it fetches that URL. After fetching the URL, it
doesn't parse it to check for the "fragment" meta tag, it just returns it to
you. However, if you fetch toddmoyer.net/blog/#! , then it should rewrite the
URL and fetch the URL toddmoyer.net/blog/?_escaped_fragment_= .
When we crawl and index your pages, we'll notice the meta-tag and act
accordingly. It's just the Fetch as Googlebot feature that doesn't check for
meta-tags, and instead just returns the raw content.

mod_rewrite - How to keep rewritten addresses without changing html links

Let's say I have a website with given sites:
index.php
info.php
In order to hide the "index.php" part of the URL and to change "info.php" to "info" I create the following .htaccess:
RewriteEngine On
RewriteRule ^info$ info.php
RewriteRule ^.$ index.php
It works just fine when I type the new URLs directly into address bar, but inside index.php I have the:
<a href="info.php">
and so when I click the link, it directs me to the info page and displays "info.php" in the URL. If I want to see just "info" in the URL after clicking the link, do I have to change the links inside HTML code? Or is there a way to make .htaccess file do it automatically?
You can put this code in your DOCUMENT_ROOT/.htaccess file for changing these links automatically:
RewriteEngine On
RewriteBase /
# To externally redirect /dir/file.php to /dir/file
RewriteCond %{THE_REQUEST} \s/+(?:index)?(.*?)\.php[\s?] [NC]
RewriteRule ^ /%1 [R=301,L,NE]
# To internally forward /dir/file to /dir/file.php
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{DOCUMENT_ROOT}/$1\.php -f [NC]
RewriteRule ^(.+?)/?$ /$1.php [L]
But it is highly recommended to change these links in your HTML page to avoid one external redirect.

Rewrite rule with 2 variables with "/" separation

I have an .htaccess file located in a folder "/mixtapes/" I am trying to get the URL mydomain.com/music/downloads/mixtapes/this-title/id to execute mydomain.com/music/downloads/mixtapes/item.php?title=variable1&id=variable2
I currently have the below way somewhat working but it only uses the id and I need both variables (../mixtapes/title/id)separated by "/" and for some reason with the below code the index page inside "/mixtapes/" does not work.I am stumped! I am somewhat new to this and any help is greatly appreciated!
BTW on my index page the passing url to item.php page is rewritten to <a href="title/id">I just cant seem to get it to properly execute item.php?title=a&id=b with the format mixtapes/title/id
Current htaccess file located in "/mixtapes/"
# turn mod_rewrite engine on
RewriteEngine On
# rewrite all physical existing file or folder
RewriteCond %{REQUEST_FILENAME} !-f [OR]
RewriteCond %{REQUEST_FILENAME} !-d
# allow things that are certainly necessary
RewriteCond %{REQUEST_URI} "/css/" [OR]
RewriteCond %{REQUEST_URI} "/images/" [OR]
RewriteCond %{REQUEST_URI} "/images/" [OR]
RewriteCond %{REQUEST_URI} "/javascript/"
# rewrite rules
RewriteRule ^mixtapes/item.php(.*) - [L]
RewriteRule ^(.*) item.php?id=$1 [QSA]
The comments in your .htaccess actually state wrong
# turn mod_rewrite engine on
RewriteEngine On
# if requested URL is NOT a existing file
RewriteCond %{REQUEST_FILENAME} !-f [OR]
# or if requested URL is NOT a directory
RewriteCond %{REQUEST_FILENAME} !-d
# CSS, images, JavaScript are files and we will never pass this point when those are requested, next rules can be skipped
# rewrite rules
# rewrite /mixtapes/title/id to item.php?title=title&id=id
Rewrite ^mixtapes/([^/]+)/([0-9]+) item.php?title=$1&id=$2 [L]
# catch all other requests and handle them ( optional, default would be 404 if not a physically existing file )
Rewrite (.*) index.php [L,QSA]
I've assumed that your id is a numeric value.
Be aware with the use of the title in php. Don't output this directly but you can use it to verify your URL and redirect wrong title/id combos