Codeigniter Routes for filename with extension - regex

I am using codeigniter and its routes system successfully with some lovely regexp, however I have come unstuck on what should be an easy peasy thing in the system.
I want to include a bunch of search engine related files (for Google webmaster etc.) plus the robots.txt file, all in a controller.
So, I have create the controller and updated the routes file and don't seem to be able to get it working with these files.
Here's a snip from my routes file:
$route['robots\.txt|LiveSearchSiteAuth\.xml'] = 'search_controller/files';
Within the function I use the URI helper to figure out which content to show.
Now I can't get this to match, which points to my regexp being wrong. I'm sure this is a really obvious one but its late and my caffeine tank is empty :)

You should not need to escape the full stop, CodeIgniter does most of the escaping for you.
Here is a working example I use:
$route['news/rss/all.rss'] = "news/rss";

Issue was actually in .htaccess file where I had created a rewrite exception to allow the search engine files to be accessed directly rather than routing them through codeigniter.
RewriteCond $1 !^(index\.php|google421b29fc254592e0.html|LiveSearchSiteAuth.xml|content|robots\.txt|favicon.ico)
Became
RewriteCond $1 !^(index\.php|content|favicon.ico)

Related

Alternative to <!--#include virtual="somefilename"-->

I have a website running an an old apache server with SSI enabled. My host wants to move to a new server which has SSI disabled for security reasons.
I have a whole lot of pages with Google Friendly urls which just have one line
<!--#include virtual="Url_Including_Search_String"-->
What is the best alternative to the SSI to keep my google friendly search strings returning the specified search result?
I can achieve most of the results with rewrite rules in the .htaccess file, however some search strings have a space in the keyword but the url doesn't. I can't do this with a rewrite rule
ie www.somedomain.com.au/SYDNEY.htm would have
<!--#include virtual="/search.php?keyword=SYDNEY&Submit=SEARCH"-->
However,the issue is
www.somedomain.com.au/POTTSPOINT.htm would have
<!--#include virtual="/search.php?keyword=POTTS+POINT&Submit=SEARCH"-->
A rewrite rule cannot detect where a space should be in a Suburb name, so hoping there is an alternative for <!--#include virtual=
I have looked at RewriteMap but don't think I can access the file I would need to put this in.
I would use Mod Rewrite to redirect any calls to non-existent files to your Search page.
For example:
http://example.com/SYDNEY redirects to
http://example.com/search.php?q=SYDNEY
(assuming there is not actually a /SYDNEY/ file at your server root.)
Then get rid of all those individual redirect pages.
As for the spaces, I'd modify my actual Search page to recognize (for example) "POTTSPOINT" and figure out that the space should be inserted. Basically compare the search term against a database of substitutions.

Rewriting RewriteRule to include back one particular php file in the path

I inherited a large WordPress site: I was looking at the Wordfence live logs when I found users not loading a particular PHP page. I investigated and through an FTP client I found the file was where it was supposed to be. I used some network tool (in Chrome, Opera and Firefox) and, again, I found that file was returning a 404.
So, I found in the root of the website a short htaccess file containing this line:
RewriteRule ^wp-content/(.*)\.php$ [R=404,L]
I commented this out and reloaded the website: no error anymore. I must say, the error apparently doesn't cause anything strange to the website. But I would like to eliminate it.
I suppose this rule is meant to avoid someone can make a direct HTTP request to this and any other PHP file in that directory: in this case I suppose this file I'm talking about is called from an include, not directly, because in WordFence what I see is an error coming after a user accesses directly other pages, not this one in particular.
Anyway, I would like to rewrite this rule so that it stays the same as now, except for that php page. The PHP page is in the path of the theme:
wp-content/themes/themeName/core/css/customized.css.php
Is this possible? Any help is appreciated
If you want to exclude that specific php file from the RewriteRule, you can add a negative lookahead to the regex, like so:
RewriteRule ^wp-content/(?!themes/.*/core/css/customized\.css\.php$)(.*)\.php$ [R=404,L]

301 redirect to correct url

I have a lot of incorrect bad links and want to 301 redirect them to the correct one, the correct url are as follows:
Blockquote http://www.domain.com/string-video_string.html
the back links are pointing to:
Blockquote http://www.domain.com/string_string.html
any possible way to 301 redirect the wrong back links to the correct links?
Thank you in adance
You can use this rule in your site root .htaccess:
RedirectMatch 301 ^/([^_-]+)_(.+)$ /$1-video_$2
Depending on how you want to redirect (in which method; PHP, htaccess, etc.) you have some options.
I assume you're seeing 404 errors when users are trying to get to the links from an external source, like a search engine.
If that's the case, you can easily generate the code you need for which ever method you choose using this website:
http://www.rapidtables.com/web/tools/redirect-generator.htm
Make sure that you correctly format the URL's you want to redirect and it should work fine.
If you want to make sure your SEO issues get fixed, you should create a robots.txt file and place it in the root directory of your site (usually where the index file is) - and follow the instructions from this site: http://tools.seobook.com/robots-txt/ to de-index the bad links from the search engine. You may also want to create and submit (or resubmit) XML site maps to the search engines your users use most.

Rewrite first folder to GET param for all php files

I am desperately looking for a rule to achieve the following:
Input URL request would be:
http://myserver.com/param/other/folders/and/files.php
It should redirect to
http://myserver.com/other/folders/and/files.php?p=param
similarly the basic index request
http://myserver.com/param/
would redirect to
http://myserver.com/?p=param
All my php files need the parameter, wherever they are. It'd be nice if JS and CSS files would be excluded but I guess it doesn't really matter since the /file.css?p=param would just be ignored and not cause a problem. I have found rules to map a folder to the GET parameter but none of them are working for php files deeper than the index file on the root level. Thanks so much in advance
Replace
http:\/\/([^\/]+)\/(\w+)\/(.*)
with
http:\/\/\1/\?p=\2\/\3
example regex page at https://regex101.com/r/sU6lR9/1

With Coldfusion, how do you handle dynamicaly generated URLs?

(Update: I converted this question to a community wiki as the answer appears more subjective than I thought it would. There are multiple answers depending on one's needs.)
If I have a folder that only includes application.cfc and index.cfm, what is a fast, reliable method to handle dynamically generated URLs? i.e. URLs that do not have a corresponding physical .cfm file.
This example url generates a 404, but it should lookup a page in a db and return it via index.cfm:
http://www.myserver.com/cfdemo/mynewpage.cfm
Should I use onMissingTemplate() in the application.cfc to handle the missing file? Since this method doesn't process onRequestStart(), onRequest() and onRequestEnd(), I wonder if it should be avoided.
Alternately, I could setup an ISAPIRewrite rule since I'm using IIS (or mod_rewrite on Apache)
# IF the request is not /index.cfm, doesn't exist and ends in cfm or html,
# rewrite it. Pass the requested filename $1.$2 as the 1st param: cgi.page
# append the remaining url params $4 ($3 is the ?)
RewriteCond %{SCRIPT_NAME} ^(?!/index.cfm)(.*)$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^\/(.*)\.(cfm|html)(\??)(.*)$ /index.cfm?page=$1.$2&$4 [I,L]
Are these methods appropriate, or am I missing a better way of accomplishing this goal? It seems that Coldfusion should have this type of feature built into the application.cfc. Maybe I'm just missing it.
nothing wrong with url rewrite on web server level. I'd vote for that.
Because CF by default handles only cfm/cfc requests, you can do in the beginning of Application.cfc something like this:
<cfif Right(cgi.SCRIPT_NAME, 9) NEQ "index.cfm">
<!--- analyze the SCRIPT_NAME and start processing --->
</cfif>
For other filetypes using web-server configuration is the only way I can see. But instead of creating rewriting rules you can try to use custom 404 handlers. At least when using IIS you'll be able to get the context in cgi.QUERY_STRING, if set up the dummy page, say 404.cfm (it does not need to exist) and putting following check before previous example:
<!--- trap 404 requests triggered by IIS --->
<cfif right(cgi.SCRIPT_NAME, 7) EQ "404.cfm">
<cflog file="mylogfile" text="404 error triggered by IIS. Context: #cgi.QUERY_STRING#">
</cfif>
For Apache it is possible to use following handler, but I'm not sure if you can extract the context in this case:
ErrorDocument 404 /404.cfm
If you are doing this for SES URLs, I'd offer two pieces of advice.
The first is that they matter less and less as time goes on. Google, for example, recognizes that URLs need to include query data.
Second: CF can natively handle SES URLs in the form hostname/file.cfm/param1/param2. Ray Camden's BlogCFC, for example, works that way. It is on by default in CF8, but needs to be enabled in CF7. I don't have a lot of information handy on this, but it should be easy to Google (or Bing, or whatever).
If you can allow it, I'd try to convert URLs like:
http://www.myserver.com/cfdemo/mynewpage.cfm
to:
http://www.myserver.com/cfdemo/mynewpage OR
http://www.myserver.com/index.cfm/cfdemo/mynewpage
so that you don't lose the onRequest methods. The first one can be done only at the webserver level, so in Apache or IIS. The second one can be done in just ColdFusion. See this: http://www.cfcdeveloper.com/index.cfm/2007/4/7/Coldfusion-SES-URL.
Otherwise, if you must have the .cfm at the end, you can use a URL rewrite package in Apache or IIS to strip it out and then forward the request to a cfm page or do what you're doing with onMissingTemplate. I'd try to opt for a solution that doesn't involve losing the onRequest methods, but up to you.
I'd definitely go for URL rewriting. Not only will it be a more predictable, yet generalized approach, but it reduces a significant amount of string parsing load from the CF server. Further, it results in CF handling a request to a real file thereby getting you the benefit of onapplicationstart, onrequeststart, and other events.
As an aside, I've personally always found URLs like /index.cfm/foo/bar/ to look unpro and hackish. Additionally, URLs (like /foo/bar) that don't end in either a file extension or trailing slash are technically incorrect (per old-school static site conventions at the very least) and ought to probably be avoided as well. I'd also be curious where Ben Doom gets his assertion that "The first is that they matter less and less as time goes on. Google, for example, recognizes that URLs need to include query data." In my experience I've actually found the exact opposite to be true.