Is there a "clean URL" (mod_rewrite) equivalent for iPlanet? - coldfusion

I'm working with Coldfusion (because I have to) and we use iPlanet 7 (because we have to), and I would like to pass clean URL's instead of the query-param junk (for numerous reasons). My problem is I don't have access to the overall obj.conf file, and was wondering if there were .htaccess equivalents I could pass on the fly per directory. Currently I am using Application.cfc to force the server to look at index.cfm in root before loading the requested page, but this requires a .cfm file is passed, so it just 404's out if the user provides /path/to/file but no extension. Ultimately, I would like to allow the user to pass domain.com/path/to/file but serve domain.com/index.cfm?q1=path&q2=to&q3=file. Any ideas?

You can mod_dir with the DirectoryIndex directive to set which page is served on /directory/ requests.
http://httpd.apache.org/docs/2.2/mod/mod_dir.html

I'm not sure what exists for iPlanet, haven't had to work with it before. But it would be possible to use a url like index.cfm/path/to/file, and pull the extra path information via the cgi.path_info variable. Not exactly what you're looking for, but cleaner that query-params.

Related

Alternative to <!--#include virtual="somefilename"-->

I have a website running an an old apache server with SSI enabled. My host wants to move to a new server which has SSI disabled for security reasons.
I have a whole lot of pages with Google Friendly urls which just have one line
<!--#include virtual="Url_Including_Search_String"-->
What is the best alternative to the SSI to keep my google friendly search strings returning the specified search result?
I can achieve most of the results with rewrite rules in the .htaccess file, however some search strings have a space in the keyword but the url doesn't. I can't do this with a rewrite rule
ie www.somedomain.com.au/SYDNEY.htm would have
<!--#include virtual="/search.php?keyword=SYDNEY&Submit=SEARCH"-->
However,the issue is
www.somedomain.com.au/POTTSPOINT.htm would have
<!--#include virtual="/search.php?keyword=POTTS+POINT&Submit=SEARCH"-->
A rewrite rule cannot detect where a space should be in a Suburb name, so hoping there is an alternative for <!--#include virtual=
I have looked at RewriteMap but don't think I can access the file I would need to put this in.
I would use Mod Rewrite to redirect any calls to non-existent files to your Search page.
For example:
http://example.com/SYDNEY redirects to
http://example.com/search.php?q=SYDNEY
(assuming there is not actually a /SYDNEY/ file at your server root.)
Then get rid of all those individual redirect pages.
As for the spaces, I'd modify my actual Search page to recognize (for example) "POTTSPOINT" and figure out that the space should be inserted. Basically compare the search term against a database of substitutions.

301 redirect to correct url

I have a lot of incorrect bad links and want to 301 redirect them to the correct one, the correct url are as follows:
Blockquote http://www.domain.com/string-video_string.html
the back links are pointing to:
Blockquote http://www.domain.com/string_string.html
any possible way to 301 redirect the wrong back links to the correct links?
Thank you in adance
You can use this rule in your site root .htaccess:
RedirectMatch 301 ^/([^_-]+)_(.+)$ /$1-video_$2
Depending on how you want to redirect (in which method; PHP, htaccess, etc.) you have some options.
I assume you're seeing 404 errors when users are trying to get to the links from an external source, like a search engine.
If that's the case, you can easily generate the code you need for which ever method you choose using this website:
http://www.rapidtables.com/web/tools/redirect-generator.htm
Make sure that you correctly format the URL's you want to redirect and it should work fine.
If you want to make sure your SEO issues get fixed, you should create a robots.txt file and place it in the root directory of your site (usually where the index file is) - and follow the instructions from this site: http://tools.seobook.com/robots-txt/ to de-index the bad links from the search engine. You may also want to create and submit (or resubmit) XML site maps to the search engines your users use most.

What does this URL mean?

http://localhost/students/index.cfm/register?action=studentreg
I did not understand the use of 'register' after index.cfm. Can anyone please help me understand what it could mean? There is a index.cfm file in students folder. Could register be a folder name?
They might be using special commands within their .htaccess files to modify the URL to point to something else.
Things like pointing home.html -> index.php?p=home
ColdFusion will execute index.cfm. It is up to the script to decide what to do with the /register that comes after.
This trick is used to build SEO friendly URL's. For example http://www.ohnuts.com/buy.cfm/bulk-nuts-seeds/almonds/roasted-salted - buy.com uses the /bulk-nuts-seeds/almonds/roasted-salted to determine which page to show.
Whats nice about this is it avoids custom 404 error handlers and URL rewrites. This makes it easier for your application to directly manage the URL's used.
I don't know if it works on all platforms, as I've only used it on IIS.
You want to look into the cgi.PATH_INFO variable, it is populated automatically by CF server when such URL format used.
Better real-life example would look something like this.
I have an URL which I want to make prettier:
http://mybikesite/index.cfm?category=bicycles&manufacturer=cannondale&model=trail-sl-4
I can rewrite it this way:
http://mybikesite/index.cfm/category/bicycles/manufacturer/cannondale/model/trail-sl-4
Our cgi.PATH_INFO value is: /category/bicycles/manufacturer/cannondale/model/trail-sl-4
We can parse it using list functions to get the same data as original URL gives us automatically.
Second part of your URL is plain GET variable, it is pushed into URL scope as usually.
Both formats can be mixed, GET vars may be used for paging or any other secondary stuff.
index.cfm is using either a CFIF IsDefind("register") or a CFIF #cgi.Path_Info# CONTAINS statements to execute a function or perform a logic step.

Using onMissingTemplate in lieu of stub cfm files

In our ColdFusion application each request goes through index.cfm
Application.cfc decides form the query and form parameters which componetes the user is actually wanted. Those components are instantiated and the content is dropped through OnRequestStart.
Rather than always hit index.cfm with a query/form parameter, for simple cases, we would like to hit a "missing" cfm (i.e. MyApp.cfm) and allow the OnMissingTemplate function parse out the fact that we really want the content of a component (i.e. MyApp).
Another way to do this would be to actualy put cfm stub files in for "generic" calls to the components but it seems like with OnMissingTemplate we do not need to do that.
Is this a reasonable use for OnMissingTemplate?
That's a great use for onMissingTemplate. Just make sure that if you're using IIS, that you make sure that the files you're linking to are actually .cfm (MyApp.cfm) files, and not directories (/MyApp/). See these links for more information:
http://www.bennadel.com/blog/1625-ColdFusion-8-s-OnMissingTemplate-So-Close-To-Being-Good.htm
http://www.bennadel.com/blog/1694-ColdFusion-s-OnMissingTemplate-Event-Handler-Works-With-CFC-Requests.htm

With Coldfusion, how do you handle dynamicaly generated URLs?

(Update: I converted this question to a community wiki as the answer appears more subjective than I thought it would. There are multiple answers depending on one's needs.)
If I have a folder that only includes application.cfc and index.cfm, what is a fast, reliable method to handle dynamically generated URLs? i.e. URLs that do not have a corresponding physical .cfm file.
This example url generates a 404, but it should lookup a page in a db and return it via index.cfm:
http://www.myserver.com/cfdemo/mynewpage.cfm
Should I use onMissingTemplate() in the application.cfc to handle the missing file? Since this method doesn't process onRequestStart(), onRequest() and onRequestEnd(), I wonder if it should be avoided.
Alternately, I could setup an ISAPIRewrite rule since I'm using IIS (or mod_rewrite on Apache)
# IF the request is not /index.cfm, doesn't exist and ends in cfm or html,
# rewrite it. Pass the requested filename $1.$2 as the 1st param: cgi.page
# append the remaining url params $4 ($3 is the ?)
RewriteCond %{SCRIPT_NAME} ^(?!/index.cfm)(.*)$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^\/(.*)\.(cfm|html)(\??)(.*)$ /index.cfm?page=$1.$2&$4 [I,L]
Are these methods appropriate, or am I missing a better way of accomplishing this goal? It seems that Coldfusion should have this type of feature built into the application.cfc. Maybe I'm just missing it.
nothing wrong with url rewrite on web server level. I'd vote for that.
Because CF by default handles only cfm/cfc requests, you can do in the beginning of Application.cfc something like this:
<cfif Right(cgi.SCRIPT_NAME, 9) NEQ "index.cfm">
<!--- analyze the SCRIPT_NAME and start processing --->
</cfif>
For other filetypes using web-server configuration is the only way I can see. But instead of creating rewriting rules you can try to use custom 404 handlers. At least when using IIS you'll be able to get the context in cgi.QUERY_STRING, if set up the dummy page, say 404.cfm (it does not need to exist) and putting following check before previous example:
<!--- trap 404 requests triggered by IIS --->
<cfif right(cgi.SCRIPT_NAME, 7) EQ "404.cfm">
<cflog file="mylogfile" text="404 error triggered by IIS. Context: #cgi.QUERY_STRING#">
</cfif>
For Apache it is possible to use following handler, but I'm not sure if you can extract the context in this case:
ErrorDocument 404 /404.cfm
If you are doing this for SES URLs, I'd offer two pieces of advice.
The first is that they matter less and less as time goes on. Google, for example, recognizes that URLs need to include query data.
Second: CF can natively handle SES URLs in the form hostname/file.cfm/param1/param2. Ray Camden's BlogCFC, for example, works that way. It is on by default in CF8, but needs to be enabled in CF7. I don't have a lot of information handy on this, but it should be easy to Google (or Bing, or whatever).
If you can allow it, I'd try to convert URLs like:
http://www.myserver.com/cfdemo/mynewpage.cfm
to:
http://www.myserver.com/cfdemo/mynewpage OR
http://www.myserver.com/index.cfm/cfdemo/mynewpage
so that you don't lose the onRequest methods. The first one can be done only at the webserver level, so in Apache or IIS. The second one can be done in just ColdFusion. See this: http://www.cfcdeveloper.com/index.cfm/2007/4/7/Coldfusion-SES-URL.
Otherwise, if you must have the .cfm at the end, you can use a URL rewrite package in Apache or IIS to strip it out and then forward the request to a cfm page or do what you're doing with onMissingTemplate. I'd try to opt for a solution that doesn't involve losing the onRequest methods, but up to you.
I'd definitely go for URL rewriting. Not only will it be a more predictable, yet generalized approach, but it reduces a significant amount of string parsing load from the CF server. Further, it results in CF handling a request to a real file thereby getting you the benefit of onapplicationstart, onrequeststart, and other events.
As an aside, I've personally always found URLs like /index.cfm/foo/bar/ to look unpro and hackish. Additionally, URLs (like /foo/bar) that don't end in either a file extension or trailing slash are technically incorrect (per old-school static site conventions at the very least) and ought to probably be avoided as well. I'd also be curious where Ben Doom gets his assertion that "The first is that they matter less and less as time goes on. Google, for example, recognizes that URLs need to include query data." In my experience I've actually found the exact opposite to be true.