What does this URL mean? - coldfusion

http://localhost/students/index.cfm/register?action=studentreg
I did not understand the use of 'register' after index.cfm. Can anyone please help me understand what it could mean? There is a index.cfm file in students folder. Could register be a folder name?

They might be using special commands within their .htaccess files to modify the URL to point to something else.
Things like pointing home.html -> index.php?p=home

ColdFusion will execute index.cfm. It is up to the script to decide what to do with the /register that comes after.
This trick is used to build SEO friendly URL's. For example http://www.ohnuts.com/buy.cfm/bulk-nuts-seeds/almonds/roasted-salted - buy.com uses the /bulk-nuts-seeds/almonds/roasted-salted to determine which page to show.
Whats nice about this is it avoids custom 404 error handlers and URL rewrites. This makes it easier for your application to directly manage the URL's used.
I don't know if it works on all platforms, as I've only used it on IIS.

You want to look into the cgi.PATH_INFO variable, it is populated automatically by CF server when such URL format used.
Better real-life example would look something like this.
I have an URL which I want to make prettier:
http://mybikesite/index.cfm?category=bicycles&manufacturer=cannondale&model=trail-sl-4
I can rewrite it this way:
http://mybikesite/index.cfm/category/bicycles/manufacturer/cannondale/model/trail-sl-4
Our cgi.PATH_INFO value is: /category/bicycles/manufacturer/cannondale/model/trail-sl-4
We can parse it using list functions to get the same data as original URL gives us automatically.
Second part of your URL is plain GET variable, it is pushed into URL scope as usually.
Both formats can be mixed, GET vars may be used for paging or any other secondary stuff.

index.cfm is using either a CFIF IsDefind("register") or a CFIF #cgi.Path_Info# CONTAINS statements to execute a function or perform a logic step.

Related

Use RegEx to redirect using data from files

Recently, we restructured a large site of one of our customers. This caused all the news-articels on that site to be on a different place. Problem is that the google cache is still showing them on the old location, leading to A LOT of 404 not founds ( its about 1400 news entries ).
Normally, a redirect using somewhat simple regex would be fine, but not only the path to the news did change, but also some parameters. Example:
Old Url:
http://www.customers-url.com/old/path/to/the/news/details/?tx_ttnews%5Btt_news%5D=67&cHash=a782f3027c4d00face6df122a76c38ed
How the new url should look like:
http://www.customers-url.com/new/path/to/news/?tx_news_pi1%5Bnews%5D=65
As you can see, the parameter D did change from 67 to 65 and the part of the URL before the ? did also change. Also, tx_ttnews has changed to tx_news and tt_news changed to news and the &cHash part did fall away completely.
I have the changed ids in a csv in the following format:
old_id, new_id
1,2
2,3
...etc...
Same goes the the changed url part before the ?. Since im not exactly an expert in using regex my question is:
Can this be done inside the .htaccess using RegEx ( not sure if it can even use a file as input)? How complicated is that? And how would such a regular expression look like?
Rather than trying to use .htaccess, it would be easier to manage and easier to code if you simply make a new page that responds on the old url (/old/path/to/the/news/details), then make that page build the new url and return a 301 to the browser with that new url.

Regex in cfif to check there's a parameter in URL?

I'm trying to display a block of code only if the parameter ?staff is appended to URL, e.g.:
display Link only if the current URL was loaded with www.blank.com/folder/?staff
Any thoughts?
I don't believe there is any kind of performance difference, but to me it's best practices to use StructKeyExists(url,'staff') rather than isDefined("url.staff"). Either one will definitely get the job done though.
You don't need to parse the url with a regex, It should be in the url variable already if defined
<cfif IsDefined("URL.staff")>

How do you acquire the rewritten CFWheels URL of an given page?

CFWheels has the URLFor() function for getting the internal URL based on supplied arguments. Is there a way to get the internal URL without supplying any arguments?
For example:
Given a user navigates to "http://somedomain.com" or "http://somedomain.com/about/" or "http://somedomain.com/contact/" is there a method like ReWrittenURL() that returns something like "/" or "/about/" or "/contact/"?
Using URLFor() with no arguments returns "/home/index" or "/about/index" or "/contact/index".
CGI.SCRIPT_NAME returns "/rewrite.cfm"
Obviously with Javascript using document.location.href I can get what I'm after.
Does CGI.path_info have the value you're looking for?
edit
At first, I deleted this post, being utterly confounded. Now I've done a little test - I downloaded the latest wheels core files (1.1.6), extracted to an IIS 7.5 (with URL Rewrite module installed) + CF9 webserver, and edited the "web.config" file in the core root, setting "enabled='true'" for the rewrite rule. Also, since I was running this example from a subfolder, I changed the path from "/rewrite.cfm" to just "rewrite.cfm". This got me to the point where I was able to successfully requests urls like this:
http://server/wheelstest/wheels/wheels
From here, I edited the layout.cfm under views/wheels, adding:
<cfdump var="#cgi#">
When I then request the above URL (/wheelstest/wheels/wheels), I see the dump for the cgi scope. Under path_info, this is the value: /wheels/wheels.
Next, I added a blank "index.cfm" file under views/wheels.
When I request /wheelstest/wheels, I get this for path_info: "/wheels".
When I request /wheelstest/wheels/, I get this for path_info: "/wheels/".
When I request /wheelstest/wheels/index, I get this for path_info: "/wheels/index".
When I request /wheelstest/wheels/index/, I get this for path_info: "/wheels/index/".
So basically - cgi.path_info is doing for me exactly what you describe you want. What is different about your setup than mine, such that it isn't returning that value for you?
there might be a better way to do this... but here I go anyway
every page gets sent the #params#
<cfdump var="#params#">
<cfoutput>#params.action#/#params.controller#/#params.key#</cfoutput>
<cfabort>
try putting that in a controller and see the results
the problem is that if the objects inside the params object don't exist you get an error. So the path that gets generated needs to check if the struct key exists and edit accordingly.
CGI.Path_Info will give you the desired results. I've been trying different options however they all failed and went into the redirect loop. As soon as I switched CGI.path_info it all started well.

Is there a "clean URL" (mod_rewrite) equivalent for iPlanet?

I'm working with Coldfusion (because I have to) and we use iPlanet 7 (because we have to), and I would like to pass clean URL's instead of the query-param junk (for numerous reasons). My problem is I don't have access to the overall obj.conf file, and was wondering if there were .htaccess equivalents I could pass on the fly per directory. Currently I am using Application.cfc to force the server to look at index.cfm in root before loading the requested page, but this requires a .cfm file is passed, so it just 404's out if the user provides /path/to/file but no extension. Ultimately, I would like to allow the user to pass domain.com/path/to/file but serve domain.com/index.cfm?q1=path&q2=to&q3=file. Any ideas?
You can mod_dir with the DirectoryIndex directive to set which page is served on /directory/ requests.
http://httpd.apache.org/docs/2.2/mod/mod_dir.html
I'm not sure what exists for iPlanet, haven't had to work with it before. But it would be possible to use a url like index.cfm/path/to/file, and pull the extra path information via the cgi.path_info variable. Not exactly what you're looking for, but cleaner that query-params.

With Coldfusion, how do you handle dynamicaly generated URLs?

(Update: I converted this question to a community wiki as the answer appears more subjective than I thought it would. There are multiple answers depending on one's needs.)
If I have a folder that only includes application.cfc and index.cfm, what is a fast, reliable method to handle dynamically generated URLs? i.e. URLs that do not have a corresponding physical .cfm file.
This example url generates a 404, but it should lookup a page in a db and return it via index.cfm:
http://www.myserver.com/cfdemo/mynewpage.cfm
Should I use onMissingTemplate() in the application.cfc to handle the missing file? Since this method doesn't process onRequestStart(), onRequest() and onRequestEnd(), I wonder if it should be avoided.
Alternately, I could setup an ISAPIRewrite rule since I'm using IIS (or mod_rewrite on Apache)
# IF the request is not /index.cfm, doesn't exist and ends in cfm or html,
# rewrite it. Pass the requested filename $1.$2 as the 1st param: cgi.page
# append the remaining url params $4 ($3 is the ?)
RewriteCond %{SCRIPT_NAME} ^(?!/index.cfm)(.*)$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^\/(.*)\.(cfm|html)(\??)(.*)$ /index.cfm?page=$1.$2&$4 [I,L]
Are these methods appropriate, or am I missing a better way of accomplishing this goal? It seems that Coldfusion should have this type of feature built into the application.cfc. Maybe I'm just missing it.
nothing wrong with url rewrite on web server level. I'd vote for that.
Because CF by default handles only cfm/cfc requests, you can do in the beginning of Application.cfc something like this:
<cfif Right(cgi.SCRIPT_NAME, 9) NEQ "index.cfm">
<!--- analyze the SCRIPT_NAME and start processing --->
</cfif>
For other filetypes using web-server configuration is the only way I can see. But instead of creating rewriting rules you can try to use custom 404 handlers. At least when using IIS you'll be able to get the context in cgi.QUERY_STRING, if set up the dummy page, say 404.cfm (it does not need to exist) and putting following check before previous example:
<!--- trap 404 requests triggered by IIS --->
<cfif right(cgi.SCRIPT_NAME, 7) EQ "404.cfm">
<cflog file="mylogfile" text="404 error triggered by IIS. Context: #cgi.QUERY_STRING#">
</cfif>
For Apache it is possible to use following handler, but I'm not sure if you can extract the context in this case:
ErrorDocument 404 /404.cfm
If you are doing this for SES URLs, I'd offer two pieces of advice.
The first is that they matter less and less as time goes on. Google, for example, recognizes that URLs need to include query data.
Second: CF can natively handle SES URLs in the form hostname/file.cfm/param1/param2. Ray Camden's BlogCFC, for example, works that way. It is on by default in CF8, but needs to be enabled in CF7. I don't have a lot of information handy on this, but it should be easy to Google (or Bing, or whatever).
If you can allow it, I'd try to convert URLs like:
http://www.myserver.com/cfdemo/mynewpage.cfm
to:
http://www.myserver.com/cfdemo/mynewpage OR
http://www.myserver.com/index.cfm/cfdemo/mynewpage
so that you don't lose the onRequest methods. The first one can be done only at the webserver level, so in Apache or IIS. The second one can be done in just ColdFusion. See this: http://www.cfcdeveloper.com/index.cfm/2007/4/7/Coldfusion-SES-URL.
Otherwise, if you must have the .cfm at the end, you can use a URL rewrite package in Apache or IIS to strip it out and then forward the request to a cfm page or do what you're doing with onMissingTemplate. I'd try to opt for a solution that doesn't involve losing the onRequest methods, but up to you.
I'd definitely go for URL rewriting. Not only will it be a more predictable, yet generalized approach, but it reduces a significant amount of string parsing load from the CF server. Further, it results in CF handling a request to a real file thereby getting you the benefit of onapplicationstart, onrequeststart, and other events.
As an aside, I've personally always found URLs like /index.cfm/foo/bar/ to look unpro and hackish. Additionally, URLs (like /foo/bar) that don't end in either a file extension or trailing slash are technically incorrect (per old-school static site conventions at the very least) and ought to probably be avoided as well. I'd also be curious where Ben Doom gets his assertion that "The first is that they matter less and less as time goes on. Google, for example, recognizes that URLs need to include query data." In my experience I've actually found the exact opposite to be true.