Evaluating Coldfusion directory path for file existence - coldfusion

I've been working on a script for a couple of weeks tweaking it as need be to track CGI.script_PATH and CGI.REFERER on an older coldfusion install which has over 500 .cfc and .cfm pages. I just hit a snag in my code. It doesn't capture a page name in the CGI.Referer variable when the referer is a folder. I'm sure it has something to do with Coldfusion automatically looking for an index.cfm even when the path doesn't include an actual file name.
How can I write an addition to my script where if there is no .cfm in the CGI.Referer, it can search the directory and capture the default file set to load or at least search for an occurrence of index.cfm or default.cfm?
Here is a block of code handling the referer element:
<!---Variable declared and set to empty--->
<cfset referer_path_and_file = "">
<cfset referer_path = "">
<cfset referer_file_name = "">
<cfset script_path_and_file = "">
<cfset script_path = "">
<cfset script_file_name = "">
<cfif cgi.HTTP_REFERER neq ''>
<!--- all of this will fail if there is no referer, for instance, if they bookmark the page --->
<!--- cgi.HTTP_REFERER may contain URL parameters, so let's strip those --->
<cfset referer_path_and_file = ListFirst(CGI.HTTP_REFERER, "?")>
<!--- now let's get just the path, stripping out the web server info --->
<cfset referer_path = ListDeleteAt(CGI.HTTP_REFERER, ListLen(CGI.HTTP_REFERER, "/"), "/")>
<cfset referer_path = ReplaceNoCase(referer_path, "https", "", "All")>
<cfset referer_path = ReplaceNoCase(referer_path, "http", "", "All")>
<cfset referer_path = ReplaceNoCase(referer_path, "://machine1.fss.com", "", "All")>
<cfset referer_path = ReplaceNoCase(referer_path, "://www_dev.fss.com", "", "All")>
<cfset referer_path = ReplaceNoCase(referer_path, "://www.fss.com", "", "All")>
<cfset referer_path = ReplaceNoCase(referer_path, "://10.11.2.60/", "", "All")>
<cfset referer_path = referer_path & "/">
<cfset referer_path = ReplaceNoCase(referer_path, "/", "\", "All")>
<!--- now let's remove everything but the file name --->
<cfset referer_file_name = ListLast(referer_path_and_file, "/")>
<!--- and that leaves us with these variables set --->
<!--- referer_path_and_file = "#referer_path_and_file#"<br />
referer_path = "#referer_path#"<br />
referer_file_name = "#referer_file_name#"<br />
<br />--->
</cfif>
<!---Directory Stripping And Modifier Block Goes Here--->
<!---Set CGI System Variables--->
<cfset currentHeader = CGI.HTTP_REFERER >
<cfset currentScriptPage = CGI.SCRIPT_NAME >
<!---Set currentScriptPage as command line directory string and delcare new variable "reverseScriptPage"--->
<cfset reverseScriptPage = ReReplace(#currentScriptPage#, "/", "\","ALL")>
<!---Set reverseScriptPage value as newly format command line directory structure--->
<cfset newScriptPage = ListSetAt(#reverseScriptPage#, 1, "#reverseScriptPage#") >
The code just strips the CGI script and referer variables of their http web references and then strips the directory structure portion and inserts the .cfm file name and original directory structure into the DB table, but not before reversing the / characters to \ because they want to be able to setup a script which will loop through the table and see something like "\admin\controls\" and auto create those directories, then copy the example.cfm page into that directory. The aim is to 1.) determine which of the 500 cfc/cfm files are still used in the application, then copy them and their directory structure to a new location, and redesign those files in a new technology that isn't Coldfusion.
Update: I'm running into an issue with my code. When I test it, it works well, truncating the http domain portion. However once its operating live under the web server, it doesn't truncate the url despite there being a ReplaceNoCase method to do so:
Under the web root in the wwwroot root, it works well giving this output:
refererPage: testFiles.cfm refererPath = testCodes\MVC
Under the live site I get this:
refererPage: client_display refererPath: **:\dev.fss.com\admin_area**
despite having this line in my code:
Any idea why?

If your cgi.http_referrer variable does not not contain .cfm, you can use the DirectoryExists function on your referer_path variable. If it returns true, you can use the DirectoryList function or cfdirectory tag to search for an occurrence of index.cfm or default.cfm.

they may have this going through a framework (like a model view controller). Without knowing more about the URL structures and the naming conventions.
And without knowing more I would say you are dealing with dynamic content (especially if it is going through index.cfm). Even in an engine with 500 pages, there is a unique identifier and that should be your target not a file. So we may assume there are no files at all and we are just calling parts and pieces from here and there to make a page based on your URL querystring, local variables and/or form variables.
So tables are your friends. Examine your URL structure, try to break down the parameters, search the code base for those parameters and once you have located the area that builds the pages then somewhere there set your tracking tools (a bit higher up stream in the page request stream).
Maybe with some code snippets we could give you a more precise answer but for now this should at least get you looking at your code base for clues.

Related

ColdFusion cffeed/cfoutput

I'm currently using a combination of cffeed and cfoutput to generate an XLM/RSS feed, but am getting some curious output, which manifest differently with different browser settings(I think).
The ColdFusion code that produces the XML is
<cfset RssDetails= StructNew()>
<cfset RssDetails.version = "rss_2.0">
<cfset RssDetails.title = #someTitle#>
<cfset RssDetails.link = "someLink#">
<cfset RssDetails.description = #someDetails#>
<cfset RssDetails.pubDate = now()>
<cfset RssDetails.item = ArrayNew(1)>
<cfloop query="queryResults">
<cfset RssDetails.item[currentRow] = structNew()>
<cfset RssDetails.item[currentRow].title = #someResultTitle#>
<cfset RssDetails.item[currentRow].description = structNew()>
<cfset RssDetails.item[currentRow].description.value = #someResultData#>
<cfset RssDetails.item[currentRow].link = "someResultLink#">
</cfloop>
<cffeed action="create" name="#RssDetails#" overwrite="true" xmlVar="someXML">
<cfoutput>#someXML#</cfoutput>
The basic output looks fine in a browser window, but if I then 'View Source' then there's several lines of 'whitespace' that are before and after the main body of XML. The format of the 'whitespace' when observed in 'View Source' is:
As mentioned above, the erroneous/additional output seems to vary with browser settings, although I've not worked out which ones yet, but ultimately, I'd like to remove the whitespace from the CF-generated XML, rather than rely on browser settings.
I've tried a couple of additional options in the cffeed command, but can't seem to hit a successful outcome...grateful for any thoughts or questions,
Phil

Excluding items from a list in coldfusion by type

Is there a way to exclude certain items by filetype in a list in Coldfusion?
Background: I just integrated a compression tool into an existing application and ran into the problem of the person's prior code would automatically grab the file from the upload destination on the server and push it to the Network Attached Storage. The aim now is to stop their NAS migration code from moving all files to the NAS, only those which are not PDF's. What I want to do is loop through their variable that stores the names of the files uploaded, and exclude the pdf's from the list then pass the list onto the NAS code, so all non pdf's are moved and all pdf's uploaded remain on the server. Working with their code is a challenge as no one commented or documented anything and I've been trying several approaches.
<cffile action="upload" destination= "c:\uploads\" result="myfiles" nameconflict="makeunique" >
<cfset fileSys = CreateObject('component','cfc.FileManagement')>
<cfif Len(get.realec_transactionid)>
<cfset internalOnly=1 >
</cfif>
**This line below is what I want to loop through and exclude file names
with pdf extensions **
<cfset uploadedfilenames='#myfiles.clientFile#' >
<CFSET a_insert_time = #TimeFormat(Now(), "HH:mm:ss")#>
<CFSET a_insert_date = #DateFormat(Now(), "mm-dd-yyyy")#>
**This line calls their method from another cfc that has all the file
migration methods.**
<cfset new_file_name = #fileSys.MoveFromUploads(uploadedfilenames)#>
**Once it moves the file to the NAS, it inserts the file info into the
DB table here**
<cfquery name="addFile" datasource="#request.dsn#">
INSERT INTO upload_many (title_id, fileDate, filetime, fileupload)
VALUES('#get.title_id#', '#dateTimeStamp#', '#a_insert_time#', '#new_file_name#')
</cfquery>
<cfelse>
<cffile action="upload" destination= #ExpandPath("./uploaded_files/zip.txt")# nameconflict="overwrite" >
</cfif>
Update 6/18
Trying the recommended code helps with the issue of sorting out filetypes when tested outside of the application, but anytime its integrated into the application to operate on the variable uploadedfilenames the rest of the application fails and the multi-file upload module just throws a status 500 error and no errors are reported in the CF logs. I've found that simply trying to run a cfloop on another variable not related to anything in the code still causes it to error.
As per my understanding, you want to filter-out file names with a specific file type/extension (ex: pdf) from the main list uploadedfilenames. This is one of the easiest ways:
<cfset lFileNames = "C:\myfiles\proj\icon-img-12.png,C:\myfiles\proj\sample-file.ppt,C:\myfiles\proj\fin-doc1.docx,C:\myfiles\proj\fin-doc2.pdf,C:\myfiles\proj\invoice-temp.docx,C:\myfiles\proj\invoice-final.pdf" />
<cfset lResultList = "" />
<cfset fileExtToExclude = "pdf" />
<cfloop list="#lFileNames#" index="fileItem" delimiters=",">
<cfif ListLast(ListLast(fileItem,'\'),'.') NEQ fileExtToExclude>
<cfset lResultList = ListAppend(lResultList,"#fileItem#") />
</cfif>
</cfloop>
Using only List Function provided by ColdFusion this is easily done, you can test and try the code here. I would recommend you to wrap this code around a function for easy handling. Another way to do it would be to use some complex regular expression on the list (if you're looking for a more general solution, outside the context of ColdFusion).
Now, applying the solution to your problem:
<cfset uploadedfilenames='#myfiles.clientFile#' >
<cfset lResultList = "" />
<cfset fileExtToExclude = "pdf" />
<cfloop list="#uploadedfilenames#" index="fileItem" delimiters=",">
<cfif ListLast(ListLast(fileItem,'\'),'.') NEQ fileExtToExclude>
<cfset lResultList = ListAppend(lResultList,fileItem) />
</cfif>
</cfloop>
<cfset uploadedfilenames = lResultList />
<!--- rest of your code continues --->
The result list lResultList is copied to the original variable uploadedfilenames.
I hope I'm not misunderstanding the question, but why don't you just wrap all of that in an if-statement that reads the full file name? Whether the files are coming one by one or through a delimited list, it should be easy to work around.
<cfif !listContains(ListName, '.pdf')>
OR
<cfif FileName does not contain '.pdf'>
then
all the code you posted

ColdFusion HELP Pages not Updating: My CF pages are still pointing to the original folder of my application

I have an existing ColdFusion application in my server. What I needed is a duplicate of that application. What I did was copy the entire folder of the original application and put it into another folder.
I already edited the Application.cfm and the links across all the pages in my copy. However, .../indexc.cfm?page=a_app_checklist - in this case, the a_app_checklist is not being updated even if I changed everything on my server /Copy/pages/app/a_app_checklist.cfm
I tried to upload the updated a_app_checklist.cfm on the original application and from there, it was updated. What should I do because I want my copy of the application to be a stand-alone from the original application.
Here is a part of my Application.cfm code:
<cfapplication name="Applicationv2" sessionmanagement="yes" setclientcookies="yes" sessiontimeout="#CreateTimeSpan(00,00,30,00)#"
applicationtimeout="#CreateTimeSpan(00,01,00,00)#" clientstorage="cookie" loginstorage="session">
<cfparam name="Url.page" default="a_main_index">
<cfparam name="Url.formpage" default="">
<cfparam name="Url.resetAppCache" default="">
<!--- Set the Application variables if they aren't defined. --->
<cfset app_is_initialized = False>
<cflock scope="application" type="readonly" timeout="5">
<cfset app_is_initialized = IsDefined("Application.initialized")>
</cflock>
<cfif not app_is_initialized>
<cflock scope="application" type="exclusive" timeout=10>
<cfif not IsDefined("Application.initialized")>
<!--- Do initializations --->
<cfset Application.StudentDB = "DB">
<cfset Application.Url = "/CopyOfApplication/">
<cfset Application.NSUrl = "/CopyOfApplication/">
<cfset Application.EmailLocation = "/CopyOfApplication/pages/email/">
<cfset Application.userfilespath = "/web_assets/UserFiles/">
<cfset Application.UnauthorizedFileExtentions = "ade,adp,asx,bas,chm,cmd,cpl,crt,dbx,exe,hlp,com,hta,inf,ins,isp,jse,lnk,mda,mde,mdz,mht,msc,msi,msp,mst,nch,pcd,prf,reg,scf,scr,sct,shb,shs,url,tmp,pif,dll,vb,vbs">
<cfset Application.ReportPage = "report.cfm">
<cfset Application.PrintPage = "print.cfm">
<!---Puts an instance of the user.cfc and system.cfc in the application scope. All pages can use it. --->
<cfobject type="component" name="Application.System" component="application.system">
<cfset Application.initialized = "yes">
</cfif>
</cflock>
<cfscript>
Application.System.LogActivity("Name","IP","Application scope variables initialized.");
</cfscript>
</cfif>
Like I said, the "page" part in the URL are the only ones that are not being updated. Does it mean I have problems with my URL.Page initialization?
Please let me know if you need additional code. I appreciate any input about this question. Thank you in advance! I am still learning ColdFusion and I will appreciate it if you will help me understand what I need to know!

ColdFusion searching robots.txt for specific page exception

We're adding some functionality to our CMS whereby when a user creates a page, they can select an option to allow/disallow search engine indexing of that page.
If they select yes, then something like the following would apply:
<cfif request.variables.indexable eq 0>
<cffile
action = "append"
file = "C:\websites\robots.txt"
output = "Disallow: /blocked-page.cfm"
addNewLine = "yes">
<cfelse>
<!-- check if page already disallowed in robots.txt and remove line if it does --->
</cfif>
It's the <cfelse> clause I need help with.
What would be the best way to parse robots.txt to see if this page had already been disallowed? Would it be a cffile action="read", then do a find() on the read variable?
Actually, the check on whether the page has already been disallowed would probably go further up, to avoid double-adding.
You keep the list of pages in database and each page record has a indexable bit, right? If yes, simpler and more reliable approach would be to generate new robots.txt each time some page is added/deleted/changes indexable bit.
<!--- TODO: query for indexable pages ---->
<!--- lock the code to prevent concurrent changes --->
<cflock name="robots.txt" type="exclusive" timeout="30">
<!--- flush the file, or simply start with writing something --->
<cffile
action = "write"
file = "C:\websites\robots.txt"
output = "Sitemap: http://www.mywebsite.tld/sitemap.xml"
addNewLine = "yes">
<!--- append indexable entry to the file --->
<cfloop query="getPages">
<!--- we assume that page names are not entered by user (= safe names) --->
<cffile
action = "append"
file = "C:\websites\robots.txt"
output = "Disallow: /#getPages.name#.cfm"
addNewLine = "yes">
</cfloop>
</cflock>
Sample code is not tested, be aware of typos/bugs.
Using the Robots.txt files for this purpose is a bad idea. Robots.txt is not a security measure and you're handing "evildoers" a list of pages that you don't want indexed.
You're much better off using the robots meta tag, which will not provide anyone with a list of pages that you don't want indexed, and gives you greater control of the individual actions a robot can perform.
Using the meta tags, you would simply output the tags when generating the page as usual.
<!--- dummy page to block --->
<cfset request.pageToBlock = "/blocked-page.cfm" />
<!--- read in current robots.txt --->
<cffile action="read" file="#expandPath('robots.txt')#" variable="data" />
<!--- build a struct of all blocked pages --->
<cfset pages = {} />
<cfloop list="#data#" delimiters="#chr(10)#" index="i">
<cfset pages[listLast(i,' ')] = '' />
</cfloop>
<cfif request.variables.indexable eq 0>
<!--- If the page is not yet blocked add it --->
<cfif not structKeyExists(pages,pageToBlock)>
<cffile action="append" file="C:\websites\robots.txt"
output="Disallow: #request.pageToBLock#" addNewLine="yes" />
<!--- not sure if this is in a loop but if it is add it to the struct for nex iteration --->
<cfset pages[request.pageToBlock] = '' />
</cfif>
</cfif>
This should do it. Read in the file, loop over it and build a struct of the bloocked pages. Only add a new page if it's not already blocked.

How to get the temporary path of a file pulled with CFHTTP in Coldfusion?

I'm using Coldfusion8 and need to fetch images from a remote server, which I'm doing like this:
<cfhttp timeout="45" throwonerror="no" url="#variables.testFilePath#" method="get" useragent="Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12" getasbinary="yes" result="variables.objGet">
<cfset variables.objImage = ImageNew(variables.objGet.FileContent)>
I now need to save the image to Amazon S3, but the function I want to use:
<cfset result = s3.putObject(url.b,file.serverFile,file.contentType,'300',form.cacheControl,'30',form.acl,form.storageClass,form.keyName,GetTempDirectory())>
Requires the directory where my generated image can be found in.
Question:
Is there a way to get the directory of an image file pulled with cfhttp and converted to an image using imageNew? Or do I need to save the file to disk first? I also need to resize before storing, so I might not be able to get by without saving to disk first.
Thanks for pointers!
EDIT:
I got it working like this:
<!--- getAsBinary --->
<cfhttp timeout="45"
throwonerror="no"
url="#variables.testFilePath#"
method="get"
useragent="..."
getasbinary="yes"
result="objGet">
<!--- validate --->
<cfif len(variables.testFilePath) EQ 0>
<cfset variables.errorCount = variables.errorCount+1>
<cfset variables.failedLoads = "FILE NOT FOUND" >
<cfelse>
<cfif len(objGet.Filecontent) EQ 0>
<cfset variables.errorCount = variables.errorCount+1>
<cfset variables.failedLoads = "COULD NOT LOAD IMAGE">
<cfelseif NOT listfindnocase(variables.allow, variables.fileExt) >
<cfset variables.errorCount = variables.errorCount+1>
<cfset variables.failedLoads = "WRONG FILE TYPE">
<cfelse>
<cftry>
<cfscript>
objImage = ImageNew(objGet.FileContent);
ImageSetAntialiasing(objImage,"on");
<!--- resize/crop --->
variables.keyName = Session.loginid & "_S_";
</cfscript>
<!--- convert modified image back to binary --->
<cfset variables.filekey = toBase64( objImage )>
<!--- pass to s3.cfc --->
<cfset result = s3.putObject(variables.bucketName, variables.filekey, variables.contentType, variables.httptimeout, variables.cacheControl, variables.cacheDays, variables.acl, variables.storageClass, variables.keyName, variables.imageSrc, "true" )>
<cfcatch>
<cfset variables.errorCount = variables.errorCount+1>
<cfset variables.failedLoads = "NO IMAGE">
</cfcatch>
</cftry>
</cfif>
I need to re-convert the cropped image to binary, because the s3.putobject will otherwise do another cffile action="readBinary" and breaks on trying to construct the image file path (the image is still in temp) right here:
<cffile action="readBinary" file="#arguments.uploadDir##arguments.fileKey#" variable="binaryFileData">
While I can get the temporary file path using this trick and set uploadDir it doesn't help, because CF docs say the path must be either an absolute path starting with drive letter or slash, otherwise the www-root temp directory will be taken.
In my case the temp www-root directory was on C:/ while the temp file CFFile-Servlet was on E:/ and a relative path did not work either (file not found). So as I found no way to re-read the image from the s3.cfc, I'm now converting back to binary before calling S3.cfc. I pass another parameter (1/0) telling s3.cfc, that I'm already sending the binary image and there is no need to re-read it.
Like so:
<!--- if encoded is true, filekey already is the encoded image --->
<cfif arguments.encoded EQ "true">
<!--- already sending binary image --->
<cfset binaryFileData = arguments.fileKey>
<cfelse>
<!--- Default --->
<cffile action="readBinary" file="#arguments.uploadDir##arguments.fileKey#" variable="binaryFileData">
</cfif>
I'm not sure if this is the smartest way performance wise, but it seems to work pretty smooth. Comments are welcome!
I guess you could use path and file attributes instead of result. Generate some temporary path using GetTempDirectory() + CreateUUID(), fetch and then drop it. Plus it may be a bit more memory-efficient thatn fetching content to the variable, then writing to the intermediate file.
Cfhttp result stores the data in a memory variable.
ImageNew creates a 'ColdFusion' image meaning it's resident in memory only also. You'd have to save it to make it a physical file to send either in cfhttp or imagewrite, etc.
Without saving it to a physical file you must use cffile action = "writetobrowser" to send it to a browser but that ends up saving it in a temp location for the browser to access but wouldn't do you much good here I don't think.
http://livedocs.adobe.com/coldfusion/8/htmldocs/functions_h-im_34.html
http://livedocs.adobe.com/coldfusion/8/htmldocs/Images_19.html
AmazonS3Client has a putObject method that takes an InputStream, so you can wrap the binary data in a ByteArrayInputStream and pass it in without worrying about the backing file.
Something like this:
<cfhttp method="get" url="#local.url#" result="local.httpResult" getAsBinary="true"/>
<cfif local.httpResult.statusCode eq "200 OK">
<cfset var fileInputStream = createObject("java", "java.io.ByteArrayInputStream" ).init(local.httpResult.fileContent) />
<cfset var objectMetadata = createObject("java", "com.amazonaws.services.s3.model.ObjectMetadata" ).init() />
<cfset application.S3UTIL.s3Client.putObject("some.bucket", "folder/file.ext", local.fileInputStream, local.objectMetadata)/>
</cfif>