Safe to call underlying java method on String in ColdFusion? - coldfusion

Adobe ColdFusion is built on Java. Almost all simple variables in CFML/CFSCRIPT are java.lang.String until the operation needs it to be of a certain type.
I've always want to use startsWith() in String instead of the more bulky CFML variant.
left(str,4) EQ "test"
However, what's the general consensus of using underlying Java method in ColdFusion?
Would this be any safer to javacast() the var first?
javacast("String",x).startsWith("test");
What if the CF engine is not built on top of Java?
Thanks

Yes, you can do this with Adobe ColdFusion and other CFML engines that are built on Java. It's actually simpler than you thought.
<cfset str = "hello what's up" />
#str.startsWith("hello")# <!--- returns "YES" --->
<cfif str.startsWith("h")>
This text will be output
</cfif>
#str.startsWith("goodbye")# <!--- returns "NO" --->
<cfif str.startsWith("g")>
This text will NOT be output
</cfif>
This is possible because CFML strings in ColdFusion are the same as Java strings. You can use any native string method (Java.lang.String) on a CFML string.
If you haven't guessed, this also works with CFML arrays (some kind of list, probably a java.util.Vector) and structs (probably a java.util.Map). Experiment with data types and the cfdump tag, you will find a lot of secrets.
One word of warning, this is not standard CFML, so if your underlying engine changes, including just upgrading to a new version, there are no guarantees that it will still work.
That said, string.startsWith() is native to Java as well as .NET, so this will also work if your CFML engine is BlueDragon.NET. The only CFML engines it will not work on are ColdFusion 5 and previous.
Is it safe to use? I would say yes. As long as CFML engines run on Java or .NET, it's perfectly safe. It's undocumented, but easy to understand, so I would say use it freely.

I have found that using built in cf functions in most cases is faster than leveraging their java counterparts, mainly as it costs so much in cf wrapping the java methods.
If you are using .startsWith(), remember it's case sensitive, whereas cf's eq isn't.
Same goes for most of the other java String methods - .endsWith(), .contains() etc.
Unless you can bundle large sets of functionality as roll your own java util classes, mixing cf and java calls seems slow. If you are in some java code, and you have a string, and you call its startsWith() method, it just executes. Done. In cf code, you have to javaCast or blindly hope the variable is in the correct data type, which is risky with things like entirely numeric strings, and when you call a .startsWith(), there is a bunch of cf code that runs before it even gets down to the java level, which is where the slowness lives. Eg. Cf's dynamic arguments means that it has to check if there is a method on the supplied object with that many args, and of those data types (or compatible types). There is just a whole bunch of code that unavoidably runs, bridging the two languages.
But don't trust our experiences, benchmark for yourselves. eg.
<cfscript>
var sys = createObject( 'java', 'java.lang.System' );
var timer = sys.nanoTime();
// run some code here
timer = sys.nanoTime() - timer;
writeDump( var: timer );
</cfscript>
If you are using the Adobe cf engine, watch out of entirely numeric strings, they bounce between java Doubles and Strings, and don't get me started with serializeJSON()...

Related

Is there a difference with the HTMLEditFormat function in ColdFusion CF9 versus CF10?

I'm seeing a difference in how HTMLEditFormat works in CF9 and CF10.
HTMLEditFormat(">")
In CF9: showing up as ">" (no difference)
In CF10: showing up as "&gt;" (double-escaped, which seems correct to me)
I've looked through the CF10 notes and reviewed the HTMLEditFormat documentation, but cannot find any mention of there being a difference in how this function works. Does anyone know of a difference, or know of documentation that proves there is no difference? ...Or know of any other settings (ColdFusion or web server) that might cause this to work different?
(This question is not a duplicate because am not asking about encodeForHTML. I understand that is the ideal solution, but am asking to understand why HTMLEditFormat might be different in CF9 vs. CF10.)
I can't imagine why that function would behave differently. Especially when it's was planned for deprecation going into CF 10. By chance, are you calling it from within a CFINPUT tag?
<cfinput id="foo" value="#htmlEditFormat(someValue)#" />
If so, in CF6 - CF9, that tag uses HTMLEditFormat() on values automatically. Calling a 2nd instance of HTMLEditFormat() doesn't affect the output. But CF 10+ updated the tag to use encodeForHTML() on values. If you also throw in an HTMLEditFormat(), then you're double-encoding the output.
For better security, you should stop using HTMLEditFormat() and start using encodeForHTML() if it's available (CF10+). As of ColdFusion 11, HTMLEditFormat() has been officially deprecated and by ColdFusion 12, the function should be removed completely.
HTMLEditFormat() only encodes 4 characters: <, >, &, ".
encodeForHTML() encodes almost every character, including UTF-8 characters. The updated "encodeFor" functions are contextual, so you have to pick the right on for the right context (html, htmlattribute, js, css, xml, etc.).

Using docx4j with ColdFusion

I am attempting to create Word documents with ColdFusion, but it does not seem there is any way to do it with only ColdFusion. The best solution seems to be docx4j. However, I can't seem to find any in-depth docx4j and ColdFusion examples (Aside from this question). Where can I get some doc4jx and ColdFusion examples?
pulling the data from a database.
https://stackoverflow.com/a/10845077/1031689 shows one approach to doing this. There are other ways, as to which see http://www.slideshare.net/plutext/document-generation-2012osdcsydney
The document needs page numbers and to
Typically you'd add these via a header or footer. You might find it easier to start with an almost empty docx structured appropriately, rather than creating the necessary structures via ColdFusion calling docx4j. You could still do it this way in conjunction with the final paragraph of this answer below.
create a table of contents.
Search the docx4j forums for how to do this.
In general, it looks like the easiest approach would be to create a Java class file which does everything you want (by invoking docx4j), and for your ColdFusion to just invoke that Java class. In other words, do a bit of Java programming first, get that working, then hook it up to your ColdFusion stuff.
I am not sure what exactly you mean with creating word document, which in my opinion is pretty simple. Manipulating yes, a bit tricky with docx4j or so.
<cfsavecontent variable="variables.mydoc">
Your content here
</cfsavecontent>
<cffile action="write" file="#yourFile.doc#" output="#variables.mydoc#">
Also see this post
Creating a Word document in Coldfusion - how to have pagenumbering?

Is there any difference between these ColdFusion components?

I know the result is the same but is there any real difference? Maybe speed or something?
component {
remote function getMath(){
math = 2 + 2;
return math;
}
}
or
<cfcomponent>
<cfscript>
remote function getMath(){
math = 2 + 2;
return math;
}
</cfscript>
</cfcomponent>
or
<cfcomponent>
<cffunction name="getMath" access="remote">
<cfscript>
math = 2 + 2;
return math;
</cfscript>
</cffunction>
</cfcomponent>
Not especially.
Version 3, full tags, will be backwards compatible with ColdFusion 8 and the open source versions of ColdFusion server eg. Railo or OpenBD.
Version 2 is neither something or nothing.
Version 1 is the full ColdFusion 9 script version.
I would recommend that you choose between the first and last versions and stick to it. Version 2 is not backwards compatible to coldfusion 8 and is neither tag nor script. Coding like this will get messy quickly.
If you plan on writing everything in script, then example 1 is the way to go.
You can do anything in script that you wish, and if something is missing you can write a cfc that will implement the missing functionality and then invoke it with the new syntax.
If your starting fresh with a new codebase i'd be trying to avoid any tags all together, thus option 1.
In terms of execution speed, they all compile to the same byte code, so should be identical.
In terms of number of characters typed (excluding line breaks/tabs):
eg 1: 64
eg 2: 100
eg 3: 129
If you are running Adobe CF9, go with option 1. It's much more succinct. You can pretty much do everything in <cfscript> these days.
If you want to check the compiled byte code for each, switch on saving .class files in your cf admin and view the files in the /Classes dir with a decompiler. eg. JD-Gui
The cfscript is probably a bit faster, and more consistent with other languages while the approach is simpler (hides complexity more) and more like .
CF started as a based language and has evolved to include a complete scripting style alternative to the approach.
Differences are a question of developer style.

Allowing code snippets in form input while preventing XSS and SQL injection attacks

How can one allow code snippets to be entered into an editor (as stackoverflow does) like FCKeditor or any other editor while preventing XSS, SQL injection, and related attacks.
Part of the problem here is that you want to allow certain kinds of HTML, right? Links for example. But you need to sanitize out just those HTML tags that might contain XSS attacks like script tags or for that matter even event handler attributes or an href or other attribute starting with "javascript:". And so a complete answer to your question needs to be something more sophisticated than "replace special characters" because that won't allow links.
Preventing SQL injection may be somewhat dependent upon your platform choice. My preferred web platform has a built-in syntax for parameterizing queries that will mostly prevent SQL-Injection (called cfqueryparam). If you're using PHP and MySQL there is a similar native mysql_escape() function. (I'm not sure the PHP function technically creates a parameterized query, but it's worked well for me in preventing sql-injection attempts thus far since I've seen a few that were safely stored in the db.)
On the XSS protection, I used to use regular expressions to sanitize input for this kind of reason, but have since moved away from that method because of the difficulty involved in both allowing things like links while also removing the dangerous code. What I've moved to as an alternative is XSLT. Again, how you execute an XSL transformation may vary dependent upon your platform. I wrote an article for the ColdFusion Developer's Journal a while ago about how to do this, which includes both a boilerplate XSL sheet you can use and shows how to make it work with CF using the native XmlTransform() function.
The reason why I've chosen to move to XSLT for this is two fold.
First validating that the input is well-formed XML eliminates the possibility of an XSS attack using certain string-concatenation tricks.
Second it's then easier to manipulate the XHTML packet using XSL and XPath selectors than it is with regular expressions because they're designed specifically to work with a structured XML document, compared to regular expressions which were designed for raw string-manipulation. So it's a lot cleaner and easier, I'm less likely to make mistakes and if I do find that I've made a mistake, it's easier to fix.
Also when I tested them I found that WYSIWYG editors like CKEditor (he removed the F) preserve well-formed XML, so you shouldn't have to worry about that as a potential issue.
The same rules apply for protection: filter input, escape output.
In the case of input containing code, filtering just means that the string must contain printable characters, and maybe you have a length limit.
When storing text into the database, either use query parameters, or else escape the string to ensure you don't have characters that create SQL injection vulnerabilities. Code may contain more symbols and non-alpha characters, but the ones you have to watch out for with respect to SQL injection are the same as for normal text.
Don't try to duplicate the correct escaping function. Most database libraries already contain a function that does correct escaping for all characters that need escaping (e.g. this may be database-specific). It should also handle special issues with character sets. Just use the function provided by your library.
I don't understand why people say "use stored procedures!" Stored procs give no special protection against SQL injection. If you interpolate unescaped values into SQL strings and execute the result, this is vulnerable to SQL injection. It doesn't matter if you are doing it in application code versus in a stored proc.
When outputting to the web presentation, escape HTML-special characters, just as you would with any text.
The best thing that you can do to prevent SQL injection attacks is to make sure that you use parameterized queries or stored procedures when making database calls. Normally, I would also recommend performing some basic input sanitization as well, but since you need to accept code from the user, that might not be an option.
On the other end (when rendering the user's input to the browser), HTML encoding the data will cause any malicious JavaScript or the like to be rendered as literal text rather than executed in the client's browser. Any decent web application server framework should have the capability.
I'd say one could replace all < by <, etc. (using htmlentities on PHP, for example), and then pick the safe tags with some sort of whitelist. The problem is that the whitelist may be a little too strict.
Here is a PHP example
$code = getTheCodeSnippet();
$code = htmlentities($code);
$code = str_ireplace("<br>", "<br>", $code); //example to whitelist <br> tags
//One could also use Regular expressions for these tags
To prevent SQL injections, you could replace all ' and \ chars by an "innofensive" equivalent, like \' and \, so that the following C line
#include <stdio.h>//'); Some SQL command--
Wouldn't have any negative results in the database.

How do I programmatically sanitize ColdFusion cfquery parameters?

I have inherited a large legacy ColdFusion app. There are hundreds of <cfquery>some sql here #variable#</cfquery> statements that need to be parameterized along the lines of: <cfquery> some sql here <cfqueryparam value="#variable#"/> </cfquery>
How can I go about adding parameterization programmatically?
I have thought about writing some regular expression or sed/awk'y sort of solution, but it seems like somebody somewhere has tackled such a problem. Bonus points awarded for inferring the sql type automatically.
There's a queryparam scanner that will find them for you on RIAForge: http://qpscanner.riaforge.org/
There is a script referenced here: http://www.webapper.net/index.cfm/2008/7/22/ColdFusion-SQL-Injection that will do the majority of the heavy lifting for you. All you have to do is check the queries and make sure the syntax will parse properly.
There is no excuse for not using CFQueryParam, apart from it being much more secure, it is a performance boost and the best way to handle quoted values in character based column types.
Keep in mind that you may not be able to solve everything with <cfqueryparam>.
I've seen a number of examples where the order by field name is being passed in the query string, which is a slightly trickier problem to solve as you need to validate that in a more "manual" way.
<cf_inputFilter
scopes = "FORM,COOKIE,URL"
chars = "<,>,!,&,|,%,=,(,),',{,}"
tags="script,embed,applet,object,HTML">
We used this to counteract a recent SQL injection attack. We added it to the Application.cfm file for our site.
I doubt that there is a solution that will fit your needs exactly. The only option I see is to write your own recursive search that builds a report for you or use one of the apps/scripts that people have listed above. Basically, you are going to have to edit each page or approve all of the automated changes.