I have little impression about variables resolve order, but I can't find it in the CFML Reference or ColdFusion Dev Guide. Can anyone help?
Scope Order
The canonical scope order for ColdFusion 9 is:
Local (only inside CFCs and UDFs)
Arguments (only inside CFCs and UDFs)
Thread local (only inside threads)
Query (only inside a query loop)
Thread (only inside threads and templates that call threads)
Variables
CGI
Cffile
URL
Form
Cookie
Client
You can see Adobe's documentation on this in Developing ColdFusion 9 Applications.
However, some scopes are only available in certain contexts, so the order that scopes are searched is different, depending upon the context of the code.
Inside CFML (no threads)
Variables
CGI
Cffile
URL
Form
Cookie
Client
Inside a CFC (no threads)
Local
Arguments
Query (only inside a query loop)
Variables
CGI
Cffile
URL
Form
Cookie
Client
Best Practice
As Al Everett notes in his answer, it is considered best practice to always scope variables. Explicit scoping produces less ambiguous code and is usually faster. Anytime you don't scope a variable, you risk getting a variable from a scope that you didn't intend to.
When the variable you are accessing is in the first scope in the search order, it is actually slightly faster to leave the variable un-scoped. This is because each dot in a variable name incurs a small cost as ColdFusion resolves it. For example, in a CFC method it is slightly faster to access myVar than local.myVar. This only applies to:
local scoped variables inside a CFC or UDF
Thread local scoped variables inside a thread
variables scoped variables inside CFML
In all other circumstances it is faster (and clearer) to explicitly declare the scope.
Use of this technique should be considered bad practice. You should only use this technique in performance-critical code, where you can guarantee that the variable always exists in the intended scope. Keep in mind that it comes at the cost of increased ambiguity.
It is a generally accepted best practice to always scope your variables for two main reasons:
Performance - CF doesn't need to find the variable by searching through the scopes in turn
Accuracy - if two variables have the same name in different scopes, you may not get the one you were expecting
That said, here's the order variable scopes are searched:
Function local (VAR keyword)
Thread local (CFTHREAD)
Query results
Function ARGUMENTS
Local VARIABLES
CGI variables
FILE variables
URL parameters
FORM fields
COOKIE values
CLIENT variables
EDIT: It's also telling to note what scopes are not searched: SESSION, SERVER, APPLICATION
Related
I have read a lot of posts about storing CFCs in the Application scope, and I understand that if a CFC stores data then it should not be in the Application scope. All CFCs that do non-util stuff would store data - when you pass in parameters like a username or email address - so I don't get when and when not to use the Application scope for a non-util cfc.
My question is that I have a posthandler.cfc component of about 500 lines of code which handles posts from a user (just like SO would handle each question being posted on this site). The posthandler.cfc component:
'cleans' any images and text submitted by the user
places the images in the correct folder
writes all the text to a database
returns a URL where the post can be viewed
The returned URL is received by a simple Jquery ajax call which redirects the user to the URL.
This happens quite regularly on the site and at the moment a new CFC instance is being created for each post. Would it safe to put it in the Application scope instead and not cause race/locking conditions?
Just passing in parameters doesn't "save" anything. Conceptually, each thread has its own arguments and local scope, which are not visible to any other thread, and cease to exist when the function exits. So from that perspective, there's no conflict.
Also, storing data doesn't mean saving it to a database table. It refers to components that maintain state by storing data in a shared scope/object/etc.. With "shared" meaning the resource is accessible to other threads, and can potentially be modified by multiple threads at the same time, leading to race conditions.
For example, take this (contrived) component function that "saves" information in the variables scope. If you create a new instance of that component each time, the function is safe because each request gets it's own instance and separate copy of the variables scope to play with.
public numeric function doStuff( numeric num1, numeric num2 ) {
variables.firstNum = arguments.num1 * 12;
variables.secondNum = arguments.num2 * 10;
return variables.firstNum / variables.secondNum;
}
Now take that same component and put it in the application scope. It's no longer safe. As soon as you store it in the application scope, the instance - AND its variables - become application scoped as well. So when the function "saves" data to the variables scope it's essentially updating an application variable. Obviously those aren't thread safe because they're are accessible to all requests. So multiple threads could easily read/modify the same variables at the same time, resulting in a race condition.
// "Essentially" becomes this ....
public numeric function doStuff( numeric num1, numeric num2 ) {
application.firstNum = arguments.num1 * 12;
application.secondNum = arguments.num2 * 10;
return application.firstNum / application.secondNum;
}
Also, as James A Mohler pointed out, the same issue occurs when you omit the scope. Declaring a function variable without a scope does NOT make it local to the function. It makes it part of the default scope: variables - (creating the same thread safety problem described above). This behavior has led to many a threading bug, when developers forget to scope a single query variable or even a loop index. So be sure to explicitly scope EVERY function variable.
// Implicitly creates "variables.firstNum" and "variables.secondNum"
public numeric function doStuff( numeric num1, numeric num2 ) {
firstNum = arguments.num1 * 12;
secondNum = arguments.num2 * 10;
return firstNum / secondNum;
}
Aside from adding locking, both examples could be made thread safe by explicitly using local scope instead. By storing data in the transient, local scope, it's not visible to other threads and ceases to exist once the function exits.
public numeric function doStuff( numeric num1, numeric num2 ) {
local.firstNum = arguments.num1 * 12;
local.secondNum = arguments.num2 * 10;
return local.firstNum / local.secondNum;
}
Obviously there are other cases to consider, such as complex objects or structures, which are passed by reference, and whether or not those objects are modified within the function. But hopefully that sheds some light on what's meant by "saving data" and how scoping can make the difference between a stateless component (safe for the application scope) and stateful components (which are not).
TL;DR;
In your case, it sounds like most of the information is not shared, and is request level (user info, uploaded images, etc..), so it's probably safe to store in the application scope.
I have a job running in PDI that is transferring data from different sources to different targets an back for a specific System. This job has a lot of child jobs. Let's call that Job MasterJob1.
We have the same System running for another purpose. Therefore, I want to copy that job in PDI. Here I just have to change a few settings. Let's call that MasterJob2.
To make different variables available for the entire job (also in parent jobs, child jobs and so on of the masterjob), we are using "Set Variables". Here, we have a lot of different variables. Let's say, one variable is called TestVar. At the moment, the "Variable Scope type" of these Variables in MasterJob1 is always set on "Valid in Java Virtual Machine".
According to the PDI Documentation http://wiki.pentaho.com/display/EAI/Set+Variables, this means, the variables are available everywhere in the Virtual Machine. For my understanding this means, if I copy the job and let the "Variable Scope type" like it is, the Variable TestVar can be written by MasterJob1 but can also be overwritten by MasterJob2.
I definitively want to avoid that MasterJob1 can overwrite Variables of MasterJob2 and vice versa. However, the Variables that are set in MasterJob1 must be everywhere available in MasterJob1 and the Variables in MasterJob2 must be everywhere available in MasterJob2. Therefore I continued reading the documentation. It's says that there exists the "Variable Scope Type" "Valid in the root Job". Is my assumption right, that this is the Variable Scope Type that I need to use?
Unfortunately I do not have that much experience with this and I hope that you can tell me if that is the right way?! Creating a test environment will take a some days for me. Therefore I hope that you can give me an easy "Yes go for it" or the right solution.
Your assumption is correct.
Avoid using Valid in the virtual machine for jobs on the server, although it is handy for debug on your dev PC.
Use Valid in the parent job when a transformation (or job) has to return a value to the caller.
Use Valid in the grand-parent job very rarely, although I remember some special moments where it was useful.
Use Valid in the root job almost all the time.
I know global variables are bad, however I have a checksettings function which is run every tick. http://pastebin.com/54yp4vuW The paste bin contains some of the check setting function. Before I added the GetPrivateProfileIntA everything worked fine. Now when I run it, it lags like hell. I can only assume this is because it is constantly loading the files. So my question is, are global variables constantly updated. (ie if I put this in global var will it stop the lag)
Thanks :)
Assuming I'm interpreting your question correctly, then no, global variables are not constantly updated unless you explicitly do so in code. So yes, putting those calls in global variables will get rid of the lag.
You haven't provided any details about the design but globals are visible across the entire application and get updated when they are written into.
Multiple processes/threads reading that global variable would then read the same updated value.
But synchronizing reads/writes requires the use of synchronization mechanisms such as mutexes, condition variables etc etc.
In your case you need to decide when to call GetPrivateProfileIntA() for all those settings.
Are all those settings constantly updated or only a fraction of those? Identify the ones which need to be monitored periodically and only load those.
And if a setting is stateful meaning all objects of the class refer to a single copy of the setting then I would use static class variables instead of plain global variables.
Alternately you could make a JIT call to GetPrivateProfileIntA() where needed and not bother about storing the setting in a global variable.
I have previously asked a question regarding cf scopes on cfm pages (happy that I understand CFC scopes and potential issues), but am still not clear on the variables scope.
In the answers to my previous question, it was suggested that there are no thread safety issues using cfm pages, and you won't get the scenario where two different users access the same page and have race conditions or thread safety probs (even if I just leave my variables in the default cfm variables scope, and that the variables scope for each user will be isolated and independent (here is my last question Coldfusion Scopes Clarification)
However, I have read this blog post http://blog.alexkyprianou.com/2010/09/20/variables-scope-in-coldfusion/ regarding the use of functions on a cfm page and using the variables scope and that seems to suggest a scenario whereby the variables scope is shared between multiple users (I understand this problem in the context of CFCs - them being more akin to java classes and the variables scope being instance variables, so has thread safety issues if the CFC is shared/application scope/singleton) but this seems counter to previous answers - if a variable put in the variables scope by a function on a cfm page can be accessed by other users, then surely variables placed in variables scope directly in cfm page code is the same?
I was hoping for some clear docs and guides but have not really been able to find definitive explanations of the different scopes and where they are available.
Thanks!
Dan is correct, and the blog article being referenced in the question is simply wrong. Dan's code demonstrates it, and I have written-up and tested this thoroughly on my blog (it was too big to go here).
The bottom line is the variables scope in a CFM is safe from this sort of race condition because the variables scope for each request is different memory. So one variables.foo is not the same as the other variables.foo, so neither ever intersect.
The same applies to objects in the variables scope: their internal variables scope is a distinct entity, so any number of requests can instantiate a CFC in the request's variables scope, and the CFC instances' variables scopes are all discrete entities too.
The only time the variables scope can participate in a race condition is the variables scope of an object stored in a shared scope. Because all references to that shared-scope object will be referencing the same object in memory, so the same object's variables scope in memory.
Functions outside of a CFC accessing the variables scope won't have thread safety issues when 2 requests run the code, but if you use cfthread or other parallel features, you could still have problems with the variables scope being changed and this can cause race conditions. Often this mistake can occur with a variable you use a lot like maybe in a for loop, the "i" variable.
for(i=1;i<10;i++){t=arr[i]; }
But then another function does this while the first is running:
for(i=1;i<20;i++){t=arr[i]; }
The "i" variable needs to become a local variable to help make it thread-safe. You don't want the first loop to be able to go above 10 by mistake and this is hard to debug many times. I had to fix a ton of "i" variables and others to make my functions thread-safe everywhere when I started caching objects and using cfthread more extensively.
You can also avoid needing to lock by never changing existing objects. You can instead to the work on copies of them. This makes the data "immutable". CFML doesn't have official support for making immutable objects more efficiently, but you can make copies easily.
http://en.wikipedia.org/wiki/Immutable_object
Simple example of thread safe change to an application scope variable:
var temp=structnew();
// build complete object
temp.myValue=true;
// set complete object to application scope variable
application.myObject=temp;
Writing to any shared object is often dangerous since variables may be undefined or partially constructed. I always construct the complete object and set it to the shared variable at the end like the example above. This makes thread-safety easy if it isn't too expensive to re-create the data. The variables scope in CFC is similar to private member variables in other languages. If you modify data in shared objects, you'd might to use CFLOCK if you can't make copies instead.
Some of the confusion about coldfusion scopes is related to shared scopes in coldfusion 5 and earlier being less reliable. They had serious thread safety problems that could cause data corruption or crashes. Two threads were in certain conditions able to write to the same memory at the same time if you didn't lock correctly. Current CFML engines are able to write to struct keys without the chance of corruption / crashes. You just can't be sure which data will be actually end up as the value without some consideration of thread-safety now, but it generally won't become corrupted unless you are dealing with non-cfml object types like CFX, Java and others. A thread-safety mistake could still lead to an infinite loop which could hang the request until it times out, but it shouldn't crash unless it ran out of memory.
I think the blog is misleading. However, if you want to see for yourself, write a page with his function. Make it look something like this.
<cffunction name="test" returntype="void">
<cfscript>
foo = now();
sleep(3 * 60 * 1000); // should be 3 minutes
writedump(foo);
</cfscript>
<cffunction>
<cfdump var="#now()#">
<cfset test()>
Run the page. During the 3 minutes, open another browser or tab and run it again. Go back to where you first ran it and wait for the results. If there is no significant difference between the two outputs, then your second page request did not affect your first one.
Note that I have not tried it myself but my bet would be on the 2nd request not affecting the first one.
Question for the crowd. We are very strict on our team about scoping local variables inside functions in our CFC's. Recently though the question of scoping variables inside Application.cfc came up. Are unscoped variables in functions like onRequestStart() at the same risk for being accessed by other sessions running concurrently as we know that local variables in functions in other components are? Or are they somehow treated differently because of the nature of the functions in Application.cfc?
Your question borders on two entirely separate questions (both of which are important to clarify and address). These two questions are:
Should I scope my variables correctly when referring to them (ie. APPLICATION.settings vs. SESSION.settings).
The short answer to this is: Yes. It makes for cleaner, more readable / managable code, and prevents variable scope clashes that you may encounter later when variable names are re-used.
If you create APPLICATION.settings and SESSION.settings, but attempt to refer to them without scope (ie. <cfset myvar = settings />), you're going to have variable clash issues, as they'll be poured into VARIABLES by default--since neither APPLICATION nor SESSION are examined to resolve scope ambiguity.
The second question is:
Should I be worried about variables that are accessed in Application.cfc that could be potentially be shared by multiple users in a concurrent environment?
The short answer to this is: Yes. You should know & understand the ramifications of how your shared variables are accessed, and <CFLOCK> them where appropriate.
Unfortunately, exactly when and where you lock your shared variables is often never clarified to the CF community, so let me sum it up:
onApplicationStart() single-threads access to the APPLICATION scope. You do not need to lock APPLICATION vars that are read/written within this method.
onSessionStart() single-threads access to the SESSION scope. Same answer as before.
If you provide any kind of mechanism that accesses SESSION or APPLICATION from within the onRequestStart() method--or any other template afterwards (such as a URL reload parameter that directly calls onApplicationStart() )--all bets are off--you must now properly handle the locking of your shared variable reads and writes.