Looking over some old code that timeouts occasionally, I came across this CFLOCK around a CFFILE COPY run inside a CFLOOP.
Does this use of CFFLOCK make sense or seem necessary? The file is being copied from one location to a newly created folder that is itself then zipped up for a future download.
At first, I was just going to increase the timeout of the lock but then I started to stare at it and wonder if it was a mistake.
<cfloop query="LOCAL.qDocsZip">
<cflock name="copyFileLock" timeout="3600" type="readonly">
<cffile action="copy"
source="#ExpandPath(LOCAL.qDocsZip.file_location)#"
destination="#LOCAL.zip_new_path#/#LOCAL.qDocsZip.original_file_name#">
</cflock>
</cfloop>
Locking seems reasonable here, but it's not the right spot and you are not locking the file access exlusively. You should lock the whole transaction, i.e. lock right before you fetch LOCAL.qDocsZip. This way you are making sure that the files to copy are only touched by a single thread and do not run into concurrency with another thread. On that note: cflock is a JVM specific semaphore, so it cannot guarantee transaction safety on a system level, e.g. if you have other programs that access your files parallel.
Here is what it should look like:
<!--- only one thread at a time can execute the code within this lock (exclusive named lock) --->
<cflock name="copyFileLock" timeout="3600" type="exclusive">
<!--- fetch files to copy in this transaction --->
<cfquery name="LOCAL.qDocsZip" ...>
...
</cfquery>
<!--- copy all the files --->
<cfloop query="LOCAL.qDocsZip">
<cffile action="copy" ...>
</cfloop>
</cflock>
(You should probably add some error handling as well, if that's not just left out in your snippet.)
Explanation
Every thread will stop at cflock and ask the semaphore copyFileLock if it is currently "running". If not, the thread will continue, fetch the files and copy them. While this whole copying is in progress (the semaphore is "running"), every other thread that encounters the cflock will be queued, so pause the execution and wait for the semaphore (in your case, every queued thread will wait 3600 seconds for the semaphore to give the "go", or otherwise just forget about it and exit). After the copy operation has finished on the first thread, the semaphore will stop "running" and check the queue. If other threads were queued in the meantime, the next thread in queue will resume execution, rinse and repeat.
The exclusive lock will make sure, that a thread never "sees" an incomplete file state (= fetch a file that is about to be copied by another thread).
Related
I am using CFTHREAD in my ColdFusion application. From what I've read from Ben Nadel (https://www.bennadel.com/blog/2980-terminating-asynchronous-cfthreads-in-coldfusion.htm) ColdFusion only exposes and tracks threads in the current request. In my situation, I am spawning a thread via an ajax call and then providing the user with a cancel button. I was hoping the cancel button could call the terminate method on the thread, but no matter where I store it (application,server,session) ColdFusion always returns an error that it was unable to terminate thread "THREAD_NAME" because "THREAD_NAME" was not spawned.
I know that under the hood, ColdFusion is mostly Java. So I'm hoping that there may be a way. Could anyone either confirm or deny this possibility? Any example of how?
Thanks!
Sorry, I don't have a 50 reputation to comment, so I'll post this as an answer. Recently, I was in the same situation with a CFThread spawned via ajax and I needed to terminate it somehow but was unable to. I had a CFQuery inside a CFLoop that used its datasource in the application scope. So what I came up with was to sign into ColdFusion Administrator and temporarily renaming the datasource which caused the thread to throw a database error. While it was inelegant termination, it served the purpose at the time.
So after seeing this question it got me thinking about a possible workaround if there isn't a known way to accomplish this. Suppose during your thread processing, it tests for the value of a variable in the application/server/session scope. Supposing the value is initially set to "true" and then subsequently set to "false" by another process, when the thread finds the false value, it can terminate gracefully.
Can you?
Yes, but only using internal classes. When the cfthread is created, use the local THREAD_NAME to retrieve a reference to the underlying thread object.
context = getPageContext().getFusionContext();
thread = context.getUserThreadTask( "theLocalTaskName" );
Since the local name can be used by multiple requests, the reference should be stored under a unique name, like a uuid. The reference is actually an instance of an internal class coldfusion.threads.Task. To terminate it, call its cancel() method.
thread.cancel();
Should you?
That's a big question and all depends on what the thread does - how it does it - and how the resources it uses would be affected if the process just stops dead, midstream, with no warning.
The reason is that calling <cfthread action="terminate"..> kills the thread - instantly. CF doesn't care if it's in the middle of a critical section. The server just whacks it with a mallet and stops it cold. The exception logs show that CF does this by invoking Thread.stop()
"Information","cfthread-47","09/07/19","17:10:44","","THREAD_V_2: Terminated"
java.lang.ThreadDeath
at java.base/java.lang.Thread.stop(Thread.java:942)
at coldfusion.thread.Task.cancel(Task.java:257)
at coldfusion.tagext.lang.ThreadTag.terminateThread(ThreadTag.java:345)
at coldfusion.tagext.lang.ThreadTag.doStartTag(ThreadTag.java:204)
The java documentation says stop() method is deprecated because it's inherently unsafe:
Stopping a thread causes it to unlock all the monitors that it has
locked. (The monitors are unlocked as the ThreadDeath exception
propagates up the stack.) If any of the objects previously protected
by these monitors were in an inconsistent state, other threads may now
view these objects in an inconsistent state. Such objects are said to
be damaged. When threads operate on damaged objects, arbitrary
behavior can result. This behavior may be subtle and difficult to
detect, or it may be pronounced. Unlike other unchecked exceptions,
ThreadDeath kills threads silently; thus, the user has no warning that
his program may be corrupted. The corruption can manifest itself at
any time after the actual damage occurs, even hours or days in the
future.
So it's important to consider what a thread actually does, and determine if it's even safe to terminate. For example, if a thread processes a file with FileOpen(), forcibly terminating it might prevent the thread from releasing the handle, leaving the underlying file in a locked state, which is undesirable.
The recommended way of stopping threads in java is with an interrupt(). That's essentially the concept user12031119 described. An interrupt doesn't forcibly kill a thread. It's just a flag that suggests a thread stop processing. Leaving it up to the thread itself to determine when it's safe to exit. That allows threads to finish critical sections or perform any cleanup tasks before terminating. Yes, it requires a little more coding, but the results are much more stable and predictable than with "terminate".
What you will want to do is setup a data structure somewhere like application or session scope that keeps track of threads running that you want to be able to cancel.
Application.cfc OnApplicationStart
<cfset application.cancelThread = {} />
Before entering thread create id and then pass into thread
<cfset threadId = createUUID() />
<cfset application.cancelThread[threadId] = false />
Pass the threadId back to the client for the cancel button. On click of the cancel button pass back the threadId
<cfset application.cancelThread[form.threadId] = true />
During thread execution
<cfif application.cancelThread[threadId]>
<cfabort />
<!--- or your chosen approach to ending the processing --->
</cfif>
If thread reached end then remove thread reference
<cfset structDelete(application.cancelThread, threadId) />
So here is my situation. I have a script which is hard-coded to always run two simultaneous threads (using cfthread). Basically the script takes an excel file or csv file and splits the data up into two sections and my script creates webpages in parallel.
However, it is apparent from the pages created that thread two (the threads are being spawned in a loop) completes its section of work... however thread one only gets a small portion of its work complete... the script also dies part way through. So I'm confused why this is happening? The only concern I have is that the actual function I run, createPage(...), is itself spawning a thread to do its work which is causing the server to freak out. I can't tell if this is happening for two reasons... One I do not have access to server monitoring at that level, and two, the createpage(...) function is black-boxed because it is a proprietary API function.
Any advice? Also, is there any way to have that createPage(...), if it is spawning a thread in its implementation, wait to complete before calling it again?
Here is psuedocode of how my code, in general is working:
<cfloop from=1 to=2>
<cfthread ...> //spawn threads
<cfloop from="pageWorkStart" to="pageWorkFinish">
<cfset createPage(...)> //at this point there are two threads calling this function, which itself may be spawning threads and is not waiting to finish... the server is potentially choking from queued processes backing up.
</cfloop>
</cfthread>
</cfloop>
Looking at some legacy code and the programmer in question uses:
<cfthread action="run">
<cfexecute name="c:\myapp.exe" timeout="30">
</cfthread>
Can one safely replace the code above with this?
<cfexecute name="c:\myapp.exe" timeout="0">
Is CF going to spawn up a thread in the code above anyway? And is the thread going to be counted towards "Maximum number of threads available for CFTHREAD"?
If the intent is to have a non-blocking flow of the code, then you can safely replace the earlier code with yours.
In my understanding, CF is not creating a thread when it gets a timeout="0". It must be just calling the exe (which creates a new process on server) and never wait for the process to reply. So, nothing is added to the thread limit count.
I'm getting a weird error from CFThread. I have it wrapped around a function that runs perfectly when outside CFThread. But, it takes about 20 seconds to complete so I shoot it off to CFThread then CFLocation the user to a new page and alert them when it's done.
It's also wrapped in CFTRY to email me should there be a problem.
I'm getting emails where the CFCATCH.Message is:
"CFThread failed to set header to response as request had already completed"
I can't find any reference to an error like this on Google. I'm assuming it's not liking the fact that I'm using CFLocation directly after invoking the Thread. So, for the hell of it, I tried using a META REFRESH to redirect the user instead. Same error result.
Any ideas?
UPDATED 7/8/13:
Code here:
<cfset admsID = replace(createUUID(),"-","","all")>
<cfthread action="run" name="runADMS#admsID#" admsID="#admsID#" formstruct="#form#">
<cftry>
<cfobject component="cfc.AutoDealerBrandMarketShare" name="adms">
<cfset rptPDF = adms.buildReport(dealer=formstruct.chosenDealer,mkt=formstruct.DMACode,make=formstruct.Make,rptID=admsID)>
<cfcatch type="any">
<cfmail to="pmascari#mysite.com" from="techsupport#mysite.com" subject="ADMS Error">
Error occurred running a Polk Auto Dealer Market Share report.
#cfcatch.Message#
#cfcatch.detail#
</cfmail>
</cfcatch>
</cftry>
</cfthread>
<cflocation url="http://www.usercanwaithere.com">
If you think about it, it makes sense because cfthread can be still running After the respond has been sent to the client. Therefore, setting something new in header does not make sense anymore 'cause the "ship has sailed".
As you know, CFThread allows you to spawn a new thread do some
processing in parallel with the request. This thread can continue to
run even after the request has completed. Since this thread is not
connected to the HTTP request that spawned it, any operation done from
the thread which tries to change something in the HTTP
request/response - like setting header, cookie, response code etc
would not make sense and should not be done.
So one should not use cfcookie, cfheader, cfcontent etc inside
cfthread as it can cause unpredictable behavior.
-- Rupesh Kumar, Adobe ColdFusion engineer
Found it. Scoured through the code and found a random CFHEADER tag above one of the CFDocument tags.
I have a link that will allow users to click it and it fetches a zip file of photos. If the zip file doesn't exist, it then starts a thread to create the zip and displays a message back to the user that the photos are currently being processed.
What I'm trying to avoid is a user clicking the link repeatedly and setting off a whole mass of threads that will try create/update the zip file. The zip file processing is quite system resource intensive so I only want to allow the application to generate one zip at a time. If one is busy being compiled, it should just do nothing and queue the requests.
Currently how I am handling it is with a cflock around the thread:
<cflock name="createAlbumZip" type="exclusive" timeout="30" throwontimeout="no">
<cfthread action="run" albumId="#arguments.albumId#" name="#CreateUUID()#">
....
What I am hoping occurs here (it seems to be working if I test it) is that it will check if there is currently a thread running using a lock called 'createAlbumZip'. If there is, it will queue the request for 30 seconds after which it should timeout without any error. If it wasn't able to create it within the 30 seconds that is fine.
So - this seems to be working, but my question is: Is this the best way to handle a scenario like this? Is locking the right approach? Are there any shortcomings that could arise from this approach that I don't see?
There are a million ways to skin this cat. Locking is a good start, and as per your comment on #Pat Branley's answer, I think your locking outside the thread creation might be a little more efficient for just the reason you propose: the potential would exist to create dozens of threads whose whole lifespan will consist of waiting for a lock to open or timeout.
The other thing you need to do is double up on the IF statement:
<cfif not (zip file exists)>
<cflock ...>
<cfif not (zip file exists)>
<cfthread>
...create zip...
</cfthread>
</cfif>
</cflock>
</cfif>
This will prevent the case where thread B is waiting while thread A creates the zip, and then thread A finishes, and thread B proceeds to recreate/overwrite it.
In addition, you could consider something like using JavaScript to prevent extra clicks by disabling the button/link after it's been clicked.
I think you have that code around the wrong way. What you are saying is 'only one thread is allowed to spawn this new thread'. Now that may work in your case because you have the timeouts set such that nobody can create another thread so there is no chance two threads are executing at once.
what you want to say is 'only one thread is allowed to make a zip'. So I would do this
<cfthread .... >
<cflock>
...zip....