Immediately kill a running future thread - clojure

I'm using
(def f
(future
(while (not (Thread/interrupted))
(function-to-run))))
(Thread/sleep 100)
(future-cancel f)
to cancel my code after a specified amount of time (100ms).
The problem is, I need to cancel the already running function 'function-to-run' as well, it is important that it really stops executing that function after 100ms.
Can I somehow propagate the interrupted signal to the function?
The function is not third-party, I wrote it myself.

The basic thing to note here is: you cannot safely kill a thread without its own cooperation. Since you are the owner of the function you wish to be able to kill prematurely, it makes sense to allow the function to cooperate and die gracefully and safely.
(defn function-to-run
[]
(while work-not-done
(if-not (Thread/interrupted)
; ... do your work
(throw (InterruptedException. "Function interrupted...")))))
(def t (Thread. (fn []
(try
(while true
(function-to-run))
(catch InterruptedException e
(println (.getMessage e)))))))
To begin the thread
(.start t)
To interrupt it:
(.interrupt t)
Your approach was not sufficient for your use case because the while condition was checked only after control flow returned from function-to-run, but you wanted to stop function-to-run during its execution. The approach here is only different in that the condition is checked more frequently, namely, every time through the loop in function-to-run. Note that instead of throwing an exception from function-to-run, you could also return some value indicating an error, and as long as your loop in the main thread checks for this value, you don't have to involve exceptions at all.
If your function-to-run doesn't feature a loop where you can perform the interrupted check, then it likely is performing some blocking I/O. You may not be able to interrupt this, though many APIs will allow you to specify a timeout on the operation. In the worst case, you can still perform intermittent checks for interrupted in the function around your calls. But the bottom line still applies: you cannot safely forcibly stop execution of code running in the function; it should yield control cooperatively.
Note:
My original answer here involved presenting an example in which java's Thread.stop() was used (though strongly discouraged). Based on feedback in the comments, I revised the answer to the one above.

Related

Libcurl - curl_multi_wakeup

Reading the function description curl_multi_wakeup: enter link description here
Calling this function only guarantees to wake up the current (or the
next if there is no current) curl_multi_poll call, which means it is
possible that multiple calls to this function will wake up the same
waiting operation.
I am confused by the phrase - "the same waiting operation". How's that?
That is, suppose I have a function curl_multi_poll() in event standby mode in thread "A".
Now, for example, I call the curl_multi_wakeup() function twice from thread "B" and thread "C".
And what happens judging by this phrase:
...function will wake up the same waiting operation.
It turns out that the function curl_multi_poll - wakes up only once ?
curl_multi_wakeup is meant to be used with a pool of threads waiting on curl_multi_poll.
What the document says is that if you call curl_multi_wakeup repeatedly, it will possibly wake up only a single thread, not necessarily one thread for each call to curl_multi_wakeup.
curl_multi_poll() is a relatively new call, designed to simplify "interrupting" threads waiting on curl_multi_poll(). Here's a good explanation:
https://daniel.haxx.se/blog/2019/12/09/this-is-your-wake-up-curl/
curl_multi_poll()
[is a] function which asks libcurl to wait for activity on any of the
involved transfers – or sleep and don’t return for the next N
milliseconds.
Calling this waiting function (or using the older curl_multi_wait() or
even doing a select() or poll() call “manually”) is crucial for a
well-behaving program. It is important to let the code go to sleep
like this when there’s nothing to do and have the system wake up it up
again when it needs to do work. Failing to do this correctly, risk
having libcurl instead busy-loop somewhere and that can make your
application use 100% CPU during periods. That’s terribly unnecessary
and bad for multiple reasons.
When ... something happens and the application for example needs to
shut down immediately, users have been asking for a way to do a wake
up call.
curl_multi_wakeup() explicitly makes a curl_multi_poll() function
return immediately. It is designed to be possible to use from a
different thread.

Clojure dosync inside future vs future inside dosync

I have the following piece of code
(def number (ref 0))
(dosync (future (alter number inc))) ; A
(future (dosync (alter number inc))) ; B
The 2nd one succeeds, but the first one fails with no transaction is running. But it is wrapped inside a dosync right?
Does clojure remember opening of transactions based on which thread it was created in ?
You are correct. The whole purpose of dosync is to begin a transaction in the current thread. The future runs its code in a new thread, so the alter in case A is not inside of a dosync for its thread.
For case B, the alter and dosync are both in the same (new) thread, so there is no problem.
There are multiple reasons this doesn't work. As Alan Thompson writes, transactions are homed to a single thread, and so when you create a new thread you lose your transaction.
Another problem is the dynamic scope of dosync. The same problem would arise if you wrote
((dosync #(alter number inc)))
Here we create a function inside of the dosync scope, and let that function be the result of the dosync. Then we call the function from outside of the dosync block, but of course the transaction is no longer running.
That's very similar to what you're doing with future: future creates a function and then executes it on a new thread, returning a handle you can use to inspect the progress of that thread. Even if cross-thread transactions were allowed, you would have a race condition here: does the dosync block close its transaction before or after the alter call in the future is executed?

Queueing Method Calls So That They Are Performed By A Single Thread In Clojure

I'm building a wrapper around OrientDB in Clojure. One of the biggest limitations (IMHO) of OrientDB is that the ODatabaseDocumentTx is not thread-safe, and yet the lifetime of this thing from .open() to .close() is supposed to represent a single transaction, effectively forcing transactions to occur is a single thread. Indeed, thread-local refs to these hybrid database/transaction objects are provided by default. But what if I want to log in the same thread as I want to persist "real" state? If I hit an error, the log entries get rolled back too! That use case alone puts me off of virtually all DBMS's since most do not allow named transaction scope management. /soapbox
Anyways, OrientDB is the way it is, and it's not going to change for me. I'm using Clojure and I want an elegant way to construct a with-tx macro such that all imperative database calls within the with-tx body are serialized.
Obviously, I can brute-force it by creating a sentinel at the top level of the with-tx generated body and deconstructing every form to the lowest level and wrapping them in a synchronized block. That's terrible, and I'm not sure how that would interact with something like pmap.
I can search the macro body for calls to the ODatabaseDocumentTx object and wrap those in synchronized blocks.
I can create some sort of dispatching system with an agent, I guess.
Or I can subclass ODatabaseDocumentTx with synchronized method calls.
I'm scratching my head trying to come up with other approaches. Thoughts? In general the agent approach seems more appealing simply because if a block of code has database method calls interspersed, I would rather do all the computation up front, queue the calls, and just fire a whole bunch of stuff to the DB at the end. That assumes, however, that the computation doesn't need to ensure consistency of reads. IDK.
Sounds like a job for Lamina.
One option would be to use Executor with 1 thread in thread pool. Something like shown below. You can create a nice macro around this concept.
(import 'java.util.concurrent.Executors)
(import 'java.util.concurrent.Callable)
(defmacro sync [executor & body]
`(.get (.submit ~executor (proxy [Callable] []
(call []
(do ~#body))))))
(let [exe (Executors/newFixedThreadPool (int 1))
dbtx (sync exe (DatabaseTx.))]
(do
(sync exe (readfrom dbtx))
(sync exe (writeto dbtx))))
The sync macro make sure that the body expression is executed in the executor (which has only one thread) and it waits for the operation to complete so that all operations execute one by one.

How to control (i.e. abort) the current evaluation of a QScriptEngine

I evaluate JavaScript in my Qt application using QScriptEngine::evaluate(QString code). Let's say I evaluate a buggy piece of JavaScript which loops forever (or takes too long to wait for the result). How can I abort such an execution?
I want to control an evaluation via two buttons Run and Abort in a GUI. (But only one execution is allowed at a time.)
I thought of running the script via QtConcurrent::run, keeping the QFuture and calling cancel() when the Abort is was pressed. But the documentation says that I can't abort such executions. It seems like QFuture only cancels after the current item in the job has been processed, i.e. when reducing or filtering a collection. But for QtConcurrent::run this means that I can't use the future to abort its execution.
The other possibility I came up with was using a QThread and calling quit(), but there I have a similar problems: It only cancels the thread if / as soon as it is waiting in an event loop. But since my execution is a single function call, this is no option either.
QThread also has terminate(), but the documentation makes me worry a bit. Although my code itself doesn't involve mutexes, maybe QScriptEngine::evaluate does behind the scenes?
Warning: This function is dangerous and its use is discouraged. The thread can be terminated at any point in its code path. Threads can be terminated while modifying data. There is no chance for the thread to clean up after itself, unlock any held mutexes, etc. In short, use this function only if absolutely necessary.
Is there another option I am missing, maybe some asynchronous evaluation feature?
http://doc.qt.io/qt-4.8/qscriptengine.html#details
It has a few sections that address your concerns:
http://doc.qt.io/qt-4.8/qscriptengine.html#long-running-scripts
http://doc.qt.io/qt-4.8/qscriptengine.html#script-exceptions
http://doc.qt.io/qt-4.8/qscriptengine.html#abortEvaluation
http://doc.qt.io/qt-4.8/qscriptengine.html#setProcessEventsInterval
Hope that helps.
While the concurrent task itself can't be aborted "from outside", the QScriptEngine can be told (of course from another thread, like your GUI thread) to abort the execution:
QScriptEngine::abortEvaluation(const QScriptValue & result = QScriptValue())
The optional parameter is used as the "pseudo result" which is passed to the caller of evaluate().
You should either set a flag somewhere or use a special result value in abortEvaluation() to make it possible for the caller routine to detect that the execution was aborted.
Note: Using isEvaluating() you can see if an evaluation is currently running.

What is the difference between Clojure's "send" and "send-off" functions with respect to dispatching an action to an agent?

The Clojure API describes these two functions as:
(send a f & args) - Dispatch an action to an agent. Returns the agent immediately. Subsequently, in a thread from a thread pool, the state of the agent will be set to the value of: (apply action-fn state-of-agent args)
and
(send-off a f & args) - Dispatch a potentially blocking action to an agent. Returns the agent immediately. Subsequently, in a separate thread, the state of the agent will be set to the value of: (apply action-fn state-of-agent args)
The only obvious difference is send-off should be used when an action may block. Can somebody explain this difference in functionality in greater detail?
all the actions that get sent to any agent using send are run in a thread pool with a couple of more threads than the physical number of processors. this causes them to run closer to the cpu's full capacity. if you make 1000 calls using send you don't really incur much switching overhead, the calls that can't be processed immediately just wait until a processor becomes available. if they block then the thread pool can run dry.
when you use send-off, a new thread is created for each call. if you send-off 1000 functions, the ones that can't be processed immediately still wait for the next available processor, but they may incur the extra overhead of starting a thread if the send-off threadpool happens to be running low. it's ok if the threads block because each task (potentially) gets a dedicated thread.