clojure (with-timeout ... macro) - clojure

I'm looking for a macro that will throw an exception if an expression takes longer than X seconds to complete.

This question has better answers here:
Executing a function with a timeout
Futures to the rescue!
user=> (let [f (future (reduce * (range 1 1001)))]
(.get f 1 java.util.concurrent.TimeUnit/MILLISECONDS))
java.util.concurrent.TimeoutException (NO_SOURCE_FILE:0)
And to make a macro of it:
(defmacro time-limited [ms & body]
`(let [f# (future ~#body)]
(.get f# ~ms java.util.concurrent.TimeUnit/MILLISECONDS)))
So you can do this:
user=> (time-limited 1 (reduce * (range 1 1001)))
java.util.concurrent.TimeoutException (NO_SOURCE_FILE:0)
user=> (time-limited 1 (reduce * (range 1 101)))
93326215443944152681699238856266700490715968264381621468592963895217599993229915
608941463976156518286253697920827223758251185210916864000000000000000000000000

I'm not sure this is possible without running the expression in a separate thread. The reason being, if the thread is busy processing the expression, you can't inject code to throw an exception.
A version with a monitor thread that throws an exception if the expression takes too long is definitely possible, however, the exception thrown would be from the monitor thread, not the thread in which the expression is running. Then, there'd be no way of stopping it short of sending that thread an interrupt, which it might ignore if you haven't coded for it in the expression.
If it's acceptable to have a version which runs the expression in a separate thread, let me know and I can post some sample code. Otherwise, your best bet sound like it would be to write the main loop/recursion of the expression in such a way that it checks to see how long it has taken at every iteration and throws an exception if it's exceeded the bound. Sorry if that's not quite what you need...

I came across this thread recently while asking the same question. I wasn't completely satisfied with the answers given so I cobbled together an alternative solution. This solution will run your code in the current thread and spin of a future to interrupt it after a set timeout in ms.
(defn invoke-timeout [f timeout-ms]
(let [thr (Thread/currentThread)
fut (future (Thread/sleep timeout-ms)
(.interrupt thr))]
(try (f)
(catch InterruptedException e
(throw (TimeoutException. "Execution timed out!")))
(finally
(future-cancel fut)))))
(defmacro timeout [ms & body] `(invoke-timeout (fn [] ~#body) ~ms))
You would use it in your code like this:
(timeout 1000 your-code)
OR
(invoke-timeout #(your-code) 1000)
One caveat to keep in mind is that your-code must not catch the InterruptedException used to trigger the TimeoutException. I use this for testing and it works well.
See the Thread.interrupt() javadoc for additional caveats.
You can see this code in use here.

Related

What happens if a thread changes the value of an atom during `swap!`?

According to the official Clojure docs:
Since another thread may have changed the value in the intervening time, it [swap!] may have to
retry, and does so in a spin loop.
Does this mean that the thread performing the swap might possibly get stuck forever inside swap! if the atom in question never returns to the value read by swap!?
Yes. If you have one very slow mutation competing with a large number of fast operations, the slow operation will have to retry every time, and will not finish until all the fast operations finish. If the fast operations go on indefinitely, then the slow one will never finish.
For example, try:
(time
(let [a (atom 0)]
(future (dotimes [_ 1e9] (swap! a inc)))
(swap! a (fn [x] (Thread/sleep 1000) (* 2 x)))))
You'll see, first of all, that it takes a long time to finish, much longer than a second. This is because the swap! outside of the loop can't make any progress until the smaller tasks have all finished. You'll also see that the answer you get is exactly 2000000000, meaning that the doubling operation definitely happened last, after every single increment. If there were more increments, they would all get "priority".
I've additionally thought of a few cute ways to deadlock an atom forever without tying up any more threads at all!
One approach is to get the thread to race with itself:
(let [a (atom 0)]
(swap! a (fn [x]
(swap! a inc')
(inc' x))))
I use inc' so that it really is forever: it won't break after Long/MAX_VALUE.
And a way that doesn't even involve another swap! operation, much less another thread!
(swap! (atom (repeat 1)) rest)
Here the problem is that the .equals comparison in compare-and-swap never terminates, because (repeat 1) goes on forever.
No*. When the atom re-trys the operation, it uses the "new" value for the comparison.
You can find many examples online (like here and also here) where people have used from dozens to hundreds of threads to pound on an atom, and it always returns the correct result.
* Unless you have an infinite number of interruptions from competing, "faster" threads.

Call a side effecting function only when atom value changes

What is the simplest way to trigger a side-effecting function to be called only when an atom's value changes?
If I were using a ref, I think I could just do this:
(defn transform-item [x] ...)
(defn do-side-effect-on-change [] nil)
(def my-ref (ref ...))
(when (dosync (let [old-value #my-ref
_ (alter! my-ref transform-item)
new-value #my-ref]
(not= old-value new-value)))
(do-side-effect-on-change))
But this seems seems a bit roundabout, since I'm using a ref even though I am not trying to coordinate changes across multiple refs. Essentially I am using it just to conveniently access the old and new value within a successful transaction.
I feel like I should be able to use an atom instead. Is there a solution simpler than this?
(def my-atom (atom ...))
(let [watch-key ::side-effect-watch
watch-fn (fn [_ _ old-value new-value]
(when (not= old-value new-value)
(do-side-effect-on-change)))]
(add-watch my-atom watch-key watch-fn)
(swap! my-atom transform-item)
(remove-watch watch-key))
This also seems roundabout, because I am adding and removing the watch around every call to swap!. But I need this, because I don't want a watch hanging around that causes the side-effecting function to be triggered when other code modifies the atom.
It is important that the side-effecting function be called exactly once per mutation to the atom, and only when the transform function transform-item actually returns a new value. Sometimes it will return the old value, yielding new change.
(when (not= #a (swap! a transform))
(do-side-effect))
But you should be very clear about what concurrency semantics you need. For example another thread may modify the atom between reading it and swapping it:
a = 1
Thread 1 reads a as 1
Thread 2 modifies a to 2
Thread 1 swaps a from 2 to 2
Thread 1 determines 1 != 2 and calls do-side-effect
It is not clear to me from the question whether this is desirable or not desirable. If you do not want this behavior, then an atom just will not do the job unless you introduce concurrency control with a lock.
Seeing as you started with a ref and asked about an atom, I think you have probably given some thought to concurrency already. It seems like from your description the ref approach is better:
(when (dosync (not= #r (alter r transform))
(do-side-effect))
Is there a reason you don't like your ref solution?
If the answer is "because I don't have concurrency" Then I would encourage you to use a ref anyway. There isn't really a downside to it, and it makes your semantics explicit. IMO programs tend to grow and to a point where concurrency exists, and Clojure is really great at being explicit about what should happen when it exists. (For example oh I'm just calculating stuff, oh I'm just exposing this stuff as a web service now, oh now I'm concurrent).
In any case, bear in mind that functions like alter and swap! return the value, so you can make use of this for concise expressions.
I'm running into the same situation and just come up 2 solutions.
state field :changed?
Keeping a meanless :changed mark in atom to track swap function. And take the return value of swap! to see if things changed. For example:
(defn data (atom {:value 0 :changed? false}))
(let [{changed? :changed?} (swap! data (fn [data] (if (change?)
{:value 1 :changed? true}
{:value 0 :change? false})))]
(when changed? (do-your-task)))
exception based
You can throw an Exception in swap function, and catch it outside:
(try
(swap! data (fn [d] (if (changed?) d2 (ex-info "unchanged" {})))
(do-your-task)
(catch Exception _
))

Monitor goroutine and channel leaking / starvation?

I've got a number of workers running, pulling work items off a queue. Something like:
(def num-workers 100)
(def num-tied-up (atom 0))
(def num-started (atom 0))
(def input-queue (chan (dropping-buffer 200))
(dotimes [x num-workers]
(go
(swap num-started inc)
(loop []
(let [item (<! input-queue)]
(swap! num-tied-up inc)
(try
(process-f worker-id item)
(catch Exception e _))
(swap! num-tied-up dec))
(recur)))
(swap! num-started dec))))
Hopefully num-tied-up represents the number of workers performing work at a given point in time. The value of num-tied-up hovers around a fairly consistent 50, sometimes 60. As num-workers is 100 and the value of num-started is, as expected 100 (i.e. all the go routines are running), this feels like there's a comfortable margin.
My problem is that input-queue is growing. I would expect it to hover around the zero mark because there are enough workers to take items off it. But in practice, eventually it maxes out and drops events.
It looks like tied-up has plenty of head-room in num-workers so workers should be available to take work off the queue.
My questions:
Is there anything I can do to make this more robust?
Are there any other diagnostics I can use to work out what's going wrong? Is there a way to monitor the number of goroutines currently working in case they die?
Can you make the observations fit with the data?
When I fix the indentation of your code, this is what I see:
(def num-workers 100)
(def num-tied-up (atom 0))
(def num-started (atom 0))
(def input-queue (chan (dropping-buffer 200))
(dotimes [x num-workers]
(go
(swap num-started inc)
(loop []
(let [item (<! input-queue)]
(swap! num-tied-up inc)
(try
(process-f worker-id item)
(catch Exception e _))
(swap! num-tied-up dec))
(recur)))
(swap! num-started dec))
)) ;; These parens don't balance.
Ignoring the extra parens, which I assume are some copy/paste error, here are some observations:
You increment num-started inside of a go thread, but you decrement it outside of the go thread immediately after creating that thread. There's a good likelihood that the decrement is always happening before the increment.
The 100 loops you create (one per go thread) never terminate. This in and of itself isn't a problem, as long as this is intentional and by design.
Remember, spawning a go thread doesn't mean that the current thread (the one doing the spawning, in which the dotimes executes) will block. I may be mistaken, but it looks as though your code is making the assumption that (swap! num-started dec) will only run when the go thread spawned immediately above it is finished. But this is not true, even if your go threads did eventually finish (which, as mentioned above, they don't).
Work done by go routines shouldn't do any IO or blocking operation (like Thread/sleep) as all go routines share the same thread pool, which right now has a fixed size of cpus * 2 + 42. For IO bounded work, use core.async/thread.
Note that the thread pool limits the number of go routines that will be executing concurrently, but you can have plenty waiting to be executed.
As an analogy, if you start Chrome, Vim and iTunes (equivalent to 3 go routines) but you have only one CPU in your laptop (equivalent to a thread pool of size 1), then only one of them will be executing in the CPU and the other will be waiting to be executed. It is the OS the one that takes care of pausing/resuming the programs so it looks like they are all running at the same time. Core.async just does the same, but with the difference that core.async can just pause the go-routines when they hit a !.
Now to answer your questions:
No. Try/catch all exceptions is your best option. Also, monitor the size of the queue. I would reevaluate the need of core.async. Zach Tellman has a very nice thread pool implementation with plenty of metrics.
A thread dump will show you where all the core.async threads are blocking, but as I said, you shouldn't really be doing any IO work in the go-routine threads. Have a look at core.async/pipeline-blocking
If you have 4 cores, you will get a core.async thread pool of 50, which matches your observations of 50 concurrent go blocks. The other 50 go blocks are running but either waiting for work to appear in the queue or for a time-slot to be executed. Note that they all have had a chance to be executed at least once as num-started is 100.
Hope it helps.

couldn't use for loop in go block of core.async?

I'm new to clojure core.async library, and I'm trying to understand it through experiment.
But when I tried:
(let [i (async/chan)] (async/go (doall (for [r [1 2 3]] (async/>! i r)))))
it gives me a very strange exception:
CompilerException java.lang.IllegalArgumentException: No method in multimethod '-item-to-ssa' for dispatch value: :fn
and I tried another code:
(let [i (async/chan)] (async/go (doseq [r [1 2 3]] (async/>! i r))))
it have no compiler exception at all.
I'm totally confused. What happend?
So the Clojure go-block stops translation at function boundaries, for many reasons, but the biggest is simplicity. This is most commonly seen when constructing a lazy seq:
(go (lazy-seq (<! c)))
Gets compiled into something like this:
(go (clojure.lang.LazySeq. (fn [] (<! c))))
Now let's think about this real quick...what should this return? Assuming what you probably wanted was a lazy seq containing the value taken from c, but the <! needs to translate the remaining code of the function into a callback, but LazySeq is expecting the function to be synchronous. There really isn't a way around this limitation.
So back to your question if, you macroexpand for you'll see that it doesn't actually loop, instead it expands into a bunch of code that eventually calls lazy-seq and so parking ops don't work inside the body. doseq (and dotimes) however are backed by loop/recur and so those will work perfectly fine.
There are a few other places where this might trip you up with-bindings being one example. Basically if a macro sticks your core.async parking operations into a nested function, you'll get this error.
My suggestion then is to keep the body of your go blocks as simple as possible. Write pure functions, and then treat the body of go blocks as the places to do IO.
------------ EDIT -------------
By stops translation at function boundaries, I mean this: the go block takes its body and translates it into a state-machine. Each call to <! >! or alts! (and a few others) are considered state machine transitions where the execution of the block can pause. At each of those points the machine is turned into a callback and attached to the channel. When this macro reaches a fn form it stops translating. So you can only make calls to <! from inside a go block, not inside a function inside a code block.
This is part of the magic of core.async. Without the go macro, core.async code would look a lot like callback-hell in other langauges.

understanding Clojure futures

I am trying to understand Clojure futures, and I've seen examples from the common Clojure books out there, and there are examples where futures are used for parallel computations (which seems to makes sense).
However, I am hoping someone can explain the behavior of a simple example adapted from O'Reilly's Programming Clojure book.
(def long-calculation (future (apply + (range 1e8))))
When I try to dereference this, by doing
(time #long-calculation)
It returns the correct result (4999999950000000), but almost instantly (in 0.045 msecs) on my machine.
But when I call the actual function, like so
(time (apply + (range 1e8)))
I get the correct result as well, but the time taken is much larger (~ 5000 msecs).
When I dereference the future, my understanding is that a new thread is created on which the expression is evaluated - in which case I would expect it to take around 5000 msec as well.
How come the dereferenced future returns the correct result so quickly?
The calculation in a future starts as soon as you create the future (in a separate thread). In your case, the calculation starts as soon as you execute (def long-calculation ....)
Dereferencing will do one of two things:
If the future has not completed, block until it completes and then return the value (this could take an arbitrary amount of time, or even never complete if the future fails to terminate)
If the future has completed, return the result. This is almost instantaneous (which is why you are seeing very fast dereference returns)
You can see the effect by comparing the following:
;; dereference before future completes
(let [f (future (Thread/sleep 1000))]
(time #f))
=> "Elapsed time: 999.46176 msecs"
;; dereference after future completes
(let [f (future (Thread/sleep 1000))]
(Thread/sleep 2000)
(time #f))
=> "Elapsed time: 0.039598 msecs"