Time for control flow in Clojure - pitfalls of ScheduledThreadPoolExecutor? - clojure

I'm learning about concurrency in Clojure.
I ran into a claim (by Stuart Serra?) at http://dev.clojure.org/display/design/Scheduled+Events, stating:
Clojure functions cannot use time for control flow without blocking or Java interop
Java interop (ScheduledThreadPoolExecutor) is not aware of thread-local bindings
I don't understand these claims and kindly ask for clarification, perhaps an example. Specifically:
What's wrong with ScheduledThreadPoolExecutor as is? Since I'm starting a new (green) thread, I don't expect per-thread bindings to carry over anyway.
I can schedule a normal Clojure function, so what's stopping me to send desired bindings as lexically closed context?
Many thanks!

Ok, I think I got it.
Suppose you try this:
(def pool (atom nil))
(defn- thread-pool []
(or #pool
(reset! pool (ScheduledThreadPoolExecutor. 1))))
(def ^:dynamic *t* 0)
(binding [*t* 1]
(future (println "first example:" *t*)))
(binding [*t* 1]
(.schedule (thread-pool) (fn [] (println "second example:" *t*)) 0
TimeUnit/SECONDS))
(binding [*t* 1]
(.schedule (thread-pool) (bound-fn [] (println "third example:" *t*)) 0
TimeUnit/SECONDS))
The output will be:
first example: 1
second example: 0
third example: 1
In the first case, the future macro wraps the body with the private function binding-conveyor-fn, which preserves the calling thread's bindings frame in the lexical scope, and restores it before calling the wrapped function.
In the third case, bound-fn pushes the calling thread's bindings onto the frame, executes the function body, and pops the bindings.
In the second case, no one saves per-thread bindings - a Java class certainly doesn't know about them, and so we drop to the root value of the t Var.
I hope someone out there finds this interesting.

Related

Call a side effecting function only when atom value changes

What is the simplest way to trigger a side-effecting function to be called only when an atom's value changes?
If I were using a ref, I think I could just do this:
(defn transform-item [x] ...)
(defn do-side-effect-on-change [] nil)
(def my-ref (ref ...))
(when (dosync (let [old-value #my-ref
_ (alter! my-ref transform-item)
new-value #my-ref]
(not= old-value new-value)))
(do-side-effect-on-change))
But this seems seems a bit roundabout, since I'm using a ref even though I am not trying to coordinate changes across multiple refs. Essentially I am using it just to conveniently access the old and new value within a successful transaction.
I feel like I should be able to use an atom instead. Is there a solution simpler than this?
(def my-atom (atom ...))
(let [watch-key ::side-effect-watch
watch-fn (fn [_ _ old-value new-value]
(when (not= old-value new-value)
(do-side-effect-on-change)))]
(add-watch my-atom watch-key watch-fn)
(swap! my-atom transform-item)
(remove-watch watch-key))
This also seems roundabout, because I am adding and removing the watch around every call to swap!. But I need this, because I don't want a watch hanging around that causes the side-effecting function to be triggered when other code modifies the atom.
It is important that the side-effecting function be called exactly once per mutation to the atom, and only when the transform function transform-item actually returns a new value. Sometimes it will return the old value, yielding new change.
(when (not= #a (swap! a transform))
(do-side-effect))
But you should be very clear about what concurrency semantics you need. For example another thread may modify the atom between reading it and swapping it:
a = 1
Thread 1 reads a as 1
Thread 2 modifies a to 2
Thread 1 swaps a from 2 to 2
Thread 1 determines 1 != 2 and calls do-side-effect
It is not clear to me from the question whether this is desirable or not desirable. If you do not want this behavior, then an atom just will not do the job unless you introduce concurrency control with a lock.
Seeing as you started with a ref and asked about an atom, I think you have probably given some thought to concurrency already. It seems like from your description the ref approach is better:
(when (dosync (not= #r (alter r transform))
(do-side-effect))
Is there a reason you don't like your ref solution?
If the answer is "because I don't have concurrency" Then I would encourage you to use a ref anyway. There isn't really a downside to it, and it makes your semantics explicit. IMO programs tend to grow and to a point where concurrency exists, and Clojure is really great at being explicit about what should happen when it exists. (For example oh I'm just calculating stuff, oh I'm just exposing this stuff as a web service now, oh now I'm concurrent).
In any case, bear in mind that functions like alter and swap! return the value, so you can make use of this for concise expressions.
I'm running into the same situation and just come up 2 solutions.
state field :changed?
Keeping a meanless :changed mark in atom to track swap function. And take the return value of swap! to see if things changed. For example:
(defn data (atom {:value 0 :changed? false}))
(let [{changed? :changed?} (swap! data (fn [data] (if (change?)
{:value 1 :changed? true}
{:value 0 :change? false})))]
(when changed? (do-your-task)))
exception based
You can throw an Exception in swap function, and catch it outside:
(try
(swap! data (fn [d] (if (changed?) d2 (ex-info "unchanged" {})))
(do-your-task)
(catch Exception _
))

Why does dynamic binding affect the body of future?

I'm trying to reproduce pitfalls of dynamic vars described by Chas - http://cemerick.com/2009/11/03/be-mindful-of-clojures-binding/.
Consider the following snippet:
(def ^:dynamic *dynamic-x* 10)
(defn adder [arg]
(+ *dynamic-x* arg))
(adder 5) ;; returns 15
(binding [*dynamic-x* 20]
(adder 5)) ;; returns 25
(binding [*dynamic-x* 20]
#(future (adder 5))) ;; returns 25 (!)
Actually, I was expecting that 3rd case will return 15, as soon as adding is performed on the separate thread and current thread local value of *dynamic-x* (which I supposed to be 10) should be used. But, unexpectedly for me, it returns 25.
Where am I wrong?
It is the design of future that the dynamic bindings are passed to the other threads in which the future body will be executed (Look into the source of future, which defers to future-call, which uses a function called binding-conveyor-fn which explicitly copies the thread-local bindings).
The motivation for this IMO is that when using future you want to run your computation in the same "logical thread", only you're actually doing it in another Java Thread for performance reasons.
I agree it would deserve to be explicitly documented :)

couldn't use for loop in go block of core.async?

I'm new to clojure core.async library, and I'm trying to understand it through experiment.
But when I tried:
(let [i (async/chan)] (async/go (doall (for [r [1 2 3]] (async/>! i r)))))
it gives me a very strange exception:
CompilerException java.lang.IllegalArgumentException: No method in multimethod '-item-to-ssa' for dispatch value: :fn
and I tried another code:
(let [i (async/chan)] (async/go (doseq [r [1 2 3]] (async/>! i r))))
it have no compiler exception at all.
I'm totally confused. What happend?
So the Clojure go-block stops translation at function boundaries, for many reasons, but the biggest is simplicity. This is most commonly seen when constructing a lazy seq:
(go (lazy-seq (<! c)))
Gets compiled into something like this:
(go (clojure.lang.LazySeq. (fn [] (<! c))))
Now let's think about this real quick...what should this return? Assuming what you probably wanted was a lazy seq containing the value taken from c, but the <! needs to translate the remaining code of the function into a callback, but LazySeq is expecting the function to be synchronous. There really isn't a way around this limitation.
So back to your question if, you macroexpand for you'll see that it doesn't actually loop, instead it expands into a bunch of code that eventually calls lazy-seq and so parking ops don't work inside the body. doseq (and dotimes) however are backed by loop/recur and so those will work perfectly fine.
There are a few other places where this might trip you up with-bindings being one example. Basically if a macro sticks your core.async parking operations into a nested function, you'll get this error.
My suggestion then is to keep the body of your go blocks as simple as possible. Write pure functions, and then treat the body of go blocks as the places to do IO.
------------ EDIT -------------
By stops translation at function boundaries, I mean this: the go block takes its body and translates it into a state-machine. Each call to <! >! or alts! (and a few others) are considered state machine transitions where the execution of the block can pause. At each of those points the machine is turned into a callback and attached to the channel. When this macro reaches a fn form it stops translating. So you can only make calls to <! from inside a go block, not inside a function inside a code block.
This is part of the magic of core.async. Without the go macro, core.async code would look a lot like callback-hell in other langauges.

Editing running program with infinite loop

In this (http://vimeo.com/14709925) video dude edits running program that renders opengl stuff in a loop.
When i run this:
(def a 10)
(defn myloop
[]
(while (= 1 1)
(println a)
(Thread/sleep 1000)))
(myloop)
then change value of a, re eval does nothing, value doesn't seem to change. I'm using LightTable IDE. Should i switch to emacs?
One possibility is that the re-evaluation isn't taking place because it is done on the same thread as the running program. Try running myloop in another thread instead with (future (myloop)) instead of (myloop) and then re-def your a after a few prints and see if it changes.
Note that (in current Clojure versions) all vars are dereferenced each time they are encountered, which allows for this dynamic behavior, but re-def-ing except during interactive testing/experimentation/demonstration is frowned upon. See atoms and refs.
Another consequence of this behavior of vars is that dereferencing can impact the efficiency of performance critical tight loops. Where the dynamic behavior is not needed you might see the following idiom to capture the value first (note you shouldn't attempt pre-optimazation in general until bottlenecks are identified).
(def foo 42)
(let [foo foo] ; capture value of foo within scope of let
(loop ...
; do something with value of foo captured before entering loop
... ))
I know this isn't a direct answer to your question - but if you want to mutate state in this way in Clojure, I think it is probably more idiomatic to use one of the constructs for state manipulation (for example, an atom) rather than re-evaluating a def.
This is especially true if you're likely to need multiple threads, which might well be the case if you're working with graphics.
(def a (atom 10))
(defn myloop []
(while (= 1 1)
(println #a)
(Thread/sleep 1000)))
(myloop)
(reset! a 9)

Threadlocal counter in Clojure

I have a web app where i want to be able to track the number of times a given function is called in a request (i.e. thread).
I know that it is possible to do in a non-thread local way with a ref, but how would I go about doing it thread locally?
There's a tool for this in useful called thread-local. You can write, for example, (def counter (thread-local (atom 0))). This will create a global variable which, when derefed, will yield a fresh atom per thread. So you could read the current value with ##counter, or increment it with (swap! #counter inc). Of course, you could also get hold of the atom itself with #counter and just treat it like a normal atom from then on.
You can use a dynamic global var, bound to a value with binding in combination with the special form set! to change its value. Vars bound with binding are thread-local. The following will increase *counter* every time my-fn is called for any form called within a with-counter call:
(def ^{:dynamic true} *counter*)
(defmacro with-counter [& body]
`(binding [*counter* 0]
~#body
*counter*))
(defn my-fn []
(set! *counter* (inc *counter*)))
To demonstrate, try:
(with-counter (doall (repeatedly 5 my-fn)))
;; ==> 5
For more information, see http://clojure.org/vars#set
You can keep instance of ThreadLocal in ref. And every time you need to increase it just read value, increase it and set back. At the beginning of request you should initialize thread local with 0, because threads may be reused for different requests.