Clojure's ref vs atom in concurrency - concurrency

(ns learnclojure.core)
(def acct1 (atom 1000 :validator #(>= % 0)))
(def acct2 (atom 1000 :validator #(>= % 0)))
(defn transfer [from-ac to-ac amt]
(swap! to-ac + amt)
(swap! from-ac - amt))
(dotimes [_ 10]
(future (transfer acct2 acct1 100)))
(deref acct1)
(deref acct2)
(def acct1 (ref 1000 :validator #(>= % 0)))
(def acct2 (ref 1000 :validator #(>= % 0)))
(defn transfer [from-ac to-ac amt]
(dosync
(alter to-ac + amt)
(alter from-ac - amt)))
(dotimes [_ 10]
(future (transfer acct2 acct1 100)))
(deref acct1)
(deref acct2)
I have two Clojure code changing states concurrently.
The first one that uses atom (line 3 - 14) seems to be working fine, whereas the second one that uses ref (line 17 and 29) shows random results. What might be wrong?

The last (deref acct1) (deref acct2) forms are evaluated before the futures are done executing.
What's more, the result is inconsistent because the reads are not coordinated; if you had written something like (dosync [(deref acct1) (deref acct2)]) the sum would always be 2000.
By the way, I strongly recommend you do not re-define the #'transfer, #'acct1 and #'acct2 vars for this kind of concurrency experiment; choose different names :)

Related

Implementing Clojure conditional/branching transducer

I'm trying to make a conditional transducer in Clojure as follows:
(defn if-xf
"Takes a predicate and two transducers.
Returns a new transducer that routes the input to one of the transducers
depending on the result of the predicate."
[pred a b]
(fn [rf]
(let [arf (a rf)
brf (b rf)]
(fn
([] (rf))
([result]
(rf result))
([result input]
(if (pred input)
(arf result input)
(brf result input)))))))
It is pretty useful in that it lets you do stuff like this:
;; multiply odd numbers by 100, square the evens.
(= [0 100 4 300 16 500 36 700 64 900]
(sequence
(if-xf odd? (map #(* % 100)) (map (fn [x] (* x x))))
(range 10)))
However, this conditional transducer does not work very well with transducers that perform cleanup in their 1-arity branch:
;; negs are multiplied by 100, non-negs are partitioned by 2
;; BUT! where did 6 go?
;; expected: [-600 -500 -400 -300 -200 -100 [0 1] [2 3] [4 5] [6]]
;;
(= [-600 -500 -400 -300 -200 -100 [0 1] [2 3] [4 5]]
(sequence
(if-xf neg? (map #(* % 100)) (partition-all 2))
(range -6 7)))
Is it possible to tweak the definition of if-xf to handle the case of transducers with cleanup?
I'm trying this, but with weird behavior:
(defn if-xf
"Takes a predicate and two transducers.
Returns a new transducer that routes the input to one of the transducers
depending on the result of the predicate."
[pred a b]
(fn [rf]
(let [arf (a rf)
brf (b rf)]
(fn
([] (rf))
([result]
(arf result) ;; new!
(brf result) ;; new!
(rf result))
([result input]
(if (pred input)
(arf result input)
(brf result input)))))))
Specifically, the flushing happens at the end:
;; the [0] at the end should appear just before the 100.
(= [[-6 -5] [-4 -3] [-2 -1] 100 200 300 400 500 600 [0]]
(sequence
(if-xf pos? (map #(* % 100)) (partition-all 2))
(range -6 7)))
Is there a way to make this branching/conditional transducer without storing the entire input sequence in local state within this transducer (i.e. doing all the processing in the 1-arity branch upon cleanup)?
The idea is to complete every time the transducer switches over. IMO this is the only way to do it without buffering:
(defn if-xf
"Takes a predicate and two transducers.
Returns a new transducer that routes the input to one of the transducers
depending on the result of the predicate."
[pred a b]
(fn [rf]
(let [arf (volatile! (a rf))
brf (volatile! (b rf))
a? (volatile! nil)]
(fn
([] (rf))
([result]
(let [crf (if #a? #arf #brf)]
(-> result crf rf)))
([result input]
(let [p? (pred input)
[xrf crf] (if p? [#arf #brf] [#brf #arf])
switched? (some-> #a? (not= p?))]
(if switched?
(-> result crf (xrf input))
(xrf result input))
(vreset! a? p?)))))))
(sequence (if-xf pos? (map #(* % 100)) (partition-all 2)) [0 1 0 1 0 0 0 1])
; => ([0] 100 [0] 100 [0 0] [0] 100)
I think your question is ill-defined. What exactly do you want to happen when the transducers have state? For example, what do you expect this do:
(sequence
(if-xf even? (partition-all 3) (partition-all 2))
(range 14))
Furthermore, sometimes reducing functions have work to do at the beginning and the end and can't be restarted arbitrarily. For example, here is a reducer that computes the mean:
(defn mean
([] {:count 0, :sum 0})
([result] (double (/ (:sum result) (:count result))))
([result x]
(update-in
(update-in result [:count] inc)
[:sum] (partial + x))))
(transduce identity mean [10 20 40 40]) ;27.5
Now let's take the average, where anything below 20 counts for 20, but everything else is decreased by 1:
(transduce
(if-xf
(fn [x] (< x 20))
(map (constantly 20))
(map dec))
mean [10 20 40 40]) ;29.25
My answer is the following: I think your original solution is best. It works well using map, which is how you stated the usefulness of the conditional transducer in the first place.

Efficient side-effect-only analogue of Clojure's map function

What if map and doseq had a baby? I'm trying to write a function or macro like Common Lisp's mapc, but in Clojure. This does essentially what map does, but only for side-effects, so it doesn't need to generate a sequence of results, and wouldn't be lazy. I know that one can iterate over a single sequence using doseq, but map can iterate over multiple sequences, applying a function to each element in turn of all of the sequences. I also know that one can wrap map in dorun. (Note: This question has been extensively edited after many comments and a very thorough answer. The original question focused on macros, but those macro issues turned out to be peripheral.)
This is fast (according to criterium):
(defn domap2
[f coll]
(dotimes [i (count coll)]
(f (nth coll i))))
but it only accepts one collection. This accepts arbitrary collections:
(defn domap3
[f & colls]
(dotimes [i (apply min (map count colls))]
(apply f (map #(nth % i) colls))))
but it's very slow by comparison. I could also write a version like the first, but with different parameter cases [f c1 c2], [f c1 c2 c3], etc., but in the end, I'll need a case that handles arbitrary numbers of collections, like the last example, which is simpler anyway. I've tried many other solutions as well.
Since the second example is very much like the first except for the use of apply and the map inside the loop, I suspect that getting rid of them would speed things up a lot. I have tried to do this by writing domap2 as a macro, but the way that the catch-all variable after & is handled keeps tripping me up, as illustrated above.
Other examples (out of 15 or 20 different versions), benchmark code, and times on a Macbook Pro that's a few years old (full source here):
(defn domap1
[f coll]
(doseq [e coll]
(f e)))
(defn domap7
[f coll]
(dorun (map f coll)))
(defn domap18
[f & colls]
(dorun (apply map f colls)))
(defn domap15
[f coll]
(when (seq coll)
(f (first coll))
(recur f (rest coll))))
(defn domap17
[f & colls]
(let [argvecs (apply (partial map vector) colls)] ; seq of ntuples of interleaved vals
(doseq [args argvecs]
(apply f args))))
I'm working on an application that uses core.matrix matrices and vectors, but feel free to substitute your own side-effecting functions below.
(ns tst
(:use criterium.core
[clojure.core.matrix :as mx]))
(def howmany 1000)
(def a-coll (vec (range howmany)))
(def maskvec (zero-vector :vectorz howmany))
(defn unmaskit!
[idx]
(mx/mset! maskvec idx 1.0)) ; sets element idx of maskvec to 1.0
(defn runbench
[domapfn label]
(print (str "\n" label ":\n"))
(bench (def _ (domapfn unmaskit! a-coll))))
Mean execution times according to Criterium, in microseconds:
domap1: 12.317551 [doseq]
domap2: 19.065317 [dotimes]
domap3: 265.983779 [dotimes with apply, map]
domap7: 53.263230 [map with dorun]
domap18: 54.456801 [map with dorun, multiple collections]
domap15: 32.034993 [recur]
domap17: 95.259984 [doseq, multiple collections interleaved using map]
EDIT: It may be that dorun+map is the best way to implement domap for multiple large lazy sequence arguments, but doseq is still king when it comes to single lazy sequences. Performing the same operation as unmask! above, but running the index through (mod idx 1000), and iterating over (range 100000000), doseq is about twice as fast as dorun+map in my tests (i.e. (def domap25 (comp dorun map))).
You don't need a macro, and I don't see why a macro would be helpful here.
user> (defn do-map [f & lists] (apply mapv f lists) nil)
#'user/do-map
user> (do-map (comp println +) (range 2 6) (range 8 11) (range 22 40))
32
35
38
nil
note do-map here is eager (thanks to mapv) and only executes for side effects
Macros can use varargs lists, as the (useless!) macro version of do-map demonstrates:
user> (defmacro do-map-macro [f & lists] `(do (mapv ~f ~#lists) nil))
#'user/do-map-macro
user> (do-map-macro (comp println +) (range 2 6) (range 8 11) (range 22 40))
32
35
38
nil
user> (macroexpand-1 '(do-map-macro (comp println +) (range 2 6) (range 8 11) (range 22 40)))
(do (clojure.core/mapv (comp println +) (range 2 6) (range 8 11) (range 22 40)) nil)
Addendum:
addressing the efficiency / garbage-creation concerns:
note that below I truncate the output of the criterium bench function, for conciseness reasons:
(defn do-map-loop
[f & lists]
(loop [heads lists]
(when (every? seq heads)
(apply f (map first heads))
(recur (map rest heads)))))
user> (crit/bench (with-out-str (do-map-loop (comp println +) (range 2 6) (range 8 11) (range 22 40))))
...
Execution time mean : 11.367804 µs
...
This looks promising because it doesn't create a data structure that we aren't using anyway (unlike mapv above). But it turns out it is slower than the previous (maybe because of the two map calls?).
user> (crit/bench (with-out-str (do-map-macro (comp println +) (range 2 6) (range 8 11) (range 22 40))))
...
Execution time mean : 7.427182 µs
...
user> (crit/bench (with-out-str (do-map (comp println +) (range 2 6) (range 8 11) (range 22 40))))
...
Execution time mean : 8.355587 µs
...
Since the loop still wasn't faster, let's try a version which specializes on arity, so that we don't need to call map twice on every iteration:
(defn do-map-loop-3
[f a b c]
(loop [[a & as] a
[b & bs] b
[c & cs] c]
(when (and a b c)
(f a b c)
(recur as bs cs))))
Remarkably, though this is faster, it is still slower than the version that just used mapv:
user> (crit/bench (with-out-str (do-map-loop-3 (comp println +) (range 2 6) (range 8 11) (range 22 40))))
...
Execution time mean : 9.450108 µs
...
Next I wondered if the size of the input was a factor. With larger inputs...
user> (def test-input (repeatedly 3 #(range (rand-int 100) (rand-int 1000))))
#'user/test-input
user> (map count test-input)
(475 531 511)
user> (crit/bench (with-out-str (apply do-map-loop-3 (comp println +) test-input)))
...
Execution time mean : 1.005073 ms
...
user> (crit/bench (with-out-str (apply do-map (comp println +) test-input)))
...
Execution time mean : 756.955238 µs
...
Finally, for completeness, the timing of do-map-loop (which as expected is slightly slower than do-map-loop-3)
user> (crit/bench (with-out-str (apply do-map-loop (comp println +) test-input)))
...
Execution time mean : 1.553932 ms
As we see, even with larger input sizes, mapv is faster.
(I should note for completeness here that map is slightly faster than mapv, but not by a large degree).

How to write concurrent program in clojure using ref, agent, and future?

I am writing a program that allows two futures to transfer money from account A to account B. Each future will try to transfer its amount and sleep for a while after that. My program compiled with No Error, so I don't know where to start debugging it. It is supposed to print out text, but it does not. Can someone tell me what is happening with my program?
;here are initial amounts of balance A and B
(def balanceA {ref 1000})
(def balanceB {ref 2000})
;agent will count a number of complete transfer
(def agentCount {agent 1})
; this func will do the transfer with the waitingTime/sleep
(defn transfer [balanceA balanceB amount futureNum waitingTime]
)
; Two futures will repeat 10 times doing the transactions and print out the balances
(dotimes [n 10](def futureA (future transfer(balanceA balanceB 20 1 (rand-int 100)) (prn "result" #balanceA #balanceB))))
(dotimes [n 10](def futureB (future transfer(balanceA balanceB 15 2 (rand-int 40))(prn "result" #balanceA #balanceB))))
(shutdown-agents)
UPDATED: thanks to #noisesmith comment/correction!
Only you have a displaced parenthesis on your transfer call and an error on your ref and agent definitions
The error definition: {ref 1000} or {agent 1} instead of (ref 1000) (agent 1)
(def balanceA (ref 1000))
(def balanceB (ref 2000))
(def agentCount (agent 1))
and the parenthesis ...
(future transfer ( balanceA ... => (future transfer balanceA ...
You have to change this:
(dotimes [n 10](def futureA (future transfer(balanceA balanceB 20 1 (rand-int 100)) (prn "result" #balanceA #balanceB))))
(dotimes [n 10](def futureB (future transfer(balanceA balanceB 15 2 (rand-int 40))(prn "result" #balanceA #balanceB))))
With this:
(dotimes [n 10](def futureA (future (transfer balanceA balanceB 20 1 (rand-int 100)) (print "result" #balanceA #balanceB))))
(dotimes [n 10](def futureB (future (transfer balanceA balanceB 15 2 (rand-int 40))(prn "result" #balanceA #balanceB))))

Can one monitor STM's contention level?

Is there any way to poll whether Clojure's STM transactions are being retried, and at what rate?
You can observe the history count of a ref which will indicate that there is contention on it:
user=> (def my-ref (ref 0 :min-history 1))
#'user/my-ref
user=> (ref-history-count my-ref)
0
user=> (dosync (alter my-ref inc))
1
user=> (ref-history-count my-ref)
1
The history count does not directly represent contention. Instead it represents the number of past values that have been maintained in order to service concurrent reads.
The size of the history is limited by min and max values. By default those are 0 and 10, respectively, but you can change them when creating the ref (see above). Since min-history is 0 by default, you won't usually see ref-history-count return non-zero values, unless there is contention on the ref.
See more discussion on history count here: https://groups.google.com/forum/?fromgroups#!topic/clojure/n_MKCoa870o
I don't think there is any way, provided by clojure.core, to observe the rate of STM transactions at the moment. You can of course do something similar to what #Chouser did in his history stress test:
(dosync
(swap! try-count inc)
...)
i.e. increment a counter inside the transaction. The increment will happen every time the transaction is tried. If try-count is larger than 1, the transaction was retried.
By introducing named dosync blocks and commit counts (the times a named dosync has succeeded), one can quite easily keep track of the times threads have retried a given transaction.
(def ^{:doc "ThreadLocal<Map<TxName, Map<CommitNumber, TriesCount>>>"}
local-tries (let [l (ThreadLocal.)]
(.set l {})
l))
(def ^{:doc "Map<TxName, Int>"}
commit-number (ref {}))
(def history ^{:doc "Map<ThreadId, Map<TxName, Map<CommitNumber, TriesCount>>>"}
(atom {}))
(defn report [_ thread-id tries]
(swap! history assoc thread-id tries))
(def reporter (agent nil))
(defmacro dosync [tx-name & body]
`(clojure.core/dosync
(let [cno# (#commit-number ~tx-name 0)
tries# (update-in (.get local-tries) [~tx-name] update-in [cno#] (fnil inc 0))]
(.set local-tries tries#)
(send reporter report (.getId (Thread/currentThread)) tries#))
~#body
(alter commit-number update-in [~tx-name] (fnil inc 0))))
Given the following example...
(def foo (ref {}))
(def bar (ref {}))
(defn x []
(dosync :x ;; `:x`: the tx-name.
(let [r (rand-int 2)]
(alter foo assoc r (rand))
(Thread/sleep (rand-int 400))
(alter bar assoc (rand-int 2) (#foo r)))))
(dotimes [i 4]
(future
(dotimes [i 10]
(x))))
...#history evaluates to:
;; {thread-id {tx-name {commit-number tries-count}}}
{40 {:x {3 1, 2 4, 1 3, 0 1}}, 39 {:x {2 1, 1 3, 0 1}}, ...}
This additional implementation is substantially simpler.
;; {thread-id retries-of-latest-tx}
(def tries (atom {}))
;; The max amount of tries any thread has performed
(def max-tries (atom 0))
(def ninc (fnil inc 0))
(def reporter (agent nil))
(defn report [_ tid]
(swap! max-tries #(max % (get #tries tid 0)))
(swap! tries update-in [tid] (constantly 0)))
(defmacro dosync [& body]
`(clojure.core/dosync
(swap! tries update-in [(.getId (Thread/currentThread))] ninc)
(commute commit-id inc)
(send reporter report (.getId (Thread/currentThread)))
~#body))

How to launch two threads and wait for them

I can launch two threads and they work, but synchronously. What am I missing to get these threads independently launched?
main, thread, and output
(defn -main
[& args]
(do
(let [grid-dim-in [0 5]
mr1-pos [\N 2 4]
mr2-pos [\N 1 5]
mr1-movs "LMLMMRMM"
mr2-movs "RMRMMMLM"]
(reset! grid-dim grid-dim-in)
(reset! mr1-id {:mr1 mr1-pos})
(reset! mr2-id {:mr2 mr2-pos})
(.start (Thread. (rover-thread mr1-id mr1-movs update-work-block)))
(.start (Thread. (rover-thread mr2-id mr2-movs update-work-block))))))
(defn rover-thread [id movs update-ref]
(let [id-key (keys #id)
id-vals (vals #id)]
(doseq [mov movs]
(println "Rover " id-key " is moving ")
(let [new-mov (determine-rover-move (first id-vals) mov)]
(move-rover id new-mov update-ref)
(print "Rover ")
(print (first id-key))
(print " is at ")
(println new-mov)
(Thread/sleep (rand 1000)))))
Rover :mr1 is at [E 2 4]
Rover (:mr1) is moving
Rover :mr1 is at [N 2 5]
Rover (:mr1) is moving
Rover :mr1 is at [N 2 5]
Finished on Thread[main,5,main]
Rover (:mr2) is moving
Rover :mr2 is at [E 1 5]
Rover (:mr2) is moving
Rover :mr2 is at [N 1 6]
Take a close look at these two lines:
(.start (Thread. (rover-thread mr1-id mr1-movs update-work-block)))
(.start (Thread. (rover-thread mr2-id mr2-movs update-work-block))))))
This code evaluates the (rover-thread mr1-id mr1-movs update-work-block) first, and passes the result of that to the constructor of Thread, which is not what you want.
Here's a simple function to illustrate the principle. This doesn't work, because the (f ...) is evaluated before its result it passed to the Thread constructor:
(defn run-thread-thing-wrong []
(let [f (fn [n s]
(doseq [i (range n)]
(prn s i)
(Thread/sleep (rand 1000))))]
(.start (Thread. (f 10 "A")))
(.start (Thread. (f 10 "B"))))
nil)
Here's a version that does work. A function is passed to the Thread constructor instead:
(defn run-thread-thing []
(let [f (fn [n s]
(doseq [i (range n)]
(prn s i)
(Thread/sleep (rand 1000))))]
(.start (Thread. (fn [] (f 10 "A"))))
(.start (Thread. (fn [] (f 10 "B")))))
nil)
Note: instead of (fn [] ....) you can use the short form #(....) for anonymous functions.
Here's another version that does the same, but with a future instead of manually creating threads:
(defn run-thread-thing []
(let [f (fn [n s]
(doseq [i (range n)]
(prn s i)
(Thread/sleep (rand 1000))))]
(future (f 10 "A"))
(future (f 10 "B")))
nil)
Note that in this case, you pass a form to future instead of a function.
This seems like a really good place to use Clojure's agent feature. I am not qualified to fully explain how to use them, but a really good example of their usage can be found here. Starting threads using agents is dead-easy, and I think it is more idiomatic.
The code would look something like,
(def rover1 (agent [mr1-posn mr1-movs mr1-id]))
(def rover2 (agent [mr2-posn mr2-movs mr2-id]))
(defn rover-behave [[posn movs id]]
(send-off *agent* #'rover-behave)
(. Thread (sleep 1000))
(let [new-mov (determine-rover-move posn movs id)
new-posn (posn-after-move posn new-mov)]
;return value updates state of agent
[new-posn movs id]
)
)
(send-off rover1 rover-behave)
(send-off rover2 rover-behave)