Race condition with Clojure atom

Race condition with Clojure atom - clojure

I apparently do not understand clojure's atom correctly. I thought that it's atomicity guarantee could be demonstrated as follows:
(def users (atom #{}))
(defn add-user! [name]
(swap! users
(fn [users]
(conj users name))))
(do
(map deref
[(future (add-user! "bob"))
(future (add-user! "clair"))
(future (add-user! "ralph"))
(future (add-user! "mark"))
(future (add-user! "bill"))
(future (add-user! "george"))]))
(println #users)
(println
(if (= 5 (count #users))
"SUCCESS!"
"FAIL"))
Unfortunately this is not the case. The code seems to exhibit a race condition on the set contained in the users atom.
Which data structure do I need to use to make sure that all users are successfully added to the users set?
SOLUTION
As pointed out in the comments, there were several bugs in the code. The main bug was not using dorun to force the evaluation of all of the futures. After making this change, the code runs as expected:
(def users (atom #{}))
(defn add-user! [name]
(swap! users
(fn [users]
(conj users name))))
(dorun
(map deref
[(future (add-user! "bob"))
(future (add-user! "clair"))
(future (add-user! "ralph"))
(future (add-user! "mark"))
(future (add-user! "bill"))
(future (add-user! "george"))]))
(println #users)
(println
(if (= 6 (count #users))
"SUCCESS!"
"FAIL"))

See Clojure Atom documentation.
Also from Joy of Clojure:
Atoms are like Refs in that they're synchronous but are like Agents in that they're independent (uncoordinated).

Related

Improving complex data structure replacement

I'm attempting to modify a specific field in a data structure, described below (a filled example can be found here:
[{:fields "There are a few other fields here"
:incidents [{:fields "There are a few other fields here"
:updates [{:fields "There are a few other fields here"
:content "THIS is the field I want to replace"
:translations [{:based_on "Based on the VALUE of this"
:content "Replace with this value"}]}]}]}]
I already have this implemented it in a number of functions, as below:
(defn- translation-content
[arr]
(:content (nth arr (.indexOf (map :locale arr) (env/get-locale)))))
(defn- translate
[k coll fn & [k2]]
(let [k2 (if (nil? k2) k k2)
c ((keyword k2) coll)]
(assoc-in coll [(keyword k)] (fn c))))
(defn- format-update-translation
[update]
(dissoc update :translations))
(defn translate-update
[update]
(format-update-translation (translate :content update translation-content :translations)))
(defn translate-updates
[updates]
(vec (map translate-update updates)))
(defn translate-incident
[incident]
(translate :updates incident translate-updates))
(defn translate-incidents
[incidents]
(vec (map translate-incident incidents)))
(defn translate-service
[service]
(assoc-in service [:incidents] (translate-incidents (:incidents service))))
(defn translate-services
[services]
(vec (map translate-service services)))
Each array could have any number of entries (though the number is likely less than 10).
The basic premise is to replace the :content in each :update with the relevant :translation based on a provided value.
My Clojure knowledge is limited, so I'm curious if there is a more optimal way to achieve this?
EDIT
Solution so far:
(defn- translation-content
[arr]
(:content (nth arr (.indexOf (map :locale arr) (env/get-locale)))))
(defn- translate
[k coll fn & [k2]]
(let [k2 (if (nil? k2) k k2)
c ((keyword k2) coll)]
(assoc-in coll [(keyword k)] (fn c))))
(defn- format-update-translation
[update]
(dissoc update :translations))
(defn translate-update
[update]
(format-update-translation (translate :content update translation-content :translations)))
(defn translate-updates
[updates]
(mapv translate-update updates))
(defn translate-incident
[incident]
(translate :updates incident translate-updates))
(defn translate-incidents
[incidents]
(mapv translate-incident incidents))
(defn translate-service
[service]
(assoc-in service [:incidents] (translate-incidents (:incidents service))))
(defn translate-services
[services]
(mapv translate-service services))

I would start more or less as you do, bottom-up, by defining some functions that look like they will be useful: how to choose a translation from among a list of translations, and how to apply that choice to an update. But I wouldn't make the functions so tiny as yours: the logic is all spread out into a lot of places, and it's not easy to get an overall idea of what is going on. Here are the two functions I'd start with:
(letfn [(choose-translation [translations]
(let [applicable (filter #(= (:locale %) (get-locale))
translations)]
(when (= 1 (count applicable))
(:content (first applicable)))))
(translate-update [update]
(-> update
(assoc :content (or (choose-translation (:translations update))
(:content update)))
(dissoc :translations)))]
...)
Of course you can defn them instead if you'd like, and I suspect many people would, but they're only going to be used in one place, and they're intimately involved with the context in which they're used, so I like a letfn. These two functions are really all the interesting logic; the rest is just some boring tree-traversal code to apply this logic in the right places.
Now to build out the body of the letfn is straightforward, and easy to read if you make your code be the same shape as the data it manipulates. We want to walk through a series of nested lists, updating objects on the way, and so we just write a series of nested for comprehensions, calling update to descend into the right keyspace:
(for [user users]
(update user :incidents
(fn [incidents]
(for [incident incidents]
(update incident :updates
(fn [updates]
(for [update updates]
(translate-update update))))))))
I think using for here is miles better than using map, although of course they are equivalent as always. The important difference is that as you read through the code you see the new context first ("okay, now we're doing something to each user"), and then what is happening inside that context; with map you see them in the other order and it is difficult to keep tack of what is happening where.
Combining these, and putting them into a defn, we get a function that you can call with your example input and which produces your desired output (assuming a suitable definition of get-locale):
(defn translate [users]
(letfn [(choose-translation [translations]
(let [applicable (filter #(= (:locale %) (get-locale))
translations)]
(when (= 1 (count applicable))
(:content (first applicable)))))
(translate-update [update]
(-> update
(assoc :content (or (choose-translation (:translations update))
(:content update)))
(dissoc :translations)))]
(for [user users]
(update user :incidents
(fn [incidents]
(for [incident incidents]
(update incident :updates
(fn [updates]
(for [update updates]
(translate-update update))))))))))

we can try to find some patterns in this task (based on the contents of the snippet from github gist, you've posted):
simply speaking, you need to
1) update every item (A) in vector of data
2) updating every item (B) in vector of A's :incidents
3) updating every item (C) in vector of B's :updates
4) translating C
The translate function could look like this:
(defn translate [{translations :translations :as item} locale]
(assoc item :content
(or (some #(when (= (:locale %) locale) (:content %)) translations)
:no-translation-found)))
it's usage (some fields are omitted for brevity):
user> (translate {:id 1
:content "abc"
:severity "101"
:translations [{:locale "fr_FR"
:content "abc"}
{:locale "ru_RU"
:content "абв"}]}
"ru_RU")
;;=> {:id 1,
;; :content "абв",
;; :severity "101",
;; :translations [{:locale "fr_FR", :content "abc"} {:locale "ru_RU", :content "абв"}]}
then we can see that 1 and 2 are totally similar, so we can generalize that:
(defn update-vec-of-maps [data k f]
(mapv (fn [item] (update item k f)) data))
using it as a building block you can make up the whole data transformation:
(defn transform [data locale]
(update-vec-of-maps
data :incidents
(fn [incidents]
(update-vec-of-maps
incidents :updates
(fn [updates] (mapv #(translate % locale) updates))))))
(transform data "it_IT")
returns what you need.
then you can generalize it further, making the utility function for arbitrary depth transformations:
(defn deep-update-vec-of-maps [data ks terminal-fn]
(if (seq ks)
((reduce (fn [f k] #(update-vec-of-maps % k f))
terminal-fn (reverse ks))
data)
data))
and use it like this:
(deep-update-vec-of-maps data [:incidents :updates]
(fn [updates]
(mapv #(translate % "it_IT") updates)))

I recommend you look at https://github.com/nathanmarz/specter
It makes it really easy to read and update clojure data structures. Same performance as hand-written code, but much shorter.

Agent/actor like constructs in clojure that operate on all messages received since last update

What's best way in clojure to implement something like an actor or agent (asynchronously updated, uncoordinated reference) that does the following?
gets sent messages/data
executes some function on that data to obtain new state; something like (fn [state new-msgs] ...)
continues to receive messages/data during that update
once done with that update, runs the same update function against all messages that have been sent in the interim
An agent doesn't seem quite right here. One must simultaneously send function and data to agents, which doesn't leave room for a function which operates on all data that has come in during the last update. The goal implicitly requires a decoupling of function and data.
The actor model seems generally better suited in that there is a decoupling of function and data. However, all actor frameworks I'm aware of seem to assume each message sent will be processed separately. It's not clear how one would turn this on it's head without adding extra machinery. I know Pulsar's actors accept a :lifecycle-handle function which can be used to make actors do "special tricks" but there isn't a lot of documentation around this so it's unclear whether the functionality would be helpful.
I do have a solution to this problem using agents, core.async channels, and watch functions, but it's a bit messy, and I'm hoping there is a better solution. I'll post it as a solution in case others find it helpful, but I'd like to see what other's come up with.

Here's the solution I came up with using agents, core.async channels, and watch functions. Again, it's a bit messy, but it does what I need it to for now. Here it is, in broad strokes:
(require '[clojure.core.async :as async :refer [>!! <!! >! <! chan go]])
; We'll call this thing a queued-agent
(defprotocol IQueuedAgent
(enqueue [this message])
(ping [this]))
(defrecord QueuedAgent [agent queue]
IQueuedAgent
(enqueue [_ message]
(go (>! queue message)))
(ping [_]
(send agent identity)))
; Need a function for draining a core async channel of all messages
(defn drain! [c]
(let [cc (chan 1)]
(go (>! cc ::queue-empty))
(letfn
; This fn does all the hard work, but closes over cc to avoid reconstruction
[(drainer! [c]
(let [[v _] (<!! (go (async/alts! [c cc] :priority true)))]
(if (= v ::queue-empty)
(lazy-seq [])
(lazy-seq (cons v (drainer! c))))))]
(drainer! c))))
; Constructor function
(defn queued-agent [& {:keys [buffer update-fn init-fn error-handler-builder] :or {:buffer 100}}]
(let [q (chan buffer)
a (agent (if init-fn (init-fn) {}))
error-handler-fn (error-handler-builder q a)]
; Set up the queue, and watcher which runs the update function when there is new data
(add-watch
a
:update-conv
(fn [k r o n]
(let [queued (drain! q)]
(when-not (empty? queued)
(send a update-fn queued error-handler-fn)))))
(QueuedAgent. a q)))
; Now we can use these like this
(def a (queued-agent
:init-fn (fn [] {:some "initial value"})
:update-fn (fn [a queued-data error-handler-fn]
(println "Receiving data" queued-data)
; Simulate some work/load on data
(Thread/sleep 2000)
(println "Done with work; ready to queue more up!"))
; This is a little warty at the moment, but closing over the queue and agent lets you requeue work on
; failure so you can try again.
:error-handler-builder
(fn [q a] (println "do something with errors"))))
(defn -main []
(doseq [i (range 10)]
(enqueue a (str "data" i))
(Thread/sleep 500) ; simulate things happening
; This part stinks... have to manually let the queued agent know that we've queued some things up for it
(ping a)))
As you'll notice, having to ping the queued-agent here every time new data is added is pretty warty. It definitely feels like things are being twisted out of typical usage.

Agents are the inverse of what you want here - they are a value that gets sent updating functions. This easiest with a queue and a Thread. For convenience I am using future to construct the thread.
user> (def q (java.util.concurrent.LinkedBlockingDeque.))
#'user/q
user> (defn accumulate
[summary input]
(let [{vowels true consonents false}
(group-by #(contains? (set "aeiouAEIOU") %) input)]
(-> summary
(update-in [:vowels] + (count vowels))
(update-in [:consonents] + (count consonents)))))
#'user/accumulate
user> (def worker
(future (loop [summary {:vowels 0 :consonents 0} in-string (.take q)]
(if (not in-string)
summary
(recur (accumulate summary in-string)
(.take q))))))
#'user/worker
user> (.add q "hello")
true
user> (.add q "goodbye")
true
user> (.add q false)
true
user> #worker
{:vowels 5, :consonents 7}

I came up with something closer to an actor, inspired by Tim Baldridge's cast on actors (Episode 16). I think this addresses the problem much more cleanly.
(defmacro take-all! [c]
`(loop [acc# []]
(let [[v# ~c] (alts! [~c] :default nil)]
(if (not= ~c :default)
(recur (conj acc# v#))
acc#))))
(defn eager-actor [f]
(let [msgbox (chan 1024)]
(go (loop [f f]
(let [first-msg (<! msgbox) ; do this so we park efficiently, and only
; run when there are actually messages
msgs (take-all! msgbox)
msgs (concat [first-msg] msgs)]
(recur (f msgs)))))
msgbox))
(let [a (eager-actor (fn f [ms]
(Thread/sleep 1000) ; simulate work
(println "doing something with" ms)
f))]
(doseq [i (range 20)]
(Thread/sleep 300)
(put! a i)))
;; =>
;; doing something with (0)
;; doing something with (1 2 3)
;; doing something with (4 5 6)
;; doing something with (7 8 9 10)
;; doing something with (11 12 13)

How to memoize a function that uses core.async and non-blocking channel read?

I'd like to use memoize for a function that uses core.async and <! e.g
(defn foo [x]
(go
(<! (timeout 2000))
(* 2 x)))
(In the real-life, it could be useful in order to cache the results of server calls)
I was able to achieve that by writing a core.async version of memoize (almost the same code as memoize):
(defn memoize-async [f]
(let [mem (atom {})]
(fn [& args]
(go
(if-let [e (find #mem args)]
(val e)
(let [ret (<! (apply f args))]; this line differs from memoize [ret (apply f args)]
(swap! mem assoc args ret)
ret))))))
Example of usage:
(def foo-memo (memoize-async foo))
(go (println (<! (foo-memo 3)))); delay because of (<! (timeout 2000))
(go (println (<! (foo-memo 3)))); subsequent calls are memoized => no delay
I am wondering if there are simpler ways to achieve the same result.
**Remark: I need a solution that works with <!. For <!!, see this question: How to memoize a function that uses core.async and blocking channel read? **

You can use the built in memoize function for this. Start by defining a method that reads from a channel and returns the value:
(defn wait-for [ch]
(<!! ch))
Note that we'll use <!! and not <! because we want this function block until there is data on the channel in all cases. <! only exhibits this behavior when used in a form inside of a go block.
You can then construct your memoized function by composing this function with foo, like such:
(def foo-memo (memoize (comp wait-for foo)))
foo returns a channel, so wait-for will block until that channel has a value (i.e. until the operation inside foo finished).
foo-memo can be used similar to your example above, except you do not need the call to <! because wait-for will block for you:
(go (println (foo-memo 3))
You can also call this outside of a go block, and it will behave like you expect (i.e. block the calling thread until foo returns).

This was a little trickier than I expected. Your solution isn't correct, because when you call your memoized function again with the same arguments, sooner than the first run finishes running its go block, you will trigger it again and get a miss. This is often the case when you process lists with core.async.
The one below uses core.async's pub/sub to solve this (tested in CLJS only):
(def lookup-sentinel #?(:clj ::not-found :cljs (js-obj))
(def pending-sentinel #?(:clj ::pending :cljs (js-obj))
(defn memoize-async
[f]
(let [>in (chan)
pending (pub >in :args)
mem (atom {})]
(letfn
[(memoized [& args]
(go
(let [v (get #mem args lookup-sentinel)]
(condp identical? v
lookup-sentinel
(do
(swap! mem assoc args pending-sentinel)
(go
(let [ret (<! (apply f args))]
(swap! mem assoc args ret)
(put! >in {:args args :ret ret})))
(<! (apply memoized args)))
pending-sentinel
(let [<out (chan 1)]
(sub pending args <out)
(:ret (<! <out)))
v))))]
memoized)))
NOTE: it probably leaks memory, subscriptions and <out channels are not closed

I have used this function in one of my projects to cache HTTP calls. The function caches results for a given amount of time and uses a barrier to prevent executing the function multiple times when the cache is "cold" (due to the context switch inside the go block).
(defn memoize-af-until
[af ms clock]
(let [barrier (async/chan 1)
last-return (volatile! nil)
last-return-ms (volatile! nil)]
(fn [& args]
(async/go
(>! barrier :token)
(let [now-ms (.now clock)]
(when (or (not #last-return-ms) (< #last-return-ms (- now-ms ms)))
(vreset! last-return (<! (apply af args)))
(vreset! last-return-ms now-ms))
(<! barrier)
#last-return)))))
You can test that it works properly by setting the cache time to 0 and observe that the two function calls take approximately 10 seconds. Without the barrier the two calls would finish at the same time:
(def memo (memoize-af-until #(async/timeout 5000) 0 js/Date))
(async/take! (memo) #(println "[:a] Finished"))
(async/take! (memo) #(println "[:b] Finished"))

how to separate concerns in the below fn

The function below does 2 things -
Checks if the atom is nil or fetch-agin is true, and then fetches the data.
It processes the data by calling (add-date-strings).
What is a better pattern to separate out the above two concerns ?
(def retrieved-data (atom nil))
(defn fetch-it!
[fetch-again?]
(if (or fetch-again?
(nil? #retrieved-data))
(->> (exec-services)
(map #(add-date-strings (:time %)))
(reset! retrieved-data))
#retrieved-data))

One possible refactoring would be:
(def retrieved-data (atom nil))
(defn fetch []
(->> (exec-services)
(map #(add-date-strings (:time %)))))
(defn fetch-it!
([]
(fetch-it! false))
([force]
(if (or force (nil? #retrieved-data))
(reset! retrieved-data (fetch))
#retrieved-data)))
By the way, the pattern to seperate out concerns is called "functions" :)

To really separate the concerns I think it might be better to define a separate fetch and process function. So that in no way they are complected.
(def retrieved-data (atom nil))
(defn fetcher []
(->> (exec-services)
(map #(add-date-strings (:time %)))))
(defn fetch-again? [force]
(fn [data] (or force (nil? data))))
(defn fetch-it! [fetch-fn data fetch-again?]
(when (fetch-again? #data))
(reset! data (fetch-fn))))
;;Usage
(fetch-it! fetcher retrieved-data (fetch-again? true))
Notice that I also gave the data atom as an argument.

How to do hooks in Clojure

I have a situation where I am creating and destroying objects in one clojure namespace, and want another namespace to co-ordinate. However I do not want the first namespace to have to call the second explicitly on object destruction.
In Java, I could use a listener. Unfortunately the underlying java libraries do not signal events on object destruction. If I were in Emacs-Lisp, then I'd use hooks which do the trick.
Now, in clojure I am not so sure. I have found the Robert Hooke library https://github.com/technomancy/robert-hooke. But this is more like defadvice in elisp terms -- I am composing functions. More over the documentation says:
"Hooks are meant to extend functions you don't control; if you own the target function there are obviously better ways to change its behaviour."
Sadly, I am not finding it so obvious.
Another possibility would be to use add-watch, but this is marked as alpha.
Am I missing another obvious solution?
Example Added:
So First namespace....
(ns scratch-clj.first
(:require [scratch-clj.another]))
(def listf (ref ()))
(defn add-object []
(dosync
(ref-set listf (conj
#listf (Object.))))
(println listf))
(defn remove-object []
(scratch-clj.another/do-something-useful (first #listf))
(dosync
(ref-set listf (rest #listf)))
(println listf))
(add-object)
(remove-object)
Second namespace
(ns scratch-clj.another)
(defn do-something-useful [object]
(println "object removed is:" object))
The problem here is that scratch-clj.first has to require another and explicitly push removal events across. This is a bit clunky, but also doesn't work if I had "yet-another" namespace, which also wanted to listen.
Hence I thought of hooking the first function.

Is this solution suitable to your requirements?
scratch-clj.first:
(ns scratch-clj.first)
(def listf (atom []))
(def destroy-listeners (atom []))
(def add-listeners (atom []))
(defn add-destroy-listener [f]
(swap! destroy-listeners conj f))
(defn add-add-listener [f]
(swap! add-listeners conj f))
(defn add-object []
(let [o (Object.)]
(doseq [f #add-listeners] (f o))
(swap! listf conj o)
(println #listf)))
(defn remove-object []
(doseq [f #destroy-listeners] (f (first #listf)))
(swap! listf rest)
(println #listf))
Some listeners:
(ns scratch-clj.another
(:require [scratch-clj.first :as fst]))
(defn do-something-useful-on-remove [object]
(println "object removed is:" object))
(defn do-something-useful-on-add [object]
(println "object added is:" object))
Init binds:
(ns scratch-clj.testit
(require [scratch-clj.another :as another]
[scratch-clj.first :as fst]))
(defn add-listeners []
(fst/add-destroy-listener another/do-something-useful-on-remove)
(fst/add-add-listener another/do-something-useful-on-add))
(defn test-it []
(add-listeners)
(fst/add-object)
(fst/remove-object))
test:
(test-it)
=> object added is: #<Object java.lang.Object#c7aaef>
[#<Object java.lang.Object#c7aaef>]
object removed is: #<Object java.lang.Object#c7aaef>
()

It sounds a lot like what you're describing is callbacks.
Something like:
(defn make-object
[destructor-fn]
{:destructor destructor-fn :other-data "data"})
(defn destroy-object
[obj]
((:destructor obj) obj))
; somewhere at the calling code...
user> (defn my-callback [o] (pr [:destroying o]))
#'user/my-callback
user> (destroy-object (make-object my-callback))
[:destroying {:destructor #<user$my_callback user$my_callback#73b8cdd5>, :other-data "data"}]
nil
user>

So, here is my final solution following mobytes suggestion. A bit more work, but
I suspect that I will want this in future.
Thanks for all the help
;; hook system
(defn make-hook []
(atom []))
(defn add-hook [hook func]
(do
(when-not
(some #{func} #hook)
(swap! hook conj func))
#hook))
(defn remove-hook [hook func]
(swap! hook
(partial
remove #{func})))
(defn clear-hook [hook]
(reset! hook []))
(defn run-hook
([hook]
(doseq [func #hook] (func)))
([hook & rest]
(doseq [func #hook] (apply func rest))))
(defn phils-hook []
(println "Phils hook"))
(defn phils-hook2 []
(println "Phils hook2"))
(def test-hook (make-hook))
(add-hook test-hook phils-hook)
(add-hook test-hook phils-hook2)
(run-hook test-hook)
(remove-hook test-hook phils-hook)
(run-hook test-hook)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Race condition with Clojure atom - clojure

See Clojure Atom documentation. Also from Joy of Clojure: Atoms are like Refs in that they're synchronous but are like Agents in that they're independent (uncoordinated).

Related

Improving complex data structure replacement

Agent/actor like constructs in clojure that operate on all messages received since last update

How to memoize a function that uses core.async and non-blocking channel read?

how to separate concerns in the below fn

How to do hooks in Clojure

Categories

Resources