(def alice-height
(ref 3))
(def right-hand-bites
(ref 10))
(defn eat-from-right-hand []
(dosync
(when (pos? #right-hand-bites)
(alter right-hand-bites dec)
(alter alice-height #(+ % 24)))))
This code is from the book Living Clojure. In the book, the author also gave an example with alter replaced by commute. I'm wondering with the pos? test at the beginning, can we really do this replacement?
No, replacing alter with commute when decrementing right-hand-bites is not correct.
The intention of the conditional is apparently to prevent right-hand-bites from becoming negative. The decrement is only valid under the assumption that right-hand-bites won’t change until the end of the transaction. While, like alter, commute has its own snapshot view of the ref world, it will re-read and re-apply the commute function to the ref at commit time, and that would be a mistake in this program.
So, with commute it is possible to commit a negative value to right-hand-bites.
Either stick with alter, or use ensure instead of # (though that makes the whole commute exercise rather pointless).
Related
I'm new to Clojure and have been trying to understand its transaction model.
When playing with alter and commute, I noticed that if I alter a ref after commute it, then the transaction will not commit anything (or makes no change to anything).
For example:
(def counter (ref 0))
(def i (ref 0))
(future (dosync
(ref-set counter 1)
(ref-set i 1)
(commute counter inc)
(alter counter inc)))
Both #counter and #i will be 0, but if I swap commute and alter or use two commutes or two alters in this case it will produce the desired result (3 and 1, respectively).
I've read some posts explaining that the behavior of commute and alter is a bit different in that commute is actually executed twice in a transaction (one where it stands, the other at the "commit" stage) and ignores inconsistent snapshots of the ref. I'm just confused by the strange behavior of the combination of these two.
Could anyone help explain how it works? Thanks in advance!
The commute function is useful only in very narrow (i.e. rare) circumstances, where it may reduce lock contention at the cost of additional re-tries of the update function. It also makes the mental model of the transaction much more complicated, as your example shows (I had never seen this particular problem before, for example).
IMHO it is almost always better to use alter instead of commute since alter is simpler and more bulletproof. Indeed, I would usually consider the use of commute to be a case of premature optimization.
I'm going through the book 7 concurrency models in 7 weeks. In it philosophers are represented as a number of ref's:
(def philosophers (into [] (repeatedly 5 #(ref :thinking))))
The state of each philosopher is flipped between :thinking and :eating using dosync transactions to ensure consistency.
Now I want to have a thread that outputs current status, so that I can be sure that the state is valid at all times:
(defn status-thread []
(Thread.
#(while true
(dosync
(println (map (fn [p] #p) philosophers))
(Thread/sleep 100)))))
We use multiple # to read values of each philosopher. It can happen that some refs are changed as we map over philosophers. Would it cause us to print inconsistent state although we don't have one?
I'm aware that Clojure uses MVCC to implement STM, but I'm not sure that I apply it correctly.
My transaction contains side effects and generally they should not appear inside a transaction. But in this case, transaction will always succeed and side effect should take place only once. Is it acceptable?
Your transaction doesn't really need a side effect, and if you scale the problem up enough I believe the transaction could fail for lack of history and retry the side effect if there's a lot of writing going on. I think the more appropriate way here would be to pull the dosync closer in. The transaction should be a pure, side-effect free fact finding mission. Once that has resulted in a value, you are then free to perform side effects with it without affecting the STM.
(defn status-thread []
(-> #(while true
(println (dosync (mapv deref philosophers)))
(Thread/sleep 100))
Thread.
.start)) ;;Threw in starting of the thread for my own testing
A few things I want to mention here:
# is a reader macro for the deref fn, so (fn [p] #p) is equivalent to just deref.
You should avoid laziness within transactions as some of the lazy values may be evaluated outside the context of the dosync or not at all. For mappings that means you can use e.g. doall, or like here just the eagerly evaluated mapv variant that makes a vector rather than a sequence.
This contingency was included in the STM design.
This problem is explicitly solved by combining agents with refs. refs guarantee that all messages set to agents in a transaction are sent exactly once and they are only sent when the transaction commits. If the transaction is retried then they will be dropped and not sent. When the transaction does eventually get through they will be sent at the moment the transaction commits.
(def watcher (agent nil))
(defn status-thread []
(future
(while true
(dosync
(send watcher (fn [_] (println (map (fn [p] #p) philosophers))))
(Thread/sleep 100)))))
The STM guarantees that your transaction will not be committed if the refs you deref during the transaction where changes in an incompatible way while it was running. You don't need to explicitly worry about derefing multiple refs in a transaction (that what the STM was made for)
What is the simplest way to trigger a side-effecting function to be called only when an atom's value changes?
If I were using a ref, I think I could just do this:
(defn transform-item [x] ...)
(defn do-side-effect-on-change [] nil)
(def my-ref (ref ...))
(when (dosync (let [old-value #my-ref
_ (alter! my-ref transform-item)
new-value #my-ref]
(not= old-value new-value)))
(do-side-effect-on-change))
But this seems seems a bit roundabout, since I'm using a ref even though I am not trying to coordinate changes across multiple refs. Essentially I am using it just to conveniently access the old and new value within a successful transaction.
I feel like I should be able to use an atom instead. Is there a solution simpler than this?
(def my-atom (atom ...))
(let [watch-key ::side-effect-watch
watch-fn (fn [_ _ old-value new-value]
(when (not= old-value new-value)
(do-side-effect-on-change)))]
(add-watch my-atom watch-key watch-fn)
(swap! my-atom transform-item)
(remove-watch watch-key))
This also seems roundabout, because I am adding and removing the watch around every call to swap!. But I need this, because I don't want a watch hanging around that causes the side-effecting function to be triggered when other code modifies the atom.
It is important that the side-effecting function be called exactly once per mutation to the atom, and only when the transform function transform-item actually returns a new value. Sometimes it will return the old value, yielding new change.
(when (not= #a (swap! a transform))
(do-side-effect))
But you should be very clear about what concurrency semantics you need. For example another thread may modify the atom between reading it and swapping it:
a = 1
Thread 1 reads a as 1
Thread 2 modifies a to 2
Thread 1 swaps a from 2 to 2
Thread 1 determines 1 != 2 and calls do-side-effect
It is not clear to me from the question whether this is desirable or not desirable. If you do not want this behavior, then an atom just will not do the job unless you introduce concurrency control with a lock.
Seeing as you started with a ref and asked about an atom, I think you have probably given some thought to concurrency already. It seems like from your description the ref approach is better:
(when (dosync (not= #r (alter r transform))
(do-side-effect))
Is there a reason you don't like your ref solution?
If the answer is "because I don't have concurrency" Then I would encourage you to use a ref anyway. There isn't really a downside to it, and it makes your semantics explicit. IMO programs tend to grow and to a point where concurrency exists, and Clojure is really great at being explicit about what should happen when it exists. (For example oh I'm just calculating stuff, oh I'm just exposing this stuff as a web service now, oh now I'm concurrent).
In any case, bear in mind that functions like alter and swap! return the value, so you can make use of this for concise expressions.
I'm running into the same situation and just come up 2 solutions.
state field :changed?
Keeping a meanless :changed mark in atom to track swap function. And take the return value of swap! to see if things changed. For example:
(defn data (atom {:value 0 :changed? false}))
(let [{changed? :changed?} (swap! data (fn [data] (if (change?)
{:value 1 :changed? true}
{:value 0 :change? false})))]
(when changed? (do-your-task)))
exception based
You can throw an Exception in swap function, and catch it outside:
(try
(swap! data (fn [d] (if (changed?) d2 (ex-info "unchanged" {})))
(do-your-task)
(catch Exception _
))
I'm wondering what is considered to be a side-effect in predicates for fns like remove or filter. There seems to be a range of possibilities. Clearly, if the predicate writes to a file, this is a side-effect. But consider a situation like this:
(def *big-var-that-might-be-garbage-collected* ...)
(let [my-ref *big-var-that-might-be-garbage-collected*]
(defn my-pred
[x]
(some-operation-on my-ref x)))
Even if some-operation-on is merely a query that does not change state, the fact that my-pred retains a reference to *big... changes the state of the system in that the big var cannot be garbage collected. Is this also considered to be side-effect?
In my case, I'd like to write to a logging system in a predicate. Is this a side effect?
And why are side-effects in predicates discouraged exactly? Is it because filter and remove and their friends work lazily so that you cannot determine when the predicates are called (and - hence - when the side-effects happen)?
GC is not typically considered when evaluating if a function is pure or not, although many actions that make a function impure can have a GC effect.
Logging is a side effect, as is changing any state in the program or the world. A pure function takes data and returns data, without modifying anything else.
https://softwareengineering.stackexchange.com/questions/15269/why-are-side-effects-considered-evil-in-functional-programming covers why side effects are avoided in functional languages.
I found this link helpful
The problem is determining when, or even whether, the side-effects will occur on any given call to the function.
If you only care that the same inputs return the same answer, you are fine. Side-effects are dependent on how the function is executed.
For example,
(first (filter odd? (range 20)))
; 1
But if we arrange for odd? to print its argument as it goes:
(first (filter #(do (print %) (odd? %)) (range 20)))
It will print 012345678910111213141516171819 before returning 1!
The reason is that filter, where it can, deals with its sequence argument in chunks of 32 elements.
If we take the limit off the range:
(first (filter #(do (print %) (odd? %)) (range)))
... we get a full-size chunk printed: 012345678910111213141516171819012345678910111213141516171819202122232425262728293031
Just printing the argument is confusing. If the side effects are significant, things could go seriously awry.
Is it possible to create a ref with a transducer in Clojure, in a way analogous to creating a chan with a transducer?
i.e., when you create a chan with a transducer, it filters/maps all the inputs into the outputs.
I'd expect there's also a way to create a ref such that whatever you set, it can either ignore or modify the input. Is this possible to do?
Adding a transducer to a channel modifies the contents as they pass through, which is roughly analogous to adding a watch to a ref that applies it's own change each time the value changes. This change it's self then triggers the watch again so be careful not to blow the stack if they are recursive.
user> (def r (ref 0))
#'user/r
user> (add-watch r :label
(fn [label the-ref old-state new-state]
(println "adding that little something extra")
(if (< old-state 10) (dosync (commute the-ref inc)))))
#<Ref#1af618c2: 0>
user> (dosync (alter r inc))
adding that little something extra
adding that little something extra
adding that little something extra
adding that little something extra
adding that little something extra
adding that little something extra
adding that little something extra
adding that little something extra
adding that little something extra
adding that little something extra
adding that little something extra
1
user> #r
11
You could even apply a transducer to the state of the atom if you wanted.
This is an interesting idea, but the wrong way to go about it for at least a couple reasons. You'd lose some relationships you'd expect to hold:
(alter r identity) =/= r
(alter r f)(alter r f) =/= (alter r (comp f f))
(alter r f) =/= (ref-set r (f #r))
Also some transducers are side-effecting volatiles, and have no business in a dosync block. i.e. if you use (take n) as your transducer then if your dosync fails, then it'll retry as though invoked with (take (dec n)), which violates dosync body requirements.
The problem is a ref lets you read and write as separate steps. If instead there was something foundational that let you "apply" an input to a hidden "state" and collect the output all in one step, consistently with the STM, then that'd be something to work with.