Suppose foo and bar are atoms.
; consistent.
(reset! foo x)
; inconsistent x and y combination.
(reset! bar y)
; consistent.
Is it possible to reset them at once so that no other thread can see this inconsistency? Or is it necessary to bundle x and y into an atom instead of having x and y themselves being atoms?
For coordinated updates to multiple references, you want to use refs, not atoms. Refs can only be modified inside a transaction, and Clojure's STM (software transactional memory) ensures that all operations on refs succeed before committing the transaction, so all threads outside the transaction see a consistent view of the refs.
(def foo (ref :x))
(def bar (ref :y))
(dosync
(ref-set foo :x1)
(ref-set bar :y1))
In this example, the transaction (delineated by dosync) will be retried if either ref was modified by a transaction in another thread, ensuring that other threads see a consistent view of foo and bar.
There is overhead in using STM, so the choice of using coordinated refs vs. a single atom that encapsulates all your mutable state will depend on your exact usage.
From the Clojure website's page on atoms (emphasis added):
Atoms provide a way to manage shared, synchronous, independent state.
This means that each atom is independent of each other, and therefore a set of atoms cannot be updated atomically.
You can combine both items into a single atom. But you also may want to consider refs, which provide for transactional updates of multiple items:
transactional references (Refs) ensure safe shared use of mutable storage locations via a software transactional memory (STM)
Related
The docs says about pmap:
Like map, except f is applied in parallel. Semi-lazy in that the
parallel computation stays ahead of the consumption, but doesn't
realize the entire result unless required.
Can you kindly dis-obfuscate these two statements in some simple context?
Also is there for the pmap function, a doseq equivalent, having a memory footprint constant relative to the size of the iterated collection?
Semi-lazy in that the parallel computation stays ahead of the consumption
This means that pmap will do slightly more work than is strictly required by the sequence's consumer. This "working ahead" minimizes the wait for more items to be computed when the sequence is consumed. For example, if you're computing some infinite sequence in parallel and you only consume the first 50 results, pmap may have gone ahead and computed 50+N.
but doesn't realize the entire result unless required.
This means it's only going to work ahead up to a certain threshold. The entire sequence won't be produced unless it's completely consumed (or almost completely consumed).
Also is there for the pmap function, a doseq equivalent
You can use doall or dorun with pmap to produce side effects in parallel.
Here's an example of all three together, using an infinite sequence as input to pmap:
(def calls (atom 0))
(dorun (take 50 (pmap (fn [_] (swap! calls inc)) (range))))
;; #calls => 60
When this completes the value of calls will be over 50, even though we only consumed 50 items from the sequence.
Also read up on reducers and core.async for another way to do the same thing.
While Taylor's answer is correct, I also gave a presentation on what happens inside of pmap, and how it's lazy, at Clojure West a few years ago. I know not everyone likes videos for learning, but if you do, it might be helpful: https://youtu.be/BzKjIk0vgzE?t=11m48s
(If you want non-lazy pmap, I second the endorsement for Claypoole.)
Let's say I have the following code :
(defn multiple-writes []
(doseq [[x y] (map list [1 2] [3 4])] ;; let's imagine those are paths to files
(when-not (exists? x y) ;; could be left off, I feel it is faster to check before overwriting
(write-to-disk! (do-something x y)))))
That I call like this (parameters omitted) :
(go (multiple-writes))
I use go to execute some code "in the background", but I do not know if I am using the right tool here. Some more information about those functions :
this is not high-priority code at all. It could even fail - multiple-writes could be seen as a cache-filling function.
I consequently do not care about the return value.
do-something takes a between 100 and 500 milliseconds depending of the input
do-something consumes some memory (uses image buffers, some images can be 2000px * 2000px)
there are 10 to 40 elements/images to be processed every time multiple-writes is called.
every call to write-to-disk will create a new file (or overwrite it if any, though that should not happen)
write-to-disk writes always in the same directory
So I would like to speed up things by executing (write-to-disk! (do-something x y)) in parallel to go as fast as possible. But I don't want to overload the system at all, since this is not a high-priority task.
How should I go about this ?
Note : despite the title, this is not a duplicate of this question since I don't want to restrict to 3 threads (not saying that the answer can't be the same, but I feel this question differs).
Take a look at the claypoole library, which gives some good and simple abstractions filling the void between pmap and fork/join reducers, which otherwise would need to be coded by hand with futures and promises.
With pmap all results of a parallel batch need to have returned before the next batch is executed, because return order is preserved. This can be a problem with widely varying processing times (be they calculation, http requests, or work items of different "size"). This is what usually slows down pmap to single threaded map + unneeded overhead performance.
With claypoole's unordered pmap and unordered for (upmap and upfor), slower function calls in one thread (core) can be overtaken by faster ones on another thread because ordering doesn't need to be preserved, as long as not all cores are clogged by slow calls.
This might not help much in case of IO to one disk being the only bottleneck, but since claypoole has configurable thread pool sizes and functions to detect the number of available cores, it will help with restricting the amount of cores.
And where fork/join reducers would optimize CPU usage by work stealing, it might greatly increase memory use, since there is no option to restrict the amount of parallel processes without altering the reducer library.
Consider basing your design on streams or fork/join.
I would a single component that does IO. Every processing node can then send their results to be saved there. This is easy to model with streams. With fork/join, it can be achieved by not returning the result up in the hierarchy but sending it to eg. an agent.
If memory consumption is an issue, perhaps you can divide work even more. Like 100x100 patches.
I have 3 long running tasks that I need to synchronize on. They are independent, but the calling thread must wait until all three are finished before continuing.
I can create an agent for each task, and await on them, but agents aren't really the right semantic construct, since each agent will only be be called once.
What I really want is to await on 3 futures, or some approach that more closely resembles what I'm trying to achieve.
Can I await on futures instead of agents?
Edit:
I guess the answer is just simply to deref each future in the calling thread in a loop, which will block until they've all returned. If I wanted to do "prep" work during this time, I could put the "defrefing" code itself in yet another future.
It looks like you mostly answered your own question. I'll add my 2 cents about how to do this though.
(defn many-futures
[tasks]
(let [futures (for [task tasks]
(future (task)))]
(do-prep tasks)
(doseq [completion futures]
#completion)))
This will do your prep in parallel with all the futures, and then return after all the futures have completed. You could replace the doseq with (doall (for ...)) if you actually want to use the results somewhere. Or, indeed, you could skip the doall, and then only block once the results are actually accessed. Even further, you could return the lazy-seq of futures itself, and then you can access any one of them via deref independently of the completion status of the others.
I'm attempting to explore the behavior of a CPU-bound algorithm as it scales to multiple CPUs using Clojure. The algorithm takes a large sequence of consecutive integers as input, partitions the sequence into a given number of sub-sequences, then uses map to apply a function to each sub-sequence. Once the map function has completed, reduce is used to collect the results.
The full code is available on Github, but here is a sample:
(map computation-function (partitioning-function number-of-partitions input))
When I execute this code on a machine with twelve cores, I see most of the the cores in use, when I expect to see only one core in use.
Ideally, I would like to use pmap to use a given number of threads, but I am unable to cause the code to execute using only one thread.
So is Clojure spreading the computation across multiple CPUs? If so, is there anything that I can do to control this behavior?
My understanding is that pmap uses multiple cores and map uses the current thread only. (There would be no point in having both functions in the library if both used all available cores.)
The following simple experiment shows that pmap uses separate threads and map does not:
(defn something-slow [x]
(Thread/sleep 1000))
(map something-slow (range 5))
;; Takes 5 seconds
(pmap something-slow (range 5))
;; Takes 1 second
I do note that your GitHub code uses pmap in the example which runs in main-; if you change back to map does the parallelism persist?
What happens when you create nested dosync calls? Will sub-transactions be completed in the parent scope? Are these sub-transactions reversible if the parent transaction fails?
If you mean syntactic nesting, then the answer is it depends on whether the inner dosync will run on the same thread as the outer one.
In Clojure, whenever a dosync block is entered, a new transaction is started if one hasn't been running already on this thread. This means that while execution stays on a single thread, inner transactions can be said to be subsumed by outer transactions; however if a dosync occupies a position syntactically nested within another dosync, but happens to be launched on a new thread, it will have a new transaction to itself.
An example which (hopefully) illustrates what happens:
user> (def r (ref 0))
#'user/r
user> (dosync (future (dosync (Thread/sleep 50) (println :foo) (alter r inc)))
(println :bar)
(alter r inc))
:bar
:foo
:foo
1
user> #r
2
The "inner" transaction retries after printing :foo; the "outer" transaction never needs to restart. (Note that after this happens, r's history chain is grown, so if the "large" dosync form were evaluated for a second time, the inner dosync would not retry. It still wouldn't be merged into the outer one, of course.)
Incidentally, Mark Volkmann has written a fantastic article on Clojure's Software Transactional Memory; it's highly recommended reading for anyone interested in gaining solid insight into details of this sort.