Clojure Synchronize Futures with Await - clojure

I have 3 long running tasks that I need to synchronize on. They are independent, but the calling thread must wait until all three are finished before continuing.
I can create an agent for each task, and await on them, but agents aren't really the right semantic construct, since each agent will only be be called once.
What I really want is to await on 3 futures, or some approach that more closely resembles what I'm trying to achieve.
Can I await on futures instead of agents?
Edit:
I guess the answer is just simply to deref each future in the calling thread in a loop, which will block until they've all returned. If I wanted to do "prep" work during this time, I could put the "defrefing" code itself in yet another future.

It looks like you mostly answered your own question. I'll add my 2 cents about how to do this though.
(defn many-futures
[tasks]
(let [futures (for [task tasks]
(future (task)))]
(do-prep tasks)
(doseq [completion futures]
#completion)))
This will do your prep in parallel with all the futures, and then return after all the futures have completed. You could replace the doseq with (doall (for ...)) if you actually want to use the results somewhere. Or, indeed, you could skip the doall, and then only block once the results are actually accessed. Even further, you could return the lazy-seq of futures itself, and then you can access any one of them via deref independently of the completion status of the others.

Related

can you explain pmap laziness and memory footprint?

The docs says about pmap:
Like map, except f is applied in parallel. Semi-lazy in that the
parallel computation stays ahead of the consumption, but doesn't
realize the entire result unless required.
Can you kindly dis-obfuscate these two statements in some simple context?
Also is there for the pmap function, a doseq equivalent, having a memory footprint constant relative to the size of the iterated collection?
Semi-lazy in that the parallel computation stays ahead of the consumption
This means that pmap will do slightly more work than is strictly required by the sequence's consumer. This "working ahead" minimizes the wait for more items to be computed when the sequence is consumed. For example, if you're computing some infinite sequence in parallel and you only consume the first 50 results, pmap may have gone ahead and computed 50+N.
but doesn't realize the entire result unless required.
This means it's only going to work ahead up to a certain threshold. The entire sequence won't be produced unless it's completely consumed (or almost completely consumed).
Also is there for the pmap function, a doseq equivalent
You can use doall or dorun with pmap to produce side effects in parallel.
Here's an example of all three together, using an infinite sequence as input to pmap:
(def calls (atom 0))
(dorun (take 50 (pmap (fn [_] (swap! calls inc)) (range))))
;; #calls => 60
When this completes the value of calls will be over 50, even though we only consumed 50 items from the sequence.
Also read up on reducers and core.async for another way to do the same thing.
While Taylor's answer is correct, I also gave a presentation on what happens inside of pmap, and how it's lazy, at Clojure West a few years ago. I know not everyone likes videos for learning, but if you do, it might be helpful: https://youtu.be/BzKjIk0vgzE?t=11m48s
(If you want non-lazy pmap, I second the endorsement for Claypoole.)

Clojure structure multiple calculation/writes to work in parallel

Let's say I have the following code :
(defn multiple-writes []
(doseq [[x y] (map list [1 2] [3 4])] ;; let's imagine those are paths to files
(when-not (exists? x y) ;; could be left off, I feel it is faster to check before overwriting
(write-to-disk! (do-something x y)))))
That I call like this (parameters omitted) :
(go (multiple-writes))
I use go to execute some code "in the background", but I do not know if I am using the right tool here. Some more information about those functions :
this is not high-priority code at all. It could even fail - multiple-writes could be seen as a cache-filling function.
I consequently do not care about the return value.
do-something takes a between 100 and 500 milliseconds depending of the input
do-something consumes some memory (uses image buffers, some images can be 2000px * 2000px)
there are 10 to 40 elements/images to be processed every time multiple-writes is called.
every call to write-to-disk will create a new file (or overwrite it if any, though that should not happen)
write-to-disk writes always in the same directory
So I would like to speed up things by executing (write-to-disk! (do-something x y)) in parallel to go as fast as possible. But I don't want to overload the system at all, since this is not a high-priority task.
How should I go about this ?
Note : despite the title, this is not a duplicate of this question since I don't want to restrict to 3 threads (not saying that the answer can't be the same, but I feel this question differs).
Take a look at the claypoole library, which gives some good and simple abstractions filling the void between pmap and fork/join reducers, which otherwise would need to be coded by hand with futures and promises.
With pmap all results of a parallel batch need to have returned before the next batch is executed, because return order is preserved. This can be a problem with widely varying processing times (be they calculation, http requests, or work items of different "size"). This is what usually slows down pmap to single threaded map + unneeded overhead performance.
With claypoole's unordered pmap and unordered for (upmap and upfor), slower function calls in one thread (core) can be overtaken by faster ones on another thread because ordering doesn't need to be preserved, as long as not all cores are clogged by slow calls.
This might not help much in case of IO to one disk being the only bottleneck, but since claypoole has configurable thread pool sizes and functions to detect the number of available cores, it will help with restricting the amount of cores.
And where fork/join reducers would optimize CPU usage by work stealing, it might greatly increase memory use, since there is no option to restrict the amount of parallel processes without altering the reducer library.
Consider basing your design on streams or fork/join.
I would a single component that does IO. Every processing node can then send their results to be saved there. This is easy to model with streams. With fork/join, it can be achieved by not returning the result up in the hierarchy but sending it to eg. an agent.
If memory consumption is an issue, perhaps you can divide work even more. Like 100x100 patches.

Does Clojure use multiple threads in a map call?

I'm attempting to explore the behavior of a CPU-bound algorithm as it scales to multiple CPUs using Clojure. The algorithm takes a large sequence of consecutive integers as input, partitions the sequence into a given number of sub-sequences, then uses map to apply a function to each sub-sequence. Once the map function has completed, reduce is used to collect the results.
The full code is available on Github, but here is a sample:
(map computation-function (partitioning-function number-of-partitions input))
When I execute this code on a machine with twelve cores, I see most of the the cores in use, when I expect to see only one core in use.
Ideally, I would like to use pmap to use a given number of threads, but I am unable to cause the code to execute using only one thread.
So is Clojure spreading the computation across multiple CPUs? If so, is there anything that I can do to control this behavior?
My understanding is that pmap uses multiple cores and map uses the current thread only. (There would be no point in having both functions in the library if both used all available cores.)
The following simple experiment shows that pmap uses separate threads and map does not:
(defn something-slow [x]
(Thread/sleep 1000))
(map something-slow (range 5))
;; Takes 5 seconds
(pmap something-slow (range 5))
;; Takes 1 second
I do note that your GitHub code uses pmap in the example which runs in main-; if you change back to map does the parallelism persist?

ref-set vs commute vs alter

What is the difference in the 3 ways to set the value of a ref in Clojure? I've read the docs several times about ref-set, commute, and alter. I'm rather confused which ones to use at what times. Can someone provide me a short description of what the differences are and why each is needed?
As a super simple explanation of how the Software Transactional Memory system works in clojure; it retries transactions until everyone of them gets through without having its values changed out from under it. You can help it make this decision by using ref-changing-functions that give it hints about what interactions are safe between transactions.
ref-set is for when you don't care about the current value. Just set it to this! ref-set saves you the angst of writing something like (alter my-ref (fun [_] 4)) just to set the value of my-ref to 4. (ref-set my-ref 4) sure does look a lot better :).
Use ref-set to simply set the value.
alter is the most normal standard one. Use this function to alter the value. This is the meat of the STM. It uses the function you pass to change the value and retries if it cannot guarantee that the value was unchanged from the start of the transaction. This is very safe, even in some cases where you don't need it to be that safe, like incrementing a counter.
You probably want to use alter most of the time.
commute is an optimized version of alter for those times when the order of things really does not matter. it makes no difference who added which +1 to the counter. The result is the same. If the STM is deciding if your transaction is safe to commit and it only has conflicts on commute operations and none on alter operations then it can go ahead and commit the new values without having to restart anyone. This can save the occasional transaction retry though you're not going to see huge gains from this in normal code.
Use commute when you can.

How do nested dosync calls behave?

What happens when you create nested dosync calls? Will sub-transactions be completed in the parent scope? Are these sub-transactions reversible if the parent transaction fails?
If you mean syntactic nesting, then the answer is it depends on whether the inner dosync will run on the same thread as the outer one.
In Clojure, whenever a dosync block is entered, a new transaction is started if one hasn't been running already on this thread. This means that while execution stays on a single thread, inner transactions can be said to be subsumed by outer transactions; however if a dosync occupies a position syntactically nested within another dosync, but happens to be launched on a new thread, it will have a new transaction to itself.
An example which (hopefully) illustrates what happens:
user> (def r (ref 0))
#'user/r
user> (dosync (future (dosync (Thread/sleep 50) (println :foo) (alter r inc)))
(println :bar)
(alter r inc))
:bar
:foo
:foo
1
user> #r
2
The "inner" transaction retries after printing :foo; the "outer" transaction never needs to restart. (Note that after this happens, r's history chain is grown, so if the "large" dosync form were evaluated for a second time, the inner dosync would not retry. It still wouldn't be merged into the outer one, of course.)
Incidentally, Mark Volkmann has written a fantastic article on Clojure's Software Transactional Memory; it's highly recommended reading for anyone interested in gaining solid insight into details of this sort.