Below is a simplified version of an application I am working on. Specifically, I am interested in benchmarking the execution time of process-list. In the function process-list, I partition the input list into partitions equal to the number of threads I would like to execute in parallel. I then pass each partition to a thread through a call to future. Finally, In main I call process-list with time wrapped around it. Time should return the elapsed time of processing done by process-list but apparently, it only returns the amount of time it takes to create the future threads and does not wait for the futures to execute to completion. How can I dereference the futures inside process-list to ensure the elapsed time accounts for the execution of the future-threads to completion?
(ns listProcessing
(:require [clojure.string]
[clojure.pprint]
[input-random :as input]))
(def N-THREADS 4)
(def element_processing_retries (atom 0))
(def list-collection
"Each element is made into a ref"
(map ref input/myList))
(defn partition-list [threads list]
"partition list into required number of partitions which is equal
to the number of threads"
(let [partitions (partition-all
(Math/ceil (/ (count list) threads)) list)]
partitions))
(defn increase-element [element]
(ref-set element inc))
(defn process-list [list]
"Process `members of list` one by one."
(let [sub-lists (partition-list N-THREADS list)]
(doseq [sub-list sub-lists]
(let [futures '()
myFuture (future (dosync (swap! element_processing_retries inc)
(map increase-element sub-list)))]
(cons myFuture futures)
(map deref futures)))))
(defn main []
(let [f1 (future (time (process-list input/mylist)))]
#f1)
(main)
(shutdown-agents)
Below is an example of a simplified list input: Note the input here is simplified and the list processing too to simplify the question.
(ns input-random)
(def myList (list 1 2 4 7 89 12 34 45 56))
This will have some overhead. If you're trying to time millisecond differences, this will skew things a bit (although minute timings shouldn't be using time anyways).
I think your example was a little convoluted, so I reduced it down to what I think represents the problem a little better:
(time (doseq [n (range 5)]
(future
(Thread/sleep 2000))))
"Elapsed time: 1.687702 msecs"
The problem here is the same as the problem with your code: all this really does is time how long it takes for doseq to dispatch all the jobs.
The idea with my hack is to put each finished job into an atom, then check for an end condition in a busy wait:
(defn do-stuff [n-things]
(let [ret-atom (atom 0)]
(doseq [n (range n-things)]
(future
(Thread/sleep 2000)
(swap! ret-atom inc)))
ret-atom))
; Time how long it takes the entire `let` to run
(time
(let [n 5
ret-atom (do-stuff n)]
; Will block until the condition is met
(while (< #ret-atom n))))
"Elapsed time: 2002.813288 msecs"
The reason this is so hard to time is all you're doing is spinning up some side effects in a doseq. There's nothing defining what "done" is, so there's nothing to block on. I'm not great with core.async, but I suspect there may be something that may help in there. It may be possible to have a call to <!! that blocks until a channel has a certain number of elements. In that case, you would just need to put results into the channel as they're produced.
I'm doing problem 7 of Project Euler (calculate the 10001st prime). I have coded a solution in the form of a lazy sequence, but it is super slow, whereas another solution I found on the web (link below) and which does essentially the same thing takes less than a second.
I'm new to clojure and lazy sequences, so my usage of take-while, lazy-cat rest or map may be the culprits. Could you PLEASE look at my code and tell me if you see anything?
The solution that runs under a second is here:
https://zach.se/project-euler-solutions/7/
It doesn't use lazy sequences. I'd like to know why it's so fast while mine is so slow (the process they follow is similar).
My solution which is super slow:
(def primes
(letfn [(getnextprime [largestprimesofar]
(let [primessofar (concat (take-while #(not= largestprimesofar %) primes) [largestprimesofar])]
(loop [n (+ (last primessofar) 2)]
(if
(loop [primessofarnottriedyet (rest primessofar)]
(if (= 0 (count primessofarnottriedyet))
true
(if (= 0 (rem n (first primessofarnottriedyet)))
false
(recur (rest primessofarnottriedyet)))))
n
(recur (+ n 2))))))]
(lazy-cat '(2 3) (map getnextprime (rest primes)))))
To try it, just load it and run something like (take 10000 primes), but use Ctrl+C to kill the process, because it is too slow. However, if you try (take 100 primes), you should get an instant answer.
Let me re-write your code just a bit to break it down into pieces that will be easier to discuss. I'm using your same algorithm, I'm just splitting out some of the inner forms into separate functions.
(declare primes) ;; declare this up front so we can refer to it below
(defn is-relatively-prime? [n candidates]
(if (= 0 (count candidates))
true
(if (zero? (rem n (first candidates)))
false
(is-relatively-prime? n (rest candidates)))))
(defn get-next-prime [largest-prime-so-far]
(let [primes-so-far (concat (take-while #(not= largest-prime-so-far %) primes) [largest-prime-so-far])]
(loop [n (+ (last primes-so-far) 2)]
(if
(is-relatively-prime? n (rest primes-so-far))
n
(recur (+ n 2))))))
(def primes
(lazy-cat '(2 3) (map get-next-prime (rest primes))))
(time (let [p (doall (take 200 primes))]))
That last line is just to make it easier to get some really rough benchmarks in the REPL. By making the timing statement part of the source file, I can keep re-loading the source, and get a fresh benchmark each time. If I just load the file once, and keep trying to do (take 500 primes) the benchmark will be skewed because primes will hold on to the primes it has already calculated. I also need the doall because I'm pulling my prime numbers inside a let statement, and if I don't use doall, it will just store the lazy sequence in p, instead of actually calculating the primes.
Now, let's get some base values. On my PC, I get this:
Loading src/scratch_clojure/core.clj... done
"Elapsed time: 274.492597 msecs"
Loading src/scratch_clojure/core.clj... done
"Elapsed time: 293.673962 msecs"
Loading src/scratch_clojure/core.clj... done
"Elapsed time: 322.035034 msecs"
Loading src/scratch_clojure/core.clj... done
"Elapsed time: 285.29596 msecs"
Loading src/scratch_clojure/core.clj... done
"Elapsed time: 224.311828 msecs"
So about 275 milliseconds, give or take 50. My first suspicion is how we're getting primes-so-far in the let statement inside get-next-prime. We're walking through the complete list of primes (as far as we have it) until we get to one that's equal to the largest prime so far. The way we've structured our code, however, all the primes are already in order, so we're effectively walking thru all the primes except the last, and then concatenating the last value. We end up with exactly the same values as have been realized so far in the primes sequence, so we can skip that whole step and just use primes. That should save us something.
My next suspicion is the call to (last primes-so-far) in the loop. When we use the last function on a sequence, it also walks the list from the head down to the tail (or at least, that's my understanding -- I wouldn't put it past the Clojure compiler writers to have snuck in some special-case code to speed things up.) But again, we don't need it. We're calling get-next-prime with largest-prime-so-far, and since our primes are in order, that's already the last of the primes as far as we've realized them, so we can just use largest-prime-so-far instead of (last primes). That will give us this:
(defn get-next-prime [largest-prime-so-far]
; deleted the let statement since we don't need it
(loop [n (+ largest-prime-so-far 2)]
(if
(is-relatively-prime? n (rest primes))
n
(recur (+ n 2)))))
That seems like it should speed things up, since we've eliminated two complete walks through the primes sequence. Let's try it.
Loading src/scratch_clojure/core.clj... done
"Elapsed time: 242.130691 msecs"
Loading src/scratch_clojure/core.clj... done
"Elapsed time: 223.200787 msecs"
Loading src/scratch_clojure/core.clj... done
"Elapsed time: 287.63579 msecs"
Loading src/scratch_clojure/core.clj... done
"Elapsed time: 244.927825 msecs"
Loading src/scratch_clojure/core.clj... done
"Elapsed time: 274.146199 msecs"
Hmm, maybe slightly better (?), but not nearly the improvement I expected. Let's look at the code for is-relatively-prime? (as I've re-written it). And the first thing that jumps out at me is the count function. The primes sequence is a sequence, not a vector, which means the count function also has to walk the complete list to add up how many elements are in it. What's worse, if we start with a list of, say, 10 candidates, it walks all ten the first time through the loop, then walks the nine remaining candidates on the next loop, then the 8 remaining, and so on. As the number of primes gets larger, we're going to spend more and more time in the count function, so maybe that's our bottleneck.
We want to get rid of that count, and that suggests a more idiomatic way we could do the loop, using if-let. Like this:
(defn is-relatively-prime? [n candidates]
(if-let [current (first candidates)]
(if (zero? (rem n current))
false
(recur n (rest candidates)))
true))
The (first candidates) function will return nil if the candidates list is empty, and if that happens, the if-let function will notice, and automatically jump to the else clause, which in this case is our return result of "true." Otherwise, we'll execute the "then" clause, and can test for whether n is evenly divisible by the current candidate. If it is, we return false, otherwise we recur back with the rest of the candidates. I also took advantage of the zero? function just because I could. Let's see what this gets us.
Loading src/scratch_clojure/core.clj... done
"Elapsed time: 9.981985 msecs"
Loading src/scratch_clojure/core.clj... done
"Elapsed time: 8.011646 msecs"
Loading src/scratch_clojure/core.clj... done
"Elapsed time: 8.154197 msecs"
Loading src/scratch_clojure/core.clj... done
"Elapsed time: 9.905292 msecs"
Loading src/scratch_clojure/core.clj... done
"Elapsed time: 8.215208 msecs"
Pretty dramatic, eh? I'm an intermediate-level Clojure coder with a pretty sketchy understanding of the internals, so take my analysis with a grain of salt, but based on those numbers, I'd guess you were getting bitten by the count.
There's one other optimization the "fast" code is using that yours isn't, and that's bailing out on the is-relatively-prime? test whenever current squared is greater than n---you might speed up your code some more if you can throw that in. But I think count is the main thing you're looking for.
I will continue speeding it up, based on #manutter's solution.
(declare primes)
(defn is-relatively-prime? [n candidates]
(if-let [current (first candidates)]
(if (zero? (rem n current))
false
(recur n (rest candidates)))
true))
(defn get-next-prime [largest-prime-so-far]
(let [primes-so-far (concat (take-while #(not= largest-prime-so-far %) primes) [largest-prime-so-far])]
(loop [n (+ (last primes-so-far) 2)]
(if
(is-relatively-prime? n (rest primes-so-far))
n
(recur (+ n 2))))))
(def primes
(lazy-cat '(2 3) (map get-next-prime (rest primes))))
(time (first (drop 10000 primes)))
"Elapsed time: 14092.414513 msecs"
Ok. First of all let's add this current^2 > n optimization:
(defn get-next-prime [largest-prime-so-far]
(let [primes-so-far (concat (take-while #(not= largest-prime-so-far %) primes) [largest-prime-so-far])]
(loop [n (+ (last primes-so-far) 2)]
(if
(is-relatively-prime? n
(take-while #(<= (* % %) n)
(rest primes-so-far)))
n
(recur (+ n 2))))))
user> (time (first (drop 10000 primes)))
"Elapsed time: 10564.470626 msecs"
104743
Nice. Now let's look closer at the get-next-prime:
if you check the algorithm carefully, you will notice that this:
(concat (take-while #(not= largest-prime-so-far %) primes) [largest-prime-so-far]) really equals to all the primes we've found so far, and (last primes-so-far) is really the largest-prime-so-far. So let's rewrite it a little:
(defn get-next-prime [largest-prime-so-far]
(loop [n (+ largest-prime-so-far 2)]
(if (is-relatively-prime? n
(take-while #(<= (* % %) n) (rest primes)))
n
(recur (+ n 2)))))
user> (time (first (drop 10000 primes)))
"Elapsed time: 142.676634 msecs"
104743
let's add one more order of magnitude:
user> (time (first (drop 100000 primes)))
"Elapsed time: 2615.910723 msecs"
1299721
Wow! it's just mind blowing!
but that's not all. let's take a look at is-relatively-prime function:
it just checks that none of the candidates evenly divides the number. So it is really what not-any? library function does. So let's just replace it in get-next-prime.
(declare primes)
(defn get-next-prime [largest-prime-so-far]
(loop [n (+ largest-prime-so-far 2)]
(if (not-any? #(zero? (rem n %))
(take-while #(<= (* % %) n)
(rest primes)))
n
(recur (+ n 2)))))
(def primes
(lazy-cat '(2 3) (map get-next-prime (rest primes))))
it is a bit more productive
user> (time (first (drop 100000 primes)))
"Elapsed time: 2493.291323 msecs"
1299721
and obviously much cleaner and shorter.
One week ago I asked a similar question (Link) where I learned that the lazy nature of map makes the following code run sequential.
(defn future-range
[coll-size num-futures f]
(let [step (/ coll-size num-futures)
parts (partition step (range coll-size))
futures (map #(future (f %)) parts)] ;Yeah I tried doall around here...
(mapcat deref futures)))
That made sense. But how do I fix it? I tried doall around pretty much everything (:D), a different approach with promises and many other things. It just doesn't want to work. Why? It seems to me that the futures don't start until mapcat derefs them (I made some tests with Thread/sleep). But when I fully realize the sequence with doall shouldn't the futures start immediately in another thread?
It seems you are already there. It works if you wrap (map #(future (f %)) parts) in (doall ...). Just restart your repl and start from clean slate to ensure you are calling the right version of your function.
(defn future-range
[coll-size num-futures f]
(let [step (/ coll-size num-futures)
parts (partition step (range coll-size))
futures (doall (map #(future (f %)) parts))]
(mapcat deref futures)))
You can use the following to test it out.
(defn test-fn [x]
(let [start-time (System/currentTimeMillis)]
(Thread/sleep 300)
[{:result x
:start start-time
:end-time (System/currentTimeMillis)}]))
(future-range 10 5 test-fn)
You could also just use time to measure that doing 5 times (Thread/sleep 300) only takes 300 ms of time:
(time (future-range 10 5 (fn [_] (Thread/sleep 300))))
I'm currently reading the O'reilly Clojure programming book which it's says the following in it's section about lazy sequences:
It is possible (though very rare) for a lazy sequence to know its length, and therefore return it as the result of count without realizing its contents.
My question is, How this is done and why it's so rare?
Unfortunately, the book does not specify these things in this section. I personally think that it's very useful to know the length of a lazy sequence prior it's realization, for instance, in the same page is an example of a lazy sequence of files that are processed with a function using map. It would be nice to know how many files could be processed before realizing the sequence.
As inspired by soulcheck's answer, here is a lazy but counted map of an expensive function over a fixed size collection.
(defn foo [s f]
(let [c (count s), res (map f s)]
(reify
clojure.lang.ISeq
(seq [_] res)
clojure.lang.Counted
(count [_] c)
clojure.lang.IPending
(isRealized [_] (realized? res)))))
(def bar (foo (range 5) (fn [x] (Thread/sleep 1000) (inc x))))
(time (count bar))
;=> "Elapsed time: 0.016848 msecs"
; 5
(realized? bar)
;=> false
(time (into [] bar))
;=> "Elapsed time: 4996.398302 msecs"
; [1 2 3 4 5]
(realized? bar)
;=> true
(time (into [] bar))
;=> "Elapsed time: 0.042735 msecs"
; [1 2 3 4 5]
I suppose it's due to the fact that usually there are other ways to find out the size.
The only sequence implementation I can think of now that could potentially do that, is some kind of map of an expensive function/procedure over a known size collection.
A simple implementation would return the size of the underlying collection, while postponing realization of the elements of the lazy-sequence (and therefore execution of the expensive part) until necessary.
In that case one knows the size of the collection that is being mapped over beforehand and can use that instead of the lazy-seq size.
It might be handy sometimes and that's why it's not impossible to implement, but I guess rarely necessary.
lets say I have a side-effect free function that I use repeatedly with the same parameters without storing the results in a variable.
Does Clojure notice this and uses the pre-calculated value of the function or is the value recalculated all the time?
Example:
(defn rank-selection [population fitness]
(map
#(select-with-probability (sort-by fitness population) %)
(repeatedly (count population) #(rand))))
(defn rank-selection [population fitness]
(let [sorted-population (sort-by fitness population)]
(map
#(select-with-probability sorted-population %)
(repeatedly (count population) #(rand)))))
In the first version sort-by is executed n-times (where n is the size of the population).
In the second version sort-by is executed once and the result is used n-times
Does Clojure stores the result nonetheless?
Are these methods comparable fast?
Clojure doesn't store the results unless you specify that in your code, either by using memoize like mentioned in the comments or by saving the calculation/result in a local binding like you did.
Regarding the questions about how fast is one function regarding the other, here's some code that returns the time for the execution of each (I had to mock the select-with-probability function). The doalls are necessary to force the evaluation of the result of map.
(defn select-with-probability [x p]
(when (< p 0.5)
x))
(defn rank-selection [population fitness]
(map
#(select-with-probability (sort-by fitness population) %)
(repeatedly (count population) rand)))
(defn rank-selection-let [population fitness]
(let [sorted-population (sort-by fitness population)]
(map
#(select-with-probability sorted-population %)
(repeatedly (count population) rand))))
(let [population (take 1000 (repeatedly #(rand-int 10)))]
(time (doall (rank-selection population <)))
(time (doall (rank-selection-let population <)))
;; So that we don't get the result seq
nil)
This returns the following in my local environment:
"Elapsed time: 95.700138 msecs"
"Elapsed time: 1.477563 msecs"
nil
EDIT
In order to avoid the use of the let form you could also use partial which receives a function and any number of arguments, and returns a partial application of that function with the values of the arguments supplied. The performance of the resulting code is in the same order as the one with the let form but is more succinct and readable.
(defn rank-selection-partial [population fitness]
(map
(partial select-with-probability (sort-by fitness population))
(repeatedly (count population) rand)))
(let [population (take 1000 (repeatedly #(rand-int 10)))]
(time (doall (rank-selection-partial population <)))
;; So that we don't get the result seq
nil)
;= "Elapsed time: 0.964413 msecs"
In Clojure sequences are lazy, but the rest of the language, including function evaluation, is eager. Clojure will invoke the function every time for you. Use the second version of your rank-selection function.