wheel sieve - slow reconstruction from bitsets

wheel sieve - slow reconstruction from bitsets - clojure

I'm trying to get my feet wet with parallel processing by implementing a wheel sieve for primes, the order of operation roughly as follows,
given some upper bound N, construct 8 different spokes of the form
[p, p + 30, p+ 60, ..., p + 30n] for p in {1, 7, 11, 13, 17, 19, 23, 29}.
combine these all together into one list
For each p, sieve all the primes in the list created in step 2 (using pmap)
take these 8 bitsets, parse them, and rebuild some sorted list of all primes less than N
My code is listed below (primes and primes2 are single-threaded implementations of a simple sieve, and wheel sieve respectively, for comparison)
The problem I'm having is that my current attempts to implement step 4 (in functions primes4, primes5, and primes6) are all dominating steps 1-3.
Can anyone give me any pointers on how I should complete primes3 (which implements only steps 1-3)? otherwise, if this is hopeless, can someone explain some other strategy for splitting up the work for separate threads to do in isolation?
(defn spoke-30 [p N]
(map #(+ p (* %1 30)) (range 0 (/ N 30))))
(defn interleaved-spokes
"returns all the spoke30 elements less than or equal to N (but not the first which is 1)"
[N]
(rest (filter #(< % N) (apply interleave (map #(spoke-30 % N) '(1 7 11 13 17 19 23 29))))))
(defn get-orbit
"for a spacing diff, generates the orbit of diff under + in Z_{30}"
[diff]
(map #(mod (* diff %) 30) (range 0 30)))
(defn _all-orbits
"returns a map of maps where each key is an element of (1 7 11 13 17 19 23 29),
and each value is the orbit of that element under + in Z_30)"
[]
(let [primes '(1 7 11 13 17 19 23 29)]
(zipmap primes (map #(zipmap (get-orbit %) (range 30 0 -1)) primes))))
(def all-orbits (memoize _all-orbits))
(defn start
"for a prime N and a spoke on-spoke, determine the optimal starting point"
[N on-spoke]
(let [dist (mod (- N) 30)
lowest (* (((all-orbits) dist) on-spoke) N)
sqrN (* N N)]
; this might be one away from where I need to be, but it is a cheaper
; calculation than the absolute best start.
(cond (>= lowest sqrN) lowest
(= lowest N) (* 31 N)
true (+ lowest (* N 30 (int (/ N 30)))))))
(defn primes2 [bound]
(let [bset (new java.util.BitSet bound)
sqrtBound (Math/sqrt bound)
pList (interleaved-spokes sqrtBound)]
(.flip bset 2 bound)
(doseq [i (range 9 bound 6)] (.clear bset i)) ;clear out the special case 3s
(doseq [i (range 25 bound 10)] (.clear bset i)) ;clear out the special case 5s
(doseq [x '(1 7 11 13 17 19 23 29)
y pList
z (range (start y x) bound (* 30 y))]
(.clear bset z))
(conj (filter (fn [x] (.get bset x)) (range 1 bound 2)) 2)))
(defn scaled-start [N on-spoke]
(let [dist (mod (- N) 30)
k (((all-orbits) dist) on-spoke)
lowest (int (/ (* k N) 30))
remaining (* (int (/ (- N k) 30)) N)
start (+ lowest remaining)]
(if (> start 0) start N)))
;TODO do I even *need* this bitset!?? ...
(defn mark-composites [bound spoke pList]
(let [scaledbound (int (/ bound 30))
bset (new java.util.BitSet scaledbound)]
(if (= spoke 1) (.set bset 0)) ; this won't be marked as composite - but it isn't prime!
(doseq [x pList
y (range (scaled-start x spoke) scaledbound x)]
(.set bset y))
[spoke bset]))
;TODO now need to find a quick way of reconstructing the required list
; need the raw bitsets ... will then loop over [0 scaledbound] and the bitsets, adding to a list in correct order if the element is true and not present already (it shouldn't!)
(defn primes3 [bound]
(let [pList (interleaved-spokes (Math/sqrt bound))]
(pmap #(mark-composites bound % pList) '(1 7 11 13 17 19 23 29))))
(defn primes4 [bound]
(let [pList (interleaved-spokes (Math/sqrt bound))
bits (pmap #(mark-composites bound % pList) '(1 7 11 13 17 19 23 29))
L (new java.util.ArrayList)]
(.addAll L '(2 3 5))
(doseq [z (range 0 (int (/ bound 30))) [x y] bits ]
(if (not (.get y z)) (.add L (+ x (* 30 z))))
(println x y z L)
)))
(defn primes5 [bound]
(let [pList (interleaved-spokes (Math/sqrt bound))
bits (pmap #(mark-composites bound % pList) '(1 7 11 13 17 19 23 29))]
(for [z (range 0 (int (+ 1 (/ bound 30)))) [x y] bits ]
(if (not (.get y z)) (+ x (* 30 z))))
))
(defn primes6 [bound]
(let [pList (interleaved-spokes (Math/sqrt bound))]
(concat '(2 3 5) (filter #(not= % 0) (apply interleave (pmap #(mark-composites2 bound % pList) '(1 7 11 13 17 19 23 29)))))))
(defn primes [n]
"returns a list of prime numbers less than or equal to n"
(let [bs (new java.util.BitSet n)]
(.flip bs 2 n)
;(doseq [i (range 4 n 2)] (.clear bs i)) ;clear out the special case 2s
(doseq [i (range 3 (Math/sqrt n))]
(if (.get bs i) ; it seems faster to check if odd than to range in steps of 2
(doseq [j (range (* i i) n (* 2 i))] (.clear bs j))))
(conj (filter (fn [x] (.get bs x)) (range 1 n 2)) 2)))
some timings are:
user=> (time (count (primes 1000000)))
"Elapsed time: 117.023543 msecs"
78498
user=> (time (count (primes2 1000000)))
"Elapsed time: 77.10944 msecs"
78498
user=> (time (count (primes3 1000000)))
"Elapsed time: 22.447898 msecs"
8
user=> (time (count (primes4 1000000)))
"Elapsed time: 647.586234 msecs"
78506
user=> (time (count (primes5 1000000)))
"Elapsed time: 721.62017 msecs"
266672
user=> (time (count (primes6 1000000)))
"Elapsed time: 306.280182 msecs"

Related

clojure split collection in chunks of increasing size

Hi I am a clojure newbie,
I am trying to create a function that splits a collection into chunks of increasing size something along the lines of
(apply #(#(take %) (range 1 n)) col)
where n is the number of chunks
example of expected output:
with n = 4 and col = (range 1 4)
(1) (2 3) (4)
with n = 7 and col = (range 1 7)
(1) (2 3) (4 5 6) (7)

You can use something like this:
(defn partition-inc
"Partition xs at increasing steps of n"
[n xs]
(lazy-seq
(when (seq xs)
(cons (take n xs)
(partition-inc (inc n) (drop n xs))))))
; (println (take 5 (partition-inc 1 (range))))
; → ((0) (1 2) (3 4 5) (6 7 8 9) (10 11 12 13 14))
Or if you want to have more influence, you could alternatively provide
a sequence for the sizes (behaves the same as above, if passed (iterate inc 1) for sizes:
(defn partition-sizes
"Partition xs into chunks given by sizes"
[sizes xs]
(lazy-seq
(when (and (seq sizes) (seq xs))
(let [n (first sizes)]
(cons (take n xs) (partition-sizes (rest sizes) (drop n xs)))))))
; (println (take 5 (partition-sizes (range 1 10 2) (range))))
; → ((0) (1 2 3) (4 5 6 7 8) (9 10 11 12 13 14 15) (16 17 18 19 20 21 22 23 24))

An eager solution would look like
(defn partition-inc [coll]
(loop [rt [], c (seq coll), n 1]
(if (seq c)
(recur (conj rt (take n c)) (drop n c) (inc n))
rt)))

another way would be to employ some clojure sequences functions:
(->> (reductions (fn [[_ x] n] (split-at n x))
[[] (range 1 8)]
(iterate inc 1))
(map first)
rest
(take-while seq))
;;=> ((1) (2 3) (4 5 6) (7))

Yet another approach...
(defn growing-chunks [src]
(->> (range)
(reductions #(drop %2 %1) src)
(take-while seq)
(map-indexed take)
rest))
(growing-chunks [:a :b :c :d :e :f :g :h :i])
;; => ((:a) (:b :c) (:d :e :f) (:g :h :i))

How to take n random items from a collection in Clojure?

I have a function that takes n random items from a collection such that the same item is never picked twice. I could accomplish this quite easily:
(defn take-rand [n coll]
(take n (shuffle coll)))
But I have a pesky requirement that I need to return the same random subset when the same seed is provided, i.e.
(defn take-rand [n coll & [seed]] )
(take-rand 5 (range 10) 42L) ;=> (2 5 8 6 7)
(take-rand 5 (range 10) 42L) ;=> (2 5 8 6 7)
(take-rand 5 (range 10) 27L) ;=> (7 6 9 1 3)
(take-rand 5 (range 10)) ;=> (9 7 8 5 0)
I have a solution for this, but it feels a bit clunky and not very idiomatic. Can any Clojure veterans out there propose improvements (or a completely different approach)?
Here's what I did:
(defn take-rand
"Returns n randomly selected items from coll, or all items if there are fewer than n.
No item will be picked more than once."
[n coll & [seed]]
(let [n (min n (count coll))
rng (if seed (java.util.Random. seed) (java.util.Random.))]
(loop [out [], in coll, n n]
(if (or (empty? in) (= n 0))
out
(let [i (.nextInt rng n)]
(recur (conj out (nth in i))
(concat (take i in) (nthrest in (inc i)))
(dec n)))))))

Well, I'm no clojure veteran, but how about:
(defn shuffle-with-seed
"Return a random permutation of coll with a seed"
[coll seed]
(let [al (java.util.ArrayList. coll)
rnd (java.util.Random. seed)]
(java.util.Collections/shuffle al rnd)
(clojure.lang.RT/vector (.toArray al))))
(defn take-rand [n coll & [seed]]
(take n (if seed
(shuffle-with-seed coll seed)
(shuffle coll))))
shuffle-with-seed is similiar to clojure's shuffle, it just passes an instance of Random to Java's java.util.Collections.shuffle().

Replace the shuffle function in your first solution with the reproducible one. I will show you my solution below:
(defn- shuffle'
[seed coll]
(let [rng (java.util.Random. seed)
rnd #(do % (.nextInt rng))]
(sort-by rnd coll)))
(defn take-rand'
([n coll]
(->> coll shuffle (take n)))
([n coll seed]
(->> coll (shuffle' seed) (take n))))
I hope this solution make results you expected:
user> (take-rand' 5 (range 10))
(5 4 7 2 6)
user> (take-rand' 5 (range 10))
(1 9 0 8 5)
user> (take-rand' 5 (range 10))
(5 2 3 1 8)
user> (take-rand' 5 (range 10) 42)
(2 6 4 8 1)
user> (take-rand' 5 (range 10) 42)
(2 6 4 8 1)
user> (take-rand' 5 (range 10) 42)
(2 6 4 8 1)

Building tables in clojure

If I wanted to build a table in Clojure of vector duplicates, I'd write:
(take 2 (repeat [1 2 3]))
But how would I expand this notion of a table function to build something like:
Input 1: [a^2 2 6 2] where a^2 is some input function, 2 is min value, 6 is max value, and 2 is step size.
Output 1: [4,16,36]
Input 2: [b^2 10 -5 -2]
Output 2: [100 64 36 16 4 0 4 16]
This outputs a 4x3 matrix
Input 3: [(+ (* 10 i) j) [1 4] [1 3]]
where (+ (* 10 i) j) is 10i+j (some given input function), [1 4] is the min and max of i, and [1 3] is the min and max of j.
Output 3: [[11 12 13] [21 22 23] [31 32 33] [41 42 43]]

You want to use for in a nested fashion:
(for [i (range 1 (inc 4))]
(for [j (range 1 (inc 3))]
(+ (* 10 i) j)))
;; '((11 12 13) (21 22 23) (31 32 33) (41 42 43))
EDIT: expanded with an example implementation
For example:
(defn build-seq [f lower upper step]
(for [i (range lower (+ upper step) step)]
(f i)))
(build-seq #(* % %) 2 6 2)
;; '(4 16 36)
(defn build-table [f [ilower iupper] [jlower jupper]]
(for [i (range ilower (inc iupper))]
(for [j (range jlower (inc jupper))]
(f i j))))
(build-table #(+ (* 10 %) %2) [1 4] [1 3])
;; '((11 12 13) (21 22 23) (31 32 33) (41 42 43))
Your three input/output samples do not display a consistent signature for one variable and two ; furthermore, the step argument seems to be optional. I'm skeptical about the existence of a nice API that would retain the samples' syntax, but I can try something different (even if I do believe the simple embedded for forms are a better solution):
(defn flexible-range [{:keys [lower upper step] :or {lower 0}}]
(let [[upper step] (cond
(and upper step) [(+ upper step) step]
step (if (pos? step)
[Double/POSITIVE_INFINITY step]
[Double/NEGATIVE-INFINITY step])
upper (if (< lower upper)
[(inc upper) 1]
[(dec upper) -1])
:else [Double/POSITIVE_INFINITY 1])]
(range lower upper step)))
(defn build-table
([f [& params]]
(for [i (flexible-range params)]
(f i)))
([f [& iparams] [& jparams]]
(for [i (flexible-range iparams)]
(for [j (flexible-range jparams)]
(f i j)))))
(build-table #(* % %) [:lower 2 :upper 6 :step 2])
;; '(4 16 36)
(build-table #(+ (* 10 %) %2) [:lower 1 :upper 4]
[:lower 1 :upper 3])
;; '((11 12 13) (21 22 23) (31 32 33) (41 42 43))

I think you can easily solve it with map and range
(defn applier
[f ini max step]
(map f (range ini (+ max step) step)))
(applier #(* % %) 2 6 2)
=> (4 16 36)

This fn can resolve your third example
(defn your-fn [[ra1 ra2] [rb1 rb2] the-fn]
(vec (map (fn [i] (vec (map (fn [j] (the-fn i j)) (range rb1 (inc rb2))))) (range ra1 (inc ra2))))
)
(your-fn [1 4] [1 3] (fn [i j] (+ (* 10 i) j)))
=> [[11 12 13] [21 22 23] [31 32 33] [41 42 43]]
But i'd need a few more specification details (or more use cases) to make this behavior generic, maybe you can explain a little more your problem. I think the 1st-2nd and the 3rd examples don't take the same type of parameters and meaning, (step vs seq ). So the #Guillermo-Winkler solves part the problem and my-fn will cover the last example

find all ordered triples of distinct positive integers i, j, and k less than or equal to a given integer n that sum to a given integer s

this is the exercise 2.41 in SICP
I have wrote this naive version myself:
(defn sum-three [n s]
(for [i (range n)
j (range n)
k (range n)
:when (and (= s (+ i j k))
(< 1 k j i n))]
[i j k]))
The question is: is this considered idiomatic in clojure? And how can I optimize this piece of code? since it takes forever to compute(sum-three 500 500)
Also, how can I have this function take an extra argument to specify number of integer to compute the sum? So instead of sum of three, It should handle more general case like sum of two, sum of four or sum of five etc.
I suppose this cannot be achieved by using for loop? not sure how to add i j k binding dynamically.

(Update: The fully optimized version is sum-c-opt at the bottom.)
I'd say it is idiomatic, if not the fastest way to do it while staying idiomatic. Well, perhaps using == in place of = when the inputs are known to be numbers would be more idiomatic (NB. these are not entirely equivalent on numbers; it doesn't matter here though.)
As a first optimization pass, you could start the ranges higher up and replace = with the number-specific ==:
(defn sum-three [n s]
(for [k (range n)
j (range (inc k) n)
i (range (inc j) n)
:when (== s (+ i j k))]
[i j k]))
(Changed ordering of the bindings since you want the smallest number last.)
As for making the number of integers a parameter, here's one approach:
(defn sum-c [c n s]
(letfn [(go [c n s b]
(if (zero? c)
[[]]
(for [i (range b n)
is (go (dec c) n (- s i) (inc i))
:when (== s (apply + i is))]
(conj is i))))]
(go c n s 0)))
;; from the REPL:
user=> (sum-c 3 6 10)
([5 4 1] [5 3 2])
user=> (sum-c 3 7 10)
([6 4 0] [6 3 1] [5 4 1] [5 3 2])
Update: Rather spoils the exercise to use it, but math.combinatorics provides a combinations function which is tailor-made to solve this problem:
(require '[clojure.math.combinatorics :as c])
(c/combinations (range 10) 3)
;=> all combinations of 3 distinct numbers less than 10;
; will be returned as lists, but in fact will also be distinct
; as sets, so no (0 1 2) / (2 1 0) "duplicates modulo ordering";
; it also so happens that the individual lists will maintain the
; relative ordering of elements from the input, although the docs
; don't guarantee this
filter the output appropriately.
A further update: Thinking through the way sum-c above works gives one a further optimization idea. The point of the inner go function inside sum-c was to produce a seq of tuples summing up to a certain target value (its initial target minus the value of i at the current iteration in the for comprehension); yet we still validate the sums of the tuples returned from the recursive calls to go as if we were unsure whether they actually do their job.
Instead, we can make sure that the tuples produced are the correct ones by construction:
(defn sum-c-opt [c n s]
(let [m (max 0 (- s (* (dec c) (dec n))))]
(if (>= m n)
()
(letfn [(go [c s t]
(if (zero? c)
(list t)
(mapcat #(go (dec c) (- s %) (conj t %))
(range (max (inc (peek t))
(- s (* (dec c) (dec n))))
(min n (inc s))))))]
(mapcat #(go (dec c) (- s %) (list %)) (range m n))))))
This version returns the tuples as lists so as to preserve the expected ordering of results while maintaining code structure which is natural given this approach. You can convert them to vectors with a map vec pass.
For small values of the arguments, this will actually be slower than sum-c, but for larger values, it is much faster:
user> (time (last (sum-c-opt 3 500 500)))
"Elapsed time: 88.110716 msecs"
(168 167 165)
user> (time (last (sum-c 3 500 500)))
"Elapsed time: 13792.312323 msecs"
[168 167 165]
And just for added assurance that it does the same thing (beyond inductively proving correctness in both cases):
; NB. this illustrates Clojure's notion of equality as applied
; to vectors and lists
user> (= (sum-c 3 100 100) (sum-c-opt 3 100 100))
true
user> (= (sum-c 4 50 50) (sum-c-opt 4 50 50))
true

for is a macro so it's hard to extend your nice idiomatic answer to cover the general case. Fortunately clojure.math.combinatorics provides the cartesian-product function that will produce all the combinations of the sets of numbers. Which reduces the problem to filter the combinations:
(ns hello.core
(:require [clojure.math.combinatorics :as combo]))
(defn sum-three [n s i]
(filter #(= s (reduce + %))
(apply combo/cartesian-product (repeat i (range 1 (inc n))))))
hello.core> (sum-three 7 10 3)
((1 2 7) (1 3 6) (1 4 5) (1 5 4) (1 6 3) (1 7 2) (2 1 7)
(2 2 6) (2 3 5) (2 4 4) (2 5 3) (2 6 2) (2 7 1) (3 1 6)
(3 2 5) (3 3 4) (3 4 3) (3 5 2) (3 6 1) (4 1 5) (4 2 4)
(4 3 3) (4 4 2) (4 5 1) (5 1 4) (5 2 3) (5 3 2) (5 4 1)
(6 1 3) (6 2 2) (6 3 1) (7 1 2) (7 2 1))
assuming that order matters in the answers that is

For making your existing code parameterized you can use reduce.This code shows a pattern that can be used where you want to paramterize the number of cases of a for macro usage.
Your code without using for macro (using only functions) would be:
(defn sum-three [n s]
(mapcat (fn [i]
(mapcat (fn [j]
(filter (fn [[i j k]]
(and (= s (+ i j k))
(< 1 k j i n)))
(map (fn [k] [i j k]) (range n))))
(range n)))
(range n)))
The pattern is visible, there is inner most map which is covered by outer mapcat and so on and you want to paramterize the nesting level, hence:
(defn sum-c [c n s]
((reduce (fn [s _]
(fn [& i] (mapcat #(apply s (concat i [%])) (range n))))
(fn [& i] (filter #(and (= s (apply + %))
(apply < 1 (reverse %)))
(map #(concat i [%]) (range n))))
(range (dec c)))))

Fast Prime Number Generation in Clojure

I've been working on solving Project Euler problems in Clojure to get better, and I've already run into prime number generation a couple of times. My problem is that it is just taking way too long. I was hoping someone could help me find an efficient way to do this in a Clojure-y way.
When I fist did this, I brute-forced it. That was easy to do. But calculating 10001 prime numbers took 2 minutes this way on a Xeon 2.33GHz, too long for the rules, and too long in general. Here was the algorithm:
(defn next-prime-slow
"Find the next prime number, checking against our already existing list"
([sofar guess]
(if (not-any? #(zero? (mod guess %)) sofar)
guess ; Then we have a prime
(recur sofar (+ guess 2))))) ; Try again
(defn find-primes-slow
"Finds prime numbers, slowly"
([]
(find-primes-slow 10001 [2 3])) ; How many we need, initial prime seeds
([needed sofar]
(if (<= needed (count sofar))
sofar ; Found enough, we're done
(recur needed (concat sofar [(next-prime-slow sofar (last sofar))])))))
By replacing next-prime-slow with a newer routine that took some additional rules into account (like the 6n +/- 1 property) I was able to speed things up to about 70 seconds.
Next I tried making a sieve of Eratosthenes in pure Clojure. I don't think I got all the bugs out, but I gave up because it was simply way too slow (even worse than the above, I think).
(defn clean-sieve
"Clean the sieve of what we know isn't prime based"
[seeds-left sieve]
(if (zero? (count seeds-left))
sieve ; Nothing left to filter the list against
(recur
(rest seeds-left) ; The numbers we haven't checked against
(filter #(> (mod % (first seeds-left)) 0) sieve)))) ; Filter out multiples
(defn self-clean-sieve ; This seems to be REALLY slow
"Remove the stuff in the sieve that isn't prime based on it's self"
([sieve]
(self-clean-sieve (rest sieve) (take 1 sieve)))
([sieve clean]
(if (zero? (count sieve))
clean
(let [cleaned (filter #(> (mod % (last clean)) 0) sieve)]
(recur (rest cleaned) (into clean [(first cleaned)]))))))
(defn find-primes
"Finds prime numbers, hopefully faster"
([]
(find-primes 10001 [2]))
([needed seeds]
(if (>= (count seeds) needed)
seeds ; We have enough
(recur ; Recalculate
needed
(into
seeds ; Stuff we've already found
(let [start (last seeds)
end-range (+ start 150000)] ; NOTE HERE
(reverse
(self-clean-sieve
(clean-sieve seeds (range (inc start) end-range))))))))))
This is bad. It also causes stack overflows if the number 150000 is smaller. This despite the fact I'm using recur. That may be my fault.
Next I tried a sieve, using Java methods on a Java ArrayList. That took quite a bit of time, and memory.
My latest attempt is a sieve using a Clojure hash-map, inserting all the numbers in the sieve then dissoc'ing numbers that aren't prime. At the end, it takes the key list, which are the prime numbers it found. It takes about 10-12 seconds to find 10000 prime numbers. I'm not sure it's fully debugged yet. It's recursive too (using recur and loop), since I'm trying to be Lispy.
So with these kind of problems, problem 10 (sum up all primes under 2000000) is killing me. My fastest code came up with the right answer, but it took 105 seconds to do it, and needed quite a bit of memory (I gave it 512 MB just so I wouldn't have to fuss with it). My other algorithms take so long I always ended up stopping them first.
I could use a sieve to calculate that many primes in Java or C quite fast and without using so much memory. I know I must be missing something in my Clojure/Lisp style that's causing the problem.
Is there something I'm doing really wrong? Is Clojure just kinda slow with large sequences? Reading some of the project Euler discussions people have calculated the first 10000 primes in other Lisps in under 100 miliseconds. I realize the JVM may slow things down and Clojure is relatively young, but I wouldn't expect a 100x difference.
Can someone enlighten me on a fast way to calculate prime numbers in Clojure?

Here's another approach that celebrates Clojure's Java interop. This takes 374ms on a 2.4 Ghz Core 2 Duo (running single-threaded). I let the efficient Miller-Rabin implementation in Java's BigInteger#isProbablePrime deal with the primality check.
(def certainty 5)
(defn prime? [n]
(.isProbablePrime (BigInteger/valueOf n) certainty))
(concat [2] (take 10001
(filter prime?
(take-nth 2
(range 1 Integer/MAX_VALUE)))))
The Miller-Rabin certainty of 5 is probably not very good for numbers much larger than this. That certainty is equal to 96.875% certain it's prime (1 - .5^certainty)

I realize this is a very old question, but I recently ended up looking for the same and the links here weren't what I'm looking for (restricted to functional types as much as possible, lazily generating ~every~ prime I want).
I stumbled upon a nice F# implementation, so all credits are his. I merely ported it to Clojure:
(defn gen-primes "Generates an infinite, lazy sequence of prime numbers"
[]
(letfn [(reinsert [table x prime]
(update-in table [(+ prime x)] conj prime))
(primes-step [table d]
(if-let [factors (get table d)]
(recur (reduce #(reinsert %1 d %2) (dissoc table d) factors)
(inc d))
(lazy-seq (cons d (primes-step (assoc table (* d d) (list d))
(inc d))))))]
(primes-step {} 2)))
Usage is simply
(take 5 (gen-primes))

Very late to the party, but I'll throw in an example, using Java BitSets:
(defn sieve [n]
"Returns a BitSet with bits set for each prime up to n"
(let [bs (new java.util.BitSet n)]
(.flip bs 2 n)
(doseq [i (range 4 n 2)] (.clear bs i))
(doseq [p (range 3 (Math/sqrt n))]
(if (.get bs p)
(doseq [q (range (* p p) n (* 2 p))] (.clear bs q))))
bs))
Running this on a 2014 Macbook Pro (2.3GHz Core i7), I get:
user=> (time (do (sieve 1e6) nil))
"Elapsed time: 64.936 msecs"

See the last example here:
http://clojuredocs.org/clojure_core/clojure.core/lazy-seq
;; An example combining lazy sequences with higher order functions
;; Generate prime numbers using Eratosthenes Sieve
;; See http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
;; Note that the starting set of sieved numbers should be
;; the set of integers starting with 2 i.e., (iterate inc 2)
(defn sieve [s]
(cons (first s)
(lazy-seq (sieve (filter #(not= 0 (mod % (first s)))
(rest s))))))
user=> (take 20 (sieve (iterate inc 2)))
(2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71)

Here's a nice and simple implementation:
http://clj-me.blogspot.com/2008/06/primes.html
... but it is written for some pre-1.0 version of Clojure. See lazy_seqs in Clojure Contrib for one that works with the current version of the language.

(defn sieve
[[p & rst]]
;; make sure the stack size is sufficiently large!
(lazy-seq (cons p (sieve (remove #(= 0 (mod % p)) rst)))))
(def primes (sieve (iterate inc 2)))
with a 10M stack size, I get the 1001th prime in ~ 33 seconds on a 2.1Gz macbook.

So I've just started with Clojure, and yeah, this comes up a lot on Project Euler doesn't it? I wrote a pretty fast trial division prime algorithm, but it doesn't really scale too far before each run of divisions becomes prohibitively slow.
So I started again, this time using the sieve method:
(defn clense
"Walks through the sieve and nils out multiples of step"
[primes step i]
(if (<= i (count primes))
(recur
(assoc! primes i nil)
step
(+ i step))
primes))
(defn sieve-step
"Only works if i is >= 3"
[primes i]
(if (< i (count primes))
(recur
(if (nil? (primes i)) primes (clense primes (* 2 i) (* i i)))
(+ 2 i))
primes))
(defn prime-sieve
"Returns a lazy list of all primes smaller than x"
[x]
(drop 2
(filter (complement nil?)
(persistent! (sieve-step
(clense (transient (vec (range x))) 2 4) 3)))))
Usage and speed:
user=> (time (do (prime-sieve 1E6) nil))
"Elapsed time: 930.881 msecs
I'm pretty happy with the speed: it's running out of a REPL running on a 2009 MBP. It's mostly fast because I completely eschew idiomatic Clojure and instead loop around like a monkey. It's also 4X faster because I'm using a transient vector to work on the sieve instead of staying completely immutable.
Edit: After a couple of suggestions / bug fixes from Will Ness it now runs a whole lot faster.

Here's a simple sieve in Scheme:
http://telegraphics.com.au/svn/puzzles/trunk/programming-in-scheme/primes-up-to.scm
Here's a run for primes up to 10,000:
#;1> (include "primes-up-to.scm")
; including primes-up-to.scm ...
#;2> ,t (primes-up-to 10000)
0.238s CPU time, 0.062s GC time (major), 180013 mutations, 130/4758 GCs (major/minor)
(2 3 5 7 11 13...

Here is a Clojure solution. i is the current number being considered and p is a list of all prime numbers found so far. If division by some prime numbers has a remainder of zero, the number i is not a prime number and recursion occurs with the next number. Otherwise the prime number is added to p in the next recursion (as well as continuing with the next number).
(defn primes [i p]
(if (some #(zero? (mod i %)) p)
(recur (inc i) p)
(cons i (lazy-seq (primes (inc i) (conj p i))))))
(time (do (doall (take 5001 (primes 2 []))) nil))
; Elapsed time: 2004.75587 msecs
(time (do (doall (take 10001 (primes 2 []))) nil))
; Elapsed time: 7700.675118 msecs
Update:
Here is a much slicker solution based on this answer above.
Basically the list of integers starting with two is filtered lazily. Filtering is performed by only accepting a number i if there is no prime number dividing the number with remainder of zero. All prime numbers are tried where the square of the prime number is less or equal to i.
Note that primes is used recursively but Clojure manages to prevent endless recursion. Also note that the lazy sequence primes caches results (that's why the performance results are a bit counter intuitive at first sight).
(def primes
(lazy-seq
(filter (fn [i] (not-any? #(zero? (rem i %))
(take-while #(<= (* % %) i) primes)))
(drop 2 (range)))))
(time (first (drop 10000 primes)))
; Elapsed time: 542.204211 msecs
(time (first (drop 20000 primes)))
; Elapsed time: 786.667644 msecs
(time (first (drop 40000 primes)))
; Elapsed time: 1780.15807 msecs
(time (first (drop 40000 primes)))
; Elapsed time: 8.415643 msecs

Based on Will's comment, here is my take on postponed-primes:
(defn postponed-primes-recursive
([]
(concat (list 2 3 5 7)
(lazy-seq (postponed-primes-recursive
{}
3
9
(rest (rest (postponed-primes-recursive)))
9))))
([D p q ps c]
(letfn [(add-composites
[D x s]
(loop [a x]
(if (contains? D a)
(recur (+ a s))
(persistent! (assoc! (transient D) a s)))))]
(loop [D D
p p
q q
ps ps
c c]
(if (not (contains? D c))
(if (< c q)
(cons c (lazy-seq (postponed-primes-recursive D p q ps (+ 2 c))))
(recur (add-composites D
(+ c (* 2 p))
(* 2 p))
(first ps)
(* (first ps) (first ps))
(rest ps)
(+ c 2)))
(let [s (get D c)]
(recur (add-composites
(persistent! (dissoc! (transient D) c))
(+ c s)
s)
p
q
ps
(+ c 2))))))))
Initial submission for comparison:
Here is my attempt to port this prime number generator from Python to Clojure. The below returns an infinite lazy sequence.
(defn primes
[]
(letfn [(prime-help
[foo bar]
(loop [D foo
q bar]
(if (nil? (get D q))
(cons q (lazy-seq
(prime-help
(persistent! (assoc! (transient D) (* q q) (list q)))
(inc q))))
(let [factors-of-q (get D q)
key-val (interleave
(map #(+ % q) factors-of-q)
(map #(cons % (get D (+ % q) (list)))
factors-of-q))]
(recur (persistent!
(dissoc!
(apply assoc! (transient D) key-val)
q))
(inc q))))))]
(prime-help {} 2)))
Usage:
user=> (first (primes))
2
user=> (second (primes))
3
user=> (nth (primes) 100)
547
user=> (take 5 (primes))
(2 3 5 7 11)
user=> (time (nth (primes) 10000))
"Elapsed time: 409.052221 msecs"
104743
edit:
Performance comparison, where postponed-primes uses a queue of primes seen so far rather than a recursive call to postponed-primes:
user=> (def counts (list 200000 400000 600000 800000))
#'user/counts
user=> (map #(time (nth (postponed-primes) %)) counts)
("Elapsed time: 1822.882 msecs"
"Elapsed time: 3985.299 msecs"
"Elapsed time: 6916.98 msecs"
"Elapsed time: 8710.791 msecs"
2750161 5800139 8960467 12195263)
user=> (map #(time (nth (postponed-primes-recursive) %)) counts)
("Elapsed time: 1776.843 msecs"
"Elapsed time: 3874.125 msecs"
"Elapsed time: 6092.79 msecs"
"Elapsed time: 8453.017 msecs"
2750161 5800139 8960467 12195263)

Idiomatic, and not too bad
(def primes
(cons 1 (lazy-seq
(filter (fn [i]
(not-any? (fn [p] (zero? (rem i p)))
(take-while #(<= % (Math/sqrt i))
(rest primes))))
(drop 2 (range))))))
=> #'user/primes
(first (time (drop 10000 primes)))
"Elapsed time: 0.023135 msecs"
=> 104729

From: http://steloflute.tistory.com/entry/Clojure-%ED%94%84%EB%A1%9C%EA%B7%B8%EB%9E%A8-%EC%B5%9C%EC%A0%81%ED%99%94
Using Java array
(defmacro loopwhile [init-symbol init whilep step & body]
`(loop [~init-symbol ~init]
(when ~whilep ~#body (recur (+ ~init-symbol ~step)))))
(defn primesUnderb [limit]
(let [p (boolean-array limit true)]
(loopwhile i 2 (< i (Math/sqrt limit)) 1
(when (aget p i)
(loopwhile j (* i 2) (< j limit) i (aset p j false))))
(filter #(aget p %) (range 2 limit))))
Usage and speed:
user=> (time (def p (primesUnderb 1e6)))
"Elapsed time: 104.065891 msecs"

After coming to this thread and searching for a faster alternative to those already here, I am surprised nobody linked to the following article by Christophe Grand :
(defn primes3 [max]
(let [enqueue (fn [sieve n factor]
(let [m (+ n (+ factor factor))]
(if (sieve m)
(recur sieve m factor)
(assoc sieve m factor))))
next-sieve (fn [sieve candidate]
(if-let [factor (sieve candidate)]
(-> sieve
(dissoc candidate)
(enqueue candidate factor))
(enqueue sieve candidate candidate)))]
(cons 2 (vals (reduce next-sieve {} (range 3 max 2))))))
As well as a lazy version :
(defn lazy-primes3 []
(letfn [(enqueue [sieve n step]
(let [m (+ n step)]
(if (sieve m)
(recur sieve m step)
(assoc sieve m step))))
(next-sieve [sieve candidate]
(if-let [step (sieve candidate)]
(-> sieve
(dissoc candidate)
(enqueue candidate step))
(enqueue sieve candidate (+ candidate candidate))))
(next-primes [sieve candidate]
(if (sieve candidate)
(recur (next-sieve sieve candidate) (+ candidate 2))
(cons candidate
(lazy-seq (next-primes (next-sieve sieve candidate)
(+ candidate 2))))))]
(cons 2 (lazy-seq (next-primes {} 3)))))

Plenty of answers already, but I have an alternative solution which generates an infinite sequence of primes. I was also interested on bechmarking a few solutions.
First some Java interop. for reference:
(defn prime-fn-1 [accuracy]
(cons 2
(for [i (range)
:let [prime-candidate (-> i (* 2) (+ 3))]
:when (.isProbablePrime (BigInteger/valueOf prime-candidate) accuracy)]
prime-candidate)))
Benjamin # https://stackoverflow.com/a/7625207/3731823 is primes-fn-2
nha # https://stackoverflow.com/a/36432061/3731823 is primes-fn-3
My implementations is primes-fn-4:
(defn primes-fn-4 []
(let [primes-with-duplicates
(->> (for [i (range)] (-> i (* 2) (+ 5))) ; 5, 7, 9, 11, ...
(reductions
(fn [known-primes candidate]
(if (->> known-primes
(take-while #(<= (* % %) candidate))
(not-any? #(-> candidate (mod %) zero?)))
(conj known-primes candidate)
known-primes))
[3]) ; Our initial list of known odd primes
(cons [2]) ; Put in the non-odd one
(map (comp first rseq)))] ; O(1) lookup of the last element of the vec "known-primes"
; Ugh, ugly de-duplication :(
(->> (map #(when (not= % %2) %) primes-with-duplicates (rest primes-with-duplicates))
(remove nil?))))
Reported numbers (time in milliseconds to count first N primes) are the fastest from the run of 5, no JVM restarts between experiments so your mileage may vary:
1e6 3e6
(primes-fn-1 5) 808 2664
(primes-fn-1 10) 952 3198
(primes-fn-1 20) 1440 4742
(primes-fn-1 30) 1881 6030
(primes-fn-2) 1868 5922
(primes-fn-3) 489 1755 <-- WOW!
(primes-fn-4) 2024 8185

If you don't need a lazy solution and you just want a sequence of primes below a certain limit, the straight forward implementation of the Sieve of Eratosthenes is pretty fast. Here's my version using transients:
(defn classic-sieve
"Returns sequence of primes less than N"
[n]
(loop [nums (transient (vec (range n))) i 2]
(cond
(> (* i i) n) (remove nil? (nnext (persistent! nums)))
(nums i) (recur (loop [nums nums j (* i i)]
(if (< j n)
(recur (assoc! nums j nil) (+ j i))
nums))
(inc i))
:else (recur nums (inc i)))))

I just started using Clojure so I don't know if it's good but here is my solution:
(defn divides? [x i]
(zero? (mod x i)))
(defn factors [x]
(flatten (map #(list % (/ x %))
(filter #(divides? x %)
(range 1 (inc (Math/floor (Math/sqrt x))))))))
(defn prime? [x]
(empty? (filter #(and divides? (not= x %) (not= 1 %))
(factors x))))
(def primes
(filter prime? (range 2 java.lang.Integer/MAX_VALUE)))
(defn sum-of-primes-below [n]
(reduce + (take-while #(< % n) primes)))

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

wheel sieve - slow reconstruction from bitsets - clojure

Related

clojure split collection in chunks of increasing size

How to take n random items from a collection in Clojure?

Building tables in clojure

find all ordered triples of distinct positive integers i, j, and k less than or equal to a given integer n that sum to a given integer s

Fast Prime Number Generation in Clojure

Categories

Resources