Maximum subarray algorithm in clojure - clojure

kanade's algorithm solves the maximum subarray problem. i'm trying to learn clojure, so i came up with this implementation:
(defn max-subarray [xs]
(last
(reduce
(fn [[here sofar] x]
(let [new-here (max 0 (+ here x))]
[new-here (max new-here sofar)]))
[0 0]
xs)))
this seems really verbose. is there a cleaner way to implement this algorithm in clojure?

As I said in a comment on the question, I believe the OP's approach is optimal. That's given the fully general problem in which the input is a seqable of arbitrary numbers.
However, if the requirement were added that the input should be a collection of longs (or doubles; other primitives are fine too, as long as we're not mixing integers with floating-point numbers), a loop / recur based solution could be made to be significantly faster by taking advantage of primitive arithmetic:
(defn max-subarray-prim [xs]
(loop [xs (seq xs) here 0 so-far 0]
(if xs
(let [x (long (first xs))
new-here (max 0 (+ here x))]
(recur (next xs) new-here (max new-here so-far)))
so-far)))
This is actually quite readable to my eye, though I do prefer reduce where there is no particular reason to use loop / recur. The hope now is that loop's ability to keep here and so-far unboxed throughout the loop's execution will make enough of a difference performance-wise.
To benchmark this, I generated a vector of 100000 random integers from the range -50000, ..., 49999:
(def xs (vec (repeatedly 100000 #(- (rand-int 100000) 50000))))
Sanity check (max-subarray-orig refers to the OP's implementation):
(= (max-subarray-orig xs) (max-subarray-prim xs))
;= true
Criterium benchmarks:
(do (c/bench (max-subarray-orig xs))
(flush)
(c/bench (max-subarray-prim xs)))
WARNING: Final GC required 3.8238570080506156 % of runtime
Evaluation count : 11460 in 60 samples of 191 calls.
Execution time mean : 5.295551 ms
Execution time std-deviation : 97.329399 µs
Execution time lower quantile : 5.106146 ms ( 2.5%)
Execution time upper quantile : 5.456003 ms (97.5%)
Overhead used : 2.038603 ns
Evaluation count : 28560 in 60 samples of 476 calls.
Execution time mean : 2.121256 ms
Execution time std-deviation : 42.014943 µs
Execution time lower quantile : 2.045558 ms ( 2.5%)
Execution time upper quantile : 2.206587 ms (97.5%)
Overhead used : 2.038603 ns
Found 5 outliers in 60 samples (8.3333 %)
low-severe 1 (1.6667 %)
low-mild 4 (6.6667 %)
Variance from outliers : 7.8724 % Variance is slightly inflated by outliers
So that's a jump from ~5.29 ms to ~2.12 ms per call.

Here it is using loop and recur to more closely mimic the example in the wikipedia page.
user> (defn max-subarray [xs]
(loop [here 0 sofar 0 ar xs]
(if (not (empty? ar))
(let [x (first ar) new-here (max 0 (+ here x))]
(recur new-here (max new-here sofar) (rest ar)))
sofar)))
#'user/max-subarray
user> (max-subarray [0 -1 1 2 -4 3])
3
Some people may find this easier to follow, others prefer reduce or map.

Related

Why is my transducer function slower than using ->> operator?

While solving problem from Hackkerank (https://www.hackerrank.com/challenges/string-compression/problem) I've written 2 implementations with and without transducers.
I was expecting the transducer implementation to be faster, than the function chaining operator ->>. Unfortunately, according to my mini-benchmark the chaining operator was outperforming the transducer by 2.5 times.
I was thinking, that I should use transducers wherever possible. Or didn't I understand the concept of transducers correctly?
Time:
"Elapsed time: 0.844459 msecs"
"Elapsed time: 2.697836 msecs"
Code:
(defn string-compression-2
[s]
(->> s
(partition-by identity)
(mapcat #(if (> (count %) 1)
(list (first %) (count %))
(list (first %))))
(apply str)))
(def xform-str-compr
(comp (partition-by identity)
(mapcat #(if (> (count %) 1)
(list (first %) (count %))
(list (first %))))))
(defn string-compression-3
[s]
(transduce xform-str-compr str s))
(time (string-compression-2 "aaabccdddd"))
(time (string-compression-3 "aaabccdddd"))
The transducer version does seem to be faster, according to Criterium:
(crit/quick-bench (string-compression-2 "aaabccdddd"))
Execution time mean : 6.150477 µs
Execution time std-deviation : 246.740784 ns
Execution time lower quantile : 5.769961 µs ( 2.5%)
Execution time upper quantile : 6.398563 µs (97.5%)
Overhead used : 1.620718 ns
(crit/quick-bench (string-compression-3 "aaabccdddd"))
Execution time mean : 2.533919 µs
Execution time std-deviation : 157.594154 ns
Execution time lower quantile : 2.341610 µs ( 2.5%)
Execution time upper quantile : 2.704182 µs (97.5%)
Overhead used : 1.620718 ns
As coredump commented, a sample size of one is not enough to say whether one approach is generally faster than the other.

Why is my Clojure prime number lazy sequence so slow?

I'm doing problem 7 of Project Euler (calculate the 10001st prime). I have coded a solution in the form of a lazy sequence, but it is super slow, whereas another solution I found on the web (link below) and which does essentially the same thing takes less than a second.
I'm new to clojure and lazy sequences, so my usage of take-while, lazy-cat rest or map may be the culprits. Could you PLEASE look at my code and tell me if you see anything?
The solution that runs under a second is here:
https://zach.se/project-euler-solutions/7/
It doesn't use lazy sequences. I'd like to know why it's so fast while mine is so slow (the process they follow is similar).
My solution which is super slow:
(def primes
(letfn [(getnextprime [largestprimesofar]
(let [primessofar (concat (take-while #(not= largestprimesofar %) primes) [largestprimesofar])]
(loop [n (+ (last primessofar) 2)]
(if
(loop [primessofarnottriedyet (rest primessofar)]
(if (= 0 (count primessofarnottriedyet))
true
(if (= 0 (rem n (first primessofarnottriedyet)))
false
(recur (rest primessofarnottriedyet)))))
n
(recur (+ n 2))))))]
(lazy-cat '(2 3) (map getnextprime (rest primes)))))
To try it, just load it and run something like (take 10000 primes), but use Ctrl+C to kill the process, because it is too slow. However, if you try (take 100 primes), you should get an instant answer.
Let me re-write your code just a bit to break it down into pieces that will be easier to discuss. I'm using your same algorithm, I'm just splitting out some of the inner forms into separate functions.
(declare primes) ;; declare this up front so we can refer to it below
(defn is-relatively-prime? [n candidates]
(if (= 0 (count candidates))
true
(if (zero? (rem n (first candidates)))
false
(is-relatively-prime? n (rest candidates)))))
(defn get-next-prime [largest-prime-so-far]
(let [primes-so-far (concat (take-while #(not= largest-prime-so-far %) primes) [largest-prime-so-far])]
(loop [n (+ (last primes-so-far) 2)]
(if
(is-relatively-prime? n (rest primes-so-far))
n
(recur (+ n 2))))))
(def primes
(lazy-cat '(2 3) (map get-next-prime (rest primes))))
(time (let [p (doall (take 200 primes))]))
That last line is just to make it easier to get some really rough benchmarks in the REPL. By making the timing statement part of the source file, I can keep re-loading the source, and get a fresh benchmark each time. If I just load the file once, and keep trying to do (take 500 primes) the benchmark will be skewed because primes will hold on to the primes it has already calculated. I also need the doall because I'm pulling my prime numbers inside a let statement, and if I don't use doall, it will just store the lazy sequence in p, instead of actually calculating the primes.
Now, let's get some base values. On my PC, I get this:
Loading src/scratch_clojure/core.clj... done
"Elapsed time: 274.492597 msecs"
Loading src/scratch_clojure/core.clj... done
"Elapsed time: 293.673962 msecs"
Loading src/scratch_clojure/core.clj... done
"Elapsed time: 322.035034 msecs"
Loading src/scratch_clojure/core.clj... done
"Elapsed time: 285.29596 msecs"
Loading src/scratch_clojure/core.clj... done
"Elapsed time: 224.311828 msecs"
So about 275 milliseconds, give or take 50. My first suspicion is how we're getting primes-so-far in the let statement inside get-next-prime. We're walking through the complete list of primes (as far as we have it) until we get to one that's equal to the largest prime so far. The way we've structured our code, however, all the primes are already in order, so we're effectively walking thru all the primes except the last, and then concatenating the last value. We end up with exactly the same values as have been realized so far in the primes sequence, so we can skip that whole step and just use primes. That should save us something.
My next suspicion is the call to (last primes-so-far) in the loop. When we use the last function on a sequence, it also walks the list from the head down to the tail (or at least, that's my understanding -- I wouldn't put it past the Clojure compiler writers to have snuck in some special-case code to speed things up.) But again, we don't need it. We're calling get-next-prime with largest-prime-so-far, and since our primes are in order, that's already the last of the primes as far as we've realized them, so we can just use largest-prime-so-far instead of (last primes). That will give us this:
(defn get-next-prime [largest-prime-so-far]
; deleted the let statement since we don't need it
(loop [n (+ largest-prime-so-far 2)]
(if
(is-relatively-prime? n (rest primes))
n
(recur (+ n 2)))))
That seems like it should speed things up, since we've eliminated two complete walks through the primes sequence. Let's try it.
Loading src/scratch_clojure/core.clj... done
"Elapsed time: 242.130691 msecs"
Loading src/scratch_clojure/core.clj... done
"Elapsed time: 223.200787 msecs"
Loading src/scratch_clojure/core.clj... done
"Elapsed time: 287.63579 msecs"
Loading src/scratch_clojure/core.clj... done
"Elapsed time: 244.927825 msecs"
Loading src/scratch_clojure/core.clj... done
"Elapsed time: 274.146199 msecs"
Hmm, maybe slightly better (?), but not nearly the improvement I expected. Let's look at the code for is-relatively-prime? (as I've re-written it). And the first thing that jumps out at me is the count function. The primes sequence is a sequence, not a vector, which means the count function also has to walk the complete list to add up how many elements are in it. What's worse, if we start with a list of, say, 10 candidates, it walks all ten the first time through the loop, then walks the nine remaining candidates on the next loop, then the 8 remaining, and so on. As the number of primes gets larger, we're going to spend more and more time in the count function, so maybe that's our bottleneck.
We want to get rid of that count, and that suggests a more idiomatic way we could do the loop, using if-let. Like this:
(defn is-relatively-prime? [n candidates]
(if-let [current (first candidates)]
(if (zero? (rem n current))
false
(recur n (rest candidates)))
true))
The (first candidates) function will return nil if the candidates list is empty, and if that happens, the if-let function will notice, and automatically jump to the else clause, which in this case is our return result of "true." Otherwise, we'll execute the "then" clause, and can test for whether n is evenly divisible by the current candidate. If it is, we return false, otherwise we recur back with the rest of the candidates. I also took advantage of the zero? function just because I could. Let's see what this gets us.
Loading src/scratch_clojure/core.clj... done
"Elapsed time: 9.981985 msecs"
Loading src/scratch_clojure/core.clj... done
"Elapsed time: 8.011646 msecs"
Loading src/scratch_clojure/core.clj... done
"Elapsed time: 8.154197 msecs"
Loading src/scratch_clojure/core.clj... done
"Elapsed time: 9.905292 msecs"
Loading src/scratch_clojure/core.clj... done
"Elapsed time: 8.215208 msecs"
Pretty dramatic, eh? I'm an intermediate-level Clojure coder with a pretty sketchy understanding of the internals, so take my analysis with a grain of salt, but based on those numbers, I'd guess you were getting bitten by the count.
There's one other optimization the "fast" code is using that yours isn't, and that's bailing out on the is-relatively-prime? test whenever current squared is greater than n---you might speed up your code some more if you can throw that in. But I think count is the main thing you're looking for.
I will continue speeding it up, based on #manutter's solution.
(declare primes)
(defn is-relatively-prime? [n candidates]
(if-let [current (first candidates)]
(if (zero? (rem n current))
false
(recur n (rest candidates)))
true))
(defn get-next-prime [largest-prime-so-far]
(let [primes-so-far (concat (take-while #(not= largest-prime-so-far %) primes) [largest-prime-so-far])]
(loop [n (+ (last primes-so-far) 2)]
(if
(is-relatively-prime? n (rest primes-so-far))
n
(recur (+ n 2))))))
(def primes
(lazy-cat '(2 3) (map get-next-prime (rest primes))))
(time (first (drop 10000 primes)))
"Elapsed time: 14092.414513 msecs"
Ok. First of all let's add this current^2 > n optimization:
(defn get-next-prime [largest-prime-so-far]
(let [primes-so-far (concat (take-while #(not= largest-prime-so-far %) primes) [largest-prime-so-far])]
(loop [n (+ (last primes-so-far) 2)]
(if
(is-relatively-prime? n
(take-while #(<= (* % %) n)
(rest primes-so-far)))
n
(recur (+ n 2))))))
user> (time (first (drop 10000 primes)))
"Elapsed time: 10564.470626 msecs"
104743
Nice. Now let's look closer at the get-next-prime:
if you check the algorithm carefully, you will notice that this:
(concat (take-while #(not= largest-prime-so-far %) primes) [largest-prime-so-far]) really equals to all the primes we've found so far, and (last primes-so-far) is really the largest-prime-so-far. So let's rewrite it a little:
(defn get-next-prime [largest-prime-so-far]
(loop [n (+ largest-prime-so-far 2)]
(if (is-relatively-prime? n
(take-while #(<= (* % %) n) (rest primes)))
n
(recur (+ n 2)))))
user> (time (first (drop 10000 primes)))
"Elapsed time: 142.676634 msecs"
104743
let's add one more order of magnitude:
user> (time (first (drop 100000 primes)))
"Elapsed time: 2615.910723 msecs"
1299721
Wow! it's just mind blowing!
but that's not all. let's take a look at is-relatively-prime function:
it just checks that none of the candidates evenly divides the number. So it is really what not-any? library function does. So let's just replace it in get-next-prime.
(declare primes)
(defn get-next-prime [largest-prime-so-far]
(loop [n (+ largest-prime-so-far 2)]
(if (not-any? #(zero? (rem n %))
(take-while #(<= (* % %) n)
(rest primes)))
n
(recur (+ n 2)))))
(def primes
(lazy-cat '(2 3) (map get-next-prime (rest primes))))
it is a bit more productive
user> (time (first (drop 100000 primes)))
"Elapsed time: 2493.291323 msecs"
1299721
and obviously much cleaner and shorter.

Parallel Sieve of Eratosthenes using Clojure Reducers

I have implemented the Sieve of Eratosthenes using Clojure's standard library.
(defn primes [below]
(remove (set (mapcat #(range (* % %) below %)
(range 3 (Math/sqrt below) 2)))
(cons 2 (range 3 below 2))))
I think this should be amenable to parallelism as there is no recursion and the reducer versions of remove and mapcat can be dropped in. Here is what I came up with:
(defn pprimes [below]
(r/foldcat
(r/remove
(into #{} (r/mapcat #(range (* % %) below %)
(into [] (range 3 (Math/sqrt below) 2))))
(into [] (cons 2 (range 3 below 2))))))
I've poured the initial set and the generated multiples into vectors as I understand that LazySeqs can't be folded. Also, r/foldcat is used to finally realize the collection.
My problem is that this is a little slower than the first version.
(time (first (primes 1000000))) ;=> approx 26000 seconds
(time (first (pprimes 1000000))) ;=> approx 28500 seconds
Is there too much overhead from the coordinating processes or am I using reducers wrong?
Thanks to leetwinski this seems to work:
(defn pprimes2 [below]
(r/foldcat
(r/remove
(into #{} (r/foldcat (r/map #(range (* % %) below %)
(into [] (range 3 (Math/sqrt below) 2)))))
(into [] (cons 2 (range 3 below 2))))))
Apparently I needed to add another fold operation in order to map #(range (* % %) below %) in parallel.
(time (first (pprimes 1000000))) ;=> approx 28500 seconds
(time (first (pprimes2 1000000))) ;=> approx 7500 seconds
Edit: The above code doesn't work. r/foldcat isn't concatenating the composite numbers it is just returning a vector of the multiples for each prime number. The final result is a vector of 2 and all the odd numbers. Replacing r/map with r/mapcat gives the correct answer but it is again slower than the original primes.
as far as remember, the r/mapcat and r/remove are not parallel themselves, thy are just producing foldable collections, which are in turn can be subject to parallelization by r/fold. in your case the only parallel operation is r/foldcat, which is according to documentation "Equivalent to (fold cat append! coll)", meaning that you just potentially do append! in parallel, which isn't what you want at all.
To make it parallel you should probably use r/fold with remove as a reduce function and concat as a combine function, but it won't really make your code faster i guess, due to the nature of your algorithm (i mean you will try to remove a big set of items from every chunk of a collection)

clojure refactor code from recursion

I have the following bit of code that produces the correct results:
(ns scratch.core
(require [clojure.string :as str :only (split-lines join split)]))
(defn numberify [str]
(vec (map read-string (str/split str #" "))))
(defn process [acc sticks]
(let [smallest (apply min sticks)
cuts (filter #(> % 0) (map #(- % smallest) sticks))]
(if (empty? cuts)
acc
(process (conj acc (count cuts)) cuts))))
(defn print-result [[x & xs]]
(prn x)
(if (seq xs)
(recur xs)))
(let [input "8\n1 2 3 4 3 3 2 1"
lines (str/split-lines input)
length (read-string (first lines))
inputs (first (rest lines))]
(print-result (process [length] (numberify inputs))))
The process function above recursively calls itself until the sequence sticks is empty?.
I am curious to know if I could have used something like take-while or some other technique to make the code more succinct?
If ever I need to do some work on a sequence until it is empty then I use recursion but I can't help thinking there is a better way.
Your core problem can be described as
stop if count of sticks is zero
accumulate count of sticks
subtract the smallest stick from each of sticks
filter positive sticks
go back to 1.
Identify the smallest sub-problem as steps 3 and 4 and put a box around it
(defn cuts [sticks]
(let [smallest (apply min sticks)]
(filter pos? (map #(- % smallest) sticks))))
Notice that sticks don't change between steps 5 and 3, that cuts is a fn sticks->sticks, so use iterate to put a box around that:
(defn process [sticks]
(->> (iterate cuts sticks)
;; ----- 8< -------------------
This gives an infinite seq of sticks, (cuts sticks), (cuts (cuts sticks)) and so on
Incorporate step 1 and 2
(defn process [sticks]
(->> (iterate cuts sticks)
(map count) ;; count each sticks
(take-while pos?))) ;; accumulate while counts are positive
(process [1 2 3 4 3 3 2 1])
;-> (8 6 4 1)
Behind the scene this algorithm hardly differs from the one you posted, since lazy seqs are a delayed implementation of recursion. It is more idiomatic though, more modular, uses take-while for cancellation which adds to its expressiveness. Also it doesn't require one to pass the initial count and does the right thing if sticks is empty. I hope it is what you were looking for.
I think the way your code is written is a very lispy way of doing it. Certainly there are many many examples in The Little Schema that follow this format of reduction/recursion.
To replace recursion, I usually look for a solution that involves using higher order functions, in this case reduce. It replaces the min calls each iteration with a single sort at the start.
(defn process [sticks]
(drop-last (reduce (fn [a i]
(let [n (- (last a) (count i))]
(conj a n)))
[(count sticks)]
(partition-by identity (sort sticks)))))
(process [1 2 3 4 3 3 2 1])
=> (8 6 4 1)
I've changed the algorithm to fit reduce by grouping the same numbers after sorting, and then counting each group and reducing the count size.

Why is my Clojure code running so slowly?

Below is my answer for 4clojure Problem 108
I'm able to pass the first three tests but the last test times out. The code runs really, really slowly on this last test. What exactly is causing this?
((fn [& coll] (loop [coll coll m {}]
(do
(let [ct (count coll)
ns (mapv first coll)
m' (reduce #(update-in %1 [%2] (fnil inc 0)) m ns)]
(println m')
(if (some #(<= ct %) (mapv m' ns))
(apply min (map first (filter #(>= (val %) ct) m')))
(recur (mapv rest coll) m'))))))
(map #(* % % %) (range)) ;; perfect cubes
(filter #(zero? (bit-and % (dec %))) (range)) ;; powers of 2
(iterate inc 20))
You are gathering the next value from every input on every iteration (recur (mapv rest coll) m')
One of your inputs generates values extremely slowly, and accellarates to very high values very quickly: (filter #(zero? (bit-and % (dec %))) (range)).
Your code is spending most of its time discovering powers of two by incrementing by one and testing the bits.
You don't need a map of all inputs with counts of occurrences. You don't need to find the next value for items that are not the lowest found so far. I won't post a solution since it is an exercise, but eliminating the lowest non matched value on each iteration should be a start.
In addition to the other good answers here, you're doing a bunch of math, but all numbers are boxed as objects rather than being used as primitives. Many tips for doing this better here.
This is a really inefficient way of counting powers of 2:
(filter #(zero? (bit-and % (dec %))) (range))
This is essentially counting from 0 to infinity, testing each number along the way to see if it's a power of two. The further you get into the sequence, the more expensive each call to rest is.
Given that it's the test input, and you can't change it, I think you need to re-think your approach. Rather than calling (mapv rest coll), you probably only want to call rest on the sequence with the smallest first value.