I'm starting to learn Clojure and have decided that doing some projects on HackerRank is a good way to do that. What I'm finding is that my Clojure solutions are horribly slow. I'm assuming that's because I'm still thinking imperatively or just don't know enough about how Clojure operates. The latest problem I wrote solutions for was Down To Zero II. Here's my Java code
import java.io.BufferedReader;
import java.io.InputStreamReader;
public class Solution {
private static final int MAX_NUMBER = 1000000;
private static final BufferedReader reader = new BufferedReader(new InputStreamReader(System.in));
public static int[] precompute() {
int[] values = new int[MAX_NUMBER];
values[0] = 0;
values[1] = 1;
for (int i = 1; i < MAX_NUMBER; i += 1) {
if ((values[i] == 0) || (values[i] > (values[i - 1] + 1))) {
values[i] = (values[i - 1] + 1);
}
for (int j = 1; j <= i && (i * j) < MAX_NUMBER; j += 1) {
int mult = i * j;
if ((values[mult] == 0) || (values[mult] > (values[i] + 1))) {
values[mult] = values[i] + 1;
}
}
}
return values;
}
public static void main(String[] args) throws Exception {
int numQueries = Integer.parseInt(reader.readLine());
int[] values = Solution.precompute();
for (int loop = 0; loop < numQueries; loop += 1) {
int query = Integer.parseInt(reader.readLine());
System.out.println(values[query]);
}
}
}
My Clojure implementation is
(def MAX-NUMBER 1000000)
(defn set-i [out i]
(cond
(= 0 i) (assoc out i 0)
(= 1 i) (assoc out i 1)
(or (= 0 (out i))
(> (out i) (inc (out (dec i)))))
(assoc out i (inc (out (dec i))))
:else out))
(defn set-j [out i j]
(let [mult (* i j)]
(if (or (= 0 (out mult)) (> (out mult) (inc (out i))))
(assoc out mult (inc (out i)))
out)))
;--------------------------------------------------
; Precompute the values for all possible inputs
;--------------------------------------------------
(defn precompute []
(loop [i 0 out (vec (repeat MAX-NUMBER 0))]
(if (< i MAX-NUMBER)
(recur (inc i) (loop [j 1 new-out (set-i out i)]
(if (and (<= j i) (< (* i j) MAX-NUMBER))
(recur (inc j) (set-j new-out i j))
new-out)))
out)))
;--------------------------------------------------
; Read the number of queries
;--------------------------------------------------
(def num-queries (Integer/parseInt (read-line)))
;--------------------------------------------------
; Precompute the solutions
;--------------------------------------------------
(def values (precompute))
;--------------------------------------------------
; Read and process each query
;--------------------------------------------------
(loop [iter 0]
(if (< iter num-queries)
(do
(println (values (Integer/parseInt (read-line))))
(recur (inc iter)))))
The Java code runs in about 1/10 of a second on my machine, while the Clojure code takes close to 2 seconds. Since it's the same machine, with the same JVM, it means I'm doing something wrong in Clojure.
How do people go about trying to translate this type of code? What are the gotchas that are causing it to be so much slower?
I'm going to do some transformations to your code (which might be slightly outside of what you were originally asking)
and then address your more specific questions.
I know it's almost two years later, but after running across your question and spending way too much time fighting with
HackerRank and its time limits, I thought I would post an answer. Does achieving a solution within HR's environment and
time limits make us better Clojure programmers? I didn't learn the answer to that. But I'll share what I did learn.
I found a slightly slimmer version of your same algorithm. It still has two loops, but the update only happens once in
the inner loop, and many of the conditions are handled in a min function. Here is my adaptation of it:
(defn compute
"Returns a vector of down-to-zero counts for all numbers from 0 to m."
[m]
(loop [i 2 out (vec (range (inc m)))]
(if (<= i m)
(recur (inc i)
(loop [j 1 out out]
(let [ij (* i j)]
(if (and (<= j i) (<= ij m))
(recur (inc j)
(assoc out ij (min (out ij) ;; current value
(inc (out (dec ij))) ;; steps from value just below
(inc (out i))))) ;; steps from a factor
out))))
out)))
Notice we're still using loop/recur (twice), we're still using a vector to hold the output. But some differences:
We initialize out to incrementing integers. This is the worst case number of steps for every value, and once
initialized, we don't have to test that a value equals 0 and we can skip indices 0 and 1 and start the outer loop at
index 2. (We also fix a bug in your original and make sure out contains MAX-NUMBER+1 values.)
All three tests happen inside a min function that encapsulates the original logic: a value will be
updated only if it's a shorter number of steps from the number just below it, or from one of it's factors.
The tests are now simple enough that we don't need to break them out into separate functions.
This code (along with your original) is fast enough to pass some of the test cases in HR, but not all. Here are some
things to speed this up:
Use int-array instead of vec. This means we'll use aset instead of assoc and aget instead of calling out
with an index. It also means that loop/recur isn't the best structure anymore (because we are no longer passing
around new versions of an immutable vector, but actually mutating a java.util.Array); instead we'll use doseq.
Type hints. This alone makes a huge speed difference. When testing your code, include a form at the top (set! *warn-on-reflection* true) and you'll see where Clojure is having to do extra work to figure out what types it is
dealing with.
Use custom I/O functions to read the input. HR's boilerplate I/O code is supposed to let you focus on solving the
challenge and not worry about I/O, but it is basically garbage, and often the culprit behind your program timing out.
Below is a version that incorporates the tips above and runs fast enough to pass all test cases. I've included my custom
I/O approach that I've been using for all my HR challenges. One nice benefit of using doseq is we can include a
:let and a :while clause within the binding form, removing some of the indentation within the body of doseq. Also
notice a few strategically placed type hints that really speed up the program.
(ns down-to-zero-int-array)
(set! *warn-on-reflection* true)
(defn compute
"Returns a vector of down-to-zero counts for all numbers from 0 to m."
^ints [m]
(let [out ^ints (int-array (inc m) (range (inc m)))]
(doseq [i (range 2 (inc m)) j (range 1 (inc i)) :let [ij (* i j)] :while (<= ij m)]
(aset out ij (min (aget out ij)
(inc (aget out (dec ij)))
(inc (aget out i)))))
out))
(let [tokens ^java.io.StreamTokenizer
(doto (java.io.StreamTokenizer. (java.io.BufferedReader. *in*))
(.parseNumbers))]
(defn next-int []
"Read next integer from input. As fast as `read-line` for a single value,
and _much_ faster than `read-line`+`split` for multiple values on same line."
(.nextToken tokens)
(int (.-nval tokens))))
(def MAX 1000000)
(let [q (next-int)
down-to-zero (compute MAX)]
(doseq [n (repeatedly q next-int)]
(println (aget down-to-zero n))))
Related
I am trying to implement a solution for minimum-swaps required to sort an array in clojure.
The code works, but takes about a second to solve for the 7 element vector, which is very poor compared to a similar solution in Java. (edited)
I already tried providing the explicit types, but doesnt seem to make a difference
I tried using transients, but has an open bug for subvec, that I am using in my solution- https://dev.clojure.org/jira/browse/CLJ-787
Any pointers on how I can optimize the solution?
;; Find minimumSwaps required to sort the array. The algorithm, starts by iterating from 0 to n-1. In each iteration, it places the least element in the ith position.
(defn minimumSwaps [input]
(loop [mv input, i (long 0), swap-count (long 0)]
(if (< i (count input))
(let [min-elem (apply min (drop i mv))]
(if (not= min-elem (mv i))
(recur (swap-arr mv i min-elem),
(unchecked-inc i),
(unchecked-inc swap-count))
(recur mv,
(unchecked-inc i),
swap-count)))
swap-count)))
(defn swap-arr [vec x min-elem]
(let [y (long (.indexOf vec min-elem))]
(assoc vec x (vec y) y (vec x))))
(time (println (minimumSwaps [7 6 5 4 3 2 1])))
There are a few things that can be improved in your solution, both algorithmically and efficiency-wise. The main improvement is to remember both the minimal element in the vector and its position when you search for it. This allows you to not search for the minimal element again with .indexOf.
Here's my revised solution that is ~4 times faster:
(defn swap-arr [v x y]
(assoc v x (v y) y (v x)))
(defn find-min-and-position-in-vector [v, ^long start-from]
(let [size (count v)]
(loop [i start-from, min-so-far (long (nth v start-from)), min-pos start-from]
(if (< i size)
(let [x (long (nth v i))]
(if (< x min-so-far)
(recur (inc i) x i)
(recur (inc i) min-so-far min-pos)))
[min-so-far min-pos]))))
(defn minimumSwaps [input]
(loop [mv input, i (long 0), swap-count (long 0)]
(if (< i (count input))
(let [[min-elem min-pos] (find-min-and-position-in-vector mv i)]
(if (not= min-elem (mv i))
(recur (swap-arr mv i min-pos),
(inc i),
(inc swap-count))
(recur mv,
(inc i),
swap-count)))
swap-count)))
To understand where are the performance bottlenecks in your program, it is better to use https://github.com/clojure-goes-fast/clj-async-profiler rather than to guess.
Notice how I dropped unchecked-* stuff from your code. It is not as important here, and it is easy to get it wrong. If you want to use them for performance, make sure to check the resulting bytecode with a decompiler: https://github.com/clojure-goes-fast/clj-java-decompiler
A similar implementation in java, runs almost in half the time.
That's actually fairly good for Clojure, given that you use immutable vectors where in Java you probably use arrays. After rewriting the Clojure solution to arrays, the performance would be almost the same.
Hey I'm doing a Project Euler question, and I'm looking to sum up all the numbers under 1000 that are multiplies of 3 or 5.
But being a clojure noob, my code just keeps returning zero.. and I'm not sure why.
(defn sum-of-multiples [max]
(let [result (atom 0)]
(for [i (range max)]
(if (or (= (rem i 3) 0) (= (rem i 5) 0))
(swap! result (+ #result i)))
)
#result))
(sum-of-multiples 1000)
Also the line (swap! result (+ #result i))) bugs me.. In C# I could do result += i, but I'm guessing there must be a better way to this in Clojure?
In Clojure - and at large in functional programming - we avoid assignment as it destroys state history and makes writing concurrent programs a whole lot harder. In fact, Clojure doesn't even support assignment. An atom is a reference type that is thread safe.
Another common trait of functional programming is that we try to solve problems as a series of data transformations. In your case you case some data, a list of numbers from 0 to 1000 exclusive, and you need to obtain the sum of all numbers that match a predicate. This can certainly be done by applying data transformations and completely removing the need for assignment. One such implementation is this:
(->> (range 1000)
(filter #(or (= (rem % 3) 0) (= (rem % 5) 0)))
(reduce +))
Please understand that a function such as the one you wrote isn't considered idiomatic code. Having said that, in the interest of learning, it can be made to work like so:
(defn sum-of-multiples [max]
(let [result (atom 0)]
(doseq [i (range max)]
(if (or (= (rem i 3) 0) (= (rem i 5) 0))
(swap! result #(+ % i)))
)
#result))
(sum-of-multiples 1000)
for returns a lazy sequence but since you're simply interested in the side-effects caused by swap! you need to use doseq to force the sequence. The other problem is that the second argument to swap! is a function, so you don't need to deref result again.
for is a list comprehension that return a lazy sequence, you have to traverse it for your code to work:
(defn sum-of-multiples [max]
(let [result (atom 0)]
(dorun
(for [i (range max)]
(if (or (= (rem i 3) 0) (= (rem i 5) 0))
(swap! result + i))))
#result))
An equivalent, more idiomatic implementation using for:
(defn sum-of-multiples [max]
(reduce +
(for [i (range max)
:when (or (zero? (rem i 3))
(zero? (rem i 5)))]
i)))
The other answers are good examples of what I alluded to in my comment. For the sake of completeness, here's a solution that uses loop/recur, so it may be easier to understand for someone who's still not comfortable with concepts like filter, map or reduce. It also happens to be about 30-40% faster, not that it really matters in this case.
(defn sum-of-multiples [max]
(loop [i 0
sum 0]
(if (> max i)
(recur (inc i)
(if (or (zero? (rem i 3)) (zero? (rem i 5)))
(+ sum i)
sum))
sum)))
I am trying to prove Clojure performance can be on equal footing with Java. An important use case I've found is the Quicksort. I have written an implementation as follows:
(set! *unchecked-math* true)
(defn qsort [^longs a]
(let [qs (fn qs [^long low, ^long high]
(when (< low high)
(let [pivot (aget a low)
[i j]
(loop [i low, j high]
(let [i (loop [i i] (if (< (aget a i) pivot)
(recur (inc i)) i))
j (loop [j j] (if (> (aget a j) pivot)
(recur (dec j)) j))
[i j] (if (<= i j)
(let [tmp (aget a i)]
(aset a i (aget a j)) (aset a j tmp)
[(inc i) (dec j)])
[i j])]
(if (< i j) (recur i j) [i j])))]
(when (< low j) (qs low j))
(when (< i high) (qs i high)))))]
(qs 0 (dec (alength a))))
a)
Also, this helps call the Java quicksort:
(defn jqsort [^longs a] (java.util.Arrays/sort a) a))
Now, for the benchmark.
user> (def xs (let [rnd (java.util.Random.)]
(long-array (repeatedly 100000 #(.nextLong rnd)))))
#'user/xs
user> (def ys (long-array xs))
#'user/ys
user> (time (qsort ys))
"Elapsed time: 163.33 msecs"
#<long[] [J#3ae34094>
user> (def ys (long-array xs))
user> (time (jqsort ys))
"Elapsed time: 13.895 msecs"
#<long[] [J#1b2b2f7f>
Performance is worlds apart (an order of magnitude, and then some).
Is there anything I'm missing, any Clojure feature I may have used? I think the main source of performance degradation is when I need to return several values from a loop and must allocate a vector for that. Can this be avoided?
BTW running Clojure 1.4. Also note that I have run the benchmark multiple times in order to warm up the HotSpot. These are the times when they settle down.
Update
The most terrible weakness in my code is not just the allocation of vectors, but the fact that they force boxing and break the primitive chain. Another weakness is using results of loop because they also break the chain. Yep, performance in Clojure is still a minefield.
This version is based on #mikera's, is just as fast and doesn't require the use of ugly macros. On my machine this takes ~12ms vs ~9ms for java.util.Arrays/sort:
(set! *unchecked-math* true)
(set! *warn-on-reflection* true)
(defn swap [^longs a ^long i ^long j]
(let [t (aget a i)]
(aset a i (aget a j))
(aset a j t)))
(defn ^long apartition [^longs a ^long pivot ^long i ^long j]
(loop [i i j j]
(if (<= i j)
(let [v (aget a i)]
(if (< v pivot)
(recur (inc i) j)
(do
(when (< i j)
(aset a i (aget a j))
(aset a j v))
(recur i (dec j)))))
i)))
(defn qsort
([^longs a]
(qsort a 0 (long (alength a))))
([^longs a ^long lo ^long hi]
(when
(< (inc lo) hi)
(let [pivot (aget a lo)
split (dec (apartition a pivot (inc lo) (dec hi)))]
(when (> split lo)
(swap a lo split))
(qsort a lo split)
(qsort a (inc split) hi)))
a))
(defn ^longs rand-long-array []
(let [rnd (java.util.Random.)]
(long-array (repeatedly 100000 #(.nextLong rnd)))))
(comment
(dotimes [_ 10]
(let [as (rand-long-array)]
(time
(dotimes [_ 1]
(qsort as)))))
)
The need for manual inlining is mostly unnecessary starting with Clojure 1.3. With a few type hints only on the function arguments the JVM will do the inlining for you. There is no need to cast index arguments to int for the the array operations - Clojure does this for you.
One thing to watch out for is that nested loop/recur does present problems for JVM inlining since loop/recur doesn't (at this time) support returning primitives. So you have to break apart your code into separate fns. This is for the best as nested loop/recurs get very ugly in Clojure anyhow.
For a more detailed look on how to consistently achieve Java performance (when you actually need it) please examine and understand test.benchmark.
This is slightly horrific because of the macros, but with this code I think you can match the Java speed (I get around 11ms for the benchmark):
(set! *unchecked-math* true)
(defmacro swap [a i j]
`(let [a# ~a
i# ~i
j# ~j
t# (aget a# i#)]
(aset a# i# (aget a# j#))
(aset a# j# t#)))
(defmacro apartition [a pivot i j]
`(let [pivot# ~pivot]
(loop [i# ~i
j# ~j]
(if (<= i# j#)
(let [v# (aget ~a i#)]
(if (< v# pivot#)
(recur (inc i#) j#)
(do
(when (< i# j#)
(aset ~a i# (aget ~a j#))
(aset ~a j# v#))
(recur i# (dec j#)))))
i#))))
(defn qsort
([^longs a]
(qsort a 0 (alength a)))
([^longs a ^long lo ^long hi]
(let [lo (int lo)
hi (int hi)]
(when
(< (inc lo) hi)
(let [pivot (aget a lo)
split (dec (apartition a pivot (inc lo) (dec hi)))]
(when (> split lo) (swap a lo split))
(qsort a lo split)
(qsort a (inc split) hi)))
a)))
The main tricks are:
Do everything with primitive arithmetic
Use ints for the array indexes (this avoids some unnecessary casts, not a big deal but every little helps....)
Use macros rather than functions to break up the code (avoids function call overhead and parameter boxing)
Use loop/recur for maximum speed in the inner loop (i.e. partitioning the subarray)
Avoid constructing any new objects on the heap (so avoid vectors, sequences, maps etc.)
The Joy of Clojure, Chapter 6.4 describes a lazy quicksort algorithm.The beauty of lazy sorting is that it will only do as much work as necessary to find the first x values. So if x << n this algorithm is O(n).
(ns joy.q)
(defn sort-parts
"Lazy, tail-recursive, incremental quicksort. Works against
and creates partitions based on the pivot, defined as 'work'."
[work]
(lazy-seq
(loop [[part & parts] work]
(if-let [[pivot & xs] (seq part)]
(let [smaller? #(< % pivot)]
(recur (list*
(filter smaller? xs)
pivot
(remove smaller? xs)
parts)))
(when-let [[x & parts] parts]
(cons x (sort-parts parts)))))))
(defn qsort [xs]
(sort-parts (list xs)))
By examining the main points from mikera's answer, you can see that they are mostly focused on eliminating the overhead introduced by using idiomatic (as opposed to tweaked) Clojure, which would probably not exist in an idiomatic Java implementation:
primitive arithmetic - slightly easier and more idiomatic in Java, you are more likely to use ints than Integers
ints for the array indexes - the same
Use macros rather than functions to break up the code (avoids functional call overhead and boxing) - fixes a problem introduced by using the language. Clojure encourages functional style, hence a function call overhead (and boxing).
Use loop/recur for maximum speed in the inner loop - in Java you'd idiomatically use an ordinary loop (which is what loop/recur compiles to anyway, as far as I know)
That being said, there actually is another trivial solution. Write (or find) an efficient Java implementation of Quick Sort, say something with a signature like this:
Sort.quickSort(long[] elems)
And then call it from Clojure:
(Sort/quickSort elems)
Checklist:
as efficient as in Java - yes
idiomatic in Clojure - arguably yes, I'd say that Java-interop is one of Clojure's core features.
reusable - yes, there's a good chance that you can easily find a very efficient Java implementation already written.
I'm not trying to troll, I understand what you are trying to find out with these experiments I'm just adding this answer for the sake of completeness. Let's not overlook the obvious one! :)
I found this code in Clojure to sieve out first n prime numbers:
(defn sieve [n]
(let [n (int n)]
"Returns a list of all primes from 2 to n"
(let [root (int (Math/round (Math/floor (Math/sqrt n))))]
(loop [i (int 3)
a (int-array n)
result (list 2)]
(if (>= i n)
(reverse result)
(recur (+ i (int 2))
(if (< i root)
(loop [arr a
inc (+ i i)
j (* i i)]
(if (>= j n)
arr
(recur (do (aset arr j (int 1)) arr)
inc
(+ j inc))))
a)
(if (zero? (aget a i))
(conj result i)
result)))))))
Then I wrote the equivalent (I think) code in Scheme (I use mit-scheme)
(define (sieve n)
(let ((root (round (sqrt n)))
(a (make-vector n)))
(define (cross-out t to dt)
(cond ((> t to) 0)
(else
(vector-set! a t #t)
(cross-out (+ t dt) to dt)
)))
(define (iter i result)
(cond ((>= i n) (reverse result))
(else
(if (< i root)
(cross-out (* i i) (- n 1) (+ i i)))
(iter (+ i 2) (if (vector-ref a i)
result
(cons i result))))))
(iter 3 (list 2))))
The timing results are:
For Clojure:
(time (reduce + 0 (sieve 5000000)))
"Elapsed time: 168.01169 msecs"
For mit-scheme:
(time (fold + 0 (sieve 5000000)))
"Elapsed time: 3990 msecs"
Can anyone tell me why mit-scheme is more than 20 times slower?
update: "the difference was in iterpreted/compiled mode. After I compiled the mit-scheme code, it was running comparably fast. – abo-abo Apr 30 '12 at 15:43"
Modern incarnations of the Java Virtual Machine have extremely good performance when compared to interpreted languages. A significant amount of engineering resource has gone into the JVM, in particular the hotspot JIT compiler, highly tuned garbage collection and so on.
I suspect the difference you are seeing is primarily down to that. For example if you look Are the Java programs faster? you can see a comparison of java vs ruby which shows that java outperforms by a factor of 220 on one of the benchmarks.
You don't say what JVM options you are running your clojure benchmark with. Try running java with the -Xint flag which runs in pure interpreted mode and see what the difference is.
Also, it's possible that your example is too small to really warm-up the JIT compiler. Using a larger example may yield an even larger performance difference.
To give you an idea of how much Hotspot is helping you. I ran your code on my MBP 2011 (quad core 2.2Ghz), using java 1.6.0_31 with default opts (-server hotspot) and interpreted mode (-Xint) and see a large difference
; with -server hotspot (best of 10 runs)
>(time (reduce + 0 (sieve 5000000)))
"Elapsed time: 282.322 msecs"
838596693108
; in interpreted mode using -Xint cmdline arg
> (time (reduce + 0 (sieve 5000000)))
"Elapsed time: 3268.823 msecs"
838596693108
As to comparing Scheme and Clojure code, there were a few things to simplify at the Clojure end:
don't rebind the mutable array in loops;
remove many of those explicit primitive coercions, no change in performance. As of Clojure 1.3 literals in function calls compile to primitives if such a function signature is available, and generally the difference in performance is so small that it gets quickly drowned by any other operations happening in a loop;
add a primitive long annotation into the fn signature, thus removing the rebinding of n;
call to Math/floor is not needed -- the int coercion has the same semantics.
Code:
(defn sieve [^long n]
(let [root (int (Math/sqrt n))
a (int-array n)]
(loop [i 3, result (list 2)]
(if (>= i n)
(reverse result)
(do
(when (< i root)
(loop [inc (+ i i), j (* i i)]
(when (>= j n) (aset a j 1) (recur inc (+ j inc)))))
(recur (+ i 2) (if (zero? (aget a i))
(conj result i)
result)))))))
I have created a very simple nested loop example and am struggling to write the equivalent Clojure code. I've been trying to do it by list comprehensions but cannot get the same answer. Any help appreciated.
public class Toy {
public static void main(String[] args) {
int maxMod = 0;
for (int i=0;i<1000;i++) {
for (int j=i;j<1000;j++) {
if ((i * j) % 13 == 0 && i % 7 == 0) maxMod = i * j;
}
}
System.out.println(maxMod);
}
}
Here's a list comprehension solution:
(last
(for [i (range 1000)
j (range 1000)
:let [n (* i j)]
:when (and (= (mod n 13) 0)
(= (mod i 7) 0))]
n))
In general, you want to use some sort of sequence operation (like dnolen's answer). However, if you need to do something that is not expressible in some combination of sequence functions, using the loop macro works as well. For this precise problem, dnolen's answer is better than anything using loop, but for illustrative purposes, here is how you would write it with loop.
(loop [i 0
max-mod 0]
(if (>= i 1000)
(println max-mod)
(recur (inc i)
(loop [j 0
max-mod max-mod]
(if (>= j 1000)
max-mod
(recur (inc j)
(if (and (= (mod (* i j) 13) 0)
(= (mod 1 7) 0))
(* i j)
max-mod)))))))
This is pretty much an exact translation of your given code. That said, this is obviously ugly, which is why a solution using for (or other similar functions) is preferred whenever possible.
List comprehensions create lists from other lists, but you want just a single value as result. You can create the input values (i and j) with a list comprehension, and then use reduce to get a single value from the list:
(reduce (fn [max-mod [i j]]
(if (and (zero? (mod (* i j) 13))
(zero? (mod i 7)))
(* i j)
max-mod))
0
(for [i (range 1000) j (range 1000)]
[i j]))