I defined a function is-prime? in Clojure that returns if a number is prime or not, and I am trying to define a function prime-seq that returns all the prime numbers between two numbersn and m.
I have created the function in Common Lisp since I am more comfortable with it and I am trying to translate the code to Clojure. However, I cannot find how to replace the collect function in Lisp to Clojure.
This is my prime-seq function in Lisp:
(defun prime-seq (i j)
(loop for x from i to j
when (is-prime x)
collect x
)
)
And this is the try I did in Clojure but it is not working:
(defn prime-seq? [n m]
(def list ())
(loop [k n]
(cond
(< k m) (if (prime? k) (def list (conj list k)))
)
)
(println list)
)
Any ideas?
loop in Clojure is not the same as CL loop. You probably want for:
(defn prime-seq [i j]
(for [x (range i j)
:when (is-prime x)]
x))
Which is basically the same as saying:
(defn prime-seq [i j]
(filter is-prime (range i j)))
which may be written using the ->> macro for readability:
(defn prime-seq [i j]
(->> (range i j)
(filter is-prime)))
However, you might actually want a lazy-sequence of all prime numbers which you could write with something like this:
(defonce prime-seq
(let [c (fn [m numbers] (filter #(-> % (mod m) (not= 0)) numbers))
f (fn g [s]
(when (seq s)
(cons (first s)
(lazy-seq (g (c (first s) (next s)))))))]
(f (iterate inc 2))))
The lazy sequence will cache the results of the previous calculation, and you can use things like take-while and drop-while to filter the sequence.
Also, you probably shouldn't be using def inside a function call like that. def is for defining a var, which is essentially global. Then using def to change that value completely destroys the var and replaces it with another var pointing to the new state. It's something that's allowed to enable iterative REPL based development and shouldn't really be used in that way. Var's are designed to isolate changes locally to a thread, and are used as containers for global things like functions and singletons in your system. If the algorithm you're writing needs a local mutable state you could use a transient or an atom, and define a reference to that using let, but it would be more idiomatic to use the sequence processing lib or maybe a transducer.
Loop works more like a tail recursive function:
(defn prime-seq [i j]
(let [l (transient [])]
(loop [k i]
(when (< k j)
(when (is-prime k)
(conj! l k))
(recur (inc k))))
(persistent! l)))
But that should be considered strictly a performance optimisation. The decision to use transients shouldn't be taken lightly, and it's often best to start with a more functional algorithm, benchmark and optimise accordingly. Here is a way to write the same thing without the mutable state:
(defn prime-seq [i j]
(loop [k i
l []]
(if (< k j)
(recur (inc k)
(if (is-prime k)
(conj l k)
l))
l)))
I'd try to use for:
(for [x (range n m) :when (is-prime? x)] x)
Related
I am coming from a Java background trying to learn Clojure. As the best way of learning is by actually writing some code, I took a very simple example of finding even numbers in a vector. Below is the piece of code I wrote:
`
(defn even-vector-2 [input]
(def output [])
(loop [x input]
(if (not= (count x) 0)
(do
(if (= (mod (first x) 2) 0)
(do
(def output (conj output (first x)))))
(recur (rest x)))))
output)
`
This code works, but it is lame that I had to use a global symbol to make it work. The reason I had to use the global symbol is because I wanted to change the state of the symbol every time I find an even number in the vector. let doesn't allow me to change the value of the symbol. Is there a way this can be achieved without using global symbols / atoms.
The idiomatic solution is straightfoward:
(filter even? [1 2 3])
; -> (2)
For your educational purposes an implementation with loop/recur
(defn filter-even [v]
(loop [r []
[x & xs :as v] v]
(if (seq v) ;; if current v is not empty
(if (even? x)
(recur (conj r x) xs) ;; bind r to r with x, bind v to rest
(recur r xs)) ;; leave r as is
r))) ;; terminate by not calling recur, return r
The main problem with your code is you're polluting the namespace by using def. You should never really use def inside a function. If you absolutely need mutability, use an atom or similar object.
Now, for your question. If you want to do this the "hard way", just make output a part of the loop:
(defn even-vector-3 [input]
(loop [[n & rest-input] input ; Deconstruct the head from the tail
output []] ; Output is just looped with the input
(if n ; n will be nil if the list is empty
(recur rest-input
(if (= (mod n 2) 0)
(conj output n)
output)) ; Adding nothing since the number is odd
output)))
Rarely is explicit looping necessary though. This is a typical case for a fold: you want to accumulate a list that's a variable-length version of another list. This is a quick version:
(defn even-vector-4 [input]
(reduce ; Reducing the input into another list
(fn [acc n]
(if (= (rem n 2) 0)
(conj acc n)
acc))
[] ; This is the initial accumulator.
input))
Really though, you're just filtering a list. Just use the core's filter:
(filter #(= (rem % 2) 0) [1 2 3 4])
Note, filter is lazy.
Try
#(filterv even? %)
if you want to return a vector or
#(filter even? %)
if you want a lazy sequence.
If you want to combine this with more transformations, you might want to go for a transducer:
(filter even?)
If you wanted to write it using loop/recur, I'd do it like this:
(defn keep-even
"Accepts a vector of numbers, returning a vector of the even ones."
[input]
(loop [result []
unused input]
(if (empty? unused)
result
(let [curr-value (first unused)
next-result (if (is-even? curr-value)
(conj result curr-value)
result)
next-unused (rest unused) ]
(recur next-result next-unused)))))
This gets the same result as the built-in filter function.
Take a look at filter, even? and vec
check out http://cljs.info/cheatsheet/
(defn even-vector-2 [input](vec(filter even? input)))
If you want a lazy solution, filter is your friend.
Here is a non-lazy simple solution (loop/recur can be avoided if you apply always the same function without precise work) :
(defn keep-even-numbers
[coll]
(reduce
(fn [agg nb]
(if (zero? (rem nb 2)) (conj agg nb) agg))
[] coll))
If you like mutability for "fun", here is a solution with temporary mutable collection :
(defn mkeep-even-numbers
[coll]
(persistent!
(reduce
(fn [agg nb]
(if (zero? (rem nb 2)) (conj! agg nb) agg))
(transient []) coll)))
...which is slightly faster !
mod would be better than rem if you extend the odd/even definition to negative integers
You can also replace [] by the collection you want, here a vector !
In Clojure, you generally don't need to write a low-level loop with loop/recur. Here is a quick demo.
(ns tst.clj.core
(:require
[tupelo.core :as t] ))
(t/refer-tupelo)
(defn is-even?
"Returns true if x is even, otherwise false."
[x]
(zero? (mod x 2)))
; quick sanity checks
(spyx (is-even? 2))
(spyx (is-even? 3))
(defn keep-even
"Accepts a vector of numbers, returning a vector of the even ones."
[input]
(into [] ; forces result into vector, eagerly
(filter is-even? input)))
; demonstrate on [0 1 2...9]
(spyx (keep-even (range 10)))
with result:
(is-even? 2) => true
(is-even? 3) => false
(keep-even (range 10)) => [0 2 4 6 8]
Your project.clj needs the following for spyx to work:
:dependencies [
[tupelo "0.9.11"]
Let's say you have a recursive function defined in a let block:
(let [fib (fn fib [n]
(if (< n 2)
n
(+ (fib (- n 1))
(fib (- n 2)))))]
(fib 42))
This can be mechanically transformed to utilize memoize:
Wrap the fn form in a call to memoize.
Move the function name in as the 1st argument.
Pass the function into itself wherever it is called.
Rebind the function symbol to do the same using partial.
Transforming the above code leads to:
(let [fib (memoize
(fn [fib n]
(if (< n 2)
n
(+ (fib fib (- n 1))
(fib fib (- n 2))))))
fib (partial fib fib)]
(fib 42))
This works, but feels overly complicated. The question is: Is there a simpler way?
I take risks in answering since I am not a scholar but I don't think so. You pretty much did the standard thing which in fine is a partial application of memoization through a fixed point combinator.
You could try to fiddle with macros though (for simple cases it could be easy, syntax-quote would do name resolution for you and you could operate on that). I'll try once I get home.
edit: went back home and tried out stuff, this seems to be ok-ish for simple cases
(defmacro memoize-rec [form]
(let [[fn* fname params & body] form
params-with-fname (vec (cons fname params))]
`(let [f# (memoize (fn ~params-with-fname
(let [~fname (partial ~fname ~fname)] ~#body)))]
(partial f# f#))))
;; (clojure.pprint/pprint (macroexpand '(memoize-rec (fn f [x] (str (f x))))))
((memoize-rec (fn fib [n]
(if (< n 2)
n
(+ (fib (- n 1))
(fib (- n 2)))))) 75) ;; instant
((fn fib [n]
(if (< n 2)
n
(+ (fib (- n 1))
(fib (- n 2))))) 75) ;; slooooooow
simpler than what i thought!
I'm not sure this is "simpler" per se, but I thought I'd share an approach I took to re-implement letfn for a CPS transformer I wrote.
The key is to introduce the variables, but delay assigning them values until they are all in scope. Basically, what you would like to write is:
(let [f nil]
(set! f (memoize (fn []
<body-of-f>)))
(f))
Of course this doesn't work as is, because let bindings are immutable in Clojure. We can get around that, though, by using a reference type — for example, a promise:
(let [f (promise)]
(deliver! f (memoize (fn []
<body-of-f>)))
(#f))
But this still falls short, because we must replace every instance of f in <body-of-f> with (deref f). But we can solve this by introducing another function that invokes the function stored in the promise. So the entire solution might look like this:
(let [f* (promise)]
(letfn [(f []
(#f*))]
(deliver f* (memoize (fn []
<body-of-f>)))
(f)))
If you have a set of mutually-recursive functions:
(let [f* (promise)
g* (promise)]
(letfn [(f []
(#f*))
(g []
(#g*))]
(deliver f* (memoize (fn []
(g))))
(deliver g* (memoize (fn []
(f))))
(f)))
Obviously that's a lot of boiler-plate. But it's pretty trivial to construct a macro that gives you letfn-style syntax.
Yes, there is a simpler way.
It is not a functional transformation, but builds on the impurity allowed in clojure.
(defn fib [n]
(if (< n 2)
n
(+ (#'fib (- n 1))
(#'fib (- n 2)))))
(def fib (memoize fib))
First step defines fib in almost the normal way, but recursive calls are made using whatever the var fib contains. Then fib is redefined, becoming the memoized version of its old self.
I would suppose that clojure idiomatic way will be using recur
(def factorial
(fn [n]
(loop [cnt n acc 1]
(if (zero? cnt)
acc
(recur (dec cnt) (* acc cnt))
;; Memoized recursive function, a mash-up of memoize and fn
(defmacro mrfn
"Returns an anonymous function like `fn` but recursive calls to the given `name` within
`body` use a memoized version of the function, potentially improving performance (see
`memoize`). Only simple argument symbols are supported, not varargs or destructing or
multiple arities. Memoized recursion requires explicit calls to `name` so the `body`
should not use recur to the top level."
[name args & body]
{:pre [(simple-symbol? name) (vector? args) (seq args) (every? simple-symbol? args)]}
(let [akey (if (= (count args) 1) (first args) args)]
;; name becomes extra arg to support recursive memoized calls
`(let [f# (fn [~name ~#args] ~#body)
mem# (atom {})]
(fn mr# [~#args]
(if-let [e# (find #mem# ~akey)]
(val e#)
(let [ret# (f# mr# ~#args)]
(swap! mem# assoc ~akey ret#)
ret#))))))
;; only change is fn to mrfn
(let [fib (mrfn fib [n]
(if (< n 2)
n
(+ (fib (- n 1))
(fib (- n 2)))))]
(fib 42))
Timings on my oldish Mac:
original, Elapsed time: 14089.417441 msecs
mrfn version, Elapsed time: 0.220748 msecs
In clojure, this is valid:
(loop [a 5]
(if (= a 0)
"done"
(recur (dec a))))
However, this is not:
(let [a 5]
(if (= a 0)
"done"
(recur (dec a))))
So I'm wondering: why are loop and let separated, given the fact they both (at least conceptually) introduce lexical bindings? That is, why is loop a recur target while let is not?
EDIT: originally wrote "loop target" which I noticed is incorrect.
Consider the following example:
(defn pascal-step [v n]
(if (pos? n)
(let [l (concat v [0])
r (cons 0 v)]
(recur (map + l r) (dec n)))
v))
This function calculates n+mth line of pascal triangle by given mth line.
Now, imagine, that let is a recur target. In this case I won't be able to recursively call the pascal-step function itself from let binding using recur operator.
Now let's make this example a little bit more complex:
(defn pascal-line [n]
(loop [v [1]
i n]
(if (pos? i)
(let [l (concat v [0])
r (cons 0 v)]
(recur (map + l r) (dec i)))
v)))
Now we're calculating nth line of a pascal triangle. As you can see, I need both loop and let here.
This example is quite simple, so you may suggest removing let binding by using (concat v [0]) and (cons 0 v) directly, but I'm just showing you the concept. There may be a more complex examples where let inside a loop is unavoidable.
I implemented a naive solution for printing a Pascal's Triangle of N depth which I'll include below. My question is, in what ways could this be improved to make it more idiomatic? I feel like there are a number of things that seem overly verbose or awkward, for example, this if block feels unnatural: (if (zero? (+ a b)) 1 (+ a b)). Any feedback is appreciated, thank you!
(defn add-row [cnt acc]
(let [prev (last acc)]
(loop [n 0 row []]
(if (= n cnt)
row
(let [a (nth prev (- n 1) 0)
b (nth prev n 0)]
(recur (inc n) (conj row (if (zero? (+ a b)) 1 (+ a b)))))))))
(defn pascals-triangle [n]
(loop [cnt 1 acc []]
(if (> cnt n)
acc
(recur (inc cnt) (conj acc (add-row cnt acc))))))
(defn pascal []
(iterate (fn [row]
(map +' `(0 ~#row) `(~#row 0)))
[1]))
Or if you're going for maximum concision:
(defn pascal []
(->> [1] (iterate #(map +' `(0 ~#%) `(~#% 0)))))
To expand on this: the higher-order-function perspective is to look at your original definition and realize something like: "I'm actually just computing a function f on an initial value, and then calling f again, and then f again...". That's a common pattern, and so there's a function defined to cover the boring details for you, letting you just specify f and the initial value. And because it returns a lazy sequence, you don't have to specify n now: you can defer that, and work with the full infinite sequence, with whatever terminating condition you want.
For example, perhaps I don't want the first n rows, I just want to find the first row whose sum is a perfect square. Then I can just (first (filter (comp perfect-square? sum) (pascal))), without having to worry about how large an n I'll need to choose up front (assuming the obvious definitions of perfect-square? and sum).
Thanks to fogus for an improvement: I need to use +' rather than just + so that this doesn't overflow when it gets past Long/MAX_VALUE.
(defn next-row [row]
(concat [1] (map +' row (drop 1 row)) [1]))
(defn pascals-triangle [n]
(take n (iterate next-row '(1))))
Not as terse as the others, but here's mine:)
(defn A []
(iterate
(comp (partial map (partial reduce +))
(partial partition-all 2 1) (partial cons 0))
[1]))
I am trying to prove Clojure performance can be on equal footing with Java. An important use case I've found is the Quicksort. I have written an implementation as follows:
(set! *unchecked-math* true)
(defn qsort [^longs a]
(let [qs (fn qs [^long low, ^long high]
(when (< low high)
(let [pivot (aget a low)
[i j]
(loop [i low, j high]
(let [i (loop [i i] (if (< (aget a i) pivot)
(recur (inc i)) i))
j (loop [j j] (if (> (aget a j) pivot)
(recur (dec j)) j))
[i j] (if (<= i j)
(let [tmp (aget a i)]
(aset a i (aget a j)) (aset a j tmp)
[(inc i) (dec j)])
[i j])]
(if (< i j) (recur i j) [i j])))]
(when (< low j) (qs low j))
(when (< i high) (qs i high)))))]
(qs 0 (dec (alength a))))
a)
Also, this helps call the Java quicksort:
(defn jqsort [^longs a] (java.util.Arrays/sort a) a))
Now, for the benchmark.
user> (def xs (let [rnd (java.util.Random.)]
(long-array (repeatedly 100000 #(.nextLong rnd)))))
#'user/xs
user> (def ys (long-array xs))
#'user/ys
user> (time (qsort ys))
"Elapsed time: 163.33 msecs"
#<long[] [J#3ae34094>
user> (def ys (long-array xs))
user> (time (jqsort ys))
"Elapsed time: 13.895 msecs"
#<long[] [J#1b2b2f7f>
Performance is worlds apart (an order of magnitude, and then some).
Is there anything I'm missing, any Clojure feature I may have used? I think the main source of performance degradation is when I need to return several values from a loop and must allocate a vector for that. Can this be avoided?
BTW running Clojure 1.4. Also note that I have run the benchmark multiple times in order to warm up the HotSpot. These are the times when they settle down.
Update
The most terrible weakness in my code is not just the allocation of vectors, but the fact that they force boxing and break the primitive chain. Another weakness is using results of loop because they also break the chain. Yep, performance in Clojure is still a minefield.
This version is based on #mikera's, is just as fast and doesn't require the use of ugly macros. On my machine this takes ~12ms vs ~9ms for java.util.Arrays/sort:
(set! *unchecked-math* true)
(set! *warn-on-reflection* true)
(defn swap [^longs a ^long i ^long j]
(let [t (aget a i)]
(aset a i (aget a j))
(aset a j t)))
(defn ^long apartition [^longs a ^long pivot ^long i ^long j]
(loop [i i j j]
(if (<= i j)
(let [v (aget a i)]
(if (< v pivot)
(recur (inc i) j)
(do
(when (< i j)
(aset a i (aget a j))
(aset a j v))
(recur i (dec j)))))
i)))
(defn qsort
([^longs a]
(qsort a 0 (long (alength a))))
([^longs a ^long lo ^long hi]
(when
(< (inc lo) hi)
(let [pivot (aget a lo)
split (dec (apartition a pivot (inc lo) (dec hi)))]
(when (> split lo)
(swap a lo split))
(qsort a lo split)
(qsort a (inc split) hi)))
a))
(defn ^longs rand-long-array []
(let [rnd (java.util.Random.)]
(long-array (repeatedly 100000 #(.nextLong rnd)))))
(comment
(dotimes [_ 10]
(let [as (rand-long-array)]
(time
(dotimes [_ 1]
(qsort as)))))
)
The need for manual inlining is mostly unnecessary starting with Clojure 1.3. With a few type hints only on the function arguments the JVM will do the inlining for you. There is no need to cast index arguments to int for the the array operations - Clojure does this for you.
One thing to watch out for is that nested loop/recur does present problems for JVM inlining since loop/recur doesn't (at this time) support returning primitives. So you have to break apart your code into separate fns. This is for the best as nested loop/recurs get very ugly in Clojure anyhow.
For a more detailed look on how to consistently achieve Java performance (when you actually need it) please examine and understand test.benchmark.
This is slightly horrific because of the macros, but with this code I think you can match the Java speed (I get around 11ms for the benchmark):
(set! *unchecked-math* true)
(defmacro swap [a i j]
`(let [a# ~a
i# ~i
j# ~j
t# (aget a# i#)]
(aset a# i# (aget a# j#))
(aset a# j# t#)))
(defmacro apartition [a pivot i j]
`(let [pivot# ~pivot]
(loop [i# ~i
j# ~j]
(if (<= i# j#)
(let [v# (aget ~a i#)]
(if (< v# pivot#)
(recur (inc i#) j#)
(do
(when (< i# j#)
(aset ~a i# (aget ~a j#))
(aset ~a j# v#))
(recur i# (dec j#)))))
i#))))
(defn qsort
([^longs a]
(qsort a 0 (alength a)))
([^longs a ^long lo ^long hi]
(let [lo (int lo)
hi (int hi)]
(when
(< (inc lo) hi)
(let [pivot (aget a lo)
split (dec (apartition a pivot (inc lo) (dec hi)))]
(when (> split lo) (swap a lo split))
(qsort a lo split)
(qsort a (inc split) hi)))
a)))
The main tricks are:
Do everything with primitive arithmetic
Use ints for the array indexes (this avoids some unnecessary casts, not a big deal but every little helps....)
Use macros rather than functions to break up the code (avoids function call overhead and parameter boxing)
Use loop/recur for maximum speed in the inner loop (i.e. partitioning the subarray)
Avoid constructing any new objects on the heap (so avoid vectors, sequences, maps etc.)
The Joy of Clojure, Chapter 6.4 describes a lazy quicksort algorithm.The beauty of lazy sorting is that it will only do as much work as necessary to find the first x values. So if x << n this algorithm is O(n).
(ns joy.q)
(defn sort-parts
"Lazy, tail-recursive, incremental quicksort. Works against
and creates partitions based on the pivot, defined as 'work'."
[work]
(lazy-seq
(loop [[part & parts] work]
(if-let [[pivot & xs] (seq part)]
(let [smaller? #(< % pivot)]
(recur (list*
(filter smaller? xs)
pivot
(remove smaller? xs)
parts)))
(when-let [[x & parts] parts]
(cons x (sort-parts parts)))))))
(defn qsort [xs]
(sort-parts (list xs)))
By examining the main points from mikera's answer, you can see that they are mostly focused on eliminating the overhead introduced by using idiomatic (as opposed to tweaked) Clojure, which would probably not exist in an idiomatic Java implementation:
primitive arithmetic - slightly easier and more idiomatic in Java, you are more likely to use ints than Integers
ints for the array indexes - the same
Use macros rather than functions to break up the code (avoids functional call overhead and boxing) - fixes a problem introduced by using the language. Clojure encourages functional style, hence a function call overhead (and boxing).
Use loop/recur for maximum speed in the inner loop - in Java you'd idiomatically use an ordinary loop (which is what loop/recur compiles to anyway, as far as I know)
That being said, there actually is another trivial solution. Write (or find) an efficient Java implementation of Quick Sort, say something with a signature like this:
Sort.quickSort(long[] elems)
And then call it from Clojure:
(Sort/quickSort elems)
Checklist:
as efficient as in Java - yes
idiomatic in Clojure - arguably yes, I'd say that Java-interop is one of Clojure's core features.
reusable - yes, there's a good chance that you can easily find a very efficient Java implementation already written.
I'm not trying to troll, I understand what you are trying to find out with these experiments I'm just adding this answer for the sake of completeness. Let's not overlook the obvious one! :)