clojure recur vs imperative loop - clojure

Learning Clojure and trying to understand the implementation:
What's the difference from:
(def factorial
(fn [n]
(loop [cnt n acc 1]
(if (zero? cnt)
acc
(recur (dec cnt) (* acc cnt))
; in loop cnt will take the value (dec cnt)
; and acc will take the value (* acc cnt)
))))
and the following C-like pseudocode
function factorial (n)
for( cnt = n, acc = 1) {
if (cnt==0) return acc;
cnt = cnt-1;
acc = acc*cnt;
}
// in loop cnt will take the value (dec cnt)
// and acc will take the value (* acc cnt)
Are clojure's "loop" and "recur", forms specifically designed to code a simple imperative loop ?
(assuming pseudocode's "for" creates it's own scope, so cnt and acc exists only inside the loop)

Are Clojure's loop and recur forms specifically designed to code a simple imperative loop?
Yes.
In functional terms:
A loop is a degenerate form of recursion called tail-recursion.
The 'variables' are not modified in the body of the loop. Instead,
they are re-incarnated whenever the loop is re-entered.
Clojure's recur makes a tail-recursive call to the surrounding recursion point.
It re-uses the one stack frame, so working faster and avoiding stack
overflow.
It can only happen as the last thing to do in any call - in so-called tail position.
Instead of being stacked up, each successive recur call overwrites the last.
A recursion point is
a fn form, possibly disguised in defn or letfn OR
a loop form, which also binds/sets-up/initialises the
locals/variables.
So your factorial function could be re-written
(def factorial
(fn [n]
((fn fact [cnt acc]
(if (zero? cnt)
acc
(fact (dec cnt) (* acc cnt))))
n 1)))
... which is slower, and risks stack overflow.
Not every C/C++ loop translates smoothly. You can get trouble from nested loops where the inner loop modifies a variable in the outer one.
By the way, your factorial function
will cause integer overflow quite quickly. If you want to avoid
this, use 1.0 instead of 1 to get floating point (double)
arithmetic, or use *' instead of * to get Clojure's BigInt
arithmetic.
will loop endlessly on a negative argument.
A quick fix for the latter is
(def factorial
(fn [n]
(loop [cnt n acc 1]
(if (pos? cnt)
(recur (dec cnt) (* acc cnt))
acc))))
; 1
... though it would be better to return nil or Double.NEGATIVE_INFINITY.

One way to look at loop/recur is that it lets you write code that is functional, but where the underlying implementation ends up essentially being an imperative loop.
To see that it is functional, take your example
(def factorial
(fn [n]
(loop [cnt n acc 1]
(if (zero? cnt)
acc
(recur (dec cnt) (* acc cnt))))))
and rewrite it so that the loop form is broken out to a separate helper function:
(def factorial-helper
(fn [cnt acc]
(if (zero? cnt)
acc
(recur (dec cnt) (* acc cnt)))))
(def factorial'
(fn [n]
(factorial-helper n 1)))
Now you can see that the helper function is simply calling itself; you can replace recur with the function name:
(def factorial-helper
(fn [cnt acc]
(if (zero? cnt)
acc
(factorial-helper (dec cnt) (* acc cnt)))))
You can look at recur, when used in factorial-helper as simply making a recursive call, which is optimized by the underlying implementation.
I think an important idea is that it allows the underlying implementation to be an imperative loop, but your Clojure code still remains functional. In other words, it is not a construct that allows you to write imperative loops that involve arbitrary assignment. But, if you structure your functional code in this way, you can gain the performance benefit associated with an imperative loop.
One way to successfully transform an imperative loop to this form is to change the imperative assignments into expressions that are "assigned to" the argument parameters of the recursive call. But, of course, if you encounter an imperative loop that makes arbitrary assignments, you may not be able to translate it into this form. In this view, loop/recur is a much more constrained construct.

Related

Combining thread-last with loop in clojure

I write small card game and i want my code to be very explicit and therefore make it clear on a high level, that there are rounds to play. My first implementation was:
(defn play-game []
(->
(myio/initialize-cards-and-players)
(shuffle-and-share-cards myio/myshuffle)
;(announce)
(play-rounds)
)
)
I want the play rounds to be more explicit and show, that it is a loop on top-level:
(defn play-game []
(->>
(myio/initialize-cards-and-players)
(shuffle-and-share-cards myio/myshuffle)
(announce)
(loop [round 1]
(play-round-with-game round)
(if (<= round 4)
(recur (+ round 1))
)
)
; (play-round-with-game 2)
; (play-round-with-game 3)
; (play-round-with-game 4)
)
)
But i somehow cannot get the "game" object, that is returned by every function into the loop. Is there a good way to handle this with thread-last?
Or is there a way to use reduce instead of the loop?
As Eugene suggested, you want to keep it simple. The following is the structure I normally use
(defn play-game
[]
(let [game-init (->>
(myio/initialize-cards-and-players)
(shuffle-and-share-cards myio/myshuffle)
(announce))]
(loop [round 1
game game-init]
(if (< 4 round)
game ; return result
(let [game-next (play-round-with-game game round)]
(recur (inc round) game-next))))))
I like to use let forms to give each value an explicit name. It helps to avoid confusion, especially for new readers of the code. If the (inc round) was not so simple, I would also make an explicit round-next variable and use that in the recur.
Also, I like the return condition to be the first thing checked in each iteration of the loop. This normally leaves the recur as the last statement in the loop.

Where is the tail position in my Clojure loop?

Clojure is saying that I can't call recur from a non-tail-position.
Is this not the tail position?
What is the tail position in my loop then?
(loop [i 20]
(for [x (range 1 21)]
(if (zero? (rem i x))
i
(recur (+ i 1)))))
for does not do what you think it does; it is not an imperative loop. It is a list comprehension, or sequence-generator. Therefore, there is not a return or iterate call at its end, so you cannot place recur there.
It would seem you probably do not need either loop or recur in this expression at all; the for is all you need to build a sequence, but it's not clear to me exactly what sequence you wish to build.
Further to #JohnBaker's answer, any recur refers to the (nearest) enclosing loop or fn (which may be dressed as a letfn or a defn). There is no such thing in your snippet. So there is nothing for the recur to be in tail position to.
But just replace the for with loop, and you get
(loop [i 20]
(loop [x 1]
(if (zero? (rem i x))
i
(recur (+ i 1)))))
... which evaluates to 20,no doubt what you intended.
However, the outer loop is never recurred to, so might at well be a let:
(let [i 20]
(loop [x 1]
(if (zero? (rem i x))
i
(recur (+ i 1)))))
The recur is in tail position because there is nothing left to do in that control path through the loop form: the recur form is the returned value. Both arms of an if have their own tail position.
There are some other disguised ifs that have their own tail position: ors and ands are such.

Tracking down a StackOverflow in a Clojure program, contains SSCCE

I am having a hard time tracking down a stackoverflow error produced by my clojure implementation of the Bron-Kerbosch algorithm.
Below is my implementation of the algorithm itself, and here is a link to a SSCCE outlining the issue http://pastebin.com/gtpTT4B4.
(defn Bron-Kerbosch [r p x graph cliques]
;(println (str "R:" r " P:" p " X:" x " Clq:" cliques))
(cond (and (empty? p) (empty? x)) (concat cliques r)
:else
(let [neigh (neighV graph (dec (count p)))]
(loop [loop-clq '(cliques)
loop-cnt (dec (count p))
loop-p '(p)
loop-x '(x)]
(cond (= -1 loop-cnt) loop-clq
:else
(recur (concat loop-clq
(Bron-Kerbosch (concat r loop-cnt)
(concat p neigh)
(filter (set x) neigh)
graph cliques))
(dec loop-cnt)
(filter (set p) loop-cnt)
(concat x loop-cnt)))))))
I would have to assume that the issue obviously lies within one of my two bootstrap conditions (cond (and (empty? p) (empty? x)) and (cond (= -1 loop-cnt) because the function algorithm is recursive.
Though this assumes that I am building the lists x r p correctly. Judging by the output of the commented out print statement (cliques is always printed as an EmptyList) I assume that my list comprehension might also be the issue.
Along the same lines, the other issue I can somewhat see is that I am not actually calling the algorithm properly in the BK-Call function (in the SSCCEE).
My overall question is what is causing this? Though this is somewhat too open, another question that might lead me to my answer is how I might go about using the print statement on the first line.
When the print statement is uncommented it produces the output
R:clojure.lang.LazySeq#2044e0b9 P:clojure.lang.LazySeq#7f9700a5 X:clojure.lang.LazySeq#1 Clq:clojure.lang.PersistentList$EmptyList#1
I assume that if I could see that x r p are at each call I might be able to see where the algorithm is going wrong.
Can anyone point me in the right direction?
EDIT: The neighV function from the SSCCE
(defn neighV [graph nodenum]
(let [ret-list (for [i (range (count graph)) :when (contains? (graph i) nodenum)] i)]
ret-list))
EDIT2: Noisesmith's answers had gotten me closer to the solution and made sense to me. I wrapped all of my concat in doall. After trying to call the function again I was getting "Cannot cast Long to Seq" errors, I figured that these stemmed from trying to concat loop-cnt onto lists in the function
fptests.core> (BK-Call (sanity1))
IllegalArgumentException Don't know how to create ISeq from: java.lang.Long clojure.lang.RT.seqFrom (RT.java:505)
fptests.core> (concat 1 '(2 3))
IllegalArgumentException Don't know how to create ISeq from: java.lang.Long clojure.lang.RT.seqFrom (RT.java:505)
So I then wrapped each loop-cnt in a '() to turn it into a list before it is concat
fptests.core> (concat '(1) '(2 3))
(1 2 3)
Which, after I made all of these changes, I ended back at my stack overflow.. Here is the new Bron-Kerbosch function with all of the edits. I guess I now have the same questions as I did before..
Though the new ones are, did I implement that changes that I should have correctly, does the usage of '() make sense to fix the issue that came up after implementing noisesmith's changes?
(defn Bron-Kerbosch1 [r p x graph cliques]
(cond (and (empty? p) (empty? x)) (doall (concat cliques r))
:else
(let [neigh (neighV graph (dec (count p)))]
(loop [loop-clq '(cliques)
loop-cnt (dec (count p))
loop-p '(p)
loop-x '(x)]
(cond (= -1 loop-cnt) loop-clq
:else
(recur (doall (concat loop-clq
(Bron-Kerbosch1 (doall (concat r '(loop-cnt)))
(doall (concat p neigh))
(filter (set x) neigh)
graph cliques)))
(dec loop-cnt)
(filter (set p) '(loop-cnt))
(doall (concat x '(loop-cnt)))))))))
EDIT3: After patiently waiting for my prn statements to stop (not sure why I put them in when I knew it was in a SO) I found that most if not all statements printed were along the lines of
"R:" (loop-cnt loop-cnt loop-cnt loop-cnt loop-cnt loop-cnt loop-cnt ...)
"P:" (range (count graph) 0 2 3) " X:" () " Clq:" ()
After inspecting this some I realized that I have not been recursively calling the function properly. I have been union'ing items to P instead of removing them. This causes P to continuously grow. This is most likely the cause of my stack overflow. Though there are still some issues. I still am creating a stackoverflow, yet again.
Once I fixed my issue of continuing to union to P my issue is that when I concat loop-cnt it is not, I guess to say, evaluated to a value but it stays as a variable name loop-cnt. I suspect that my stack overflow now lies with my bootstrap condition not being met because it cannot be met if loop-cnt is not evaluated to a number.
So I think my issue now lies with concat loop-cnt to a list as a number and not a variable.
concat is lazy. Recursive calls that build concat on top of concat, without realizing any of the prior layers of laziness, each add to the size of the stack of calls that will be needed to realize the lazy-seq.
Does this concatenation need to be lazy at all? If not, wrap the calls to concat in calls to doall. This will make the concatenation eager, which reduces the size of the call stack needed to realize the final result, and thus eliminating the StackOverflowError.
Also, the correct way to print a lazy-seq and see the contents is prn, you can use pr-str to get the form of the value that pr or prn would use as a string, if needed.
You are misusing quoted lists, I think.
For example, in (defn Bron-Kerbosch1 [ ... ] ... ), '(cliques) evaluates to a list containing the symbol cliques, not to a list containing the argument cliques as its one element. You want (list cliques) for that.
Similar cases abound.

Clojure using let variable declaration within its own instantiation?

In the language of Clojure I am trying to pass a variable that I am defining within a let as a parameter to a function within the same let. The variable itself represents a list of vectors representing edges in a graph. The function I want to pass it to uses the list to make sure that it does not generate the same value within the list.
The function in whole
(defn random-WS
([n k] (random-WS (- n 1) k (reg-ring-lattice n k)))
([n k graph]
(cond (= n -1) graph
:else
(let [rem-list (for [j (range (count (graph n))) :when (< (rand) 0.5)]
[n (nth (seq (graph n)) j)])
add-list (for [j (range (count rem-list))]
(random-WSE graph n add-list))
new-graph (reduce add-edge (reduce rem-edge graph rem-list) add-list)]
(random-WS (- n 1) k new-graph)))))
The actual problem statement is seen here
add-list (for [j (range (count rem-list))]
(random-WSE graph n add-list))
Again for clarity, the function random-WSE generates a random edge for my graph based on some rules. Given the current graph, current node n, and current list of edges to add add-list it will generate one more edge to add to the list based on some rules.
The only real idea I have is to first let add-list () to first define it before then redefining it. Though this still has somewhat the same issue, though add-list is now defined, it will be () through out the for statement. Thus the function random-WSE will not take into account the edges already in the list.
Is there a way to "evaluate" add-list at some defined point within its own definition so that it can be used, within its definition? So I would first "evaluate" it to () before the for and then "evaluate" after each iteration of the for.
If you're interested the function is used to create a random Watts-Stogatz graph.
From what I get of your description of this algorithm, add-list grows (accumulates) during the problematic for loop. Accumulation (for a very broad acceptation of accumulation) is a strong sign you should use reduce:
(reduce (fn [add-list j] (conj add-list (random-WSE graph n add-list))) [] (range (count rem-list))
Basically you're chaining results in your let, in the sense that the result of the first computation (resulting in rem-list) is the sole input for your second computation (your trouble point) which again is the sole input for your third computation, which is finally the sole input to your final computation step (your recursion step). If this chaining sounds familiar, that's because it is: think about reformulating your let construction in terms of the threading macro ->.
I.e. something along the lines of
(defn- rem-list [graph n]
...)
(defn- add-list [remlist n]
...)
(defn- new-graph [addlist]
...)
(defn random-ws
([n k] ...)
([graph n k] ;; <- note the moved parameter
(cond (= n -1) graph
:else
(-> graph
(rem-list n)
(add-list n)
(new-graph)
(random-ws (dec n) k))))
You can then formulate add-list as a simple recursive function (maybe introduce an accumulator variable) or use the reduce variant that cgrand explained.

Iteratively apply function to its result without generating a seq

This is one of those "Is there a built-in/better/idiomatic/clever way to do this?" questions.
I want a function--call it fn-pow--that will apply a function f to the result of applying f to an argument, then apply it to the result of applying it to its result, etc., n times. For example,
(fn-pow inc 0 3)
would be equivalent to
(inc (inc (inc 0)))
It's easy to do this with iterate:
(defn fn-pow-0
[f x n]
(nth (iterate f x) n))
but that creates and throws away an unnecessary lazy sequence.
It's not hard to write the function from scratch. Here is one version:
(defn fn-pow-1
[f x n]
(if (> n 0)
(recur f (f x) (dec n))
x))
I found this to be almost twice as fast as fn-pow-0, using Criterium on (fn-pow inc 0 10000000).
I don't consider the definition of fn-pow-1 to be unidiomatic, but fn-pow seems like something that might be a standard built-in function, or there may be some simple way to define it with a couple of higher-order functions in a clever arrangement. I haven't succeeded in discovering either. Am I missing something?
The built-in you are looking for is probably dotimes. I'll tell you why in a round-about fashion.
Time
What you are testing in your benchmark is mainly the overhead of a level of indirection. That (nth (iterate ...) n) is only twice as slow as what compiles to a loop when the body is a very fast function is rather surprising/encouraging. If f is a more costly function, the importance of that overhead diminishes. (Of course if your f is low-level and fast, then you should use a low-level loop construct.)
Say your function takes ~ 1 ms instead
(defn my-inc [x] (Thread/sleep 1) (inc x))
Then both of these will take about 1 second -- the difference is around 2% rather than 100%.
(bench (fn-pow-0 my-inc 0 1000))
(bench (fn-pow-1 my-inc 0 1000))
Space
The other concern is that iterate is creating an unnecessary sequence. But, if you are not holding onto the head, just doing an nth, then you aren't really creating a sequence per se but sequentially creating, using, and discarding LazySeq objects. In other words, you are using a constant amount of space, though generating garbage in proportion to n. However, unless your f is primitive or mutating its argument, then it is already producing garbage in proportion to n in producing its own intermediate results.
Reducing Garbage
An interesting compromise between fn-pow-0 and fn-pow-1 would be
(defn fn-pow-2 [f x n] (reduce (fn [x _] (f x)) x (range n)))
Since range objects know how to intelligently reduce themselves, this does not create additional garbage in proportion to n. It boils down to a loop as well. This is the reduce method of range:
public Object reduce(IFn f, Object start) {
Object ret = f.invoke(start,n);
for(int x = n+1;x < end;x++)
ret = f.invoke(ret, x);
return ret;
}
This was actually the fastest of the three (before adding primitive type-hints on n in the recur version, that is) with the slowed down my-inc.
Mutation
If you are iterating a function potentially expensive in time or space, such as matrix operations, then you may very well be wanting to use (in a contained manner) an f that mutates its argument to eliminate the garbage overhead. Since mutation is a side effect, and you want that side effect n times, dotimes is the natural choice.
For the sake of example, I'll use an atom as a stand-in, but imagine bashing on a mutable matrix instead.
(def my-x (atom 0))
(defn my-inc! [x] (Thread/sleep 1) (swap! x inc))
(defn fn-pow-3! [f! x n] (dotimes [i n] (f! x)))
That sounds just like composing functions n times.
(defn fn-pow [f p t]
((apply comp (repeat t f)) p))
Hmmm. I note that Ankur's version is around 10x slower than your original - possibly not the intent, no matter how idiomatic? :-)
Type hinting fn-pow-1 simply for the counter yields substantially faster results for me - around 3x faster.
(defn fn-pow-3 [f x ^long n]
(if (> n 0)
(recur f (f x) (dec n))
x))
This is around twice as slow as a version which uses inc directly, losing the variability (not hinting x to keep to the spirit of the test)...
(defn inc-pow [x ^long n]
(if (> n 0)
(recur (inc x) (dec n))
x))
I think that for any nontrivial f that fn-pow-3 is probably the best solution.
I haven't found a particularly "idiomatic" way of doing this as it does not feel like common use case outside of micro benchmarks (although would love to be contradicted).
Would be intrigued to hear of a real world example, if you have one?
To us benighted imperative programmers, a more general pattern is known as a while statement. We can capture it in a macro:
(defmacro while [bv ; binding vector
tf ; test form
recf ; recur form
retf ; return form
]
`(loop ~bv (if ~tf (recur ~#recf) ~retf)))
... in your case
(while [x 0, n 3] (pos? n)
[(inc x) (dec n)]
x)
; 3
I was hoping to type-hint the n, but it's illegal. Maybe it's
inferred.
Forgive me (re)using while.
This isn't quite right: it doesn't allow for computation prior to the recur-form.
We can adapt the macro to do things prior to the recur:
(defmacro while [bv ; binding vector
tf ; test form
bodyf ; body form
retf ; return form
]
(let [bodyf (vec bodyf)
recf (peek bodyf)
bodyf (seq (conj (pop bodyf) (cons `recur recf)))]
`(loop ~bv (if ~tf ~bodyf ~retf))))
For example
(while [x 0, n 3] (pos? n)
(let [x- (inc x) n- (dec n)] [x- n-])
x)
; 3
I find this quite expressive. YMMV.