(list) in lazy-seq causes infinite recursion but (cons) does not - clojure

In attempting to understand lazy-seq, I came up with this example:
(defn zeroes []
(list 0 (lazy-seq (zeroes))))
(take 5 (zeroes)) ; too much recursion error
This however triggers a too much recursion error. Replacing (list) with (cons) fixes the problem, but I don't understand why:
(defn zeroes []
(cons 0 (lazy-seq (zeroes))))
(take 5 (zeroes)) ; returns (0 0 0 0 0)
My understanding of lazy-seq is that it immediately returns a lazy-seq instance but that its body is not evaluated until a call to first or rest on that instance. So I would think (zeroes) would just return a simple Cons of 0 and a LazySeq with a yet unevaluated body.
As an additional curiosity, I'm puzzled why this hangs the repl (because the repl attempts to print an infinite sequence) but doesn't trigger a 'too much recursion' error.
(defn zeroes []
(cons 0 (lazy-seq (zeroes))))
(zeroes) ; hangs the repl
(In case it's relevant, I'm trying these examples in the ClojureScript repl at http://himera.herokuapp.com/index.html.)

You asked two questions...
1) Why does using list instead of cons in the following code result in infinite recursion?
(defn zeroes []
(cons 0 (lazy-seq (zeroes))))
(take 5 (zeroes)) ; too much recursion error
The version using cons produces an infinite sequence of zeros, like this:
(0 0 0 0 ...)
If you use list instead, you produce a totally different result. You get an infinite nesting of lists of two elements each (with head=0 and tail=another list):
'(0 (0 (0 (0 (...))))
Since the top-level list only has two elements, you end up with the whole thing when you call (take 5). You get the "too much recursion" error when the REPL tries to print out these infinitely-nested lists.
Note that you could safely substitute the list* for cons. The list* function takes a variable number of arguments (as does list), but unlike list it assumes the last argument is a seq. This means (list* a b c d) is essentially just shorthand for (cons a (cons b (cons c d))). Since (list* a b) is basically the same as (cons a b), it follows that you can make the substitution in your code. In this case it probably wouldn't make much sense, but it's nice if you're cons-ing several items at the same time.
2) Why does the following diverge (hang) rather than throw a "too much recursion" error like we saw above?
(defn zeroes []
(cons 0 (lazy-seq (zeroes))))
(zeroes) ; hangs the repl
The zeros function produces a "flat" sequence of zeros (unlike the nested lists above). The REPL probably uses a tail-recursive function to evaluate each successive lazy element of the sequence. The tail-call optimization allows recursive functions to recur forever without blowing the call stack—so that's what happens. Recursive tail-calls in Clojure are denoted by the recur special form.

Related

What scope should calls to lazy-seq have?

I'm writing a lazy implementation of the Recamán's Sequence, and ran into some confusion regarding where calls to lazy-seq should happen.
This first version I came up with this morning was:
(defn lazy-recamans-sequence []
(let [f (fn rec [n seen last-s]
(let [back (- last-s n)
new-s (if (and (pos? back) (not (seen back)))
back
(+ last-s n))]
(lazy-seq ; Here
(cons new-s (rec (inc n) (conj seen new-s) new-s)))))]
(f 0 #{} 0)))
Then I realized that my placement of lazy-seq was kind of arbitrary, and that it could be placed higher to wrap more of the computations:
(defn lazy-recamans-sequence2 []
(let [f (fn rec [n seen last-s]
(lazy-seq ; Here
(let [back (- last-s n)
new-s (if (and (pos? back) (not (seen back)))
back
(+ last-s n))]
(cons new-s (rec (inc n) (conj seen new-s) new-s)))))]
(f 0 #{} 0)))
Then I looked back on a review that someone gave me last night:
(defn recaman []
(letfn [(tail [previous n seen]
(let [nx (if (and (> previous n) (not (seen (- previous n))))
(- previous n)
(+ previous n))]
; Here, inside "cons"
(cons nx (lazy-seq (tail nx (inc n) (conj seen nx))))))]
(tail 0 0 #{})))
And they have theirs inside of the call to cons!
Thinking this over, it seems like it wouldn't make a difference. With a broader scope (like the second version), more code is inside the explicit function that's passed to LazySeq. With a narrower scope however, the function itself may be smaller, but since the passed function involves a recursive call, it will be executing the same code anyways.
They seem to preform nearly identically and give the same answers. Is there any reason to prefer placing lazy-seq in one place over another? Is this simply a stylistic choice, or can this have actual repercussions?
In the first two examples the lazy-seq wraps the cons call. This means that when you generate call the function you return a lazy sequence immediately without calculating the first item of the sequence.
In the first example the let expression is still outside of lazy-seq so the value of the first item is calculated immediately but the returned sequence is still lazy and not realized.
The second example is similar to the first. The lazy-seq wraps the cons cell and also the let block. This means that the function will return immediatetly and the value of the first item is calculated only when the caller starts to consume the lazy sequence.
In the third example the value of the first item in the list is calculated immediately and only the tail of the returned sequence is lazy.
Is there any reason to prefer placing lazy-seq in one place over another?
It depends on what you want to achieve. Do you want to return a sequence immediately without calculating any values? In this case make the scope of lazy-seq as broad as possible. Otherwise try to restrict the scope of lazy-seq to calculate only the tail part of the sequence.
When I was first learning Clojure, I was a bit confused by the many possible choices of lazy-seq constructs, the lack of clarity in terms of which construct to choose, and the somewhat vague explanation for how lazy-seq creates laziness in the first place (it is implemented as a Java class of ~240 lines).
To reduce repetition and keep things as simple as possible, I created the lazy-cons macro. It is used like so:
(defn lazy-countdown [n]
(when (<= 0 n)
(lazy-cons n (lazy-countdown (dec n)))))
(deftest t-all
(is= (lazy-countdown 5) [5 4 3 2 1 0] )
(is= (lazy-countdown 1) [1 0] )
(is= (lazy-countdown 0) [0] )
(is= (lazy-countdown -1) nil ))
This version does realize the initial value n immediately.
I never worry about chunking (typically batches of 32) or trying to precisely control the number of elements realized in a lazy sequence. IMHO, if you need fine-grained control such as this, it is better to use an explicit loop than to make assumptions on the timing of realizations in a lazy sequence.

The usage of lazy-sequences in clojure

I am wondering that lazy-seq returns a finite list or infinite list. There is an example,
(defn integers [n]
(cons n (lazy-seq (integers (inc n)))))
when I run like
(first integers 10)
or
(take 5 (integers 10))
the results are 10 and (10 11 12 13 14)
. However, when I run
(integers 10)
the process cannot print anything and cannot continue. Is there anyone who can tell me why and the usage of laza-seq. Thank you so much!
When you say that you are running
(integers 10)
what you're really doing is something like this:
user> (integers 10)
In other words, you're evaluating that form in a REPL (read-eval-print-loop).
The "read" step will convert from the string "(integers 10)" to the list (integers 10). Pretty straightforward.
The "eval" step will look up integers in the surrounding context, see that it is bound to a function, and evaluate that function with the parameter 10:
(cons 10 (lazy-seq (integers (inc 10))))
Since a lazy-seq isn't realized until it needs to be, simply evaluating this form will result in a clojure.lang.Cons object whose first element is 10 and whose rest element is a clojure.lang.LazySeq that hasn't been realized yet.
You can verify this with a simple def (no infinite hang):
user> (def my-integers (integers 10))
;=> #'user/my-integers
In the final "print" step, Clojure basically tries to convert the result of the form it just evaluated to a string, then print that string to the console. For a finite sequence, this is easy. It just keeps taking items from the sequence until there aren't any left, converts each item to a string, separates them by spaces, sticks some parentheses on the ends, and voilà:
user> (take 5 (integers 10))
;=> (10 11 12 13 14)
But as you've defined integers, there won't be a point at which there are no items left (well, at least until you get an integer overflow, but that could be remedied by using inc' instead of just inc). So Clojure is able to read and evaluate your input just fine, but it simply cannot print all the items of an infinite result.
When you try to print an unbounded lazy sequence, it will be completely realized, unless you limit *print-length*.
The lazy-seq macro never constructs a list, finite or infinite. It constructs a clojure.lang.LazySeq object. This is a nominal sequence that wraps a function of no arguments (commonly called a thunk) that evaluates to the actual sequence when called; but it isn't called until it has to be, and that's the purpose of the mechanism: to delay evaluating the actual sequence.
So you can pass endless sequences around as evaluated LazySeq objects, provided you never realise them. Your evaluation at the REPL invokes realisation, an endless process.
It's not returning anything because your integers function creates an infinite loop.
(defn integers [n]
(do (prn n)
(cons n (lazy-seq (integers (inc n))))))
Call it with (integers 10) and you'll see it counting forever.

Building a list with loops

We just covered loops today in class and I've got a few things I need to do. Put simply, I have to build a list using loops instead of recursion. I seem to be at a stumbling block here. For this example, we need to do a simple countdown. The function takes an argument and then returns a list of all the positive integers less than or equal to the initial argument. (countdown 5) => (5 4 3 2 1)
I'm having a hard time getting loops for whatever reason. The ones we talked about was Loop, Do, Dotimes, and Dolist. I've tried this with a couple loops and always end up with a similar result.
(defun countdown (num)
(cond ((= num 0) nil)
(T (let* ((list nil))
(loop
(if (= num 0) (return list)
(setf list (cons list num)))
(setf num (- num 1)))))))
My output shows up like this:
(((((NIL . 5) . 4) . 3) . 2) .1)
update: I've solved the issue. Apparently I needed to reverse the order in the cons, so num comes before list. Does anyone care to explain this? I thought you put the list first and then what you put second would be added to the end of it. At least, that's how I've been using it so far without issue.
Reverse the arguments to cons (and why)
You wrote in an answer (that, since it asks for more information, perhaps should have been a comment):
I've solved the issue. Apparently I needed to reverse the order in the
cons, so num comes before list. Does anyone care to explain this? I
thought you put the list first and then what you put second would be
added to the end of it. At least, that's how I've been using it so far
without issue.
The function is documented in the HyperSpec clearly: Function CONS. The examples in the documentation show, e.g.,
(cons 1 (cons 2 (cons 3 (cons 4 nil)))) => (1 2 3 4)
(cons 'a (cons 'b (cons 'c '()))) => (A B C)
(cons 'a '(b c d)) => (A B C D)
and even the note
If object-2 is a list, cons can be thought of as producing a new list which is like it but has object-1 prepended.
It may help to read through 14.1.2 Conses as Lists, as well, which includes:
A list is a chain of conses in which the car of each cons is an element of the list, and the cdr of each cons is either the next link in the chain or a terminating atom.
Concerning loop
Many of the answers here are pointing out to you that the loop form includes a special iteration language. That's true, but it can also be used in the way that you're using it. That way is called a simple loop:
6.1.1.1.1 Simple Loop
A simple loop form is one that has a body containing only compound
forms. Each form is evaluated in turn from left to right. When the
last form has been evaluated, then the first form is evaluated again,
and so on, in a never-ending cycle. A simple loop form establishes an
implicit block named nil. The execution of a simple loop can be
terminated by explicitly transfering control to the implicit block
(using return or return-from) or to some exit point outside of the
block (e.g., using throw, go, or return-from).
Simple loops probably aren't as common as loops using the nicer features that loop provides, but if you just covered this in class, you might not be there yet. The other answers do provide some good examples, though.
If you speaking about common lisp loop, your countdown may look like this:
(defun countdown (from-number)
(loop :for x :from from-number :downto 1 :collect x))
CL-USER> (countdown 10)
(10 9 8 7 6 5 4 3 2 1)
Using loop, which has its own "special-purpose language" that does not really look like Lisp:
(defun countdown (n)
(loop
for i from n downto 1
collect i))
Or using do:
(defun countdown (n)
(do ((i 1 (1+ i))
(res nil (cons i res)))
((> i n) res)))
See here, especially chapters 7 and 22.

Overflow while using recur in clojure

I have a simple prime number calculator in clojure (an inefficient algorithm, but I'm just trying to understand the behavior of recur for now). The code is:
(defn divisible [x,y] (= 0 (mod x y)))
(defn naive-primes [primes candidates]
(if (seq candidates)
(recur (conj primes (first candidates))
(remove (fn [x] (divisible x (first candidates))) candidates))
primes)
)
This works as long as I am not trying to find too many numbers. For example
(print (sort (naive-primes [] (range 2 2000))))
works. For anything requiring more recursion, I get an overflow error.
(print (sort (naive-primes [] (range 2 20000))))
will not work. In general, whether I use recur or call naive-primes again without the attempt at TCO doesn't appear to make any difference. Why am I getting errors for large recursions while using recur?
recur always uses tail recursion, regardless of whether you are recurring to a loop or a function head. The issue is the calls to remove. remove calls first to get the element from the underlying seq and checks to see if that element is valid. If the underlying seq was created by a call to remove, you get another call to first. If you call remove 20000 times on the same seq, calling first requires calling first 20000 times, and none of the calls can be tail recursive. Hence, the stack overflow error.
Changing (remove ...) to (doall (remove ...)) fixes the problem, since it prevents the infinite stacking of remove calls (each one gets fully applied immediately and returns a concrete seq, not a lazy seq). I think this method only ever keeps one candidates list in memory at one time, though I am not positive about this. If so, it isn't too space inefficient, and a bit of testing shows that it isn't actually much slower.

How can I call recur in an if conditional in Clojure?

I'm trying to solve the count a sequence excercise at 4clojure.com. The excercise is to count the number of elements in a collection without using the count function.
I thought I can do this via recursion, by the usage of rest. If what I get isn't empty, I return 1 + recur on the sequence rest returned. The problem is, that I end up getting
java.security.PrivilegedActionException: java.lang.UnsupportedOperationException:
Can only recur from tail position
even though I'm calling recur as the very last statement.
(fn [coll] (let [tail (rest coll)]
(if (empty tail)
1
(+ 1 (recur tail)))))
Am I missing something?
The last statement is the addition, not the call to recur, which is why it doesn't work. The fact that it's inside an if has nothing to do with it. (fn [coll] (let [tail (rest coll)] (+ 1 (recur tail)))) wouldn't work either.
The usual way to turn a function like this into a tail-recursive one is to make the function take a second argument, which holds the accumulator for the value you're adding up and then recurse like this: (recur tail (+ acc 1)) instead of trying to add 1 to the result of recur.
As a general rule: If you're doing anything to the result of recur (like for example adding 1 to it), it can't be in a tail position, so it won't work.
The error that you are getting is pointing out that your final expression of (+ 1 (recur tail)) is not tail-call-optimization optimizable (is that a word?). The problem is that it needs to keep a bunch of (+ 1 ...) expressions on the stack in order to evaluate result of the function. Tail call optimization can only occur if the value of the called function is the only thing needed to know the return of the function making the call.
What you are trying to write is pretty much a fold. In this case the function should pass along the rest of the collection as well as the count so far.
(fn [count coll] (let [tail (rest coll)]
(if (empty tail)
count
(recur (+ 1 count) tail)))))