Thinking in Lazy Sequences - clojure

Taking an example of Fibonacci Series from the Clojure Wiki, the Clojure code is :
(def fib-seq
(lazy-cat [0 1] (map + (rest fib-seq) fib-seq)))
If you were to think about this starting from the [0 1], how does it work ? Would be great if there are suggestions on the thought process that goes into thinking in these terms.

As you noted, the [0 1] establishes the base cases: The first two values in the sequence are zero, then one. After that, each value is to be the sum of the previous value and the value before that one. Hence, we can't even compute the third value in the sequence without having at least two that come before it. That's why we need two values with which to start off.
Now look at the map form. It says to take the head items from two different sequences, combine them with the + function (adding multiple values to produce one sum), and expose the result as the next value in a sequence. The map form is zipping together two sequences — presumably of equal length — into one sequence of the same length.
The two sequences fed to map are different views of the same basic sequence, shifted by one element. The first sequence is "all but the first value of the base sequence". The second sequence is the base sequence itself, which, of course, includes the first value. But what should the base sequence be?
The definition above said that each new element is the sum of the previous (Z - 1) and the predecessor to the previous element (Z - 2). That means that extending the sequence of values requires access to the previously computed values in the same sequence. We definitely need a two-element shift register, but we can also request access to our previous results instead. That's what the recursive reference to the sequence called fib-seq does here. The symbol fib-seq refers to a sequence that's a concatenation of zero, one, and then the sum of its own Z - 2 and Z - 1 values.
Taking the sequence called fib-seq, drawing the first item yields the first element of the [0 1] vector — zero. Drawing the second item yields the second element of the vector — one. Upon drawing the third item, we consult the map to generate a sequence and use that as the remaining values. The sequence generated by map here starts out with the sum of the first item of "the rest of" [0 1], which is one, and the first item of [0 1], which is zero. That sum is one.
Drawing the fourth item consults map again, which now must compute the sum of the second item of "the rest of" the base sequence, which is the one generated by map, and the second item of the base sequence, which is the one from the vector [0 1]. That sum is two.
Drawing the fifth item consults map, summing the third item of "the rest of" the base sequence — again, the one resulting from summing zero and one — and the third item of the base sequence — which we just found to be two.
You can see how this is building up to match the intended definition for the series. What's harder to see is whether drawing each item is recomputing all the preceding values twice — once for each sequence examined by map. It turns out there's no such repetition here.
To confirm this, augment the definition of fib-seq like this to instrument the use of function +:
(def fib-seq
(lazy-cat [0 1]
(map
(fn [a b]
(println (format "Adding %d and %d." a b))
(+ a b))
(rest fib-seq) fib-seq)))
Now ask for the first ten items:
> (doall (take 10 fib-seq))
Adding 1 and 0.
Adding 1 and 1.
Adding 2 and 1.
Adding 3 and 2.
Adding 5 and 3.
Adding 8 and 5.
Adding 13 and 8.
Adding 21 and 13.
(0 1 1 2 3 5 8 13 21 34)
Notice that there are eight calls to + to generate the first ten values.
Since writing the preceding discussion, I've spent some time studying the implementation of lazy sequences in Clojure — in particular, the file LazySeq.java — and thought this would be a good place to share a few observations.
First, note that many of the lazy sequence processing functions in Clojure eventually use lazy-seq over some other collection. lazy-seq creates an instance of the Java type LazySeq, which models a small state machine. It has several constructors that allow it to start in different states, but the most interesting case is the one that starts with just a reference to a nullary function. Constructed that way, the LazySeq has neither evaluated the function nor found a delegate sequence (type ISeq in Java). The first time one asks the LazySeq for its first element — via first — or any successors — via next or rest — it evaluates the function, digs down through the resulting object to peel away any wrapping layers of other LazySeq instances, and finally feeds the innermost object through the java function RT#seq(), which results in an ISeq instance.
At this point, the LazySeq has an ISeq to which to delegate calls on behalf of first, next, and rest. Usually the "head" ISeq will be of type Cons, which stores a constant value in its "first" (or "car") slot and another ISeq in its "rest" (or "cdr") slot. That ISeq in its "rest" slot can in turn be a LazySeq, in which case accessing it will again require this same evaluation of a function, peeling away any lazy wrappers on the return value, and passing that value through RT#seq() to yield another ISeq to which to delegate.
The LazySeq instances remain linked together, but having forced one (through first, next, or rest) causes it to delegate straight through to some non-lazy ISeq thereafter. Usually that forcing evaluates a function that yields a Cons bound to first value and its tail bound to another LazySeq; it's a chain of generator functions that each yield one value (the Cons's "first" slot) linked to another opportunity to yield more values (a LazySeq in the Cons's "rest" slot).
Tying this back, in the Fibonacci Sequence example above, map will take each of the nested references to to fib-seq and walk them separately via repeated calls to rest. Each such call will transform at most one LazySeq holding an unevaluated function into a LazySeq pointing to something like a Cons. Once transformed, any subsequent accesses will quickly resolve to the Conses — where the actual values are stored. When one branch of the map zipping walks fib-seq one element behind the other, the values have already been resolved and are available for constant-time access, with no further evaluation of the generator function required.
Here are some diagrams to help visualize this interpretation of the code:
+---------+
| LazySeq |
fib-seq | fn -------> (fn ...)
| sv |
| s |
+---------+
+---------+
| LazySeq |
fib-seq | fn -------> (fn ...) -+
| sv <------------------+
| s |
+---------+
+---------+
| LazySeq |
fib-seq | fn |
| sv -------> RT#seq() -+
| s <------------------+
+---------+
+---------+ +------+
| LazySeq | | ISeq |
fib-seq | fn | | |
| sv | | |
| s ------->| |
+---------+ +------+
+---------+ +--------+ +------+
| LazySeq | | Cons | | ISeq |
fib-seq | fn | | first ---> 1 | |
| sv | | more -------->| |
| s ------->| | | |
+---------+ +--------+ +------+
+---------+ +--------+ +---------+
| LazySeq | | Cons | | LazySeq |
fib-seq | fn | | first ---> 1 | fn -------> (fn ...)
| sv | | more -------->| sv |
| s ------->| | | s |
+---------+ +--------+ +---------+
As map progresses, it hops from LazySeq to LazySeq (and hence Cons to Cons), and the rightmost edge only expands the first time one calls first, next, or rest on a given LazySeq.

My Clojure is a bit rusty, but this seems to be a literal translation of the famous Haskell one-liner:
fibs = 0 : 1 : zipWith (+) fibs (tail fibs)
[I'm going to be using pseudo-Haskell, because it's a little bit more succinct.]
The first thing you need to do, is simply let laziness sink in. When you look at a definition like this:
zeroes = 0 : zeroes
Your immediate gut reaction as a strict programmer would be "ZOMG infinite loop! Must fix bug!" But it isn't an infinite loop. This is a lazy infinite loop. If you do something stupid like
print zeroes
Then, yes, there will be an infinite loop. But as long as you simply use a finite number of elements, you will never notice that the recursion doesn't actually terminate. This is really hard to get. I still don't.
Laziness is like the monetary system: it's based on the assumption that the vast majority of people never use the vast majority of their money. So, when you put $1000 in the bank, they don't keep it in their safe. They lend it to someone else. Actually, they leverage the money, which means that they lend $5000 to someone else. They only ever need enough money so that they can quickly reshuffle it so that it's there when you are looking at it, giving you the appearance that they actually keep your money.
As long as they can manage to always give out money when you walk up to an ATM, it doesn't actually matter that the vast majority of your money isn't there: they only need the small amount you are withdrawing at the point in time when you are making your withdrawal.
Laziness works the same: whenever you look at it, the value is there. If you look at the first, tenth, hundreth, quadrillionth element of zeroes, it will be there. But it will only be there, if and when you look at it, not before.
That's why this inifintely recursive definition of zeroes works: as long as you don't try to look at the last element (or every element) of an infinite list, you are safe.
Next step is zipWith. (Clojure's map is just a generalization of what in other programming languages are usually three different functions: map, zip and zipWith. In this example, it is used as zipWith.)
The reason why the zip family of functions is named that way, is because it actually works like a zipper, and that is also how to best visualize it. Say we have some sporting event, where every contestant gets two tries, and the score from both tries is added up to give the end result. If we have two sequences, run1 and run2 with the scores from each run, we can calculate the end result like so:
res = zipWith (+) run1 run2
Assuming our two lists are (3 1 6 8 6) and (4 6 7 1 3), we line both of those lists up side by side, like the two halves of a zipper, and then we zip them together using our given function (+ in this case) to yield a new sequence:
3 1 6 8 6
+ + + + +
4 6 7 1 3
= = = = =
7 7 13 9 9
Contestant #3 wins.
So, what does our fib look like?
Well, it starts out with a 0, then we append a 1, then we append the sum of the infinite list with the infinite list shifted by one element. It's easiest to just draw that out:
the first element is zero:
0
the second element is one:
0 1
the third element is the first element plus the first element of the rest (i.e. the second element). We visualize this again like a zipper, by putting the two lists on top of each other.
0 1
+
1
=
1
Now, the element that we just computed is not just the output of the zipWith function, it is at the same time also the input, as it gets appended to both lists (which are actually the same list, just shifted by one):
0 1 1
+ +
1 1
= =
1 2
and so forth:
0 1 1 2
+ + +
1 1 2
= = =
1 2 3
0 1 1 2 3 ^
+ + + + |
1 1 2 3 ^ |
= = = = | |
1 2 3 5---+---+
Or if you draw it a little bit differently and merge the result list and the second input list (which really are the same, anyway) into one:
0 1 1 2 3 ^
+ + + + + |
1 = 1 = 2 = 3 = 5---+
That's how I visualize it, anyway.

As for how this works:
Each term of the fibonacci series is obviously the result of adding the previous two terms.
That's what map is doing here, map applies + to each element in each sequence until one of the sequences runs out (which they won't in this case, of course). So the result is a sequence of numbers that result from adding one term in the sequence to the next term in the sequence. Then you need lazy-cat to give it a starting point and make sure the function only returns what it's asked for.
The problem with this implementation is that fib-seq is holding onto the whole sequence for as long as the fib-seq is defined, so it will eventually run you out of memory.
Stuart Halloway's book spends some time on dissecting different implementations of this function, I think the most interesting one is below (it's Christophe Grande's):
(defn fibo []
(map first (iterate (fn [[a b]] [b (+ a b)]) [0 1])))
Unlike the posted implementation previously read elements of the sequence have nothing holding onto them so this one can keep running without generating an OutOfMemoryError.
How to get thinking in these terms is a harder question. So far for me it's a matter of getting acquainted with a lot of different ways of doing things and trying them out, while in general looking for ways to apply the existing function library in preference to using recursion and lazy-cat. But in some cases the recursive solution is really great, so it depends on the problem. I'm looking forward to getting the Joy of Clojure book, because I think it will help me a lot with this issue.

Related

How to implement recursive function to simplify polynomial terms with sorted tuple list?

I'm trying to implement a function to add like terms of a sorted list of tuples (first number represents polynomial's constant, the second represents the power). I'm an ocaml noob and don't really know what I'm doing wrong or how to do this correctly.
I tried to write it, but it doesn't work
https://gyazo.com/d37bb66d0e6813537c34225b6d4048d0
let rec simp list =
match list with
| (a,b)::(c,d)::remainder where b == d -> (a+c,b)::simp(remainder)
| (a,b)::(c,d)::remainder where b != d -> (a,b)::(c,d)::simp(remainder)
| _ -> list;;
This should combine all the terms with the same second value and just return one tuple with their first values added to the new list. ie: [(3,2);(4,2)] -> [(7,2)].
I am not familiar with the where keyword - there is ocaml-where which provides it, but it seems to be doing something different than what you are expecting. As such, the syntax is just wrong, and where is unexpected.
You probably meant when instead of where.

How recursion met the base case Haskell

I am trying to understand this piece of code which returns the all possible combinations of [a] passed to it:
-- Infinite list of all combinations for a given value domain
allCombinations :: [a] -> [[a]]
allCombinations [] = [[]]
allCombinations values = [] : concatMap (\w -> map (:w) values)
(allCombinations values)
Here i tried this sample input:
ghci> take 7 (allCombinations [True,False])
[[],[True],[False],[True,True],[False,True],[True,False],[False,False]]
Here it doesn't seems understandable to me which is that how the recursion will eventually stops and will return [ [ ] ], because allCombinations function certainly doesn't have any pointer which moves through the list, on each call and when it meets the base case [ ] it returns [ [ ] ]. According to me It will call allCombinations function infinite and will never stop on its own. Or may be i am missing something?
On the other hand, take only returns the first 7 elements from the final list after all calculation is carried out by going back after completing recursive calls. So actually how recursion met the base case here?
Secondly what is the purpose of concatMap here, here we could also use Map function here just to apply function to the list and inside function we could arrange the list? What is actually concatMap doing here. From definition it concatMap tells us it first map the function then concatenate the lists where as i see we are already doing that inside the function here?
Any valuable input would be appreciated?
Short answer: it will never meet the base case.
However, it does not need to. The base case is most often needed to stop a recursion, however here you want to return an infinite list, so no need to stop it.
On the other hand, this function would break if you try to take more than 1 element of allCombination [] -- have a look at #robin's answer to understand better why. That is the only reason you see a base case here.
The way the main function works is that it starts with an empty list, and then append at the beginning each element in the argument list. (:w) does that recursively. However, this lambda alone would return an infinitely nested list. I.e: [],[[True],[False]],[[[True,True],[True,False] etc. Concatmap removes the outer list at each step, and as it is called recursively this only returns one list of lists at the end. This can be a complicated concept to grasp so look for other example of the use of concatMap and try to understand how they work and why map alone wouldn't be enough.
This obviously only works because of Haskell lazy evaluation. Similarly, you know in a foldr you need to pass it the base case, however when your function is supposed to only take infinite lists, you can have undefined as the base case to make it more clear that finite lists should not be used. For example, foldr f undefined could be used instead of foldr f []
#Lorenzo has already explained the key point - that the recursion in fact never ends, and therefore this generates an infinite list, which you can still take any finite number of elements from because of Haskell's laziness. But I think it will be helpful to give a bit more detail about this particular function and how it works.
Firstly, the [] : at the start of the definition tells you that the first element will always be []. That of course is the one and only way to make a 0-element list from elements of values. The rest of the list is concatMap (\w -> map (:w) values) (allCombinations values).
concatMap f is as you observe simply the composition concat . (map f): it applies the given function to every element of the list, and concatenates the results together. Here the function (\w -> map (:w) values) takes a list, and produces the list of lists given by prepending each element of values to that list. For example, if values == [1,2], then:
(\w -> map (:w) values) [1,2] == [[1,1,2], [2,1,2]]
if we map that function over a list of lists, such as
[[], [1], [2]]
then we get (still with values as [1,2]):
[[[1], [2]], [[1,1], [2,1]], [[1,2], [2,2]]]
That is of course a list of lists of lists - but then the concat part of concatMap comes to our rescue, flattening the outermost layer, and resulting in a list of lists as follows:
[[1], [2], [1,1], [2,1], [1,2], [2,2]]
One thing that I hope you might have noticed about this is that the list of lists I started with was not arbitrary. [[], [1], [2]] is the list of all combinations of size 0 or 1 from the starting list [1,2]. This is in fact the first three elements of allCombinations [1,2].
Recall that all we know "for sure" when looking at the definition is that the first element of this list will be []. And the rest of the list is concatMap (\w -> map (:w) [1,2]) (allCombinations [1,2]). The next step is to expand the recursive part of this as [] : concatMap (\w -> map (:w) [1,2]) (allCombinations [1,2]). The outer concatMap
then can see that the head of the list it's mapping over is [] - producing a list starting [1], [2] and continuing with the results of appending 1 and then 2 to the other elements - whatever they are. But we've just seen that the next 2 elements are in fact [1] and [2]. We end up with
allCombinations [1,2] == [] : [1] : [2] : concatMap (\w -> map (:w) values) [1,2] (tail (allCombinations [1,2]))
(tail isn't strictly called in the evaluation process, it's done by pattern-matching instead - I'm trying to explain more by words than explicit plodding through equalities).
And looking at that we know the tail is [1] : [2] : concatMap .... The key point is that, at each stage of the process, we know for sure what the first few elements of the list are - and they happen to be all 0-element lists with values taken from values, followed by all 1-element lists with these values, then all 2-element lists, and so on. Once you've got started, the process must continue, because the function passed to concatMap ensures that we just get the lists obtained from taking every list generated so far, and appending each element of values to the front of them.
If you're still confused by this, look up how to compute the Fibonacci numbers in Haskell. The classic way to get an infinite list of all Fibonacci numbers is:
fib = 1 : 1 : zipWith (+) fib (tail fib)
This is a bit easier to understand that the allCombinations example, but relies on essentially the same thing - defining a list purely in terms of itself, but using lazy evaluation to progressively generate as much of the list as you want, according to a simple rule.
It is not a base case but a special case, and this is not recursion but corecursion,(*) which never stops.
Maybe the following re-formulation will be easier to follow:
allCombs :: [t] -> [[t]]
-- [1,2] -> [[]] ++ [1:[],2:[]] ++ [1:[1],2:[1],1:[2],2:[2]] ++ ...
allCombs vals = concat . iterate (cons vals) $ [[]]
where
cons :: [t] -> [[t]] -> [[t]]
cons vals combs = concat [ [v : comb | v <- vals]
| comb <- combs ]
-- iterate :: (a -> a ) -> a -> [a]
-- cons vals :: [[t]] -> [[t]]
-- iterate (cons vals) :: [[t]] -> [[[t]]]
-- concat :: [[ a ]] -> [ a ]
-- concat . iterate (cons vals) :: [[t]]
Looks different, does the same thing. Not just produces the same results, but actually is doing the same thing to produce them.(*) The concat is the same concat, you just need to tilt your head a little to see it.
This also shows why the concat is needed here. Each step = cons vals is producing a new batch of combinations, with length increasing by 1 on each step application, and concat glues them all together into one list of results.
The length of each batch is the previous batch length multiplied by n where n is the length of vals. This also shows the need to special case the vals == [] case i.e. the n == 0 case: 0*x == 0 and so the length of each new batch is 0 and so an attempt to get one more value from the results would never produce a result, i.e. enter an infinite loop. The function is said to become non-productive, at that point.
Incidentally, cons is almost the same as
== concat [ [v : comb | comb <- combs]
| v <- vals ]
== liftA2 (:) vals combs
liftA2 :: Applicative f => (a -> b -> r) -> f a -> f b -> f r
So if the internal order of each step results is unimportant to you (but see an important caveat at the post bottom) this can just be coded as
allCombsA :: [t] -> [[t]]
-- [1,2] -> [[]] ++ [1:[],2:[]] ++ [1:[1],1:[2],2:[1],2:[2]] ++ ...
allCombsA [] = [[]]
allCombsA vals = concat . iterate (liftA2 (:) vals) $ [[]]
(*) well actually, this refers to a bit modified version of it,
allCombsRes vals = res
where res = [] : concatMap (\w -> map (: w) vals)
res
-- or:
allCombsRes vals = fix $ ([] :) . concatMap (\w -> map (: w) vals)
-- where
-- fix g = x where x = g x -- in Data.Function
Or in pseudocode:
Produce a sequence of values `res` by
FIRST producing `[]`, AND THEN
from each produced value `w` in `res`,
produce a batch of new values `[v : w | v <- vals]`
and splice them into the output sequence
(by using `concat`)
So the res list is produced corecursively, starting from its starting point, [], producing next elements of it based on previous one(s) -- either in batches, as in iterate-based version, or one-by-one as here, taking the input via a back pointer into the results previously produced (taking its output as its input, as a saying goes -- which is a bit deceptive of course, as we take it at a slower pace than we're producing it, or otherwise the process would stop being productive, as was already mentioned above).
But. Sometimes it can be advantageous to produce the input via recursive calls, creating at run time a sequence of functions, each passing its output up the chain, to its caller. Still, the dataflow is upwards, unlike regular recursion which first goes downward towards the base case.
The advantage just mentioned has to do with memory retention. The corecursive allCombsRes as if keeps a back-pointer into the sequence that it itself is producing, and so the sequence can not be garbage-collected on the fly.
But the chain of the stream-producers implicitly created by your original version at run time means each of them can be garbage-collected on the fly as n = length vals new elements are produced from each downstream element, so the overall process becomes equivalent to just k = ceiling $ logBase n i nested loops each with O(1) space state, to produce the ith element of the sequence.
This is much much better than the O(n) memory requirement of the corecursive/value-recursive allCombsRes which in effect keeps a back pointer into its output at the i/n position. And in practice a logarithmic space requirement is most likely to be seen as a more or less O(1) space requirement.
This advantage only happens with the order of generation as in your version, i.e. as in cons vals, not liftA2 (:) vals which has to go back to the start of its input sequence combs (for each new v in vals) which thus must be preserved, so we can safely say that the formulation in your question is rather ingenious.
And if we're after a pointfree re-formulation -- as pointfree can at times be illuminating -- it is
allCombsY values = _Y $ ([] :) . concatMap (\w -> map (: w) values)
where
_Y g = g (_Y g) -- no-sharing fixpoint combinator
So the code is much easier understood in a fix-using formulation, and then we just switch fix with the semantically equivalent _Y, for efficiency, getting the (equivalent of the) original code from the question.
The above claims about space requirements behavior are easily tested. I haven't done so, yet.
See also:
Why does GHC make fix so confounding?
Sharing vs. non-sharing fixed-point combinator

Is this the definition for a list using cons?

In a paper I see a definition for a list (T is any type you want):
listof T ::= Nil | Cons T (listof T)
I think this says:
List of type T is defined as either Nil or the result of the function cons applied to a list of type T, where cons links the list with another list (the remainder - which could be nil).
Is this an accurate description?
Yes. This is how Lisp lists were constructed.
This is the linked list. Since a Nil or Cons is an object in memory, we thus have an object for every element in the list. That object has - given it is a Cons - two references: one to the element the list holds at that position, and one to the next node in the linked list.
So if you store a list (1,4,2,5), then internally, it is stored as:
+---+---+ +---+---+ +---+---+ +---+---+
| o | o---->| o | o---->| o | o---->| o | o----> Nil
+-|-+---+ +-|-+---+ +-|-+---+ +-|-+---+
v v v v
1 4 2 5
Or you can construct it like Cons 1 (Cons 4 (Cons 2 (cons 4 Nil))).
The concept of a Lisp list is quite popular in both functional and logic programming languages.
Working with linked lists usually requires to write different algorithms than working with arrays and arraylists. Obtaining the k-th element will require O(k) time, so usually one aims to prevent that. Therefore one usually iterates through the list and for instance emits certain elements (given these for instance satisfy a given predicate).

why the change is in place for the list with elisp?

I have a question about elisp. For example:
(setq trees '(maple oak pine birch))
-> (maple oak pine birch)
(setcdr (nthcdr 2 trees) nil)
-> nil
trees
-> (maple oak pine)
I thought (nthcdr 2 trees) returns a new list - (pine birch) and put the list into the setcdr expression, which should not change the value of trees. Could anyone explain it to me?
If you read the documentation string for nthcdr, you'll see that it just returns a pointer to the "nth" "cdr" - which is a pointer into the original list. So you're modifying the original list.
Doc string:
Take cdr N times on LIST, return the result.
Edit Wow, "pointer" seems to stir up confusion. Yes, Lisp has pointers.
Just look at the box diagrams used to explain list structure in lisp (here's Emacs's documentation on that very thing):
--- --- --- --- --- ---
| | |--> | | |--> | | |--> nil
--- --- --- --- --- ---
| | |
| | |
--> rose --> violet --> buttercup
Look at all those arrows, they almost look like they're .... pointing to things. When you take the cdr of a list, you get what the 2nd box refers to (aka "points" to), be it an atom, a string, or another cons cell. Heck, check it out on Wikipedia's entry for CAR and CDR.
If it feels better to call it a reference, use that terminology.
cdr certainly does NOT return a copy of what it refers to, which is what was confusing RNAer.
R: "Look, in the sky! The Lambda Signal! A citizen is in trouble Lambda Man!"
LM: "I see it! And I've got just the box art they need."
In Lisps, lists are singly-linked data structures, comprised of elements called cons cells. Each of these cells is a structure that consists of
a pointer to a value
a pointer to the next cell
These are called the car and cdr respectively for historical reasons. Here's the traditional box art representing a 3-element list:
Structure: (car . cdr -)--->(car . cdr -)--->(car . cdr)
| | | |
v v v v
Values: 1 2 3 nil
The car and cdr functions allow you to work with lists from this low level of abstraction, and return the values of the respective cells. Thus, car returns the 'value' of the cell, and cdr dereferences to the remainder of the list. nthcdr is a generalisation on top of cdr for convenience.
The value returned by cdr is a reference to a raw data structure, which is mutable at this level. If you change the value of a cons-cell's cdr, you are changing the underlying structure of the list.
Given:
let A = '(1 2) ~= (1 . -)-->(2 . nil)
let B = '(3 4) ~= (3 . -)-->(4 . nil)
Setting the cdr of (cdr A) to B will destructively concatenate A and B such that A is now the structure below:
A B
(1 . -)-->(2 . -)-->(3 . -)-->(4 . nil)
As we've shown, a nil value in a cell's cdr represents the end of the list - there's nothing more that can be traversed. If we set the cdr of A to nil, we lobotomise the list, such that A is now
A
(1 . nil) <- [Not pointing to anything - the rest of the list shall go wanting]
This is pretty much what you've done - you've mutated the underlying data structures using low-level functions. :) By setting one of the cell's cdrs to nil, you've trimmed the end off your list.
This is called 'mutation'. And is present everywhere apart from Haskell.
There are functions that mutate the data structures and functions that duplicate it.

Can clojure be made fully dynamic?

In clojure 1.1, all calls were dynamic, meaning that you could redefine a function at the REPL and it would be included in the running program automatically. This was also nice for things like dotrace.
In clojure 1.2, many calls seem to be statically linked, and if I want to replace a function, Sometimes, I have to find all the places where it's called and put #' in front of them.
Worse, I can't predict where I'll need to do this.
Is it possible to go back to the old default of dynamic linking? Maybe if you needed the extra iota of speed you could switch it back on for the production app, but for development I much prefer the 1.1 behaviour.
I'm hoping for some sort of compiler option like *warn-on-reflection*.
Edit:
I'm confused about what's going on. More specifically, here are two functions.
I prefer the behaviour of the second. How can I make the first one behave like the second, as I believe it used to do in 1.1?
user> (clojure-version)
"1.2.0"
user> (defn factorial[n] (if (< n 2) n (* n (factorial (dec n)))))
#'user/factorial
user> (require 'clojure.contrib.trace)
user> (clojure.contrib.trace/dotrace (factorial) (factorial 10))
TRACE t1670: (factorial 10)
TRACE t1670: => 3628800
user> (defn factorial[n] (if (< n 2) n (* n (#'factorial (dec n)))))
#'user/factorial
user> (clojure.contrib.trace/dotrace (factorial) (factorial 10))
TRACE t1681: (factorial 10)
TRACE t1682: | (factorial 9)
TRACE t1683: | | (factorial 8)
TRACE t1684: | | | (factorial 7)
TRACE t1685: | | | | (factorial 6)
TRACE t1686: | | | | | (factorial 5)
TRACE t1687: | | | | | | (factorial 4)
TRACE t1688: | | | | | | | (factorial 3)
TRACE t1689: | | | | | | | | (factorial 2)
TRACE t1690: | | | | | | | | | (factorial 1)
TRACE t1690: | | | | | | | | | => 1
TRACE t1689: | | | | | | | | => 2
TRACE t1688: | | | | | | | => 6
TRACE t1687: | | | | | | => 24
TRACE t1686: | | | | | => 120
TRACE t1685: | | | | => 720
TRACE t1684: | | | => 5040
TRACE t1683: | | => 40320
TRACE t1682: | => 362880
TRACE t1681: => 3628800
3628800
Edit (to the whole question, and a change of title):
Joost points out below that what's actually going on here is that the self call in factorial is being optimized away. I can't see why that would be done, since you can't do that many recursive self calls without blowing the stack, but it explains the observed behaviour. Perhaps it's something to do with anonymous self-calls.
The original reason for my question was that I was trying to write http://www.learningclojure.com/2011/03/hello-web-dynamic-compojure-web.html, and I got irritated with the number of places i had to type #' to get the behaviour I expected. That and the dotrace made me think that the general dynamic behaviour had gone and that the on the fly redefining, which works in some places, must be done with some clever hack.
In retrospect that seems a strange conclusion to jump to, but now I'm just confused (which is better!). Are there any references for all this? I'd love to have a general theory of when this will work and when it won't.
Everything in Clojure is fully dynamic, but you have to take note of when you're working with a Var and when you're working with the Function which is the current value of that Var.
In your first example:
(defn factorial [n] (if (< n 2) n (* n (factorial (dec n)))))
The factorial symbol is resolved to the Var #'user/factorial, which is then evaluated by the compiler to get its current value, a compiled function. This evaluation happens only once, when the function is compiled. The factorial in this first example is the value of the Var #'user/factorial at the moment the function was defined.
In the second example:
(defn factorial [n] (if (< n 2) n (* n (#'factorial (dec n)))))
You have explicitly asked for the Var #'user/factorial. Invoking a Var has the same effect as dereferencing the Var and invoking its (function) value. This example could be written more explicitly as:
(defn factorial [n] (if (< n 2) n (* n ((deref (var factorial)) (dec n)))))
The clojure.contrib.trace/dotrace macro (which I wrote, ages ago) uses binding to temporarily rebind a Var to a different value. It does not change the definition of any functions. Instead, it creates a new function which calls the original function and prints the trace lines, then it binds that function to the Var.
In your first example, since the original function was compiled with the value of the factorial function, dotrace has no effect. In the second example, each invocation of the factorial function looks up the current value of the #'user/factorial Var, so each invocation sees the alternate binding created by dotrace.
So that people don't confused about the issues, I'm explaining the "problem" related to web development.
This is a limitation of Ring not Clojure (and really it's a limitation of the Java Jetty library). You can always redefine functions normally. However the handler given to the Jetty server process cannot be redefined. Your functions are being updated, but the Jetty server cannot see these updates. Providing a var as the handler is the work around in this case.
But note the var is not the real handler. An AbstractHandler must be given to the Jetty server, so Ring uses proxy to create one which closes over your handler. This is why in order for the handler to updated dynamically it needs to be var and not a fn.
I think you're mistaken. In clojure 1.2 you certainly can redefine functions and calling code will call the new definitions. In 1.3 it looks like this might change somewhat, but 1.3 is not fixed at all yet.
In Clojure-1.3 you will also be able to redefine functions at runtime (thus changing the root binding) this will stil work the same as 1.2 and 1.1. you will however need to mark variables that will be dynamically rebound with the binding as dynamic. This breaking change offers
significant speed improvements
allows bindings to work through pmap
totally worth it because 99% of vars are never rebound anyway