I'm going over SICP translating problems into Clojure to learn both Clojure and read SICP. Currently, I am stuck with the Count Leaves procedure from Section 2.2.2.
The goal is to write a function that takes a list representation of a tree, e.g. '(1 2 '(3 4)) and counts the number of leaves, in this case 4.
So far, the closest I have come up with is
(defn count-leaves
[coll]
(cond
(nil? coll) 0
(not (seq? coll)) 1
:else (let [[left & right] coll] (+ (count-leaves left) (count-leaves right)))
))
However, this does not handle subtrees correctly. In particular, it evaluates
(count-leaves '('(1)))
to 2 instead of 1.
Note the Scheme implementation from the book is:
(define (count-leaves x)
(cond ((null? x) 0)
((not (pair? x)) 1)
(else (+ (count-leaves (car x))
(count-leaves (cdr x))))))
Comment
As #jkiski's comment suggests, your code works. So there is no problem.
But I'd prefer to test whether the argument is a sequence first. Try working out how (count-leaves '()) evaluates to 0!
Switch the first two clauses of the cond and we get ...
(defn count-leaves [coll]
(cond
(not (seq? coll)) 1
(empty? coll) 0
:else (+ (count-leaves (first coll)) (count-leaves (rest coll)))))
... where I've used rest instead of the next implicit in the destructuring, so empty? instead of nil? to test it. This deals properly with nil values, which your code doesn't. But it is still properly recursive, so remains subject to stack overflow.
I prefer ...
(defn count-leaves [coll]
(if (seq? coll)
(apply + (map count-leaves coll))
1))
... which is still recursive, but cleaner.
Edit
I've had to retract my good opinion of #glts's solution: postwalk is recursive, so offers no real advantage.
Translating examples from one language into another is a good exercise, but do keep in mind that a language also comes with its own idiom, and its own core library.
In Clojure, walking data structures is especially easy with clojure.walk.
After requiring clojure.walk you can run postwalk-demo to see how your data structure is traversed:
(require '[clojure.walk :refer [postwalk postwalk-demo]])
(postwalk-demo '(1 2 (3 4)))
Walked: 1
Walked: 2
Walked: 3
Walked: 4
Walked: (3 4)
Walked: (1 2 (3 4))
Then you can devise a function to count the leaf nodes and pass it to postwalk.
(postwalk (fn [e]
(if (seq? e) (apply + e) 1))
'(1 2 (3 4)))
During the postwalk traversal leaf nodes get replaced with 1, and seqs get replaced with the sum of their constituent leaf counts.
I realise that this is a tangential answer but perhaps you still find it useful!
Related
At the end of CLOJURE for the BRAVE and TRUE's Chapter 4 there's an exercise: make an append function that appends a new entry to a list.
What's the most efficient way to do so?
From what I understand of datatypes in general, if conj prepends elements to a list, that simply means that consistently appending to a list is either silly or the choice of using a list type was silly.
Anyway, the solution I've written is this
(defn append
[lst item]
(into '() (conj (into '() lst) item)))
Well, that's actually the same as
(defn append
[lst item]
(reverse (conj (reverse lst) item)))
I believe, so probably is costly because I reverse the list twice?
Another solution I could think of is
(defn append
[lst item]
(apply list (conj (apply vector lst) item)))
But they all seem to traverse the sequence of values twice, so I don't see why any one shoiuld be better than another.
Is there the proper way to accomplish the task?
to avoid the double traversal, you could use the classic recursive approach. Something like this:
(defn append [lst item]
(loop [res [] lst lst]
(if (seq lst)
(recur (conj res (first lst))
(rest lst))
(seq (conj res item)))))
so, it just rebuilds the list from scratch.
To make it's performance better in clojure, you can optimize it with transient collection:
(defn append' [lst item]
(loop [res (transient []) lst lst]
(if (seq lst)
(recur (conj! res (first lst))
(rest lst))
(seq (persistent! (conj! res item))))))
but yes, as another answer proposes, you should carefully pick the proper data structure for every specific task, so in practice you would want to use vector.
As of concat variant, it comes for free (since it is lazy), but it has this known pitfall, which is stack overflow on applying a lot of stacked lazy functions:
user> (defn append-concat [lst item]
(concat lst [item]))
user> (reduce append-concat () (range 1000000))
;; Error printing return value (StackOverflowError) at clojure.lang.LazySeq/sval (LazySeq.java:42).
There is a reason, why conj adds a new item at the start of the list and not at its end. Because you have to traverse the list to add something at its end. This is not very efficient. And it is because of the nature of a linked list. That's why in lisp you cons new items onto a list and at the very end reverse it. This way, you traverse the list just once while building it.
In Clojure, if you want to add to a list at the end, it is more idiomatic not to use a list but instead the vector type. On a vector, conj adds right to the end.
But let's say, you want to traverse a list once and add to the end.
That is actually:
(defn my-append [lst item] (concat lst (list item)))
(my-append '(1 2 3) 4)
;; (1 2 3 4)
but as I said, if you want to add repeatedly at the end, don't use a list, but a vector and conj to the end.
;; more idiomatic clojure in thisst to add something at its end. This is not very efficient. And it is because of the nature of a linked list. Tha case
(def v [1 2 3])
(conj v 4) ;; [1 2 3 4] ;; unbeatable in efficience, because no traversal!
;; and convert to a list
;; e.g.
(def l (conj v 4))
(seq l) ;; (1 2 3 4)
In lisp, I can pass an argument to a function and have it altered within the function. (AKA destructive functions). However, in Clojure, I've read somewhere that it is not permissible to alter the given arguments within that same function. For example:
(defn add-two-lists [list1 list2]
(for [n (range (count list1))]
(+ (nth list1 n) (nth list2 n))))
This is a normal function and its output is the addition of the two identical lists. However, I want something like this:
(defn add-two-lists [list1 list2 added_list]
(set! added_list
(for [n (range (count list1))]
(+ (nth list1 n) (nth list2 n)))))
Perhaps my use of set! is wrong or misused, and I still get errors. Is there a elegant way to destructively modify arguments in Clojure?
Destructive modification is discouraged in Clojure - I would encourage you to find ways to write your code without resorting to destructive updates.
In the spirit of giving a Clojurey solution, I would write your add-two-lists function as follows:
(defn add-two-lists [list1 list2]
(map + list1 list2))
This has a few advantages:
It's purely functional
It's lazy, so you can even add lists of infinite length (try doing that with a destructively updated argument!)
It's performance is O(n) which is optimal - the versions in the question are actually O(n^2) since nth is itself an O(n) operation on lists.
It's nice and concise :-)
Clojure Provides several mutable types that would work well in this situation, for instance you could pass an atom to the function and have it set the value in that atom.
(defn add-two-lists [list1 list2 added_list]
(reset! added_list
(for [n (range (count list1))]
(+ (nth list1 n) (nth list2 n)))))
then after you call this you get the value out of the atom with #/deref
edit: if efficiency is the goal then using a transient collection may help
The with-local-vars macro lets you create thread-locally bound vars that you can modify with var-set. You also have to access the var's value with var-get, which can be shortened to just #.
(defn add-two-lists [list1 list2 added-list]
(var-set added-list
(for [n (range (count list1))]
(+ (nth list1 n) (nth list2 n)))))
(with-local-vars [my-list nil]
(add-two-lists '(1 2 3) '(3 4 5) my-list)
#my-list)
EDIT:
On a stylistic note, you could use map to add the two lists without using the nth function to random-access each index in each list:
(defn add-two-lists [list1 list2 added-list]
(var-set added-list (map + list1 list2)))
From the clojure documentation on set!
Note - you cannot assign to function params or local bindings. Only Java fields, Vars, Refs and Agents are mutable in Clojure.
Typically in courses where functional languages are chosen, you are encouraged not to use for-loops and assignments. Instead you should favor recursion and composition of functions.
So if I wanted to add 2 to each element of a list, in an imperative language, I would just do a for loop, but in a functional language, I would use recursion
user=> (def add2
(fn [mylist]
(if
(empty? mylist)
nil
(cons (+ (first mylist) 2) (add2 (rest mylist))))))
user=> (add2 (list 1 2 3))
(3 4 5)
(take 2 (for [x (range 10)
:let [_ (println x)]
:when (even? x)] x))
>> (* 0
* 1
* 2
* 3
* 4
* 5
* 6
* 7
* 8
* 9
0 2)
I assumed I was just being remarkably dense. But no, it turns out that Clojure actually evaluates the first 32 elements of any lazy sequence (if available). Ouch.
I had a for with a recursive call in the :let. I was very curious as to why computation seemed to be proceeding in a breadth first rather than depth first fashion. It seems that computation (although, to be fair, not memory) was exploding as I kept going down all the upper branches of the recursive tree. Clojure's 32-chunking was forcing breadth first evaluation, even though the logical intent of the code was depth first.
Anyway, is there any simple way to force 1-chunking rather than 32-chunking of lazy sequences?
Michaes Fogus has written a blog entry on disabling this behavior by providing a custom ISeq implementation.
To steal shamelessly from the modified version by Colin Jones:
(defn seq1 [#^clojure.lang.ISeq s]
(reify clojure.lang.ISeq
(first [_] (.first s))
(more [_] (seq1 (.more s)))
(next [_] (let [sn (.next s)] (and sn (seq1 sn))))
(seq [_] (let [ss (.seq s)] (and ss (seq1 ss))))
(count [_] (.count s))
(cons [_ o] (.cons s o))
(empty [_] (.empty s))
(equiv [_ o] (.equiv s o))))
A simpler approach is given in The Joy of Clojure:
(defn seq1 [s]
(lazy-seq
(when-let [[x] (seq s)]
(cons x (seq1 (rest s))))))
To answer the question in your title, no, for is not lazy. However, it:
Takes a vector of one or more
binding-form/collection-expr pairs, each followed by zero or more
modifiers, and yields a lazy sequence of evaluations of expr.
(emphasis mine)
So what's going on?
basically Clojure always evaluates strictly. Lazy seqs
basically use the same tricks as python with their generators etc.
Strict evals in lazy clothes.
In other words, for eagerly returns a lazy sequence. Which won't be evaluated until you ask for it, and will be chunked.
I know I can do the following in Common Lisp:
CL-USER> (let ((my-list nil))
(dotimes (i 5)
(setf my-list (cons i my-list)))
my-list)
(4 3 2 1 0)
How do I do this in Clojure? In particular, how do I do this without having a setf in Clojure?
My personal translation of what you are doing in Common Lisp would Clojurewise be:
(into (list) (range 5))
which results in:
(4 3 2 1 0)
A little explanation:
The function into conjoins all elements to a collection, here a new list, created with (list), from some other collection, here the range 0 .. 4. The behavior of conj differs per data structure. For a list, conj behaves as cons: it puts an element at the head of a list and returns that as a new list. So what is does is this:
(cons 4 (cons 3 (cons 2 (cons 1 (cons 0 (list))))))
which is similar to what you are doing in Common Lisp. The difference in Clojure is that we are returning new lists all the time, instead of altering one list. Mutation is only used when really needed in Clojure.
Of course you can also get this list right away, but this is probably not what you wanted to know:
(range 4 -1 -1)
or
(reverse (range 5))
or... the shortest version I can come up with:
'(4 3 2 1 0)
;-).
Augh the way to do this in Clojure is to not do it: Clojure hates mutable state (it's available, but using it for every little thing is discouraged). Instead, notice the pattern: you're really computing (cons 4 (cons 3 (cons 2 (cons 1 (cons 0 nil))))). That looks an awful lot like a reduce (or a fold, if you prefer). So, (reduce (fn [acc x] (cons x acc)) nil (range 5)), which yields the answer you were looking for.
Clojure bans mutation of local variables for the sake of thread safety, but it is still possible to write loops even without mutation. In each run of the loop you want to my-list to have a different value, but this can be achieved with recursion as well:
(let [step (fn [i my-list]
(if (< i 5)
my-list
(recur (inc i) (cons i my-list))))]
(step 0 nil))
Clojure also has a way to "just do the looping" without making a new function, namely loop. It looks like a let, but you can also jump to beginning of its body, update the bindings, and run the body again with recur.
(loop [i 0
my-list nil]
(if (< i 5)
my-list
(recur (inc i) (cons i my-list))))
"Updating" parameters with a recursive tail call can look very similar to mutating a variable but there is one important difference: when you type my-list in your Clojure code, its meaning will always always the value of my-list. If a nested function closes over my-list and the loop continues to the next iteration, the nested function will always see the value that my-list had when the nested function was created. A local variable can always be replaced with its value, and the variable you have after making a recursive call is in a sense a different variable.
(The Clojure compiler performs an optimization so that no extra space is needed for this "new variable": When a variable needs to be remembered its value is copied and when recur is called the old variable is reused.)
For this I would use range with the manually set step:
(range 4 (dec 0) -1) ; => (4 3 2 1 0)
dec decreases the end step with 1, so that we get value 0 out.
user=> (range 5)
(0 1 2 3 4)
user=> (take 5 (iterate inc 0))
(0 1 2 3 4)
user=> (for [x [-1 0 1 2 3]]
(inc x)) ; just to make it clear what's going on
(0 1 2 3 4)
setf is state mutation. Clojure has very specific opinions about that, and provides the tools for it if you need it. You don't in the above case.
(let [my-list (atom ())]
(dotimes [i 5]
(reset! my-list (cons i #my-list)))
#my-list)
(def ^:dynamic my-list nil);need ^:dynamic in clojure 1.3
(binding [my-list ()]
(dotimes [i 5]
(set! my-list (cons i my-list)))
my-list)
This is the pattern I was looking for:
(loop [result [] x 5]
(if (zero? x)
result
(recur (conj result x) (dec x))))
I found the answer in Programming Clojure (Second Edition) by Stuart Halloway and Aaron Bedra.
Say I have a list (a b c d e), I'm trying to figure out a "lazy" and Clojure-idiomatic way of producing a list or seq of each item with each other item, such as ((a b) (a c) (a d) (a e) (b c) (b d) (b e) (c d) (c e) (d e)).
Clojure's for doesn't seem to allow this, it just produces one item as it goes through a list and doesn't allow access to a sub-list. The closest I've come so far is to turn the original list into a vector, and have a for statement that iterates over the count of the vector and grab indexed items,
(for [i (range vector-count) j (range i vector-count)]
...
but I hope that there's a better way.
You want combinations. There's a function to give you a lazy sequence of combinations right here in clojure-contrib.
user> (combinations [:a :b :c :d :e] 2)
((:a :b) (:a :c) (:a :d) (:a :e) (:b :c) (:b :d) (:b :e) (:c :d) (:c :e) (:d :e))
(Unfortunately, the monolithic clojure-contrib repo containing that file is deprecated in favor of splitting contrib up into smaller separate repos, and clojure.contrib.combinatorics doesn't seem to have made the transition yet, so there's no easy way currently to install that library, but you can snag the code from github if nothing else.)
FWIW, I tried writing this without looking at the code in contrib. I think my code is much easier to understand, and in my simple-minded benchmark it's more than twice as fast. It's available at https://gist.github.com/1042047, and reproduced below for convenience:
(defn combinations [n coll]
(if (= 1 n)
(map list coll)
(lazy-seq
(when-let [[head & tail] (seq coll)]
(concat (for [x (combinations (dec n) tail)]
(cons head x))
(combinations n tail))))))
user> (require '[clojure.contrib.combinatorics :as combine])
nil
user> (time (last (user/combinations 4 (range 100))))
"Elapsed time: 4379.959957 msecs"
(96 97 98 99)
user> (time (last (combine/combinations (range 100) 4)))
"Elapsed time: 10913.170605 msecs"
(96 97 98 99)
I strongly prefer the [n coll] argument order, rather than [coll n] - clojure likes the "important" argument to come last, especially for functions dealing with seqs: mostly this is for ease of combination with (->>) in scenarios like (->> (my-list) (filter even?) (take 10) (combinations 8)).
why use range and index grabbing in the for loop?
(let [myseq (list :a :b :c :d)]
(for [a myseq b myseq] (list a b)))
works.