Why does the second evaluation with butlast loop endlessly in Clojure?
user=> (->> (range) butlast lazy-seq (take 0))
()
user=> (->> (range) butlast lazy-seq first) ; ...
The equivalent in Haskell can be reasonably lazy-eval'ed.
ghci> take 0 . init $ [0..]
[]
ghci> head . init $ [0..]
0
Edit
take 0 is no-op as being pointed out below.
Why does the second evaluation loop endlessly in Clojure?
butlast is non-lazy as it enumerates the entire input sequence and builds up a new collection until it exhausts the input.
In both versions butlast is eliminating upstream laziness, so I think another relevant question is "why does the first version ever return when butlast is involved?": take 0 is a no-op — it doesn't consume the input sequence when n isn't a positive number, so it doesn't matter what you pass it.
Also, I think the equivalent Clojure code to head . tail $ [0..] would be:
(->> (range) rest first) ;; => 1
butlast is kinda the opposite of what you'd want.
If you look at the implementation of butlast, you can see that it's eager (using loop instead of lazy-seq). An alternate lazy implementation would do what you want:
(defn butlast'
"Like clojure.core/butlast but lazy."
[xs]
(when (seq xs)
((fn f [[x & xs]]
(if (seq xs)
(lazy-seq (cons x (f xs)))
()))
xs)))
Related
I need to write a Clojure function which takes an unevaluated arbitrarily deep nesting of lists as input, and then determines if any item in the list (not in function position) is non-numeric. This is my first time writing anything in Clojure so I am a bit confused. Here is my first attempt at making the function:
(defn list-eval
[x]
(for [lst x]
(for [item lst]
(if(integer? item)
(println "")
(println "This list contains a non-numeric value")))))
I tried to use a nested for-loop to iterate through each item in every nested list. Trying to test the function like so:
=> (list-eval (1(2 3("a" 5(3)))))
results in this exception:
ClassCastException java.lang.Long cannot be cast to clojure.lang.IFn listeval.core/eval7976 (form-init4504441070457356195.clj:1)
Does the problem here lie in the code, or in how I call the function and pass an argument? In either case, how can I make this work as intended?
This happens because (1 ..) is treated as calling a function, and 1 is a Long, and not a function. First you should change the nested list to '(1(2 3("a" 5(3)))). Next you can change your function to run recursively:
(defn list-eval
[x]
(if (list? x)
(for [lst x] (list-eval lst))
(if (integer? x)
(println "")
(println "This list contains a non-numeric value"))))
=> (list-eval '(1(2 3("a" 5(3)))))
There is a cool function called tree-seq that does all the hard work for you in traversing the structure. Use it then remove any collections, remove all numbers, and check if there is anything left.
(defn any-non-numbers?
[x]
(->> x
(tree-seq coll? #(if (map? %) (vals %) %))
(remove (some-fn coll? number?))
not-empty
boolean))
Examples:
user=> (any-non-numbers? 1)
false
user=> (any-non-numbers? [1 2])
false
user=> (any-non-numbers? [1 2 "sd"])
true
user=> (any-non-numbers? [1 2 "sd" {:x 1}])
true
user=> (any-non-numbers? [1 2 {:x 1}])
false
user=> (any-non-numbers? [1 2 {:x 1 :y "hello"}])
true
If you want to consider map keys as well, just change (vals %) to (interleave (keys %) (vals %)).
quoting
As others have mentioned, you need to quote a list to keep it from being evaluated as
code. That's the cause of the exception you're seeing.
for and nesting
for will only descend to the nesting depth you tell it to. It is not a for loop,
as you might expect, but a sequence comprehension, like the the python list comprehension.
(for [x xs, y ys] y) will presume that xs is a list of lists and flatten it.
(for [x xs, y ys, z zs] z) Is the same but with an extra level of nesting.
To walk down to any depth, you'd usually use recursion.
(There are ways to do this iteratively, but they're more difficult to wrap your head around.)
side effects
You're doing side effects (printing) inside a lazy sequence. This will work at the repl,
but if you're not using the result anywhere, it won't run and cause great confusion.
It's something every new clojurian bumps into at some point.
(doseq is like for, but for side effects.)
The clojure way is to separate functions that work with values from functions that
"do stuff", like printing to the console of launching missiles, and to keep the
side effecting functions as simple as possible.
putting it all together
Let's make a clear problem statement: Is there a non number anywhere inside an
arbitrarily nested list? If there is, print a message saying that to the console.
In a lot of cases, when you'd use a for loop in other langs reduce is what you want in clojure.
(defn collect-nested-non-numbers
;; If called with one argument, call itself with empty accumulator
;; and that argument.
([form] (collect-nested-non-numbers [] form))
([acc x]
(if (coll? x)
;; If x is a collection, use reduce to call itself on every element.
(reduce collect-nested-non-numbers acc x)
;; Put x into the accumulator if it's a non-number
(if (number? x)
acc
(conj acc x)))))
;; A function that ends in a question mark is (by convention) one that
;; returns a boolean.
(defn only-numbers? [form]
(empty? (collect-nested-non-numbers form)))
;; Our function that does stuff becomes very simple.
;; Which is a good thing, cause it's difficult to test.
(defn warn-on-non-numbers [form]
(when-not (only-numbers? form)
(println "This list contains a non-numeric value")))
And that'll work. There already exists a bunch of things that'll help you walk a nested structure, though, so you don't need to do it manually.
There's the clojure.walk namespace that comes with clojure. It's for when you have
a nested thing and want to transform some parts of it. There's tree-seq which is explained
in another answer. Specter is a library which is
a very powerful mini language for expressing transformations of nested structures.
Then there's my utils library comfy which contains reduce versions of the
functions in clojure.walk, for when you've got a nested thing and want to "reduce" it to a single value.
The nice thing about that is that you can use reduced which is like the imperative break statement, but for reduce. If it finds a non-number it doesn't need to keep going through the whole thing.
(ns foo.core
(:require
[madstap.comfy :as comfy]))
(defn only-numbers? [form]
(comfy/prewalk-reduce
(fn [ret x]
(if (or (coll? x) (number? x))
ret
(reduced false)))
true
form))
Maybe by "any item in the list (not in function position)" you meant this?
(defn only-numbers-in-arg-position? [form]
(comfy/prewalk-reduce
(fn [ret x]
(if (and (list? x) (not (every? (some-fn number? list?) (rest x))))
(reduced false)
ret))
true
form))
I've got a function like this:
(defn magic
[a b c]
(flatten (conj [] a b c)))
So on these inputs I get the following:
(magic 1 2 3) => (1 2 3)
(magic 1 [2 3] 4) => (1 2 3 4)
My question is, is there a better way of doing this?
The problem can be summarised as:
I don't know whether I will get numbers or vectors as input, but I need to return a single flat list
This could be slightly simplified (and generalized) as:
(defn magic [& args]
(flatten (apply list args)))
Or, as pointed out in the comments, it can be simplified even further (since args above is already a seq):
(defn magic [& args]
(flatten args))
Other than that, I don't see much else that can be improved about this. Is there anything in particular that's bothering you about your implementation?
If you can get seqs of seqs then you need to be more careful. And will have to recursively go into the list. There is a clojure native function for this tree-seq see the examples here:
http://clojuredocs.org/clojure_core/clojure.core/tree-seq
You'd want something like this (untested):
(defn nonempty-seq [x]
"returns x as a seq if it's a non-empty seq otherwise nil/false"
(and (coll? x) (seq x)))
(tree-seq nonempty-seq seq expr)
I see some examples that show we can get a nice head/tail destructuring of a sequence in clojure as follows:
(if-let [[x & xs] (seq coll)]
However I assume this won't work as desired for lazy sequences because this puts the values into a vector, which aren't lazy. I tried changing the vector form to a list form, and it gave me binding errors, quoted or not.
Without having binding like this, it seems that if I've got a lazy sequence where each element was a computationally-intensive equation of the previous element, I'd have to do that computation twice to get the head and tail as separate statements, right?
(let [head (first my-lazy-seq) ;; has to calculate the value of head.
tail (rest my-lazy-seq)] ;; also has to calculate the value of head to prepare the rest of the sequence.
Is there any way around this, or am I making an incorrect assumption somewhere?
user=> (let [[x & xs] (range)] [x (take 10 xs)])
[0 (1 2 3 4 5 6 7 8 9 10)]
xs is still a lazy seq, so you can use the destructuring with no problems. This will force the first element of xs, though. (Destructuring uses vector notation, but it doesn't necessarily use vectors under the covers.)
With respect to your second question: lazy seqs cache their results, so your second option would also work without extra recalculation. The head will only be calculated once.
The binding vector [x & xs] isn't actually constructing a vector at runtime. It's just the notation used for destructuring into head & tail.
So it works fine on infinite sequences:
(if-let [[x & xs] (range)]
(apply str x (take 9 xs)))
=> "0123456789"
The destructuring form is actually producing a lazy sequence in this case, which you can observe as follows:
(if-let [[x & xs :as my-seq] (range)]
(class my-seq))
=> clojure.lang.LazySeq
Reduce works fine but it is more like fold-left.
Is there any other form of reduce that lets me fold to right ?
The reason that the clojure standard library only has fold-left (reduce) is actually very subtle and it is because clojure isn't lazy enough to get the main benefit of fold-right.
The main benefit of fold-right in languages like haskell is that it can actually short-circuit.
If we do foldr (&&) True [False, True, True, True, True] the way that it actually gets evaluated is very enlightening. The only thing it needs to evaluate is the function and with 1 argument (the first False). Once it gets there it knows the answer and does not need to evaluate ANY of the Trues.
If you look very closely at the picture:
you will see that although conceptually fold-right starts and the end of the list and moves towards the front, in actuality, it starts evaluating at the FRONT of the list.
This is an example of where lazy/curried functions and tail recursion can give benefits that clojure can't.
Bonus Section (for those interested)
Based on a recommendation from vemv, I would like to mention that Clojure added a new function to the core namespace to get around this limitation that Clojure can't have the lazy right fold. There is a function called reduced in the core namespace which allows you to make Clojure's reduce lazier. It can be used to short-circuit reduce by telling it not to look at the rest of the list. For instance, if you wanted to multiply lists of numbers but had reason to suspect that the list would occasionally contain zero and wanted to handle that as a special case by not looking at the remainder of the list once you encountered a zero, you could write the following multiply-all function (note the use of reduced to indicate that the final answer is 0 regardless of what the rest of the list is).
(defn multiply-all [coll]
(reduce
(fn [accumulator next-value]
(if (zero? next-value)
(reduced 0)
(* accumulator next-value)))
1
coll))
And then to prove that it short-circuits you could multiply an infinite list of numbers which happens to contain a zero and see that it does indeed terminate with the answer of 0
(multiply-all
(cycle [1 2 3 4 0]))
Let's look at a possible definition of each:
(defn foldl [f val coll]
(if (empty? coll) val
(foldl f (f val (first coll)) (rest coll))))
(defn foldr [f val coll]
(if (empty? coll) val
(f (foldr f val (rest coll)) (first coll))))
Notice that only foldl is in tail position, and the recursive call can be replaced by recur. So with recur, foldl will not take up stack space, while foldr will. That's why reduce is like foldl. Now let's try them out:
(foldl + 0 [1 2 3]) ;6
(foldl - 0 [1 2 3]) ;-6
(foldl conj [] [1 2 3]) ;[1 2 3]
(foldl conj '() [1 2 3]) ;(3 2 1)
(foldr + 0 [1 2 3]) ;6
(foldr - 0 [1 2 3]) ;-6
(foldr conj [] [1 2 3]) ;[3 2 1]
(foldr conj '() [1 2 3]) ;(1 2 3)
Is there some reason you want to fold right? I think the most common usage of foldr is to put together a list from front to back. In Clojure we don't need that because we can just use a vector instead. Another choice to avoid stack overflow is to use a lazy sequence:
(defn make-list [coll]
(lazy-seq
(cons (first coll) (rest coll))))
So, if you want to fold right, some efficient alternatives are
Use a vector instead.
Use a lazy sequence.
Use reduced to short-circuit reduce.
If you really want to dive down a rabbit hole, use a transducer.
(take 2 (for [x (range 10)
:let [_ (println x)]
:when (even? x)] x))
>> (* 0
* 1
* 2
* 3
* 4
* 5
* 6
* 7
* 8
* 9
0 2)
I assumed I was just being remarkably dense. But no, it turns out that Clojure actually evaluates the first 32 elements of any lazy sequence (if available). Ouch.
I had a for with a recursive call in the :let. I was very curious as to why computation seemed to be proceeding in a breadth first rather than depth first fashion. It seems that computation (although, to be fair, not memory) was exploding as I kept going down all the upper branches of the recursive tree. Clojure's 32-chunking was forcing breadth first evaluation, even though the logical intent of the code was depth first.
Anyway, is there any simple way to force 1-chunking rather than 32-chunking of lazy sequences?
Michaes Fogus has written a blog entry on disabling this behavior by providing a custom ISeq implementation.
To steal shamelessly from the modified version by Colin Jones:
(defn seq1 [#^clojure.lang.ISeq s]
(reify clojure.lang.ISeq
(first [_] (.first s))
(more [_] (seq1 (.more s)))
(next [_] (let [sn (.next s)] (and sn (seq1 sn))))
(seq [_] (let [ss (.seq s)] (and ss (seq1 ss))))
(count [_] (.count s))
(cons [_ o] (.cons s o))
(empty [_] (.empty s))
(equiv [_ o] (.equiv s o))))
A simpler approach is given in The Joy of Clojure:
(defn seq1 [s]
(lazy-seq
(when-let [[x] (seq s)]
(cons x (seq1 (rest s))))))
To answer the question in your title, no, for is not lazy. However, it:
Takes a vector of one or more
binding-form/collection-expr pairs, each followed by zero or more
modifiers, and yields a lazy sequence of evaluations of expr.
(emphasis mine)
So what's going on?
basically Clojure always evaluates strictly. Lazy seqs
basically use the same tricks as python with their generators etc.
Strict evals in lazy clothes.
In other words, for eagerly returns a lazy sequence. Which won't be evaluated until you ask for it, and will be chunked.