I have a function that is supposed to take a lazy seq and return an unrealized lazy seq. Now I want to write a unit test (in test-is btw) to make sure that the result is an unrealized lazy sequence.
user=> (instance? clojure.lang.LazySeq (map + [1 2 3 4] [1 2 3 4]))
true
If you have a lot of things to test, maybe this would simplify it:
(defmacro is-lazy? [x] `(is (instance? clojure.lang.LazySeq ~x)))
user=> (is-lazy? 1)
FAIL in clojure.lang.PersistentList$EmptyList#1 (NO_SOURCE_FILE:7)
expected: (clojure.core/instance? clojure.lang.LazySeq 1)
actual: (not (clojure.core/instance? clojure.lang.LazySeq 1))
false
user=> (is-lazy? (map + [1 2 3 4] [1 2 3 4]))
true
As of Clojure 1.3 there is also the realized? function: "Returns true if a value has been produced for a promise, delay, future or lazy sequence."
Use a function with a side effect (say, writing to a ref) as the sequence generator function in your test case. If the side effect never happens, it means the sequence remains unrealized... as soon as the sequence is realized, the function will be called.
First, set it up like this:
(def effect-count (ref 0))
(defn test-fn [x]
(do
(dosync (alter effect-count inc))
x))
Then, run your function. I'll just use map, here:
(def result (map test-fn (range 1 10)))
Test if test-fn ever ran:
(if (= 0 #effect-count)
(println "Test passed!")
(println "Test failed!"))
Since we know map is lazy, it should always work at this point. Now, force evaluation of the sequence:
(dorun result)
And check the value of effect-count again. This time, we DO expect the side effect to have triggered. And, it is so...
user=>#effect-count
9
Related
It appears that apply forces the realization of four elements given a lazy sequence.
(take 1
(apply concat
(repeatedly #(do
(println "called")
(range 1 10)))))
=> "called"
=> "called"
=> "called"
=> "called"
Is there a way to do an apply which does not behave this way?
Thank You
Is there a way to do an apply which does not behave this way?
I think the short answer is: not without reimplementing some of Clojure's basic functionality. apply's implementation relies directly on Clojure's implementation of callable functions, and tries to discover the proper arity of the given function to .invoke by enumerating the input sequence of arguments.
It may be easier to factor your solution using functions over lazy, un-chunked sequences / reducers / transducers, rather than using variadic functions with apply. For example, here's your sample reimplemented with transducers and it only invokes the body function once (per length of range):
(sequence
(comp
(mapcat identity)
(take 1))
(repeatedly #(do
(println "called")
(range 1 10))))
;; called
;; => (1)
Digging into what's happening in your example with apply, concat, seq, LazySeq, etc.:
repeatedly returns a new LazySeq instance: (lazy-seq (cons (f) (repeatedly f))).
For the given 2-arity (apply concat <args>), apply calls RT.seq on its argument list, which for a LazySeq then invokes LazySeq.seq, which will invoke your function
apply then calls a Java impl. method applyToHelper which tries to get the length of the argument sequence. applyToHelper tries to determine the length of the argument list using RT.boundedLength, which internally calls next and in turn seq, so it can find the proper overload of IFn.invoke to call
concat itself adds another layer of lazy-seq behavior.
You can see the stack traces of these invocations like this:
(take 1
(repeatedly #(do
(clojure.stacktrace/print-stack-trace (Exception.))
(range 1 10))))
The first trace descends from the apply's initial call to seq, and the subsequent traces from RT.boundedLength.
in fact, your code doesn't realize any of the items from the concatenated collections (ranges in your case). So the resulting collection is truly lazy as far as elements are concerned. The prints you get are from the function calls, generating unrealized lazy seqs. This one could easily be checked this way:
(defn range-logged [a b]
(lazy-seq
(when (< a b)
(println "realizing item" a)
(cons a (range-logged (inc a) b)))))
user> (take 1
(apply concat
(repeatedly #(do
(println "called")
(range-logged 1 10)))))
;;=> called
;; called
;; called
;; called
;; realizing item 1
(1)
user> (take 10
(apply concat
(repeatedly #(do
(println "called")
(range-logged 1 10)))))
;; called
;; called
;; called
;; called
;; realizing item 1
;; realizing item 2
;; realizing item 3
;; realizing item 4
;; realizing item 5
;; realizing item 6
;; realizing item 7
;; realizing item 8
;; realizing item 9
;; realizing item 1
(1 2 3 4 5 6 7 8 9 1)
So my guess is that you have nothing to worry about, as long as the collection returned from repeatedly closure is lazy
I want to know if this is the right way to loop through an collection:
(def citrus-list ["lemon" "orange" "grapefruit"])
(defn display-citrus [citruses]
(loop [[citrus & citruses] citruses]
(println citrus)
(if citrus (recur citruses))
))
(display-citrus citrus-list)
I have three questions:
the final print displays nil, is it ok or how can avoid it?
I understand what & is doing in this example but I donĀ“t see it in other cases, maybe you could provide a few examples
Any other example to get the same result?
Thanks,
R.
First of all your implementation is wrong. It would fail if your list contains nil:
user> (display-citrus [nil "asd" "fgh"])
;;=> nil
nil
And print unneeded nil if the list is empty:
user> (display-citrus [])
;;=> nil
nil
you can fix it this way:
(defn display-citrus [citruses]
(when (seq citruses)
(loop [[citrus & citruses] citruses]
(println citrus)
(if (seq citruses) (recur citruses)))))
1) it is totally ok: for non-empty collection the last call inside function is println, which returns nil, and for empty collection you don't call anything, meaning nil would be returned (clojure function always returns a value). To avoid nil in your case you should explicitly return some value (like this for example):
(defn display-citrus [citruses]
(when (seq citruses)
(loop [[citrus & citruses] citruses]
(println citrus)
(if (seq citruses) (recur citruses))))
citruses)
user> (display-citrus citrus-list)
;;=> lemon
;;=> orange
;;=> grapefruit
["lemon" "orange" "grapefruit"]
2) some articles about destructuring should help you
3) yes, there are some ways to do this. The simplest would be:
(run! println citrus-list)
Answering your last question, you should avoid using loop in Clojure. This form is rather for experienced users that really know what they do. In your case, you may use such more user-friendly forms as doseq. For example:
(doseq [item collection]
(println item))
You may also use map but keep in mind that it returns a new list (of nils if your case) that not sometimes desirable. Say, you are interested only in printing but not in the result.
In addition, map is lazy and won't be evaluated until it has been printed or evaluated with doall.
For most purpose, you can use either map, for or loop.
=> (map count citrus-list)
(5 6 10)
=> (for [c citrus-list] (count c))
(5 6 10)
=> (loop [[c & citrus] citrus-list
counts []]
(if-not c counts
(recur citrus (conj counts (count c)))))
[5 6 10]
I tend to use map as much of possible. The syntax is more concise, and it clearly separates the control flow (sequential loop) from the transformation logic (count the values).
For instance, you can run the same operation (count) in parallel by simply replacing map by pmap
=> (pmap count citrus-list)
[5 6 10]
In Clojure, most operations on collection are lazy. They will not take effect as long as your program doesn't need the new values. To apply the effect immediately, you can enclose your loop operation inside doall
=> (doall (map count citrus-list))
(5 6 10)
You can also use doseq if you don't care about return values. For instance, you can use doseq with println since the function will always return nil
=> (doseq [c citrus-list] (println c))
lemon
orange
grapefruit
I am constructing a list of hash maps which is then passed to another function. When I try to print each hash maps from the list using map it is not working. I am able to print the full list or get the first element etc.
(defn m [a]
(println a)
(map #(println %) a))
The following works from the repl only.
(m (map #(hash-map :a %) [1 2 3]))
But from the program that I load using load-file it is not working. I am seeing the a but not its individual elements. What's wrong?
In Clojure tranform functions return a lazy sequence. So, (map #(println %) a) return a lazy sequence. When consumed, the map action is applied and only then the print-side effect is visible.
If the purpose of the function is to have a side effect, like printing, you need to eagerly evaluate the transformation. The functions dorun and doall
(def a [1 2 3])
(dorun (map #(println %) a))
; returns nil
(doall (map #(println %) a))
; returns the collection
If you actually don't want to map, but only have a side effect, you can use doseq. It is intended to 'iterate' to do side effects:
(def a [1 2 3])
(doseq [i a]
(println i))
If your goal is simply to call an existing function on every item in a collection in order, ignoring the returned values, then you should use run!:
(run! println [1 2 3])
;; 1
;; 2
;; 3
;;=> nil
In some more complicated cases it may be preferable to use doseq as #Gamlor suggests, but in this case, doseq only adds boilerplate.
I recommend to use tail recursion:
(defn printList [a]
(let [head (first a)
tail (rest a)]
(when (not (nil? head))
(println head)
(printList tail))))
I'm trying to understand when clojure's lazy sequences are lazy, and when the work happens, and how I can influence those things.
user=> (def lz-seq (map #(do (println "fn call!") (identity %)) (range 4)))
#'user/lz-seq
user=> (let [[a b] lz-seq])
fn call!
fn call!
fn call!
fn call!
nil
I was hoping to see only two "fn call!"s here. Is there a way to manage that?
Anyway, moving on to something which indisputably only requires one evaluation:
user=> (def lz-seq (map #(do (println "fn call!") (identity %)) (range 4)))
#'user/lz-seq
user=> (first lz-seq)
fn call!
fn call!
fn call!
fn call!
0
Is first not suitable for lazy sequences?
user=> (def lz-seq (map #(do (println "fn call!") (identity %)) (range 4)))
#'user/lz-seq
user=> (take 1 lz-seq)
(fn call!
fn call!
fn call!
fn call!
0)
At this point, I'm completely at a loss as to how to access the beginning of my toy lz-seq without having to realize the entire thing. What's going on?
Clojure's sequences are lazy, but for efficiency are also chunked, realizing blocks of 32 results at a time.
=>(def lz-seq (map #(do (println (str "fn call " %)) (identity %)) (range 100)))
=>(first lz-seq)
fn call 0
fn call 1
...
fn call 31
0
The same thing happens once you cross the 32 boundary first
=>(nth lz-seq 33)
fn call 0
fn call 1
...
fn call 63
33
For code where considerable work needs to be done per realisation, Fogus gives a way to work around chunking, and gives a hint an official way to control chunking might be underway.
I believe that the expression produces a chunked sequence. Try replacing 4 with 10000 in the range expression - you'll see something like 32 calls on first eval, which is the size of the chunk.
A lazy sequence is one where we evaluate the sequence as and when needed. (hence lazy). Once a result is evaluated, it is cached so that it can be re-used (and we don't have to do the work again). If you try to realize an item of the sequence that hasn't been evaluated yet, clojure evaluates it and returns the value to you. However, it also does some extra work. It anticipates that you might want to evaluate the next element(s) in the sequence and does that for you too. This is done to avoid some performance overheads, the exact nature of which is beyond my skill-level. Thus, when you say (first lz-seq), it actually calculates the first as well as the next few elements in the seq. Since your println statement is a side effect, you can see the evaluation happening. Now if you were to say (second lz-seq), you will not see the println again since the result has already been evaluated and cached.
A better way to see that your sequence is lazy is :
user=> def lz-seq (map #(do (println "fn call!") (identity %)) (range 400))
#'user/lz-seq
user=> (first lz-seq)
This will print a few "fn call!" statements, but not all 400 of them. That's because the first call will actually end up evaluating more than one element of the sequence.
Hope this explanation is clear enough.
I think its some sort of optimization made by repl.
My repl is caching 32 at a time.
user=> (def lz-seq (map #(do (println "fn call!") (identity %)) (range 100))
#'user/lz-seq
user=> (first lz-seq)
prints 32 times
user=> (take 20 lz-seq)
does not print any "fn call!"
user=> (take 33 lz-seq)
prints 0 to 30, then prints 32 more "fn call!"s followed by 31,32
What's the difference between doseq and for in Clojure? What are some examples of when you would choose to use one over the other?
The difference is that for builds a lazy sequence and returns it while doseq is for executing side-effects and returns nil.
user=> (for [x [1 2 3]] (+ x 5))
(6 7 8)
user=> (doseq [x [1 2 3]] (+ x 5))
nil
user=> (doseq [x [1 2 3]] (println x))
1
2
3
nil
If you want to build a new sequence based on other sequences, use for. If you want to do side-effects (printing, writing to a database, launching a nuclear warhead, etc) based on elements from some sequences, use doseq.
Note also that doseq is eager while for is lazy. The example missing in Rayne's answer is
(for [x [1 2 3]] (println x))
At the REPL, this will generally do what you want, but that's basically a coincidence: the REPL forces the lazy sequence produced by for, causing the printlns to happen. In a non-interactive environment, nothing will ever be printed. You can see this in action by comparing the results of
user> (def lazy (for [x [1 2 3]] (println 'lazy x)))
#'user/lazy
user> (def eager (doseq [x [1 2 3]] (println 'eager x)))
eager 1
eager 2
eager 3
#'user/eager
Because the def form returns the new var created, and not the value which is bound to it, there's nothing for the REPL to print, and lazy will refer to an unrealized lazy-seq: none of its elements have been computed at all. eager will refer to nil, and all of its printing will have been done.