Suppose I wrote:
(def stuff
(lazy-seq stuff))
When I ask for the value of stuff in REPL, I would expect it to be stuck in an infinite loop, since I'm defining stuff as itself(which pretty much says nothing about this sequence at all).
However, I got an empty sequence instead.
> stuff
()
Why?
Edit: By "recursive" I meant recursive data, not recursive functions.
I'm still confused about why the sequence terminated. As a comparison, the following code is stuck in infinite loop(and blows the stack).
(def stuff
(lazy-seq (cons (first stuff) [])))
Some background: This question arises from me trying to implement a prime number generator using the sieve of Eratosthenes. My first attempt was:
(def primes
(lazy-seq (cons 2
(remove (fn [x]
(let [ps (take-while #(< % x) primes)]
(some #(zero? (mod x %)) ps)))
(range 3 inf))))) ;; My customized range function that returns an infinite sequence
I figured that it would never work, since take-while would keep asking for more primes even if they could not be calculated yet. So it surprised me when it worked pretty well.
> (take 20 primes)
(2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71)
First, each lazy seq can only be realized once. Second, your definition of stuff doesn't use recursion — stuff isn't a function. If you look at the definition of lazy-seq, you can see that your definition of stuff expands to
(def stuff (new clojure.lang.LazySeq (fn* [] stuff)))
When the fn arg to the clojure.lang.LazySeq constructor is invoked, it returns the same lazy seq that has already been realized. So, when you attempt to print the lazy seq to the REPL, iteration immediately terminates and returns nil.
You can verify that the type of stuff is clojure.lang.LazySeq
user=> (type stuff)
clojure.lang.LazySeq
and that after printing stuff to the REPL, stuff has been realized
user=> (realized? stuff)
false
user=> stuff
()
user=> (realized? stuff)
true
You can use recursion to get the effect that you expected
user=> (defn stuff
[]
(lazy-seq (stuff)))
#'user/stuff
user=> (stuff) ;; Hangs forever.
Related
I am working through the Armstrong Numbers exercise on Exercism's Clojure track. An armstrong number is a number equal to the sum of its digits raised to the power of the number of digits. 153 is an Armstrong number, because: 153 = 1^3 + 5^3 + 3^3 = 1 + 125 + 27 = 153. 154 is not an Armstrong number, because: 154 != 1^3 + 5^3 + 4^3 = 1 + 125 + 64 = 190.
The test file for this exercise will call the armstrong? function, pass in a number, and expects true if the number is an Armstrong number. I have already solved the problem with this code:
(ns armstrong-numbers)
(defn pow [a b]
(reduce * 1 (repeat b a)))
(defn armstrong? [num]
(->> num
(iterate #(quot % 10))
(take-while pos?)
(map #(mod % 10))
((fn [sq]
(map #(pow % (count sq))
sq)))
(apply +)
(= num)))
but now I am trying to refactor the code. This is what I would like the code to look like:
(ns armstrong-numbers
(:require [swiss.arrows :refer :all]))
(defn pow [a b]
(reduce * 1 (repeat b a)))
(defn armstrong? [num]
(-<>> num
(iterate #(quot % 10))
(take-while pos?)
(map #(mod % 10))
(map #(pow % (count <>))
<>)
(apply +)
(= num)))
A link to the package required above: https://github.com/rplevy/swiss-arrows.
In the first code section, I create an implicit function within the thread-last macro because the sequence returned from the map form is needed in two different places in the second map form. That implicit function works just fine, but I just wanted to make the code sleeker. But when I test the second code block, I get the following error: java.lang.RuntimeException: Unable to resolve symbol: <> in this context.
I get this error whether I use #(), partial , or fn inside the second map form. I have figured out that because all of the preceding are macros (or a special form in fns case), they cannot resolve <> because it's only meaningful to the -<>> macro, which is called at a later point in macroexpansion. But why do #(), partial, and fn attempt to resolve that character at all? As far as I can see, they have no reason to know what the symbol is, or what it's purpose is. All they need to do is return that symbol rearranged into the proper s-expressions. So why does clojure attempt to resolve this (<>) symbol?
The <> symbol is only valid in the topmost level of a clause (plus literals for set, map, vector directly therein). -<> and -<>> do not establish bindings (as in let) for <>, but do code insertion at macro expansion time.
This code insertion is done only at toplevel, because making it work deeper is not only much more complex (you need a so-called code walker), but also raises interesting questions regarding the nesting of arrow forms. This complexity is likely not worth it, for such a simple effect.
If you want a real binding, you can use as-> (from clojure.core).
The documentation for -<>> is quite clear that it doesn't behave the way you wish it did:
"the 'diamond spear': top-level insertion of x in place of single
positional '<>' symbol within the threaded form if present, otherwise
mostly behave as the thread-last macro. Also works with hash literals
and vectors."
It performs replacement:
Of a single symbol
At the top level of the threaded form
Your example wishing to use it for multiple symbols, nested within subforms, will thus not work.
You have made a mistake leaving off the <> symbol in most of your forms in the failing case. Here is a working version (using the similar it-> macro in place of the Swiss Arrows). I also cleaned up the pow a bit.
(defn pow-int [a b] (Math/round (Math/pow a b)))
(defn armstrong? [num]
(it-> num
(iterate #(quot % 10) it)
(take-while pos? it)
(map #(mod % 10) it)
(map #(pow-int % (count it)) it)
(apply + it)
(= it num)))
(armstrong? 153) => true
(armstrong? 154) => false
You can find the docs on it-> here.
If you leave off the (collection) arg to a function like map, it returns a transducer; from the docs:
(map f)(map f coll)(map f c1 c2)(map f c1 c2 c3)(map f c1 c2 c3 & colls)
Returns a lazy sequence consisting of the result of applying f to
the set of first items of each coll, followed by applying f to the
set of second items in each coll, until any one of the colls is
exhausted. Any remaining items in other colls are ignored. Function
f should accept number-of-colls arguments. Returns a transducer when
no collection is provided.
And, always refer to the Clojure CheatSheet!
It appears that apply forces the realization of four elements given a lazy sequence.
(take 1
(apply concat
(repeatedly #(do
(println "called")
(range 1 10)))))
=> "called"
=> "called"
=> "called"
=> "called"
Is there a way to do an apply which does not behave this way?
Thank You
Is there a way to do an apply which does not behave this way?
I think the short answer is: not without reimplementing some of Clojure's basic functionality. apply's implementation relies directly on Clojure's implementation of callable functions, and tries to discover the proper arity of the given function to .invoke by enumerating the input sequence of arguments.
It may be easier to factor your solution using functions over lazy, un-chunked sequences / reducers / transducers, rather than using variadic functions with apply. For example, here's your sample reimplemented with transducers and it only invokes the body function once (per length of range):
(sequence
(comp
(mapcat identity)
(take 1))
(repeatedly #(do
(println "called")
(range 1 10))))
;; called
;; => (1)
Digging into what's happening in your example with apply, concat, seq, LazySeq, etc.:
repeatedly returns a new LazySeq instance: (lazy-seq (cons (f) (repeatedly f))).
For the given 2-arity (apply concat <args>), apply calls RT.seq on its argument list, which for a LazySeq then invokes LazySeq.seq, which will invoke your function
apply then calls a Java impl. method applyToHelper which tries to get the length of the argument sequence. applyToHelper tries to determine the length of the argument list using RT.boundedLength, which internally calls next and in turn seq, so it can find the proper overload of IFn.invoke to call
concat itself adds another layer of lazy-seq behavior.
You can see the stack traces of these invocations like this:
(take 1
(repeatedly #(do
(clojure.stacktrace/print-stack-trace (Exception.))
(range 1 10))))
The first trace descends from the apply's initial call to seq, and the subsequent traces from RT.boundedLength.
in fact, your code doesn't realize any of the items from the concatenated collections (ranges in your case). So the resulting collection is truly lazy as far as elements are concerned. The prints you get are from the function calls, generating unrealized lazy seqs. This one could easily be checked this way:
(defn range-logged [a b]
(lazy-seq
(when (< a b)
(println "realizing item" a)
(cons a (range-logged (inc a) b)))))
user> (take 1
(apply concat
(repeatedly #(do
(println "called")
(range-logged 1 10)))))
;;=> called
;; called
;; called
;; called
;; realizing item 1
(1)
user> (take 10
(apply concat
(repeatedly #(do
(println "called")
(range-logged 1 10)))))
;; called
;; called
;; called
;; called
;; realizing item 1
;; realizing item 2
;; realizing item 3
;; realizing item 4
;; realizing item 5
;; realizing item 6
;; realizing item 7
;; realizing item 8
;; realizing item 9
;; realizing item 1
(1 2 3 4 5 6 7 8 9 1)
So my guess is that you have nothing to worry about, as long as the collection returned from repeatedly closure is lazy
I am new to Clojure and I am currently stuck with below code which throws a NullPointerException when I run it like this:
(mapset inc [1 1 2 2])
(defn mapset
[fn v]
(loop [[head & tail] v result-set (hash-set)]
(if (nil? head)
result-set)
(conj result-set (fn head))
(recur tail result-set)))
When I print the result-set in the if block it prints an empty set whereas I expect a set like this: #{2 3}
After trying to interpret the stacktrace I guess that the NullPointerException has something to do with the following line:
(conj result-set (fn head)).
The stacktrace and the fact that the resulting set is empty leads me to believe that the inc operation somehow gets called with nil as input.
I am glad about any explanation of why this error happens
The mentioned (shortend) stacktrace looks like this:
java.lang.NullPointerException
Numbers.java: 1013 clojure.lang.Numbers/ops
Numbers.java: 112 clojure.lang.Numbers/inc
core.clj: 908 clojure.core/inc
core.clj: 903 clojure.core/inc
REPL: 6 user/mapset
REPL: 1 user/mapset
I made some small changes:
(defn mapset [f v]
(loop [[head & tail] v
result-set (hash-set)]
(if (nil? head)
result-set
(let [res (f head)]
(recur tail (conj result-set res))))))
(defn x-1 []
(mapset inc [1 1 2 2]))
(x-1)
;;=> #{3 2}
So now mapset will call the function f on each of your inputs from v, then put the result of that call into the hash-set that was created at the outset.
The problem was in the control flow logic of your if statement. The flow of execution was continuing on after the if. Thus the function fn (which I renamed to f, rather than leave it as the name of a macro) was being called even when head was nil, which was not what you intended.
As originally coded the if (which would more clearly have been a when) was doing nothing useful. But once I realised you meant to have an if, but closed the paren off too early, then the answer slotted into place. Hence minor changes fixed the problem - your underlying logic was sound - and the amended function just worked.
Similar questions: One, Two, Three.
I am thoroughly flummoxed here. I'm using the loop-recur form, I'm using doall, and still I get a stack overflow for large loops. My Clojure version is 1.5.1.
Context: I'm training a neural net to mimic XOR. The function xor is the feed-forward function, taking weights and input and returning the result; the function b-xor is the back-propagation function that returns updated weights given the results of the last call to xor.
The following loop runs just fine, runs very fast, and returns a result, and based off of the results it returns, it is training the weights perfectly:
(loop [res 1 ; <- initial value doesn't matter
weights xorw ; <- initial pseudo-random weights
k 0] ; <- count
(if (= k 1000000)
res
(let [n (rand-int 4)
r (doall (xor weights (first (nth xorset n))))]
(recur (doall r)
(doall (b-xor weights r (second (nth xorset n))))
(inc k)))))
But of course, that only gives me the result of the very last run. Obviously I want to know what weights have been trained to get that result! The following loop, with nothing but the return value changed, overflows:
(loop [res 1
weights xorw
k 0]
(if (= k 1000000)
weights ; <- new return value
(let [n (rand-int 4)
r (doall (xor weights (first (nth xorset n))))]
(recur (doall r)
(doall (b-xor weights r (second (nth xorset n))))
(inc k)))))
This doesn't make sense to me. The entirety of weights gets used in each call to xor. So why could I use weights internally but not print it to the REPL?
And as you can see, I've stuck doall in all manner of places, more than I think I should need. XOR is a toy example, so weights and xorset are both very small. I believe the overflow occurs not from the execution of xor and b-xor, but when the REPL tries to print weights, for these two reasons:
(1) this loop can go up to 1500 without overflowing the stack.
(2) the time the loop runs is consistent with the length of the loop; that is, if I loop to 5000, it runs for half a second and then prints a stack overflow; if I loop to 1000000, it runs for ten seconds and then prints a stack overflow -- again, only if I print weights and not res at the end.
(3) EDIT: Also, if I just wrap the loop in (def w ... ), then there is no stack overflow. Attempting to peek at the resulting variable does, though.
user=> (clojure.stacktrace/e)
java.lang.StackOverflowError: null
at clojure.core$seq.invoke (core.clj:133)
clojure.core$map$fn__4211.invoke (core.clj:2490)
clojure.lang.LazySeq.sval (LazySeq.java:42)
clojure.lang.LazySeq.seq (LazySeq.java:60)
clojure.lang.RT.seq (RT.java:484)
clojure.core$seq.invoke (core.clj:133)
clojure.core$map$fn__4211.invoke (core.clj:2490)
clojure.lang.LazySeq.sval (LazySeq.java:42)
nil
Where is the lazy sequence?
If you have suggestions for better ways to do this (this is just my on-the-fly REPL code), that'd be great, but I'm really looking for an explanation as to what is happening in this case.
EDIT 2: Definitely (?) a problem with the REPL.
This is bizarre. weights is a list containing six lists, four of which are empty. So far, so good. But trying to print one of these empty lists to the screen results in a stack overflow, but only the first time. The second time it prints without throwing any errors. Printing the non-empty lists produces no stack overflow. Now I can move on with my project, but...what on earth is going on here? Any ideas? (Please pardon the following ugliness, but I thought it might be helpful)
user=> (def ww (loop etc. etc. ))
#'user/ww
user=> (def x (first ww))
#'user/x
user=> x
StackOverflowError clojure.lang.RT.seq (RT.java:484)
user=> x
()
user=> (def x (nth ww 3))
#'user/x
user=> x
(8.47089879874061 -8.742792338501289 -4.661609290853221)
user=> (def ww (loop etc. etc. ))
#'user/ww
user=> ww
StackOverflowError clojure.core/seq (core.clj:133)
user=> ww
StackOverflowError clojure.core/seq (core.clj:133)
user=> ww
StackOverflowError clojure.core/seq (core.clj:133)
user=> ww
StackOverflowError clojure.core/seq (core.clj:133)
user=> ww
(() () () (8.471553034351501 -8.741870954507117 -4.661171802683782) () (-8.861958958234174 8.828933147027938 18.43649480263751 -4.532462509591159))
If you call doall on a sequence that contains more lazy sequences, doall does not recursively iterate through the subsequences. In this particular case, the return value of b-xor contained empty lists that were defined lazily from previous empty lists defined lazily from previous empty lists, and so on. All I had to do was add a single doall to the map that produced the empty lists (in b-xor), and the problem disappeared. This loop (with all of the doall's removed) never overflows:
(loop [res 1
weights xorw
k 0]
(if (= k 1000000)
weights
(let [n (rand-int 4)
r (xor weights (first (nth xorset n)))]
(recur r
(b-xor weights r (second (nth xorset n)))
(inc k)))))
Okay. So I have an answer. I hope this is helpful to some other poor soul who thought he'd solved his lazy sequencing issues with a badly-placed doall.
This still leaves me with a question about the REPL, but it should probably go under a different question so it won't have all of the baggage of this problem with it. You can see in my question above that the empty lists were evaluated correctly. Why did printing them the first time throw an exception? I'm going to experiment a bit with this, and if I can't figure it out...new question!
I'm trying to understand when clojure's lazy sequences are lazy, and when the work happens, and how I can influence those things.
user=> (def lz-seq (map #(do (println "fn call!") (identity %)) (range 4)))
#'user/lz-seq
user=> (let [[a b] lz-seq])
fn call!
fn call!
fn call!
fn call!
nil
I was hoping to see only two "fn call!"s here. Is there a way to manage that?
Anyway, moving on to something which indisputably only requires one evaluation:
user=> (def lz-seq (map #(do (println "fn call!") (identity %)) (range 4)))
#'user/lz-seq
user=> (first lz-seq)
fn call!
fn call!
fn call!
fn call!
0
Is first not suitable for lazy sequences?
user=> (def lz-seq (map #(do (println "fn call!") (identity %)) (range 4)))
#'user/lz-seq
user=> (take 1 lz-seq)
(fn call!
fn call!
fn call!
fn call!
0)
At this point, I'm completely at a loss as to how to access the beginning of my toy lz-seq without having to realize the entire thing. What's going on?
Clojure's sequences are lazy, but for efficiency are also chunked, realizing blocks of 32 results at a time.
=>(def lz-seq (map #(do (println (str "fn call " %)) (identity %)) (range 100)))
=>(first lz-seq)
fn call 0
fn call 1
...
fn call 31
0
The same thing happens once you cross the 32 boundary first
=>(nth lz-seq 33)
fn call 0
fn call 1
...
fn call 63
33
For code where considerable work needs to be done per realisation, Fogus gives a way to work around chunking, and gives a hint an official way to control chunking might be underway.
I believe that the expression produces a chunked sequence. Try replacing 4 with 10000 in the range expression - you'll see something like 32 calls on first eval, which is the size of the chunk.
A lazy sequence is one where we evaluate the sequence as and when needed. (hence lazy). Once a result is evaluated, it is cached so that it can be re-used (and we don't have to do the work again). If you try to realize an item of the sequence that hasn't been evaluated yet, clojure evaluates it and returns the value to you. However, it also does some extra work. It anticipates that you might want to evaluate the next element(s) in the sequence and does that for you too. This is done to avoid some performance overheads, the exact nature of which is beyond my skill-level. Thus, when you say (first lz-seq), it actually calculates the first as well as the next few elements in the seq. Since your println statement is a side effect, you can see the evaluation happening. Now if you were to say (second lz-seq), you will not see the println again since the result has already been evaluated and cached.
A better way to see that your sequence is lazy is :
user=> def lz-seq (map #(do (println "fn call!") (identity %)) (range 400))
#'user/lz-seq
user=> (first lz-seq)
This will print a few "fn call!" statements, but not all 400 of them. That's because the first call will actually end up evaluating more than one element of the sequence.
Hope this explanation is clear enough.
I think its some sort of optimization made by repl.
My repl is caching 32 at a time.
user=> (def lz-seq (map #(do (println "fn call!") (identity %)) (range 100))
#'user/lz-seq
user=> (first lz-seq)
prints 32 times
user=> (take 20 lz-seq)
does not print any "fn call!"
user=> (take 33 lz-seq)
prints 0 to 30, then prints 32 more "fn call!"s followed by 31,32