Either I'm exhausted for the day and not able to think properly, or this is not possible, but I wanted to swap! an atom that refers to an infinite lazy seq with the rest of the seq that is currently in it
My program hangs for obvious reasons because compare-and-set! tries to check for previous and new seq equality before swapping. The equality check basically never terminates
Any clue on how to achieve this?
(def beyond-infinity (atom (repeat 1)))
(defn keep-pulling [] (swap! beyond-infinity #(rest %)))
EDIT
Previously my lazy seq was range in which it worked, but on REPL due to realization of the returned seq, my REPL was hung
Anyway, with repeat it still fails
The issue was because of my REPL trying to realize an infinite sequence.
Also the thing I mentioned about compare-and-set! doing a equality check of the seq is wrong. The compare-and-set! does a reference equality test and not of the value!
Related
I've got a function that looks at two of these objects, does some mystery logic, and returns either one of them, or both (as a sequence).
I've got a sequence of these objects [o1 o2 o3 o4 ...], and I want to return a result of processing it like this:
call the mystery function on o1 and o2
keep the butlast of what you've got so far
take the last of the result of the previous mystery function, and call the mystery function on it, and o3
keep the butlast of what you've got so far
take the last of the result of the previous mystery function, and call the mystery function on it, and o4
keep the butlast of what you've got so far
take the last of the result of the previous mystery function, and call the mystery function on it, and oN
....
Here's what I've got so far:
; the % here is the input sequence
#(reduce update-algorithm [(first %)] (rest %))
(defn update-algorithm
[input-vector o2]
(apply conj (pop input-vector)
(mystery-function (peek input-vector) o2)))
What's an idiomatic way of writing this? I don't like the way that this looks. I think the apply conj is a little hard to read and so is the [(first %)] (rest %) on the first line.
into would be a better choice than apply conj.
I think [(first %)] (rest %) is just fine though. Probably the shortest way to write this and it makes it completely clear what the seed of the reduction and the sequence being reduced are.
Also, reduce is a perfect match to the task at hand, not only in the sense that it works, but also in the sense that the task is a reduction / fold. Similarly pop and peek do exactly the thing specified in the sense that it is their purpose to "keep the butlast" and "take the last" of what's been accumulated (in a vector). With the into change, the code basically tells the same story the spec does, and in fewer words to boot.
So, nope, no way to improve this, sorry. ;-)
I'm trying to use create a Clojure seq from some iterative Java library code that I inherited. Basically what the Java code does is read records from a file using a parser, sends those records to a processor and returns an ArrayList of result. In Java this is done by calling parser.readData(), then parser.getRecord() to get a record then passing that record into processor.processRecord(). Each call to parser.readData() returns a single record or null if there are no more records. Pretty common pattern in Java.
So I created this next-record function in Clojure that will get the next record from a parser.
(defn next-record
"Get the next record from the parser and process it."
[parser processor]
(let [datamap (.readData parser)
row (.getRecord parser datamap)]
(if (nil? row)
nil
(.processRecord processor row 100))))
The idea then is to call this function and accumulate the records into a Clojure seq (preferably a lazy seq). So here is my first attempt which works great as long as there aren't too many records:
(defn datamap-seq
"Returns a lazy seq of the records using the given parser and processor"
[parser processor]
(lazy-seq
(when-let [records (next-record parser processor)]
(cons records (datamap-seq parser processor)))))
I can create a parser and processor, and do something like (take 5 (datamap-seq parser processor)) which gives me a lazy seq. And as expected getting the (first) of that seq only realizes one element, doing count realizes all of them, etc. Just the behavior I would expect from a lazy seq.
Of course when there are a lot of records I end up with a StackOverflowException. So my next attempt was to use loop-recur to do the same thing.
(defn datamap-seq
"Returns a lazy seq of the records using the given parser and processor"
[parser processor]
(lazy-seq
(loop [records (seq '())]
(if-let [record (next-record parser processor)]
(recur (cons record records))
records))))
Now using this the same way and defing it using (def results (datamap-seq parser processor)) gives me a lazy seq and doesn't realize any elements. However, as soon as I do anything else like (first results) it forces the realization of the entire seq.
Can anyone help me understand where I'm going wrong in the second function using loop-recur that causes it to realize the entire thing?
UPDATE:
I've looked a little closer at the stack trace from the exception and the stack overflow exception is being thrown from one of the Java classes. BUT it only happens when I have the datamap-seq function like this (the one I posted above actually does work):
(defn datamap-seq
"Returns a lazy seq of the records using the given parser and processor"
[parser processor]
(lazy-seq
(when-let [records (next-record parser processor)]
(cons records (remove empty? (datamap-seq parser processor))))))
I don't really understand why that remove causes problems, but when I take it out of this funciton it all works right (I'm doing the removal of empty lists somewhere else now).
loop/recur loops within the loop expression until the recursion runs out. adding a lazy-seq around it won't prevent that.
Your first attempt with lazy-seq / cons should already work as you want, without stack overflows. I can't spot right now what the problem with it is, though it might be in the java part of the code.
I'll post here addition to Joost's answer. This code:
(defn integers [start]
(lazy-seq
(cons
start
(integers (inc start)))))
will not throw StackOverflowExceptoin if I do something like this:
(take 5 (drop 1000000 (integers)))
EDIT:
Of course better way to do it would be to (iterate inc 0). :)
EDIT2:
I'll try to explain a little how lazy-seq works. lazy-seq is a macro that returns seq-like object. Combined with cons that doesn't realize its second argument until it is requested you get laziness.
Now take a look at how LazySeq class is implemented. LazySeq.sval triggers computation of the next value which returns another instance of "frozen" lazy sequence. Method LazySeq.seq even better shows mechanics behind the concept. Notice that to fully realize sequence it uses while loop. It in itself means that stack trace use is limited to short function calls that return another instances of LazySeq.
I hope this makes any sense. I described what I could deduce from the source code. Please let me know if I made any mistakes.
I have a simple prime number calculator in clojure (an inefficient algorithm, but I'm just trying to understand the behavior of recur for now). The code is:
(defn divisible [x,y] (= 0 (mod x y)))
(defn naive-primes [primes candidates]
(if (seq candidates)
(recur (conj primes (first candidates))
(remove (fn [x] (divisible x (first candidates))) candidates))
primes)
)
This works as long as I am not trying to find too many numbers. For example
(print (sort (naive-primes [] (range 2 2000))))
works. For anything requiring more recursion, I get an overflow error.
(print (sort (naive-primes [] (range 2 20000))))
will not work. In general, whether I use recur or call naive-primes again without the attempt at TCO doesn't appear to make any difference. Why am I getting errors for large recursions while using recur?
recur always uses tail recursion, regardless of whether you are recurring to a loop or a function head. The issue is the calls to remove. remove calls first to get the element from the underlying seq and checks to see if that element is valid. If the underlying seq was created by a call to remove, you get another call to first. If you call remove 20000 times on the same seq, calling first requires calling first 20000 times, and none of the calls can be tail recursive. Hence, the stack overflow error.
Changing (remove ...) to (doall (remove ...)) fixes the problem, since it prevents the infinite stacking of remove calls (each one gets fully applied immediately and returns a concrete seq, not a lazy seq). I think this method only ever keeps one candidates list in memory at one time, though I am not positive about this. If so, it isn't too space inefficient, and a bit of testing shows that it isn't actually much slower.
I tried the following in Clojure, expecting to have the class of a non-lazy sequence returned:
(.getClass (doall (take 3 (repeatedly rand))))
However, this still returns clojure.lang.LazySeq. My guess is that doall does evaluate the entire sequence, but returns the original sequence as it's still useful for memoization.
So what is the idiomatic means of creating a non-lazy sequence from a lazy one?
doall is all you need. Just because the seq has type LazySeq doesn't mean it has pending evaluation. Lazy seqs cache their results, so all you need to do is walk the lazy seq once (as doall does) in order to force it all, and thus render it non-lazy. seq does not force the entire collection to be evaluated.
This is to some degree a question of taxonomy. a lazy sequence is just one type of sequence as is a list, vector or map. So the answer is of course "it depends on what type of non lazy sequence you want to get:
Take your pick from:
an ex-lazy (fully evaluated) lazy sequence (doall ... )
a list for sequential access (apply list (my-lazy-seq)) OR (into () ...)
a vector for later random access (vec (my-lazy-seq))
a map or a set if you have some special purpose.
You can have whatever type of sequence most suites your needs.
This Rich guy seems to know his clojure and is absolutely right.
Buth I think this code-snippet, using your example, might be a useful complement to this question :
=> (realized? (take 3 (repeatedly rand)))
false
=> (realized? (doall (take 3 (repeatedly rand))))
true
Indeed type has not changed but realization has
I stumbled on this this blog post about doall not being recursive. For that I found the first comment in the post did the trick. Something along the lines of:
(use 'clojure.walk)
(postwalk identity nested-lazy-thing)
I found this useful in a unit test where I wanted to force evaluation of some nested applications of map to force an error condition.
(.getClass (into '() (take 3 (repeatedly rand))))
Almost 2 identical programs to generate infinite lazy seqs of randoms.
The first doesn't crash. The second crash with OutOfMemoryError exception. Why?
;Return infinite lazy sequence of random numbers
(defn inf-rand[] (lazy-seq (cons (rand) (inf-rand))))
;Never returns. Burns the CPU but won't crash and lives forever.
(last (inf-rand))
But the following crash pretty quickly:
;Return infinite lazy sequence of random numbers
(defn inf-rand[] (lazy-seq (cons (rand) (inf-rand))))
(def r1 (inf-rand))
;Crash with "OutOfMemoryError"
(last r1)
I believe this is an example of "holding onto the head".
By making the reference r1 in the second example you open up the possibility of later saying something like (first r1) so you will end up storing the members of your lazy-seq as they are reified.
In the first case Clojure can determine that nothing will ever be done with earlier members of the infinite sequence so they can be disposed of and not consume memory.
I am still very much a Clojure beginner myself, any comments or corrections to my understanding or terminology greatly appreciated.