Clojure. Why the wrapper lazy-seq is required? - clojure

Why the wrapper lazy-cons is required? There are two functions with the same result.
(defn seq1 [s]
(lazy-seq
(when-let [x (seq s)]
(cons (first x) (seq1 (rest x))))))
(defn seq2 [s]
(when-let [x (seq s)]
(cons (first x) (seq2 (rest x)))))
Both case I got the same result none chunked sequences.
repl.core=> (first (map println (seq1 (range 1000))))
0
nil
repl.core=> (first (map println (seq2 (range 1000))))
0
nil
repl.core=> (chunked-seq? (seq2 (range 1000)))
false
repl.core=> (chunked-seq? (seq1 (range 1000)))
false

The first is lazy. It only evaluates elements of the sequence as needed. The second however is strict and goes through the entire sequence immediately. This can be seen if you add some println calls in each:
(defn seq1 [s]
(lazy-seq
(when-let [x (seq s)]
(println "Seq1" (first x))
(cons (first x) (seq1 (rest x))))))
(defn seq2 [s]
(when-let [x (seq s)]
(println "Seq2" (first x))
(cons (first x) (seq2 (rest x)))))
(->> (range 10)
(seq1)
(take 5))
Seq1 0
Seq1 1
Seq1 2
Seq1 3
Seq1 4 ; Only iterated over what was asked for
=> (0 1 2 3 4)
(->> (range 10)
(seq2)
(take 5))
Seq2 0
Seq2 1
Seq2 2
Seq2 3
Seq2 4
Seq2 5
Seq2 6
Seq2 7
Seq2 8
Seq2 9 ; Iterated over everything immediately
=> (0 1 2 3 4)
So, to answer the question, lazy-seq is only required if you intend the iteration of the sequence to only happen as needed. Prefer lazy solutions if you think you may not need the entire sequence, or the sequence is infinite, or you want to sequence several transformations using map or filter. Use a strict solution if you're likely going to need the entire sequence at some point and/or you need fast random access.

Related

Why the program runs endlessly?

Why the program runs endlessly?
(defn lazycombine
([s] (lazycombine s []))
([s v] (let [a (take 1 s)
b (drop 1 s)]
(if (= a :start)
(lazy-seq (lazycombine b v))
(if (= a :end)
(lazy-seq (cons v (lazycombine b [])))
(lazy-seq (lazycombine b (conj v a))))))))
(def w '(:start 1 2 3 :end :start 7 7 :end))
(lazycombine w)
I need a function that returns a lazy sequence of elements by taking elements from another sequence of the form [: start 1 2: end: start: 5: end] and combining all the elements between: start and: end into a vector
You need to handle the termination condition - i.e. what should return when input s is empty?
And also the detection of :start and :end should use first instead of (take 1 s). And you can simplify that with destructuring.
(defn lazycombine
([s] (lazycombine s []))
([[a & b :as s] v]
(if (empty? s)
v
(if (= a :start)
(lazy-seq (lazycombine b v))
(if (= a :end)
(lazy-seq (cons v (lazycombine b [])))
(lazy-seq (lazycombine b (conj v a))))))))
(def w '(:start 1 2 3 :end :start 7 7 :end))
(lazycombine w)
;; => ([1 2 3] [7 7])
To reduce cyclomatic complexity a bit, you can use condp to replace couple if:
(defn lazycombine
([s] (lazycombine s []))
([[a & b :as s] v]
(if (empty? s)
v
(lazy-seq
(condp = a
:start (lazycombine b v)
:end (cons v (lazycombine b []))
(lazycombine b (conj v a)))))))
I would do it like so, using take-while:
(ns tst.demo.core
(:use tupelo.core tupelo.test))
(def data
[:start 1 2 3 :end :start 7 7 :end])
(defn end-tag? [it] (= it :end))
(defn start-tag? [it] (= it :start))
(defn lazy-segments
[data]
(when-not (empty? data)
(let [next-segment (take-while #(not (end-tag? %)) data)
data-next (drop (inc (count next-segment)) data)
segment-result (vec (remove #(start-tag? %) next-segment))]
(cons segment-result
(lazy-seq (lazy-segments data-next))))))
(dotest
(println "result: " (lazy-segments data)))
Running we get:
result: ([1 2 3] [7 7])
Note the contract when constructing a sequence recursively using cons (lazy or not). You must return either the next value in the sequence, or nil. Supplying nil to cons is the same as supplying an empty sequence:
(cons 5 nil) => (5)
(cons 5 []) => (5)
So it is convenient to use a when form to test the termination condition (instead of using if and returning an empty vector when the sequence must end).
Suppose we wrote the cons as a simple recursion:
(cons segment-result
(lazy-segments data-next))
This works great and produces the same result. The only thing the lazy-seq part does is to delay when the recursive call takes place. Because lazy-seq is a Clojure built-in (special form), it it is similar to loop/recur and does not consume the stack like ordinary recursion does . Thus, we can generate millions (or more) values in the lazy sequence without creating a StackOverflowError (on my computer, the default maximum stack size is ~4000). Consider the infinite lazy-sequence of integers beginning at 0:
(defn intrange
[n]
(cons n (lazy-seq (intrange (inc n)))))
(dotest
(time
(spyx (first (drop 1e6 (intrange 0))))))
Dropping the first million integers and taking the next one succeeds and requires only a few milliseconds:
(first (drop 1000000.0 (intrange 0))) => 1000000
"Elapsed time: 49.5 msecs"

How to get tails of sequence clojure

I have sequence in clojure of theem
(1 2 3 4)
how can I get all the tails of sequence like
((1 2 3 4) (2 3 4) (3 4) (4) ())
Another way to get all tails is by using the reductions function.
user=> (def x '(1 2 3 4))
#'user/x
user=> (reductions (fn [s _] (rest s)) x x)
((1 2 3 4) (2 3 4) (3 4) (4) ())
user=>
If you want to do this with higher-level functions, I think iterate would work well here:
(defn tails [xs]
(concat (take-while seq (iterate rest xs)) '(()))
However, I think in this case it would be cleaner to just write it with lazy-seq:
(defn tails [xs]
(if-not (seq xs) '(())
(cons xs (lazy-seq (tails (rest xs))))))
Here is one way.
user=> (def x [1 2 3 4])
#'user/x
user=> (map #(drop % x) (range (inc (count x))))
((1 2 3 4) (2 3 4) (3 4) (4) ())
One way you can do that is by
(defn tails [coll]
(take (inc (count coll)) (iterate rest coll)))
(defn tails
[s]
(cons s (if-some [r (next s)]
(lazy-seq (tails r))
'(()))))
Yippee! Another one:
(defn tails [coll]
(if-let [s (seq coll)]
(cons coll (lazy-seq (tails (rest coll))))
'(())))
This is really just what reductions does under the hood. The best answer, by the way, is ez121sl's.

Function for replacing subsequences

Is there a function that could replace subsequences? For example:
user> (good-fnc [1 2 3 4 5] [1 2] [3 4 5])
;; => [3 4 5 3 4 5]
I know that there is clojure.string/replace for strings:
user> (clojure.string/replace "fat cat caught a rat" "a" "AA")
;; => "fAAt cAAt cAAught AA rAAt"
Is there something similar for vectors and lists?
Does this work for you?
(defn good-fnc [s sub r]
(loop [acc []
s s]
(cond
(empty? s) (seq acc)
(= (take (count sub) s) sub) (recur (apply conj acc r)
(drop (count sub) s))
:else (recur (conj acc (first s)) (rest s)))))
Here is a version that plays nicely with lazy seq inputs. Note that it can take an infinite lazy sequence (range) without looping infinitely as a loop based version would.
(defn sq-replace
[match replacement sq]
(let [matching (count match)]
((fn replace-in-sequence [[elt & elts :as sq]]
(lazy-seq
(cond (empty? sq)
()
(= match (take matching sq))
(concat replacement (replace-in-sequence (drop matching sq)))
:default
(cons elt (replace-in-sequence elts)))))
sq)))
#'user/sq-replace
user> (take 10 (sq-replace [3 4 5] ["hello, world"] (range)))
(0 1 2 "hello, world" 6 7 8 9 10 11)
I took the liberty of making the sequence argument the final argument, since this is the convention in Clojure for functions that walk a sequence.
My previous (now deleted) answer was incorrect because this was not as trivial as I first thought, here is my second attempt:
(defn seq-replace
[coll sub rep]
(letfn [(seq-replace' [coll]
(when-let [s (seq coll)]
(let [start (take (count sub) s)
end (drop (count sub) s)]
(if (= start sub)
(lazy-cat rep (seq-replace' end))
(cons (first s) (lazy-seq (seq-replace' (rest s))))))))]
(seq-replace' coll)))

Clojure: find repetition

Let's say we have a list of integers: 1, 2, 5, 13, 6, 5, 7 and I want to find the first number that repeats and return a vector of the two indices. In my sample, it's 5 at [2, 5]. What I did so far is loop, but can I do it more elegant, short way?
(defn get-cycle
[xs]
(loop [[x & xs_rest] xs, indices {}, i 0]
(if (nil? x)
[0 i] ; Sequence is over before we found a duplicate.
(if-let [x_index (indices x)]
[x_index i]
(recur xs_rest (assoc indices x i) (inc i))))))
No need to return number itself, because I can get it by index and, second, it may be not always there.
An option using list processing, but not significantly more concise:
(defn get-cycle [xs]
(first (filter #(number? (first %))
(reductions
(fn [[m i] x] (if-let [xat (m x)] [xat i] [(assoc m x i) (inc i)]))
[(hash-map) 0] xs))))
Here is a version using reduced to stop consuming the sequence when you find the first duplicate:
(defn first-duplicate [coll]
(reduce (fn [acc [idx x]]
(if-let [v (get acc x)]
(reduced (conj v idx))
(assoc acc x [idx])))
{} (map-indexed #(vector % %2) coll)))
I know that you have only asked for the first. Here is a fully lazy implementation with little per-step allocation overhead
(defn dups
[coll]
(letfn [(loop-fn [idx [elem & rest] cached]
(if elem
(if-let [last-idx (cached elem)]
(cons [last-idx idx]
(lazy-seq (loop-fn (inc idx) rest (dissoc cached elem))))
(lazy-seq (loop-fn (inc idx) rest (assoc cached elem idx))))))]
(loop-fn 0 coll {})))
(first (dups v))
=> [2 5]
Edit: Here are some criterium benchmarks:
The answer that got accepted: 7.819269 µs
This answer (first (dups [12 5 13 6 5 7])): 6.176290 µs
Beschastnys: 5.841101 µs
first-duplicate: 5.025445 µs
Actually, loop is a pretty good choice unless you want to find all duplicates. Things like reduce will cause the full scan of an input sequence even when it's not necessary.
Here is my version of get-cycle:
(defn get-cycle [coll]
(loop [i 0 seen {} coll coll]
(when-let [[x & xs] (seq coll)]
(if-let [j (seen x)]
[j i]
(recur (inc i) (assoc seen x i) xs)))))
The only difference from your get-cycle is that my version returns nil when there is no duplicates.
The intent of your code seems different from your description in the comments so I'm not totally confident I understand. That said, loop/recur is definitely a valid way to approach the problem.
Here's what I came up with:
(defn get-cycle [xs]
(loop [xs xs index 0]
(when-let [[x & more] (seq xs)]
(when-let [[y] (seq more)]
(if (= x y)
{x [index (inc index)]}
(recur more (inc index)))))))
This will return a map of the repeated item to a vector of the two indices the item was found at.
(get-cycle [1 1 2 1 2 4 2 1 4 5 6 7])
;=> {1 [0 1]}
(get-cycle [1 2 1 2 4 2 1 4 5 6 7 7])
;=> {7 [10 11]}
(get-cycle [1 2 1 2 4 2 1 4 5 6 7 8])
;=> nil
Here's an alternative solution using sequence functions. I like this way better but whether it's shorter or more elegant is probably subjective.
(defn pairwise [coll]
(map vector coll (rest coll)))
(defn find-first [pred xs]
(first (filter pred xs)))
(defn get-cycle [xs]
(find-first #(apply = (val (first %)))
(map-indexed hash-map (pairwise xs))))
Edited based on clarification from #demi
Ah, got it. Is this what you have in mind?
(defn get-cycle [xs]
(loop [xs (map-indexed vector xs)]
(when-let [[[i n] & more] (seq xs)]
(if-let [[j _] (find-first #(= n (second %)) more)]
{n [i j]}
(recur more)))))
I re-used find-first from my earlier sequence-based solution.

Partition by a seq of integers

What would be a more idiomatic way to partition a seq based on a seq of integers instead of just one integer?
Here's my implementation:
(defn partition-by-seq
"Return a lazy sequence of lists with a variable number of items each
determined by the n in ncoll. Extra values in coll are dropped."
[ncoll coll]
(let [partition-coll (mapcat #(repeat % %) ncoll)]
(->> coll
(map vector partition-coll)
(partition-by first)
(map (partial map last)))))
Then (partition-by-seq [2 3 6] (range)) yields ((0 1) (2 3 4) (5 6 7 8 9 10)).
Your implementation looks fine, but there could be a more simple solution which uses simple recursion wrapped in lazy-seq(and turns out to be more efficient) than using map and existing partition-by as in your case.
(defn partition-by-seq [ncoll coll]
(if (empty? ncoll)
'()
(let [n (first ncoll)]
(cons (take n coll)
(lazy-seq (partition-by-seq (rest ncoll) (drop n coll)))))))
A variation on Ankur's answer, with a minor addition of laziness and when-let instead of an explicit test for empty?.
(defn partition-by-seq [parts coll]
(lazy-seq
(when-let [s (seq parts)]
(cons
(take (first s) coll)
(partition-by-seq (rest s) (nthrest coll (first s)))))))
(first (reduce (fn [[r l] n]
[(conj r (take n l)) (drop n l)])
[[] (range)]
[2 3 6]))
=> [(0 1) (2 3 4) (5 6 7 8 9 10)]