When I should make sequence from lazy sequence - clojure

I'm reading Programming Clojure now and I find out next example
(defn by-pairs [coll]
(let
[take-pair (fn [c] (when (next c) (take 2 c)))]
(lazy-seq
(when-let [pair (seq (take-pair coll))] ;seq calls here
(cons pair (by-pairs (rest coll)))))))
it breaks list into pairs, like
(println (by-pairs [1 2 1])) ((1 2) (2 1))
(println (by-pairs [1 2 1 3])) ((1 2) (2 1) (1 3))
(println (by-pairs [])) ()
(println (by-pairs [1])) ()
What I can not get is why we should invoke seq on take-pair result? So why we can not just write
(defn by-pairs [coll]
(let
[take-pair (fn [c] (when (next c) (take 2 c)))]
(lazy-seq
(when-let [pair (take-pair coll)]
(cons pair (by-pairs (rest coll)))))))
In witch cases there are will be different results or are there are any performance reasons?

Both the code are same and there will be no difference because next and take functions that are being applied to coll in take-pair function, do call seq on the passed parameter i.e next c will first call seq on c or try to check if it is an object which implements ISeq and same in being doing by the take function. So basically in this case if you don't call seq yourself, the next and take will call seq on it.

Related

Clojure function to Replace Count

I need help with an assignment that uses Clojure. It is very small but the language is a bit confusing to understand. I need to create a function that behaves like count without actually using the count funtion. I know a loop can be involved with it somehow but I am at a lost because nothing I have tried even gets my code to work. I expect it to output the number of elements in list. For example:
(defn functionname []
...
...)
(println(functionname '(1 4 8)))
Output:3
Here is what I have so far:
(defn functionname [n]
(def n 0)
(def x 0)
(while (< x n)
do
()
)
)
(println(functionname '(1 4 8)))
It's not much but I think it goes something like this.
This implementation takes the first element of the list and runs a sum until it can't anymore and then returns the sum.
(defn recount [list-to-count]
(loop [xs list-to-count sum 0]
(if (first xs)
(recur (rest xs) (inc sum))
sum
)))
user=> (recount '(3 4 5 9))
4
A couple more example implementations:
(defn not-count [coll]
(reduce + (map (constantly 1) coll)))
or:
(defn not-count [coll]
(reduce (fn [a _] (inc a)) 0 coll))
or:
(defn not-count [coll]
(apply + (map (fn [_] 1) coll)))
result:
(not-count '(5 7 8 1))
=> 4
I personally like the first one with reduce and constantly.

Writing a lazy-as-possible unfoldr-like function to generate arbitrary factorizations

problem formulation
Informally speaking, I want to write a function which, taking as input a function that generates binary factorizations and an element (usually neutral), creates an arbitrary length factorization generator. To be more specific, let us first define the function nfoldr in Clojure.
(defn nfoldr [f e]
(fn rec [n]
(fn [s]
(if (zero? n)
(if (empty? s) e)
(if (seq s)
(if-some [x ((rec (dec n)) (rest s))]
(f (list (first s) x))))))))
Here nil is used with the meaning "undefined output, input not in function's domain". Additionally, let us view the inverse relation of a function f as a set-valued function defining inv(f)(y) = {x | f(x) = y}.
I want to define a function nunfoldr such that inv(nfoldr(f , e)(n)) = nunfoldr(inv(f) , e)(n) when for every element y inv(f)(y) is finite, for each binary function f, element e and natural number n.
Moreover, I want the factorizations to be generated as lazily as possible, in a 2-dimensional sense of laziness. My goal is that, when getting some part of a factorization for the first time, there does not happen (much) computation needed for next parts or next factorizations. Similarly, when getting one factorization for the first time, there does not happen computation needed for next ones, whereas all the previous ones get in effect fully realized.
In an alternative formulation we can use the following longer version of nfoldr, which is equivalent to the shorter one when e is a neutral element.
(defn nfoldr [f e]
(fn [n]
(fn [s]
(if (zero? n)
(if (empty? s) e)
((fn rec [n]
(fn [s]
(if (= 1 n)
(if (and (seq s) (empty? (rest s))) (first s))
(if (seq s)
(if-some [x ((rec (dec n)) (rest s))]
(f (list (first s) x)))))))
n)))))
a special case
This problem is a generalization of the problem of generating partitions described in that question. Let us see how the old problem can be reduced to the current one. We have for every natural number n:
npt(n) = inv(nconcat(n)) = inv(nfoldr(concat2 , ())(n)) = nunfoldr(inv(concat2) , ())(n) = nunfoldr(pt2 , ())(n)
where:
npt(n) generates n-ary partitions
nconcat(n) computes n-ary concatenation
concat2 computes binary concatenation
pt2 generates binary partitions
So the following definitions give a solution to that problem.
(defn generate [step start]
(fn [x] (take-while some? (iterate step (start x)))))
(defn pt2-step [[x y]]
(if (seq y) (list (concat x (list (first y))) (rest y))))
(def pt2-start (partial list ()))
(def pt2 (generate pt2-step pt2-start))
(def npt (nunfoldr pt2 ()))
I will summarize my story of solving this problem, using the old one to create example runs, and conclude with some observations and proposals for extension.
solution 0
At first, I refined/generalized the approach I took for solving the old problem. Here I write my own versions of concat and map mainly for a better presentation and, in the case of concat, for some added laziness. Of course we can use Clojure's versions or mapcat instead.
(defn fproduct [f]
(fn [s]
(lazy-seq
(if (and (seq f) (seq s))
(cons
((first f) (first s))
((fproduct (rest f)) (rest s)))))))
(defn concat' [s]
(lazy-seq
(if (seq s)
(if-let [x (seq (first s))]
(cons (first x) (concat' (cons (rest x) (rest s))))
(concat' (rest s))))))
(defn map' [f]
(fn rec [s]
(lazy-seq
(if (seq s)
(cons (f (first s)) (rec (rest s)))))))
(defn nunfoldr [f e]
(fn rec [n]
(fn [x]
(if (zero? n)
(if (= e x) (list ()) ())
((comp
concat'
(map' (comp
(partial apply map)
(fproduct (list
(partial partial cons)
(rec (dec n))))))
f)
x)))))
In an attempt to get inner laziness we could replace (partial partial cons) with something like (comp (partial partial concat) list). Although this way we get inner LazySeqs, we do not gain any effective laziness because, before consing, most of the computation required for fully realizing the rest part takes place, something that seems unavoidable within this general approach. Based on the longer version of nfoldr, we can also define the following faster version.
(defn nunfoldr [f e]
(fn [n]
(fn [x]
(if (zero? n)
(if (= e x) (list ()) ())
(((fn rec [n]
(fn [x] (println \< x \>)
(if (= 1 n)
(list (list x))
((comp
concat'
(map' (comp
(partial apply map)
(fproduct (list
(partial partial cons)
(rec (dec n))))))
f)
x))))
n)
x)))))
Here I added a println call inside the main recursive function to get some visualization of eagerness. So let us demonstrate the outer laziness and inner eagerness.
user=> (first ((npt 5) (range 3)))
< (0 1 2) >
< (0 1 2) >
< (0 1 2) >
< (0 1 2) >
< (0 1 2) >
(() () () () (0 1 2))
user=> (ffirst ((npt 5) (range 3)))
< (0 1 2) >
< (0 1 2) >
< (0 1 2) >
< (0 1 2) >
< (0 1 2) >
()
solution 1
Then I thought of a more promising approach, using the function:
(defn transpose [s]
(lazy-seq
(if (every? seq s)
(cons
(map first s)
(transpose (map rest s))))))
To get the new solution we replace the previous argument in the map' call with:
(comp
(partial map (partial apply cons))
transpose
(fproduct (list
repeat
(rec (dec n)))))
Trying to get inner laziness we could replace (partial apply cons) with #(cons (first %) (lazy-seq (second %))) but this is not enough. The problem lies in the (every? seq s) test inside transpose, where checking a lazy sequence of factorizations for emptiness (as a stopping condition) results in realizing it.
solution 2
A first way to tackle the previous problem that came to my mind was to use some additional knowledge about the number of n-ary factorizations of an element. This way we can repeat a certain number of times and use only this sequence for the stopping condition of transpose. So we will replace the test inside transpose with (seq (first s)), add an input count to nunfoldr and replace the argument in the map' call with:
(comp
(partial map #(cons (first %) (lazy-seq (second %))))
transpose
(fproduct (list
(partial apply repeat)
(rec (dec n))))
(fn [[x y]] (list (list ((count (dec n)) y) x) y)))
Let us turn to the problem of partitions and define:
(defn npt-count [n]
(comp
(partial apply *)
#(map % (range 1 n))
(partial comp inc)
(partial partial /)
count))
(def npt (nunfoldr pt2 () npt-count))
Now we can demonstrate outer and inner laziness.
user=> (first ((npt 5) (range 3)))
< (0 1 2) >
(< (0 1 2) >
() < (0 1 2) >
() < (0 1 2) >
() < (0 1 2) >
() (0 1 2))
user=> (ffirst ((npt 5) (range 3)))
< (0 1 2) >
()
However, the dependence on additional knowledge and the extra computational cost make this solution unacceptable.
solution 3
Finally, I thought that in some crucial places I should use a kind of lazy sequences "with a non-lazy end", in order to be able to check for emptiness without realizing. An empty such sequence is just a non-lazy empty list and overall they behave somewhat like the lazy-conss of the early days of Clojure. Using the definitions given below we can reach an acceptable solution, which works under the assumption that always at least one of the concat'ed sequences (when there is one) is non-empty, something that holds in particular when every element has at least one binary factorization and we are using the longer version of nunfoldr.
(def lazy? (partial instance? clojure.lang.IPending))
(defn empty-eager? [x] (and (not (lazy? x)) (empty? x)))
(defn transpose [s]
(lazy-seq
(if-not (some empty-eager? s)
(cons
(map first s)
(transpose (map rest s))))))
(defn concat' [s]
(if-not (empty-eager? s)
(lazy-seq
(if-let [x (seq (first s))]
(cons (first x) (concat' (cons (rest x) (rest s))))
(concat' (rest s))))
()))
(defn map' [f]
(fn rec [s]
(if-not (empty-eager? s)
(lazy-seq (cons (f (first s)) (rec (rest s))))
())))
Note that in this approach the input function f should produce lazy sequences of the new kind and the resulting n-ary factorizer will also produce such sequences. To take care of the new input requirement, for the problem of partitions we define:
(defn pt2 [s]
(lazy-seq
(let [start (list () s)]
(cons
start
((fn rec [[x y]]
(if (seq y)
(lazy-seq
(let [step (list (concat x (list (first y))) (rest y))]
(cons step (rec step))))
()))
start)))))
Once again, let us demonstrate outer and inner laziness.
user=> (first ((npt 5) (range 3)))
< (0 1 2) >
< (0 1 2) >
(< (0 1 2) >
() < (0 1 2) >
() < (0 1 2) >
() () (0 1 2))
user=> (ffirst ((npt 5) (range 3)))
< (0 1 2) >
< (0 1 2) >
()
To make the input and output use standard lazy sequences (sacrificing a bit of laziness), we can add:
(defn lazy-end->eager-end [s]
(if (seq s)
(lazy-seq (cons (first s) (lazy-end->eager-end (rest s))))
()))
(defn eager-end->lazy-end [s]
(lazy-seq
(if-not (empty-eager? s)
(cons (first s) (eager-end->lazy-end (rest s))))))
(def nunfoldr
(comp
(partial comp (partial comp eager-end->lazy-end))
(partial apply nunfoldr)
(fproduct (list
(partial comp lazy-end->eager-end)
identity))
list))
observations and extensions
While creating solution 3, I observed that the old mechanism for lazy sequences in Clojure might not be necessarily inferior to the current one. With the transition, we gained the ability to create lazy sequences without any substantial computation taking place but lost the ability to check for emptiness without doing the computation needed to get one more element. Because both of these abilities can be important in some cases, it would be nice if a new mechanism was introduced, which would combine the advantages of the previous ones. Such a mechanism could use again an outer LazySeq thunk, which when forced would return an empty list or a Cons or another LazySeq or a new LazyCons thunk. This new thunk when forced would return a Cons or perhaps another LazyCons. Now empty? would force only LazySeq thunks while first and rest would also force LazyCons. In this setting map could look like this:
(defn map [f s]
(lazy-seq
(if (empty? s) ()
(lazy-cons
(cons (f (first s)) (map f (rest s)))))))
I have also noticed that the approach taken from solution 1 onwards lends itself to further generalization. If inside the argument in the map' call in the longer nunfoldr we replace cons with concat, transpose with some implementation of Cartesian product and repeat with another recursive call, we can then create versions that "split at different places". For example, using the following as the argument we can define a nunfoldm function that "splits in the middle" and corresponds to an easy-to-imagine nfoldm. Note that all "splitting strategies" are equivalent when f is associative.
(comp
(partial map (partial apply concat))
cproduct
(fproduct (let [n-half (quot n 2)]
(list (rec n-half) (rec (- n n-half))))))
Another natural modification would allow for infinite factorizations. To achieve this, if f generated infinite factorizations, nunfoldr(f , e)(n) should generate the factorizations in an order of type ω, so that each one of them could be produced in finite time.
Other possible extensions include dropping the n parameter, creating relational folds (in correspondence with the relational unfolds we consider here) and generically handling algebraic data structures other than sequences as input/output. This book, which I have just discovered, seems to contain valuable relevant information, given in a categorical/relational language.
Finally, to be able to do this kind of programming more conveniently, we could transfer it into a point-free, algebraic setting. This would require constructing considerable "extra machinery", in fact almost making a new language. This paper demonstrates such a language.

How to define the partitions (factorizations w.r.t. concatenation) of a sequence as a lazy sequence of lazy sequences in Clojure

I am new to Clojure and I want to define a function pt taking as arguments a number n and a sequence s and returning all the partitions of s in n parts, i.e. its factorizations with respect to n-concatenation. for example (pt 3 [0 1 2]) should produce:
(([] [] [0 1 2]) ([] [0] [1 2]) ([] [0 1] [2]) ([] [0 1 2] []) ([0] [] [1 2]) ([0] [1] [2]) ([0] [1 2] []) ([0 1] [] [2]) ([0 1] [2] []) ([0 1 2] [] []))
with the order being unimportant.
Specifically, I want the result to be a lazy sequence of lazy sequences of vectors.
My first attempt for such a function was the following:
(defn pt [n s]
(lazy-seq
(if (zero? n)
(when (empty? s) [nil])
((fn split [a b]
(concat
(map (partial cons a) (pt (dec n) b))
(when-let [[bf & br] (seq b)] (split (conj a bf) br))))
[] s))))
After that, I wrote a somewhat less concise version which reduces the time complexity by avoiding useless comparisons for 1-part partitions, given below:
(defn pt [n s]
(lazy-seq
(if (zero? n)
(when (empty? s) [nil])
((fn pt>0 [n s]
(lazy-seq
(if (= 1 n)
[(cons (vec s) nil)]
((fn split [a b]
(concat
(map (partial cons a) (pt>0 (dec n) b))
(when-let [[bf & br] (seq b)] (split (conj a bf) br))))
[] s))))
n s))))
The problem with these solutions is that, although they work, they produce a lazy sequence of (non-lazy) cons's and I suspect that quite a different approach must be taken to achieve the "inner laziness". So any corrections, suggestions, explanations are welcome!
EDIT: After reading l0st3d's answer I thought I should make clear that I do not want a partition just to be a LazySeq but to be "really lazy", in the sense that a part is computed and held in memory only when it is requested.
For example, both of the functions given below produce LazySeq's but only the first one produces a "really lazy" sequence.
(defn f [n]
(if (neg? n)
(lazy-seq nil)
(lazy-seq (cons n (f (dec n))))))
(defn f [n]
(if (neg? n)
(lazy-seq nil)
(#(lazy-seq (cons n %)) (f (dec n)))))
So mapping (partial concat [a]) or #(lazy-seq (cons a %)) instead of (partial cons a) does not solve the problem.
The cons call in your split inline fn is the only place where eagerness is being introduced. You could replace that with something that lazily constructs a list, like concat:
(defn pt [n s]
(lazy-seq
(if (zero? n)
(when (empty? s) [nil])
((fn split [a b]
(concat
(map (partial concat [a]) (pt (dec n) b))
(when-let [[bf & br] (seq b)] (split (conj a bf) br))))
[] s))))
(every? #(= clojure.lang.LazySeq (class %)) (pt 3 [0 1 2 3])) ;; => true
But, reading the code I feel like it's fairly unClojurey, and I think that's to do with the use of recursion. Often you'd use things like reductions, partition-by, split-at and so to do this sort of thing. I feel like there should also be a way to make this a transducer and separate out the lazyness from the processing (so you can use sequence to say you want it lazily), but I haven't got time to work that out right now. I'll try and come back with a more complete answer soon.

clojure performance on badly performing code

I have completed this problem on hackerrank and my solution passes most test cases but it is not fast enough for 4 out of the 11 test cases.
My solution looks like this:
(ns scratch.core
(require [clojure.string :as str :only (split-lines join split)]))
(defn ascii [char]
(int (.charAt (str char) 0)))
(defn process [text]
(let [parts (split-at (int (Math/floor (/ (count text) 2))) text)
left (first parts)
right (if (> (count (last parts)) (count (first parts)))
(rest (last parts))
(last parts))]
(reduce (fn [acc i]
(let [a (ascii (nth left i))
b (ascii (nth (reverse right) i))]
(if (> a b)
(+ acc (- a b))
(+ acc (- b a))))
) 0 (range (count left)))))
(defn print-result [[x & xs]]
(prn x)
(if (seq xs)
(recur xs)))
(let [input (slurp "/Users/paulcowan/Downloads/input10.txt")
inputs (str/split-lines input)
length (read-string (first inputs))
texts (rest inputs)]
(time (print-result (map process texts))))
Can anyone give me any advice about what I should look at to make this faster?
Would using recursion instead of reduce be faster or maybe this line is expensive:
right (if (> (count (last parts)) (count (first parts)))
(rest (last parts))
(last parts))
Because I am getting a count twice.
You are redundantly calling reverse on every iteration of the reduce:
user=> (let [c [1 2 3]
noisey-reverse #(doto (reverse %) println)]
(reduce (fn [acc e] (conj acc (noisey-reverse c) e))
[]
[:a :b :c]))
(3 2 1)
(3 2 1)
(3 2 1)
[(3 2 1) :a (3 2 1) :b (3 2 1) :c]
The reversed value could be calculated inside the containing let, and would then only need to be calculated once.
Also, due to the way your parts is defined, you are doing linear time lookups with each call to nth. It would be better to put parts in a vector and do indexed lookup. In fact you wouldn't need a reversed parts, and could do arithmetic based on the count of the vector to find the item to look up.

clojure - using loop and recur with a lazy sequence

If I am returning a lazy-seq from a function like this:
(letfn [(permutations [s]
(lazy-seq
(if (seq (rest s))
(apply concat (for [x s]
(map #(cons x %) (permutations (remove #{x} s)))))
[s])))])
If I use loop recur like below, will the list be eagerly evaluated?
(loop [perms (permutations chain)]
(if (empty? perms)
(prn "finised")
(recur (rest perms))))
If it is eagerly evaluated, can I use loop..recur to lazily loop over what is returned from the permutations function?
The list is lazily evaluated by your loop-recur code.
You can try it out yourself. Let's make permutations print something every time it returns a value by adding a println call.
(defn permutations [s]
(lazy-seq
(if (seq (rest s))
(apply concat (for [x s]
(map #(cons x %) (permutations (remove #{x} s)))))
(do
(println "returning a value")
[s]))))
When using loop, let's also print the values with we're looping over with (prn (first perms).
(loop [perms (permutations [1 2 3])]
(if (empty? perms)
(prn "finised")
(do
(prn (first perms))
(recur (rest perms)))))
Here's what it prints:
returning a value
(1 2 3)
returning a value
(1 3 2)
returning a value
(2 1 3)
returning a value
(2 3 1)
returning a value
(3 1 2)
returning a value
(3 2 1)
"finished"
As you can see, "returning a value" and the value lines are interlaced. The evaluation of the lazy seq can be forced with doall. If you loop over (doall (permutations [1 2 3])), first it prints all the "returning a value" lines and only then the values.