creating a finite lazy sequence - clojure

I'm using the function iterate to create a lazy sequence. The sequence keeps producing new values on each item. At one point however the produced values "doesn't make sense" anymore, so they are useless. This should be the end of the lazy sequence. This is the intended behavior in a abstract form.
My approach was to let the sequence produce the values. And once detected that they are not useful anymore, the sequence would only emit nil values. Then, the sequence would be wrapped with a take-while, to make it finite.
simplified:
(take-while (comp not nil?)
(iterate #(let [v (myfunction1 %)]
(if (mypred? (myfunction2 v)) v nil)) start-value))
This works, but two questions arise here:
Is it generally a good idea to model a finite lazy sequence with a nil as a "stopper", or are there better ways?
The second question would be related to the way I implemented the mechanism above, especially inside the iterate.
The problem is: I need one function to get a value, then a predicate to test if it's valid, if yes: in needs to pass a second function, otherwise: return nil.
I'm looking for a less imperative way tho achieve this, more concretely omitting the let statement. Rather something like this:
(defn pass-if-true [pred v f]
(when (pred? v) (f v)))
#(pass-if-true mypred? (myfunction1 %) myfunction2)
For now, I'll go with this:
(comp #(when (mypred? %) (myfunction2 %)) myfunction1)

Is it generally a good idea to model a finite lazy sequence with a nil as a "stopper", or are there better ways?
nil is the idiomatic way to end a finite lazy sequence.
Regarding the second question, try writing it this way:
(def predicate (partial > 10))
(take-while predicate (iterate inc 0))
;; => (0 1 2 3 4 5 6 7 8 9)
Here inc takes the previous value and produces a next value, predicate tests whether or not a value is good. The first time predicate returns false, sequence is terminated.

Using a return value of nil can make a lazy sequence terminate.
For example, this code calculates the greatest common divisor of two integers:
(defn remainder-sequence [n d]
(let [[q r] ((juxt quot rem) n d)]
(if (= r 0) nil
(lazy-seq (cons r (remainder-sequence d r))))))
(defn gcd [n d]
(cond (< (Math/abs n) (Math/abs d)) (gcd d n)
(= 0 (rem n d)) d
:default (last (remainder-sequence n d))))
(gcd 100 32) ; returns 4

Related

How to make reduce more readable in Clojure?

A reduce call has its f argument first. Visually speaking, this is often the biggest part of the form.
e.g.
(reduce
(fn [[longest current] x]
(let [tail (last current)
next-seq (if (or (not tail) (> x tail))
(conj current x)
[x])
new-longest (if (> (count next-seq) (count longest))
next-seq
longest)]
[new-longest next-seq]))
[[][]]
col))
The problem is, the val argument (in this case [[][]]) and col argument come afterward, below, and it's a long way for your eyes to travel to match those with the parameters of f.
It would look more readable to me if it were in this order instead:
(reduceb val col
(fn [x y]
...))
Should I implement this macro, or am I approaching this entirely wrong in the first place?
You certainly shouldn't write that macro, since it is easily written as a function instead. I'm not super keen on writing it as a function, either, though; if you really want to pair the reduce with its last two args, you could write:
(-> (fn [x y]
...)
(reduce init coll))
Personally when I need a large function like this, I find that a comma actually serves as a good visual anchor, and makes it easier to tell that two forms are on that last line:
(reduce (fn [x y]
...)
init, coll)
Better still is usually to not write such a large reduce in the first place. Here you're combining at least two steps into one rather large and difficult step, by trying to find all at once the longest decreasing subsequence. Instead, try splitting the collection up into decreasing subsequences, and then take the largest one.
(defn decreasing-subsequences [xs]
(lazy-seq
(cond (empty? xs) []
(not (next xs)) (list xs)
:else (let [[x & [y :as more]] xs
remainder (decreasing-subsequences more)]
(if (> y x)
(cons [x] remainder)
(cons (cons x (first remainder)) (rest remainder)))))))
Then you can replace your reduce with:
(apply max-key count (decreasing-subsequences xs))
Now, the lazy function is not particularly shorter than your reduce, but it is doing one single thing, which means it can be understood more easily; also, it has a name (giving you a hint as to what it's supposed to do), and it can be reused in contexts where you're looking for some other property based on decreasing subsequences, not just the longest. You can even reuse it more often than that, if you replace the > in (> y x) with a function parameter, allowing you to split up into subsequences based on any predicate. Plus, as mentioned it is lazy, so you can use it in situations where a reduce of any sort would be impossible.
Speaking of ease of understanding, as you can see I misunderstood what your function is supposed to do when reading it. I'll leave as an exercise for you the task of converting this to strictly-increasing subsequences, where it looked to me like you were computing decreasing subsequences.
You don't have to use reduce or recursion to get the descending (or ascending) sequences. Here we are returning all the descending sequences in order from longest to shortest:
(def in [3 2 1 0 -1 2 7 6 7 6 5 4 3 2])
(defn descending-sequences [xs]
(->> xs
(partition 2 1)
(map (juxt (fn [[x y]] (> x y)) identity))
(partition-by first)
(filter ffirst)
(map #(let [xs' (mapcat second %)]
(take-nth 2 (cons (first xs') xs'))))
(sort-by (comp - count))))
(descending-sequences in)
;;=> ((7 6 5 4 3 2) (3 2 1 0 -1) (7 6))
(partition 2 1) gives every possible comparison and partition-by allows you to mark out the runs of continuous decreases. At this point you can already see the answer and the rest of the code is removing the baggage that is no longer needed.
If you want the ascending sequences instead then you only need to change the < to a >:
;;=> ((-1 2 7) (6 7))
If, as in the question, you only want the longest sequence then put a first as the last function call in the thread last macro. Alternatively replace the sort-by with:
(apply max-key count)
For maximum readability you can name the operations:
(defn greatest-continuous [op xs]
(let [op-pair? (fn [[x y]] (op x y))
take-every-second #(take-nth 2 (cons (first %) %))
make-canonical #(take-every-second (apply concat %))]
(->> xs
(partition 2 1)
(partition-by op-pair?)
(filter (comp op-pair? first))
(map make-canonical)
(apply max-key count))))
I feel your pain...they can be hard to read.
I see 2 possible improvements. The simplest is to write a wrapper similar to the Plumatic Plumbing defnk style:
(fnk-reduce { :fn (fn [state val] ... <new state value>)
:init []
:coll some-collection } )
so the function call has a single map arg, where each of the 3 pieces is labelled & can come in any order in the map literal.
Another possibility is to just extract the reducing fn and give it a name. This can be either internal or external to the code expression containing the reduce:
(let [glommer (fn [state value] (into state value)) ]
(reduce glommer #{} some-coll))
or possibly
(defn glommer [state value] (into state value))
(reduce glommer #{} some-coll))
As always, anything that increases clarity is preferred. If you haven't noticed already, I'm a big fan of Martin Fowler's idea of Introduce Explaining Variable refactoring. :)
I will apologize in advance for posting a longer solution to something where you wanted more brevity/clarity.
We are in the new age of clojure transducers and it appears a bit that your solution was passing the "longest" and "current" forward for record-keeping. Rather than passing that state forward, a stateful transducer would do the trick.
(def longest-decreasing
(fn [rf]
(let [longest (volatile! [])
current (volatile! [])
tail (volatile! nil)]
(fn
([] (rf))
([result] (transduce identity rf result))
([result x] (do (if (or (nil? #tail) (< x #tail))
(if (> (count (vswap! current conj (vreset! tail x)))
(count #longest))
(vreset! longest #current))
(vreset! current [(vreset! tail x)]))
#longest)))))))
Before you dismiss this approach, realize that it just gives you the right answer and you can do some different things with it:
(def coll [2 1 10 9 8 40])
(transduce longest-decreasing conj coll) ;; => [10 9 8]
(transduce longest-decreasing + coll) ;; => 27
(reductions (longest-decreasing conj) [] coll) ;; => ([] [2] [2 1] [2 1] [2 1] [10 9 8] [10 9 8])
Again, I know that this may appear longer but the potential to compose this with other transducers might be worth the effort (not sure if my airity 1 breaks that??)
I believe that iterate can be a more readable substitute for reduce. For example here is the iteratee function that iterate will use to solve this problem:
(defn step-state-hof [op]
(fn [{:keys [unprocessed current answer]}]
(let [[x y & more] unprocessed]
(let [next-current (if (op x y)
(conj current y)
[y])
next-answer (if (> (count next-current) (count answer))
next-current
answer)]
{:unprocessed (cons y more)
:current next-current
:answer next-answer}))))
current is built up until it becomes longer than answer, in which case a new answer is created. Whenever the condition op is not satisfied we start again building up a new current.
iterate itself returns an infinite sequence, so needs to be stopped when the iteratee has been called the right number of times:
(def in [3 2 1 0 -1 2 7 6 7 6 5 4 3 2])
(->> (iterate (step-state-hof >) {:unprocessed (rest in)
:current (vec (take 1 in))})
(drop (- (count in) 2))
first
:answer)
;;=> [7 6 5 4 3 2]
Often you would use a drop-while or take-while to short circuit just when the answer has been obtained. We could so that here however there is no short circuiting required as we know in advance that the inner function of step-state-hof needs to be called (- (count in) 1) times. That is one less than the count because it is processing two elements at a time. Note that first is forcing the final call.
I wanted this order for the form:
reduce
val, col
f
I was able to figure out that this technically satisfies my requirements:
> (apply reduce
(->>
[0 [1 2 3 4]]
(cons
(fn [acc x]
(+ acc x)))))
10
But it's not the easiest thing to read.
This looks much simpler:
> (defn reduce< [val col f]
(reduce f val col))
nil
> (reduce< 0 [1 2 3 4]
(fn [acc x]
(+ acc x)))
10
(< is shorthand for "parameters are rotated left"). Using reduce<, I can see what's being passed to f by the time my eyes get to the f argument, so I can just focus on reading the f implementation (which may get pretty long). Additionally, if f does get long, I no longer have to visually check the indentation of the val and col arguments to determine that they belong to the reduce symbol way farther up. I personally think this is more readable than binding f to a symbol before calling reduce, especially since fn can still accept a name for clarity.
This is a general solution, but the other answers here provide many good alternative ways to solve the specific problem I gave as an example.

clojure laziness: prevent unneded mapcat results to realize

Consider a query function q that returns, with a delay, some (let say ten) results.
Delay function:
(defn dlay [x]
(do
(Thread/sleep 1500)
x))
Query function:
(defn q [pg]
(lazy-seq
(let [a [0 1 2 3 4 5 6 7 8 9 ]]
(println "q")
(map #(+ (* pg 10) %) (dlay a)))))
Wanted behaviour:
I would like to produce an infinite lazy sequence such that when I take a value only needed computations are evaluated
Wrong but explicative example:
(drop 29 (take 30 (mapcat q (range))))
If I'm not wrong, it needs to evaluate every sequence because it really doesn't now how long the sequences will be.
How would you obtain the correct behaviour?
My attempt to correct this behaviour:
(defn getq [coll n]
(nth
(nth coll (quot n 10))
(mod n 10)))
(defn results-seq []
(let [a (map q (range))]
(map (partial getq a)
(iterate inc 0)))) ; using iterate instead of range, this way i don't have a chunked sequence
But
(drop 43 (take 44 (results-seq)))
still realizes the "unneeded" q sequences.
Now, I verified that a is lazy, iterate and map should produce lazy sequences, so the problem must be with getq. But I can't understand really how it breaks my laziness...perhaps does nth realize things while walking through a sequence? If this would be true, is there a viable alternative in this case or my solution suffers from bad design?

Apply a list of functions to a value

I'm looking for something that is probably very well defined in Clojure (in the Lisp world at large in fact) but I don't have enough experience or culture to get on the right track and Google hasn't been very helpful so far.
Let's say I have three simple forms:
(defn add-one [v] (+ v 1))
(defn add-two [v] (+ v 2))
(defn add-three [v] (+ v 3))
Out of convenience, they are stored in a vector. In the real world, that vector would vary depending on the context:
(def operations
[add-one
add-two
add-three])
And I also have an initial value:
(def value 42)
Now, I would like to apply all the functions in that vector to that value and get the result of the combined operations:
(loop [ops operations
val value]
(if (empty? ops)
val
(recur (rest ops)
((first ops) val))))
While this does work, I'm surprised there isn't a higher level form just for that. I've looked all over the place but couldn't find anything.
The functional phrase you are searching for is (apply comp operations):
((apply comp operations) 42)
;48
Your loop does work if you feed it 42 for value:
(loop [ops operations
val 42]
(if (empty? ops)
val
(recur (rest ops)
((first ops) val))))
;48
This applies the operations in the opposite order from comp.
... As does using reduce:
(reduce (fn [v f] (f v)) 42 operations)
;48
If you look at the source code for comp, you'll find that the general case essentially executes a loop similar to yours upon a reversed list of the supplied functions.
'In Lisp world at large' you can use reduce:
user> (reduce (fn [x y] (y x)) 5 [inc inc inc inc])
;; => 9
This may look not so sexy, but it works everywhere with minor variations (this is Common Lisp, for example):
CL-USER> (reduce (lambda (x y) (funcall y x))
'(1+ 1+ 1+ 1+)
:initial-value 5)
9

Iteratively apply function to its result without generating a seq

This is one of those "Is there a built-in/better/idiomatic/clever way to do this?" questions.
I want a function--call it fn-pow--that will apply a function f to the result of applying f to an argument, then apply it to the result of applying it to its result, etc., n times. For example,
(fn-pow inc 0 3)
would be equivalent to
(inc (inc (inc 0)))
It's easy to do this with iterate:
(defn fn-pow-0
[f x n]
(nth (iterate f x) n))
but that creates and throws away an unnecessary lazy sequence.
It's not hard to write the function from scratch. Here is one version:
(defn fn-pow-1
[f x n]
(if (> n 0)
(recur f (f x) (dec n))
x))
I found this to be almost twice as fast as fn-pow-0, using Criterium on (fn-pow inc 0 10000000).
I don't consider the definition of fn-pow-1 to be unidiomatic, but fn-pow seems like something that might be a standard built-in function, or there may be some simple way to define it with a couple of higher-order functions in a clever arrangement. I haven't succeeded in discovering either. Am I missing something?
The built-in you are looking for is probably dotimes. I'll tell you why in a round-about fashion.
Time
What you are testing in your benchmark is mainly the overhead of a level of indirection. That (nth (iterate ...) n) is only twice as slow as what compiles to a loop when the body is a very fast function is rather surprising/encouraging. If f is a more costly function, the importance of that overhead diminishes. (Of course if your f is low-level and fast, then you should use a low-level loop construct.)
Say your function takes ~ 1 ms instead
(defn my-inc [x] (Thread/sleep 1) (inc x))
Then both of these will take about 1 second -- the difference is around 2% rather than 100%.
(bench (fn-pow-0 my-inc 0 1000))
(bench (fn-pow-1 my-inc 0 1000))
Space
The other concern is that iterate is creating an unnecessary sequence. But, if you are not holding onto the head, just doing an nth, then you aren't really creating a sequence per se but sequentially creating, using, and discarding LazySeq objects. In other words, you are using a constant amount of space, though generating garbage in proportion to n. However, unless your f is primitive or mutating its argument, then it is already producing garbage in proportion to n in producing its own intermediate results.
Reducing Garbage
An interesting compromise between fn-pow-0 and fn-pow-1 would be
(defn fn-pow-2 [f x n] (reduce (fn [x _] (f x)) x (range n)))
Since range objects know how to intelligently reduce themselves, this does not create additional garbage in proportion to n. It boils down to a loop as well. This is the reduce method of range:
public Object reduce(IFn f, Object start) {
Object ret = f.invoke(start,n);
for(int x = n+1;x < end;x++)
ret = f.invoke(ret, x);
return ret;
}
This was actually the fastest of the three (before adding primitive type-hints on n in the recur version, that is) with the slowed down my-inc.
Mutation
If you are iterating a function potentially expensive in time or space, such as matrix operations, then you may very well be wanting to use (in a contained manner) an f that mutates its argument to eliminate the garbage overhead. Since mutation is a side effect, and you want that side effect n times, dotimes is the natural choice.
For the sake of example, I'll use an atom as a stand-in, but imagine bashing on a mutable matrix instead.
(def my-x (atom 0))
(defn my-inc! [x] (Thread/sleep 1) (swap! x inc))
(defn fn-pow-3! [f! x n] (dotimes [i n] (f! x)))
That sounds just like composing functions n times.
(defn fn-pow [f p t]
((apply comp (repeat t f)) p))
Hmmm. I note that Ankur's version is around 10x slower than your original - possibly not the intent, no matter how idiomatic? :-)
Type hinting fn-pow-1 simply for the counter yields substantially faster results for me - around 3x faster.
(defn fn-pow-3 [f x ^long n]
(if (> n 0)
(recur f (f x) (dec n))
x))
This is around twice as slow as a version which uses inc directly, losing the variability (not hinting x to keep to the spirit of the test)...
(defn inc-pow [x ^long n]
(if (> n 0)
(recur (inc x) (dec n))
x))
I think that for any nontrivial f that fn-pow-3 is probably the best solution.
I haven't found a particularly "idiomatic" way of doing this as it does not feel like common use case outside of micro benchmarks (although would love to be contradicted).
Would be intrigued to hear of a real world example, if you have one?
To us benighted imperative programmers, a more general pattern is known as a while statement. We can capture it in a macro:
(defmacro while [bv ; binding vector
tf ; test form
recf ; recur form
retf ; return form
]
`(loop ~bv (if ~tf (recur ~#recf) ~retf)))
... in your case
(while [x 0, n 3] (pos? n)
[(inc x) (dec n)]
x)
; 3
I was hoping to type-hint the n, but it's illegal. Maybe it's
inferred.
Forgive me (re)using while.
This isn't quite right: it doesn't allow for computation prior to the recur-form.
We can adapt the macro to do things prior to the recur:
(defmacro while [bv ; binding vector
tf ; test form
bodyf ; body form
retf ; return form
]
(let [bodyf (vec bodyf)
recf (peek bodyf)
bodyf (seq (conj (pop bodyf) (cons `recur recf)))]
`(loop ~bv (if ~tf ~bodyf ~retf))))
For example
(while [x 0, n 3] (pos? n)
(let [x- (inc x) n- (dec n)] [x- n-])
x)
; 3
I find this quite expressive. YMMV.

Cleaning up Clojure function

Coming from imperative programming languages, I am trying to wrap my head around Clojure in hopes of using it for its multi-threading capability.
One of the problems from 4Clojure is to write a function that generates a list of Fibonacci numbers of length N, for N > 1. I wrote a function, but given my limited background, I would like some input on whether or not this is the best Clojure way of doing things. The code is as follows:
(fn fib [x] (cond
(= x 2) '(1 1)
:else (reverse (conj (reverse (fib (dec x))) (+ (last (fib (dec x))) (-> (fib (dec x)) reverse rest first))))
))
The most idiomatic "functional" way would probably be to create an infinite lazy sequence of fibonacci numbers and then extract the first n values, i.e.:
(take n some-infinite-fibonacci-sequence)
The following link has some very interesting ways of generating fibonnaci sequences along those lines:
http://en.wikibooks.org/wiki/Clojure_Programming/Examples/Lazy_Fibonacci
Finally here is another fun implementation to consider:
(defn fib [n]
(let [next-fib-pair (fn [[a b]] [b (+ a b)])
fib-pairs (iterate next-fib-pair [1 1])
all-fibs (map first fib-pairs)]
(take n all-fibs)))
(fib 6)
=> (1 1 2 3 5 8)
It's not as concise as it could be, but demonstrates quite nicely the use of Clojure's destructuring, lazy sequences and higher order functions to solve the problem.
Here is a version of Fibonacci that I like very much (I took the implementation from the clojure wikibook: http://en.wikibooks.org/wiki/Clojure_Programming)
(def fib-seq (lazy-cat [0 1] (map + (rest fib-seq) fib-seq)))
It works like this: Imagine you already have the infinite sequence of Fibonacci numbers. If you take the tail of the sequence and add it element-wise to the original sequence you get the (tail of the tail of the) Fibonacci sequence
0 1 1 2 3 5 8 ...
1 1 2 3 5 8 ...
-----------------
1 2 3 5 8 13 ...
thus you can use this to calculate the sequence. You need two initial elements [0 1] (or [1 1] depending on where you start the sequence) and then you just map over the two sequences adding the elements. Note that you need lazy sequences here.
I think this is the most elegant and (at least for me) mind stretching implementation.
Edit: The fib function is
(defn fib [n] (nth fib-seq n))
Here's one way of doing it that gives you a bit of exposure to lazy sequences, although it's certainly not really an optimal way of computing the Fibonacci sequence.
Given the definition of the Fibonacci sequence, we can see that it's built up by repeatedly applying the same rule to the base case of '(1 1). The Clojure function iterate sounds like it would be good for this:
user> (doc iterate)
-------------------------
clojure.core/iterate
([f x])
Returns a lazy sequence of x, (f x), (f (f x)) etc. f must be free of side-effects
So for our function we'd want something that takes the values we've computed so far, sums the two most recent, and returns a list of the new value and all the old values.
(fn [[x y & _ :as all]] (cons (+ x y) all))
The argument list here just means that x and y will be bound to the first two values from the list passed as the function's argument, a list containing all arguments after the first two will be bound to _, and the original list passed as an argument to the function can be referred to via all.
Now, iterate will return an infinite sequence of intermediate values, so for our case we'll want to wrap it in something that'll just return the value we're interested in; lazy evaluation will stop the entire infinite sequence being evaluated.
(defn fib [n]
(nth (iterate (fn [[x y & _ :as all]] (cons (+ x y) all)) '(1 1)) (- n 2)))
Note also that this returns the result in the opposite order to your implementation; it's a simple matter to fix this with reverse of course.
Edit: or indeed, as amalloy says, by using vectors:
(defn fib [n]
(nth (iterate (fn [all]
(conj all (->> all (take-last 2) (apply +)))) [1 1])
(- n 2)))
See Christophe Grand's Fibonacci solution in Programming Clojure by Stu Halloway. It is the most elegant solution I have seen.
(defn fibo [] (map first (iterate (fn [[a b]] [b (+ a b)]) [0 1])))
(take 10 (fibo))
Also see
How can I generate the Fibonacci sequence using Clojure?