Reduce works fine but it is more like fold-left.
Is there any other form of reduce that lets me fold to right ?
The reason that the clojure standard library only has fold-left (reduce) is actually very subtle and it is because clojure isn't lazy enough to get the main benefit of fold-right.
The main benefit of fold-right in languages like haskell is that it can actually short-circuit.
If we do foldr (&&) True [False, True, True, True, True] the way that it actually gets evaluated is very enlightening. The only thing it needs to evaluate is the function and with 1 argument (the first False). Once it gets there it knows the answer and does not need to evaluate ANY of the Trues.
If you look very closely at the picture:
you will see that although conceptually fold-right starts and the end of the list and moves towards the front, in actuality, it starts evaluating at the FRONT of the list.
This is an example of where lazy/curried functions and tail recursion can give benefits that clojure can't.
Bonus Section (for those interested)
Based on a recommendation from vemv, I would like to mention that Clojure added a new function to the core namespace to get around this limitation that Clojure can't have the lazy right fold. There is a function called reduced in the core namespace which allows you to make Clojure's reduce lazier. It can be used to short-circuit reduce by telling it not to look at the rest of the list. For instance, if you wanted to multiply lists of numbers but had reason to suspect that the list would occasionally contain zero and wanted to handle that as a special case by not looking at the remainder of the list once you encountered a zero, you could write the following multiply-all function (note the use of reduced to indicate that the final answer is 0 regardless of what the rest of the list is).
(defn multiply-all [coll]
(reduce
(fn [accumulator next-value]
(if (zero? next-value)
(reduced 0)
(* accumulator next-value)))
1
coll))
And then to prove that it short-circuits you could multiply an infinite list of numbers which happens to contain a zero and see that it does indeed terminate with the answer of 0
(multiply-all
(cycle [1 2 3 4 0]))
Let's look at a possible definition of each:
(defn foldl [f val coll]
(if (empty? coll) val
(foldl f (f val (first coll)) (rest coll))))
(defn foldr [f val coll]
(if (empty? coll) val
(f (foldr f val (rest coll)) (first coll))))
Notice that only foldl is in tail position, and the recursive call can be replaced by recur. So with recur, foldl will not take up stack space, while foldr will. That's why reduce is like foldl. Now let's try them out:
(foldl + 0 [1 2 3]) ;6
(foldl - 0 [1 2 3]) ;-6
(foldl conj [] [1 2 3]) ;[1 2 3]
(foldl conj '() [1 2 3]) ;(3 2 1)
(foldr + 0 [1 2 3]) ;6
(foldr - 0 [1 2 3]) ;-6
(foldr conj [] [1 2 3]) ;[3 2 1]
(foldr conj '() [1 2 3]) ;(1 2 3)
Is there some reason you want to fold right? I think the most common usage of foldr is to put together a list from front to back. In Clojure we don't need that because we can just use a vector instead. Another choice to avoid stack overflow is to use a lazy sequence:
(defn make-list [coll]
(lazy-seq
(cons (first coll) (rest coll))))
So, if you want to fold right, some efficient alternatives are
Use a vector instead.
Use a lazy sequence.
Use reduced to short-circuit reduce.
If you really want to dive down a rabbit hole, use a transducer.
Related
I'm currently learning Clojure, and I'm trying to learn how to do things the best way. Today I'm looking at the basic concept of doing things on a sequence, I know the basics of map, filter and reduce. Now I want to try to do a thing to pairs of elements in a sequence, and I found two ways of doing it. The function I apply is println. The output is simply 12 34 56 7
(def xs [1 2 3 4 5 6 7])
(defn work_on_pairs [xs]
(loop [data xs]
(if (empty? data)
data
(do
(println (str (first data) (second data)))
(recur (drop 2 data))))))
(work_on_pairs xs)
I mean, I could do like this
(map println (zipmap (take-nth 2 xs) (take-nth 2 (drop 1 xs))))
;; prints [1 2] [3 4] [5 6], and we loose the last element because zip.
But it is not really nice.. My background is in Python, where I could just say zip(xs[::2], xs[1::2]) But I guess this is not the Clojure way to do it.
So I'm looking for suggestions on how to do this same thing, in the best Clojure way.
I realize I'm so new to Clojure I don't even know what this kind of operation is called.
Thanks for any input
This can be done with partition-all:
(def xs [1 2 3 4 5 6 7])
(->> xs
(partition-all 2) ; Gives ((1 2) (3 4) (5 6) (7))
(map (partial apply str)) ; or use (map #(apply str %))
(apply println))
12 34 56 7
The map line is just to join the pairs so the "()" don't end up in the output.
If you want each pair printed on its own line, change (apply println) to (run! println). Your expected output seems to disagree with your code, so that's unclear.
If you want to dip into transducers, you can do something similar to the threading (->>) form of the accepted answer, but in a single pass over the data.
Assuming
(def xs [1 2 3 4 5 6 7])
has been evaluated already,
(transduce
(comp
(partition-all 2)
(map #(apply str %)))
conj
[]
xs)
should give you the same output if you wrap it in
(apply println ...)
We supply conj (reducing fn) and [] (initial data structure) to specify how the reduce process inside transduce should build up the result.
I wouldn't use a transducer for a list that small, or a process that simple, but it's good to know what's possible!
Why does the second evaluation with butlast loop endlessly in Clojure?
user=> (->> (range) butlast lazy-seq (take 0))
()
user=> (->> (range) butlast lazy-seq first) ; ...
The equivalent in Haskell can be reasonably lazy-eval'ed.
ghci> take 0 . init $ [0..]
[]
ghci> head . init $ [0..]
0
Edit
take 0 is no-op as being pointed out below.
Why does the second evaluation loop endlessly in Clojure?
butlast is non-lazy as it enumerates the entire input sequence and builds up a new collection until it exhausts the input.
In both versions butlast is eliminating upstream laziness, so I think another relevant question is "why does the first version ever return when butlast is involved?": take 0 is a no-op — it doesn't consume the input sequence when n isn't a positive number, so it doesn't matter what you pass it.
Also, I think the equivalent Clojure code to head . tail $ [0..] would be:
(->> (range) rest first) ;; => 1
butlast is kinda the opposite of what you'd want.
If you look at the implementation of butlast, you can see that it's eager (using loop instead of lazy-seq). An alternate lazy implementation would do what you want:
(defn butlast'
"Like clojure.core/butlast but lazy."
[xs]
(when (seq xs)
((fn f [[x & xs]]
(if (seq xs)
(lazy-seq (cons x (f xs)))
()))
xs)))
From what I gather a transformer is the use of functions that change , alter , a collection of elements . Like if I did added 1 to each element in a collection of
[1 2 3 4 5]
and it became
[2 3 4 5 6]
but writing the code for this looks like
(map inc)
but I keep getting this sort of code confused with a reducer. Because it produces a new accumulated result .
The question I ask is , what is the difference between a transformer and a reducer ?
You are likely just confusing various nomenclature (as the comments above suggest), but I'll answer what I think is your question by taking some liberties in interpreting what you mean to be reducer and transformer.
Reducing:
A reducing function (what you probably think is a reducer), is a function that takes an accumulated value and a current value, and returns a new accumulated value.
(accumulated, current) => accumulated
These functions are passed to reduce, and they successively step through a sequence performing whatever the body of the reducing function says with it's two arguments (accumulated and current), and then returning a new accumulated value which will be used as the accumulated value (first argument) to the next call of the reducing function.
For example, plus can be viewed as a reducing function.
(reduce + [0 1 2]) => 3
First, the reducing function (plus in this example) is called with 0 and 1, which returns 1. On the next call, 1 is now the accumulated value, and 2 is the current value, so plus is called with 1 and 2, returning 3, which completes the reduction as there are no further elements in the collection to process.
It may help to look at a simplified version of a reduce implementation:
(defn reduce1
([f coll] ;; f is a reducing function
(let [[x y & xs] coll]
;; called with the accumulated value so far "x"
;; and cur value in input sequence "y"
(if y (reduce1 f (cons (f x y) xs))
x)))
([f start coll]
(reduce1 f (cons start coll))))
You can see that the function "f" , or the "reducing function" is called on each iteration with two arguments, the accumulated value so far, and the next value in the input sequence. The return value of this function is used as the first argument in the next call, etc. and thus has the type:
(x, y) => x
Transforming:
A transformation, the way I think you mean it, suggests the shape of the input does not change, but is simply modified according to an arbitrary function. This would be functions you pass to map, as they are applied to each element and build up a new collection of the same shape, but with that function applied to each element.
(map inc [0 1 2]) => '(1 2 3)
Notice the shape is the same, it's still a 3 element sequence, whereas in the reduction above, you input a 3 element sequence and get back an integer. Reductions can change the shape of the final result, map does not.
Note that I say the "shape" doesn't change, but the type of each element may change depending on what your "transforming" function does:
(map #(list (inc %)) [0 1 2]) => '((1) (2) (3))
It's still a 3 element sequence, but now each element is a list, not an integer.
Addendum:
There are two related concepts in Clojure, Reducers and Transducers, which I just wanted to mention since you asked about reducers (which have as specific meaning in Clojure) and transformers (which are the names Clojurists typically assign to a transducing function via the shorthand "xf"). It would turn this already long answer into a short-story if I tried to explain the details of both here, and it's been done better than I can do by others:
Transducers:
http://elbenshira.com/blog/understanding-transducers/
https://www.youtube.com/watch?v=6mTbuzafcII
Reducers and Transducers:
https://eli.thegreenplace.net/2017/reducers-transducers-and-coreasync-in-clojure/
It turns out that many transformations of collections can be expressed in terms of reduce. For instance map could be implemented as
(defn map [f coll] (reduce (fn [x y] (conj x (f y))) [] [0 1 2 3 4]))
and then you would call
(map inc [1 2 3 4 5])
to obtain
[2 3 4 5 6]
In our homemade implementation of map, the function that we pass to reduce is
(fn [x y] (conj x (f y))))
where f is the function that we would like to apply to every element. So we can write a function that produces such a function for us, passing the function that we would like to map.
(defn mapping-with-conj [f] (fn [x y] (conj x (f y))))
But we still see the presence of conj in the above function assuming we want to add elements to a collection. We can get even more flexibility by extra indirection:
(defn mapping [f] (fn [step] (fn [x y] (step x (f y)))))
Then we can use it like this:
(def increase-by-1 (mapping inc))
(reduce (increase-by-1 conj) [] [1 2 3])
The (map inc) you are referring does what our call to (mapping inc) does. Why would you want to do things this way? The answer is that it gives us a lot of flexibility to build things. For instance, instead of building up a collection, we can do
(reduce ((map inc) +) 0 [1 2 3 4 5])
Which will give us the sum of the mapped collection [2 3 4 5 6]. Or we can add extra processing steps just by simple function composition.
(reduce ((comp (filter odd?) (map inc)) conj) [] [1 2 3 4 5])
which will first remove even elements from the collection before we map. The transduce function in Clojure does essentially what the above line does, but takes care of another few extra details, too. So you would actually write
(transduce (comp (filter odd?) (map inc)) conj [] [1 2 3 4 5])
To sum up, the map function in Clojure has two arities. Calling it like (map inc [1 2 3 4 5]) will map every element of a collection so that you obtain [2 3 4 5 6]. Calling it just like (map inc) gives us a function that behaves pretty much like our mapping function in the above explanation.
Updating a vector works fine:
(update [{:idx :a} {:idx :b}] 1 (fn [_] {:idx "Hi"}))
;; => [{:idx :a} {:idx "Hi"}]
However trying to do the same thing with a list does not work:
(update '({:idx :a} {:idx :b}) 1 (fn [_] {:idx "Hi"}))
;; => ClassCastException clojure.lang.PersistentList cannot be cast to clojure.lang.Associative clojure.lang.RT.assoc (RT.java:807)
Exactly the same problem exists for assoc.
I would like to do update and overwrite operations on lazy types rather than vectors. What is the underlying issue here, and is there a way I can get around it?
The underlying issue is that the update function works on associative structures, i.e. vectors and maps. Lists can't take a key as a function to look up a value.
user=> (associative? [])
true
user=> (associative? {})
true
user=> (associative? `())
false
update uses get behind the scenes to do its random access work.
I would like to do update and overwrite operations on lazy types
rather than vectors
It's not clear what want to achieve here. You're correct that vectors aren't lazy, but if you wish to do random access operations on a collection then vectors are ideal for this scenario and lists aren't.
and is there a way I can get around it?
Yes, but you still wouldn't be able to use the update function, and it doesn't look like there would be any benefit in doing so, in your case.
With a list you'd have to walk the list in order to access an index somewhere in the list - so in many cases you'd have to realise a great deal of the sequence even if it was lazy.
You can define your own function, using take and drop:
(defn lupdate [list n function]
(let [[head & tail] (drop n list)]
(concat (take n list)
(cons (function head) tail))))
user=> (lupdate '(a b c d e f g h) 4 str)
(a b c d "e" f g h)
With lazy sequences, that means that you will compute the n first values (but not the remaining ones, which after all is an important part of why we use lazy sequences). You have also to take into account space and time complexity (concat, etc.). But if you truly need to operate on lazy sequences, that's the way to go.
Looking behind your question to the problem you are trying to solve:
You can use Clojure's sequence functions to construct a simple solution:
(defn elf [n]
(loop [population (range 1 (inc n))]
(if (<= (count population) 1)
(first population)
(let [survivors (->> population
(take-nth 2)
((if (-> population count odd?) rest identity)))]
(recur survivors)))))
For example,
(map (juxt identity elf) (range 1 8))
;([1 1] [2 1] [3 3] [4 1] [5 3] [6 5] [7 7])
This has complexity O(n). You can speed up count by passing the population count as a redundant argument in the loop, or by dumping the population and survivors into vectors. The sequence functions - take-nth and rest - are quite capable of doing the weeding.
I hope I got it right!
I see some examples that show we can get a nice head/tail destructuring of a sequence in clojure as follows:
(if-let [[x & xs] (seq coll)]
However I assume this won't work as desired for lazy sequences because this puts the values into a vector, which aren't lazy. I tried changing the vector form to a list form, and it gave me binding errors, quoted or not.
Without having binding like this, it seems that if I've got a lazy sequence where each element was a computationally-intensive equation of the previous element, I'd have to do that computation twice to get the head and tail as separate statements, right?
(let [head (first my-lazy-seq) ;; has to calculate the value of head.
tail (rest my-lazy-seq)] ;; also has to calculate the value of head to prepare the rest of the sequence.
Is there any way around this, or am I making an incorrect assumption somewhere?
user=> (let [[x & xs] (range)] [x (take 10 xs)])
[0 (1 2 3 4 5 6 7 8 9 10)]
xs is still a lazy seq, so you can use the destructuring with no problems. This will force the first element of xs, though. (Destructuring uses vector notation, but it doesn't necessarily use vectors under the covers.)
With respect to your second question: lazy seqs cache their results, so your second option would also work without extra recalculation. The head will only be calculated once.
The binding vector [x & xs] isn't actually constructing a vector at runtime. It's just the notation used for destructuring into head & tail.
So it works fine on infinite sequences:
(if-let [[x & xs] (range)]
(apply str x (take 9 xs)))
=> "0123456789"
The destructuring form is actually producing a lazy sequence in this case, which you can observe as follows:
(if-let [[x & xs :as my-seq] (range)]
(class my-seq))
=> clojure.lang.LazySeq