Difference between arrow and double arrow macros in Clojure - clojure

What is the difference between the -> and ->> macros in Clojure?

The docs A. Webb linked to explain the "what", but don't do a good job of the "why".
As a rule, when a function works on a singular subject, that subject is the first argument (e.g., conj, assoc). When a function works on a sequence subject, that subject is the last argument (e.g., map, filter).
So, -> and ->> are documented as threading the first and last arguments respectively, but it is also useful to think of them as applying to singular or sequential arguments respectively.
For example, we can consider a vector as a singular object:
(-> [1 2 3]
(conj 4) ; (conj [1 2 3] 4)
(conj 5) ; (conj [1 2 3 4] 5)
(assoc 0 0)) ; (assoc [1 2 3 4 5] 0 0)
=> [0 2 3 4 5]
Or we can consider it as a sequence:
(->> [1 2 3]
(map inc) ; (map inc [1 2 3])
(map inc) ; (map inc (2 3 4))
(concat [0 2])) ; (concat [0 2] (3 4 5))
=> (0 2 3 4 5)

Related

Why does clojure.core/rest output a list when input is a vector?

Why does clojure.core/rest output a list when input is a vector?
This creates an unexpected effect:
(conj [1 2 3] 4)
; => [1 2 3 4]
(conj (rest [1 2 3]) 4)
; => (4 2 3)
I know that "it calls seq on its argument" from the docs which creates this effect. I don't understand why this is the desired effect. As a naïve user, I would expect (rest [1 2 3]) to behave like (subvec [1 2 3] 1). I know I could just use subvec for my use case. For the sake of learning, I would like to understand the rationale of rest, and use cases where outputting a list is desirable (even when the input is a vector).
The output of rest is NOT a list, but a seq, which is an even lower level abstraction. From the official documentation for rest:
Returns a possibly empty seq of the items after the first. Calls seq on its
argument.
The confusion arises from the fact that both are printed between parens, but if you look closely, they are different:
user=> (list? (rest [1 2 3]))
false
user=> (seq? (rest [1 2 3]))
true
How it's a seq different from a list? seqs are implemented with an Interface that requires implementing first, rest and cons, but details are up to the collection implementation. For instance, vectors use their own implementation:
user=> (class (rest [1 2 3]))
clojure.lang.PersistentVector$ChunkedSeq
user=> (class (rest '(1 2 3)))
clojure.lang.PersistentList
List are an implementation that at least extends a basic Seq interface, and builds on top. For instance, clojure.lang.PersistentList implements the Counted interface which requires a constant-time version of count.
For a detailed description of the differences between Seqs and Lists, check these links:
Differences between a seq and a list
https://clojure.org/reference/sequences
You make a good case for rest on a vector returning a vector. The trouble is that rest is one of the fundamental operations on sequences, and a vector is not a sequence:
=> (seq? [1 2 3 4])
false
However, if rest can accept a seqable thing such as a vector, you could say that it ought to be able to return such.
What does it return?
=> (type (rest [1 2 3 4]))
clojure.lang.PersistentVector$ChunkedSeq
This gives every appearance of being a subvec wrapped in a seq call.
I know that "it calls seq on its argument"
That is correct. Seqs are implemented with an Interface (ISeq) that requires implementing first, rest and cons.
rest takes any Seq'able (any collection that implements ISequable). The reason for using this is efficiency and simplicity.
The way different collection works, the most efficient way of getting the first and rest is different.
Which is why when you convert one collection into a seq, it will come with the most efficient implementation on rest and the others.
I hope this was clear
I agree that this behavior is unexpected and counterintuitive. As a workaround, I created the append and prepend functions in the Tupelo library.
From the docs, we see examples:
Clojure has the cons, conj, and concat functions, but it is not obvious how they should be used to add a new value to the beginning of a vector or list:
; Add to the end
> (concat [1 2] 3) ;=> IllegalArgumentException
> (cons [1 2] 3) ;=> IllegalArgumentException
> (conj [1 2] 3) ;=> [1 2 3]
> (conj [1 2] 3 4) ;=> [1 2 3 4]
> (conj '(1 2) 3) ;=> (3 1 2) ; oops
> (conj '(1 2) 3 4) ;=> (4 3 1 2) ; oops
; Add to the beginning
> (conj 1 [2 3] ) ;=> ClassCastException
> (concat 1 [2 3] ) ;=> IllegalArgumentException
> (cons 1 [2 3] ) ;=> (1 2 3)
> (cons 1 2 [3 4] ) ;=> ArityException
> (cons 1 '(2 3) ) ;=> (1 2 3)
> (cons 1 2 '(3 4) ) ;=> ArityException
Do you know what conj does when you pass it nil instead of a sequence? It silently replaces it with an empty list: (conj nil 5) ⇒ (5) This can cause you to accumulate items in reverse order if you aren’t aware of the default behavior:
(-> nil
(conj 1)
(conj 2)
(conj 3))
;=> (3 2 1)
These failures are irritating and unproductive, and the error messages don’t make it obvious what went wrong. Instead, use the simple prepend and append functions to add new elements to the beginning or end of a sequence, respectively:
(append [1 2] 3 ) ;=> [1 2 3 ]
(append [1 2] 3 4) ;=> [1 2 3 4]
(prepend 3 [2 1]) ;=> [ 3 2 1]
(prepend 4 3 [2 1]) ;=> [4 3 2 1]
Both prepend and append always return a vector result.

Split vector after each occurance of an element

This should be easy but I'm finding it more difficult than expected.
Given [0 1 2 0 1 2 0 1], split the sequence after each occurance of 2.
Result should be similar to [[0 1 2] [0 1 2] [0 1]].
split functions only split at the first instance. My imagination is also limited on how to use the partition functions to achieve this.
previous solutions are ok (although #magos solution if flawed in some cases), but if this function is to be used as an utility (it is rather general i guess), i would use the classic iterative approach:
(defn group-loop [delim coll]
(loop [res [] curr [] coll (seq coll)]
(if coll
(let [group (conj curr (first coll))]
(if (= delim (first coll))
(recur (conj res group) [] (next coll))
(recur res group (next coll))))
(if (seq curr)
(conj res curr)
res))))
in repl:
user> (map (partial group-loop 2)
[[]
nil
[1 2 3 1 2 3]
[1 2 3 1 2 3 2]
[2 1 2 3 1 2 3]
[1 3 4 1 3 4]])
;;([] []
;; [[1 2] [3 1 2] [3]]
;; [[1 2] [3 1 2] [3 2]]
;; [[2] [1 2] [3 1 2] [3]]
;; [[1 3 4 1 3 4]])
Though it looks a bit too verbose, it still has some rather important advantages: first of all it is kind of classic (which i find a pro rather than con), second: it is fast (according to my benchmark about 3 times faster than reduce variant, and 6 to 10 times faster than partition variant)
also you can make it more clojurish with some minor tweaks, returning lazy collection as clojure's sequence operating functions do:
(defn group-lazy [delim coll]
(loop [curr [] coll coll]
(if (seq coll)
(let [curr (conj curr (first coll))]
(if (= delim (first coll))
(cons curr (lazy-seq (group-lazy delim (rest coll))))
(recur curr (next coll))))
(when (seq curr) [curr]))))
user> (map (partial group-lazy 2)
[[]
nil
[1 2 3 1 2 3]
[1 2 3 1 2 3 2]
[2 1 2 3 1 2 3]
[1 3 4 1 3 4]])
;;(nil nil
;; ([1 2] [3 1 2] [3])
;; ([1 2] [3 1 2] [3 2])
;; ([2] [1 2] [3 1 2] [3])
;; [[1 3 4 1 3 4]])
Here's one way by combining two partition variants. First use partition-by to divide at instances of 2, then take two and two of those partitions with partition-all and join them together using concat.
(->> [0 1 2 0 1 2 0 1]
(partition-by (partial = 2)) ;;((0 1) (2) (0 1) (2) (0 1))
(partition-all 2) ;;(((0 1) (2)) ((0 1) (2)) ((0 1)))
(mapv (comp vec (partial reduce concat)))) ;;[[0 1 2] [0 1 2] [0 1]]
Although note that if the input starts on a 2 the returned partitions will also start with 2s, not end on them as here.
Here you go, works as requested for all inputs:
(reduce #(let [last-v (peek %1)]
(if (= 2 (last last-v))
(conj %1 [%2])
(conj (pop %1) (conj last-v %2))))
[[]]
[2 2 0 1 2 3 4 2 2 0 1 2 2])
=> [[2] [2] [0 1 2] [3 4 2] [2] [0 1 2] [2]]
While Magos has an elegant solution, it is unfortunately not complete, as he mentions. So, the above should do the job using reduce.
We look at the most recently added element. If it was a 2, we create a new sub-vector ((conj %1 [%2])). Otherwise, we add it to the last sub-vector. Pretty simple really. Existing functions like the partitions and splits are great for reusing when possible, but sometimes the best solution is a custom function, and in this case it's actually pretty clean.

clojure: partition a seq based on a seq of values

I would like to partition a seq, based on a seq of values
(partition-by-seq [3 5] [1 2 3 4 5 6])
((1 2 3)(4 5)(6))
The first input is a seq of split points.
The second input is a seq i would like to partition.
So, that the first list will be partitioned at the value 3 (1 2 3) and the second partition will be (4 5) where 5 is the next split point.
another example:
(partition-by-seq [3] [2 3 4 5])
result: ((2 3)(4 5))
(partition-by-seq [2 5] [2 3 5 6])
result: ((2)(3 5)(6))
given: the first seq (split points) is always a subset of the second input seq.
I came up with this solution which is lazy and quite (IMO) straightforward.
(defn part-seq [splitters coll]
(lazy-seq
(when-let [s (seq coll)]
(if-let [split-point (first splitters)]
; build seq until first splitter
(let [run (cons (first s) (take-while #(<= % split-point) (next s)))]
; build the lazy seq of partitions recursively
(cons run
(part-seq (rest splitters) (drop (count run) s))))
; just return one partition if there is no splitter
(list coll)))))
If the split points are all in the sequence:
(part-seq [3 5 8] [0 1 2 3 4 5 6 7 8 9])
;;=> ((0 1 2 3) (4 5) (6 7 8) (9))
If some split points are not in the sequence
(part-seq [3 5 8] [0 1 2 4 5 6 8 9])
;;=> ((0 1 2) (4 5) (6 8) (9))
Example with some infinite sequences for the splitters and the sequence to split.
(take 5 (part-seq (iterate (partial + 3) 5) (range)))
;;=> ((0 1 2 3 4 5) (6 7 8) (9 10 11) (12 13 14) (15 16 17))
the sequence to be partitioned is a splittee and the elements of split-points (aka. splitter) marks the last element of a partition.
from your example:
splittee: [1 2 3 4 5 6]
splitter: [3 5]
result: ((1 2 3)(4 5)(6))
Because the resulting partitions is always a increasing integer sequence and increasing integer sequence of x can be defined as start <= x < end, the splitter elements can be transformed into end of a sequence according to the definition.
so, from [3 5], we want to find subsequences ended with 4 and 6.
then by adding the start, the splitter can be transformed into sequences of [start end]. The start and end of the splittee is also used.
so, the splitter [3 5] then becomes:
[[1 4] [4 6] [6 7]]
splitter transformation could be done like this
(->> (concat [(first splittee)]
(mapcat (juxt inc inc) splitter)
[(inc (last splittee))])
(partition 2)
there is a nice symmetry between transformed splitter and the desired result.
[[1 4] [4 6] [6 7]]
((1 2 3) (4 5) (6))
then the problem becomes how to extract subsequences inside splittee that is ranged by [start end] inside transformed splitter
clojure has subseq function that can be used to find a subsequence inside ordered sequence by start and end criteria. I can just map the subseq of splittee for each elements of transformed-splitter
(map (fn [[x y]]
(subseq (apply sorted-set splittee) <= x < y))
transformed-splitter)
by combining the steps above, my answer is:
(defn partition-by-seq
[splitter splittee]
(->> (concat [(first splittee)]
(mapcat (juxt inc inc) splitter)
[(inc (last splittee))])
(partition 2)
(map (fn [[x y]]
(subseq (apply sorted-set splittee) <= x < y)))))
This is the solution i came up with.
(def a [1 2 3 4 5 6])
(def p [2 4 5])
(defn partition-by-seq [s input]
(loop [i 0
t input
v (transient [])]
(if (< i (count s))
(let [x (split-with #(<= % (nth s i)) t)]
(recur (inc i) (first (rest x)) (conj! v (first x))))
(do
(conj! v t)
(filter #(not= (count %) 0) (persistent! v))))))
(partition-by-seq p a)

Map with an accumulator in Clojure?

I want to map over a sequence in order but want to carry an accumulator value forward, like in a reduce.
Example use case: Take a vector and return a running total, each value multiplied by two.
(defn map-with-accumulator
"Map over input but with an accumulator. func accepts [value accumulator] and returns [new-value new-accumulator]."
[func accumulator collection]
(if (empty? collection)
nil
(let [[this-value new-accumulator] (func (first collection) accumulator)]
(cons this-value (map-with-accumulator func new-accumulator (rest collection))))))
(defn double-running-sum
[value accumulator]
[(* 2 (+ value accumulator)) (+ value accumulator)])
Which gives
(prn (pr-str (map-with-accumulator double-running-sum 0 [1 2 3 4 5])))
>>> (2 6 12 20 30)
Another example to illustrate the generality, print running sum as stars and the original number. A slightly convoluted example, but demonstrates that I need to keep the running accumulator in the map function:
(defn stars [n] (apply str (take n (repeat \*))))
(defn stars-sum [value accumulator]
[[(stars (+ value accumulator)) value] (+ value accumulator)])
(prn (pr-str (map-with-accumulator stars-sum 0 [1 2 3 4 5])))
>>> (["*" 1] ["***" 2] ["******" 3] ["**********" 4] ["***************" 5])
This works fine, but I would expect this to be a common pattern, and for some kind of map-with-accumulator to exist in core. Does it?
You should look into reductions. For this specific case:
(reductions #(+ % (* 2 %2)) 2 (range 2 6))
produces
(2 6 12 20 30)
The general solution
The common pattern of a mapping that can depend on both an item and the accumulating sum of a sequence is captured by the function
(defn map-sigma [f s] (map f s (sigma s)))
where
(def sigma (partial reductions +))
... returns the sequence of accumulating sums of a sequence:
(sigma (repeat 12 1))
; (1 2 3 4 5 6 7 8 9 10 11 12)
(sigma [1 2 3 4 5])
; (1 3 6 10 15)
In the definition of map-sigma, f is a function of two arguments, the item followed by the accumulator.
The examples
In these terms, the first example can be expressed
(map-sigma (fn [_ x] (* 2 x)) [1 2 3 4 5])
; (2 6 12 20 30)
In this case, the mapping function ignores the item and depends only on the accumulator.
The second can be expressed
(map-sigma #(vector (stars %2) %1) [1 2 3 4 5])
; (["*" 1] ["***" 2] ["******" 3] ["**********" 4] ["***************" 5])
... where the mapping function depends on both the item and the accumulator.
There is no standard function like map-sigma.
General conclusions
Just because a pattern of computation is common does not imply that
it merits or requires its own standard function.
Lazy sequences and the sequence library are powerful enough to tease
apart many problems into clear function compositions.
Rewritten to be specific to the question posed.
Edited to accommodate the changed second example.
Reductions is the way to go as Diego mentioned however to your specific problem the following works
(map #(* % (inc %)) [1 2 3 4 5])
As mentioned you could use reductions:
(defn map-with-accumulator [f init-value collection]
(map first (reductions (fn [[_ accumulator] next-elem]
(f next-elem accumulator))
(f (first collection) init-value)
(rest collection))))
=> (map-with-accumulator double-running-sum 0 [1 2 3 4 5])
(2 6 12 20 30)
=> (map-with-accumulator stars-sum 0 [1 2 3 4 5])
("*" "***" "******" "**********" "***************")
It's only in case you want to keep the original requirements. Otherwise I'd prefer to decompose f into two separate functions and use Thumbnail's approach.

Compare two vectors in clojure no matter the order of the items

I want to compare two vectors and find out if the items they have are the same no matter the order the items are in.
So..
right now in clojure:
(= [1 2 3] [3 2 1]) ;=> false
I want:
(other_fun [1 2 3] [3 2 1]) ;=> true
(other_fun [1 2 3 4] [3 2 1]) ;=> false
I could not find a containsAll like in java
If you do care about duplicates, you can compare their frequency maps. These are maps with each collection element as a key and number of occurrences as a value. You create them using standard function frequencies, like in given examples.
Different order, same number of duplicates:
(= (frequencies [1 1 2 3 4])(frequencies [4 1 1 2 3]))
evaluates true.
Different order, different number of duplicates:
(= (frequencies [1 1 2 3 4])(frequencies [4 1 2 3]))
evaluates false.
So, you can write a function:
(defn other_fun [& colls]
(apply = (map frequencies colls)))
If you don't care about duplicates, you could create sets from both vectors and compare these:
(= (set [1 2 3]) (set [3 2 1])) ;=> true
As a function:
(defn set= [& vectors] (apply = (map set vectors)))
If you don't care about duplicates, other answers a perfectly applicable and efficient.
But if you do care about duplicates, probably the easiest way to compare two vectors is sorting and comparing:
user=> (= (sort [3 5 2 2]) (sort [2 2 5 3]))
true
user=> (= (sort [3 5 2 2]) (sort [2 5 3]))
false
Create sets from them:
user=> (= (set [1 2 3]) (set [3 2 1]))
true
user=> (defn other_func [col1 col2]
(= (set col1) (set col2)))
#'user/other_func
user=> (other_func [1 2 3] [3 2 1])
true
You're on the JVM already, so if you want containsAll, then just use containsAll, right?
(defn other_fun
"checkes the presence of the elements of vec1 in vec2 and vice versa"
[vec1 vec2]
(if (or (some nil?
(for [a vec1 b [vec2]] (some #(= % a) b)))
(some nil?
(for [a vec2 b [vec1]] (some #(= % a) b))))
false
true))
(other_fun [1 2 3] [3 2 1]) ;=> true
(other_fun [1 2 3 4] [3 2 1]) ;=> false