Compare two vectors in clojure no matter the order of the items - clojure

I want to compare two vectors and find out if the items they have are the same no matter the order the items are in.
So..
right now in clojure:
(= [1 2 3] [3 2 1]) ;=> false
I want:
(other_fun [1 2 3] [3 2 1]) ;=> true
(other_fun [1 2 3 4] [3 2 1]) ;=> false
I could not find a containsAll like in java

If you do care about duplicates, you can compare their frequency maps. These are maps with each collection element as a key and number of occurrences as a value. You create them using standard function frequencies, like in given examples.
Different order, same number of duplicates:
(= (frequencies [1 1 2 3 4])(frequencies [4 1 1 2 3]))
evaluates true.
Different order, different number of duplicates:
(= (frequencies [1 1 2 3 4])(frequencies [4 1 2 3]))
evaluates false.
So, you can write a function:
(defn other_fun [& colls]
(apply = (map frequencies colls)))

If you don't care about duplicates, you could create sets from both vectors and compare these:
(= (set [1 2 3]) (set [3 2 1])) ;=> true
As a function:
(defn set= [& vectors] (apply = (map set vectors)))

If you don't care about duplicates, other answers a perfectly applicable and efficient.
But if you do care about duplicates, probably the easiest way to compare two vectors is sorting and comparing:
user=> (= (sort [3 5 2 2]) (sort [2 2 5 3]))
true
user=> (= (sort [3 5 2 2]) (sort [2 5 3]))
false

Create sets from them:
user=> (= (set [1 2 3]) (set [3 2 1]))
true
user=> (defn other_func [col1 col2]
(= (set col1) (set col2)))
#'user/other_func
user=> (other_func [1 2 3] [3 2 1])
true

You're on the JVM already, so if you want containsAll, then just use containsAll, right?

(defn other_fun
"checkes the presence of the elements of vec1 in vec2 and vice versa"
[vec1 vec2]
(if (or (some nil?
(for [a vec1 b [vec2]] (some #(= % a) b)))
(some nil?
(for [a vec2 b [vec1]] (some #(= % a) b))))
false
true))
(other_fun [1 2 3] [3 2 1]) ;=> true
(other_fun [1 2 3 4] [3 2 1]) ;=> false

Related

Why does clojure.core/rest output a list when input is a vector?

Why does clojure.core/rest output a list when input is a vector?
This creates an unexpected effect:
(conj [1 2 3] 4)
; => [1 2 3 4]
(conj (rest [1 2 3]) 4)
; => (4 2 3)
I know that "it calls seq on its argument" from the docs which creates this effect. I don't understand why this is the desired effect. As a naïve user, I would expect (rest [1 2 3]) to behave like (subvec [1 2 3] 1). I know I could just use subvec for my use case. For the sake of learning, I would like to understand the rationale of rest, and use cases where outputting a list is desirable (even when the input is a vector).
The output of rest is NOT a list, but a seq, which is an even lower level abstraction. From the official documentation for rest:
Returns a possibly empty seq of the items after the first. Calls seq on its
argument.
The confusion arises from the fact that both are printed between parens, but if you look closely, they are different:
user=> (list? (rest [1 2 3]))
false
user=> (seq? (rest [1 2 3]))
true
How it's a seq different from a list? seqs are implemented with an Interface that requires implementing first, rest and cons, but details are up to the collection implementation. For instance, vectors use their own implementation:
user=> (class (rest [1 2 3]))
clojure.lang.PersistentVector$ChunkedSeq
user=> (class (rest '(1 2 3)))
clojure.lang.PersistentList
List are an implementation that at least extends a basic Seq interface, and builds on top. For instance, clojure.lang.PersistentList implements the Counted interface which requires a constant-time version of count.
For a detailed description of the differences between Seqs and Lists, check these links:
Differences between a seq and a list
https://clojure.org/reference/sequences
You make a good case for rest on a vector returning a vector. The trouble is that rest is one of the fundamental operations on sequences, and a vector is not a sequence:
=> (seq? [1 2 3 4])
false
However, if rest can accept a seqable thing such as a vector, you could say that it ought to be able to return such.
What does it return?
=> (type (rest [1 2 3 4]))
clojure.lang.PersistentVector$ChunkedSeq
This gives every appearance of being a subvec wrapped in a seq call.
I know that "it calls seq on its argument"
That is correct. Seqs are implemented with an Interface (ISeq) that requires implementing first, rest and cons.
rest takes any Seq'able (any collection that implements ISequable). The reason for using this is efficiency and simplicity.
The way different collection works, the most efficient way of getting the first and rest is different.
Which is why when you convert one collection into a seq, it will come with the most efficient implementation on rest and the others.
I hope this was clear
I agree that this behavior is unexpected and counterintuitive. As a workaround, I created the append and prepend functions in the Tupelo library.
From the docs, we see examples:
Clojure has the cons, conj, and concat functions, but it is not obvious how they should be used to add a new value to the beginning of a vector or list:
; Add to the end
> (concat [1 2] 3) ;=> IllegalArgumentException
> (cons [1 2] 3) ;=> IllegalArgumentException
> (conj [1 2] 3) ;=> [1 2 3]
> (conj [1 2] 3 4) ;=> [1 2 3 4]
> (conj '(1 2) 3) ;=> (3 1 2) ; oops
> (conj '(1 2) 3 4) ;=> (4 3 1 2) ; oops
; Add to the beginning
> (conj 1 [2 3] ) ;=> ClassCastException
> (concat 1 [2 3] ) ;=> IllegalArgumentException
> (cons 1 [2 3] ) ;=> (1 2 3)
> (cons 1 2 [3 4] ) ;=> ArityException
> (cons 1 '(2 3) ) ;=> (1 2 3)
> (cons 1 2 '(3 4) ) ;=> ArityException
Do you know what conj does when you pass it nil instead of a sequence? It silently replaces it with an empty list: (conj nil 5) ⇒ (5) This can cause you to accumulate items in reverse order if you aren’t aware of the default behavior:
(-> nil
(conj 1)
(conj 2)
(conj 3))
;=> (3 2 1)
These failures are irritating and unproductive, and the error messages don’t make it obvious what went wrong. Instead, use the simple prepend and append functions to add new elements to the beginning or end of a sequence, respectively:
(append [1 2] 3 ) ;=> [1 2 3 ]
(append [1 2] 3 4) ;=> [1 2 3 4]
(prepend 3 [2 1]) ;=> [ 3 2 1]
(prepend 4 3 [2 1]) ;=> [4 3 2 1]
Both prepend and append always return a vector result.

Split vector after each occurance of an element

This should be easy but I'm finding it more difficult than expected.
Given [0 1 2 0 1 2 0 1], split the sequence after each occurance of 2.
Result should be similar to [[0 1 2] [0 1 2] [0 1]].
split functions only split at the first instance. My imagination is also limited on how to use the partition functions to achieve this.
previous solutions are ok (although #magos solution if flawed in some cases), but if this function is to be used as an utility (it is rather general i guess), i would use the classic iterative approach:
(defn group-loop [delim coll]
(loop [res [] curr [] coll (seq coll)]
(if coll
(let [group (conj curr (first coll))]
(if (= delim (first coll))
(recur (conj res group) [] (next coll))
(recur res group (next coll))))
(if (seq curr)
(conj res curr)
res))))
in repl:
user> (map (partial group-loop 2)
[[]
nil
[1 2 3 1 2 3]
[1 2 3 1 2 3 2]
[2 1 2 3 1 2 3]
[1 3 4 1 3 4]])
;;([] []
;; [[1 2] [3 1 2] [3]]
;; [[1 2] [3 1 2] [3 2]]
;; [[2] [1 2] [3 1 2] [3]]
;; [[1 3 4 1 3 4]])
Though it looks a bit too verbose, it still has some rather important advantages: first of all it is kind of classic (which i find a pro rather than con), second: it is fast (according to my benchmark about 3 times faster than reduce variant, and 6 to 10 times faster than partition variant)
also you can make it more clojurish with some minor tweaks, returning lazy collection as clojure's sequence operating functions do:
(defn group-lazy [delim coll]
(loop [curr [] coll coll]
(if (seq coll)
(let [curr (conj curr (first coll))]
(if (= delim (first coll))
(cons curr (lazy-seq (group-lazy delim (rest coll))))
(recur curr (next coll))))
(when (seq curr) [curr]))))
user> (map (partial group-lazy 2)
[[]
nil
[1 2 3 1 2 3]
[1 2 3 1 2 3 2]
[2 1 2 3 1 2 3]
[1 3 4 1 3 4]])
;;(nil nil
;; ([1 2] [3 1 2] [3])
;; ([1 2] [3 1 2] [3 2])
;; ([2] [1 2] [3 1 2] [3])
;; [[1 3 4 1 3 4]])
Here's one way by combining two partition variants. First use partition-by to divide at instances of 2, then take two and two of those partitions with partition-all and join them together using concat.
(->> [0 1 2 0 1 2 0 1]
(partition-by (partial = 2)) ;;((0 1) (2) (0 1) (2) (0 1))
(partition-all 2) ;;(((0 1) (2)) ((0 1) (2)) ((0 1)))
(mapv (comp vec (partial reduce concat)))) ;;[[0 1 2] [0 1 2] [0 1]]
Although note that if the input starts on a 2 the returned partitions will also start with 2s, not end on them as here.
Here you go, works as requested for all inputs:
(reduce #(let [last-v (peek %1)]
(if (= 2 (last last-v))
(conj %1 [%2])
(conj (pop %1) (conj last-v %2))))
[[]]
[2 2 0 1 2 3 4 2 2 0 1 2 2])
=> [[2] [2] [0 1 2] [3 4 2] [2] [0 1 2] [2]]
While Magos has an elegant solution, it is unfortunately not complete, as he mentions. So, the above should do the job using reduce.
We look at the most recently added element. If it was a 2, we create a new sub-vector ((conj %1 [%2])). Otherwise, we add it to the last sub-vector. Pretty simple really. Existing functions like the partitions and splits are great for reusing when possible, but sometimes the best solution is a custom function, and in this case it's actually pretty clean.

Clojure: how to test if a seq is a "subseq" of another seq

Is there an easy / idiomatic way in Clojure to test whether a given sequence is included within another sequence? Something like:
(subseq? [4 5 6] (range 10)) ;=> true
(subseq? [4 6 5] (range 10)) ;=> false
(subseq? "hound" "greyhound") ;=> true
(where subseq? is a theoretical function that would do what I'm describing)
It seems that there is no such function in the core or other Clojure libraries... assuming that's true, is there a relatively simple way to implement such a function?
(defn subseq? [a b]
(some #{a} (partition (count a) 1 b)))
(defn subseq? [target source]
(pos? (java.util.Collections/indexOfSubList (seq source) (seq target))))
***
DISCLAIMER EDIT
This proposal is not reliable for reasons discussed in comments section.
***
#amalloy 's solution has one flaw - it won't work with infinite lazy sequences. So it will loop forever when you run this:
(subseq? [1 2 3] (cycle [2 3 1]))
I propose this implementation to fix this:
(defn- safe-b
"In case b is a cycle, take only two full cycles -1 of a-count
to prevent infinite loops yet still guarantee finding potential solution."
[b a-count]
(take
(* 2 a-count)
b))
(defn subseq? [a b]
(let [a-count (count a)]
(some #{a} (partition a-count 1 (safe-b b a-count)))))
=> #'user/safe-b
=> #'user/subseq?
(subseq? [1 2 3] (cycle [2 3 1]))
=> [1 2 3]
(subseq? [1 2 3] (cycle [3 2 1]))
=> nil
(subseq? [1 2 3] [2 3])
=> nil
(subseq? [2 3] [1 2 3])
=> [2 3]

Difference between arrow and double arrow macros in Clojure

What is the difference between the -> and ->> macros in Clojure?
The docs A. Webb linked to explain the "what", but don't do a good job of the "why".
As a rule, when a function works on a singular subject, that subject is the first argument (e.g., conj, assoc). When a function works on a sequence subject, that subject is the last argument (e.g., map, filter).
So, -> and ->> are documented as threading the first and last arguments respectively, but it is also useful to think of them as applying to singular or sequential arguments respectively.
For example, we can consider a vector as a singular object:
(-> [1 2 3]
(conj 4) ; (conj [1 2 3] 4)
(conj 5) ; (conj [1 2 3 4] 5)
(assoc 0 0)) ; (assoc [1 2 3 4 5] 0 0)
=> [0 2 3 4 5]
Or we can consider it as a sequence:
(->> [1 2 3]
(map inc) ; (map inc [1 2 3])
(map inc) ; (map inc (2 3 4))
(concat [0 2])) ; (concat [0 2] (3 4 5))
=> (0 2 3 4 5)

How to remove multiple items from a list?

I have a list [2 3 5] which I want to use to remove items from another list like [1 2 3 4 5], so that I get [1 4].
thanks
Try this:
(let [a [1 2 3 4 5]
b [2 3 5]]
(remove (set b) a))
which returns (1 4).
The remove function, by the way, takes a predicate and a collection, and returns a sequence of the elements that don't satisfy the predicate (a set, in this example).
user=> (use 'clojure.set)
nil
user=> (difference (set [1 2 3 4 5]) (set [2 3 5]))
#{1 4}
Reference:
http://clojure.org/data_structures#toc22
http://clojure.org/api#difference
You can do this yourself with something like:
(def a [2 3 5])
(def b [1 2 3 4 5])
(defn seq-contains?
[coll target] (some #(= target %) coll))
(filter #(not (seq-contains? a %)) b)
; (3 4 5)
A version based on the reducers library could be:
(require '[clojure.core.reducers :as r])
(defn seq-contains?
[coll target]
(some #(= target %) coll))
(defn my-remove
"remove values from seq b that are present in seq a"
[a b]
(into [] (r/filter #(not (seq-contains? b %)) a)))
(my-remove [1 2 3 4 5] [2 3 5] )
; [1 4]
EDIT Added seq-contains? code
Here is my take without using sets;
(defn my-diff-func [X Y]
(reduce #(remove (fn [x] (= x %2)) %1) X Y ))