Clojure equality of collections with sequences - clojure

I noticed that Clojure (1.4) seems to be happy to consider vectors equal to the seq of the same vector, but that the same does not apply for maps:
(= [1 2] (seq [1 2]))
=> true
(= {1 2} (seq {1 2}))
=> false
Why should the behaviour of = be different in this way?

Clojure's = can be thought of as performing its comparisons in two steps:
Check if the types of the things being compared belong to the same "equality partition", that is a class of types whose members might potentially be equal (depending on things like the exact members of a given data structure, but not the particular type in the partition);
If so, check if the things being compared actually are equal.
One such equality partition is that of "sequential" things. Vectors are considered sequential:
(instance? clojure.lang.Sequential [])
;= true
As are seqs of various types:
(instance? clojure.lang.Sequential (seq {1 2}))
;= true
Therefore a vector is considered equal to a seq if (and only if) their corresponding elements are equal.
(Note that (seq {}) produces nil, which is not sequential and compares "not equal" to (), [] etc.)
On the other hand, maps constitute an equality partition of their own, so while a hash map might be considered equal to a sorted map, it will never be considered equal to a seq. In particular, it is not equal to the seq of its entries, which is what (seq some-map) produces.

I guess this is because in sequences order as well as value at particular position matters where as in map the order of key/value doesn't matter and this difference between semantics causes this to work as shown by your sample code.
For more details have a look at mapEquals in file https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/APersistentMap.java
It checks if the other object is not map then return false.

user=> (seq {1 2})
([1 2])
user=> (type {1 2})
clojure.lang.PersistentArrayMap

It seems to me that this example points out a slight inconsistency in the notion of equality of values in clojure for this case where they are different types derived from the same type (by the seq function). It could well be argued that this is not inconsistent because it is comparing a derived type to the type it is derived from and I can understand that if the same logic was applied to this same example using vectors (note at the bottom)
the contents are the same type:
user> (type (first (seq {1 2})))
clojure.lang.MapEntry
user> (type (first {1 2}))
clojure.lang.MapEntry
user> (= (type (first {1 2})) (type (first (seq {1 2}))))
true
user> (= (first {1 2}) (first (seq {1 2})))
true
the sequences have the same values
user> (map = (seq {1 2}) {1 2})
(true)
but they are not considered equal
user> (= {1 2} (seq {1 2}))
false
this is true for longer maps as well:
user> (map = (seq {1 2 3 4}) {1 2 3 4})
(true true)
user> (map = (seq {1 2 3 4 5 6}) {1 2 3 4 5 6})
(true true true)
user> (map = (seq {9 10 1 2 3 4 5 6}) {9 10 1 2 3 4 5 6})
(true true true true)
even if they are not in the same order:
user> (map = (seq {9 10 1 2 3 4 5 6}) {1 2 3 4 5 6 9 10})
(true true true true)
but again not if the containing types differ :-(
user> (= {1 2 3 4} (seq {1 2 3 4}))
false
EDIT: this is not always true see below:
to work around this you can convert everything to a seq before comparison, which is (I presume) safe because the seq function always iterates the whole data structure the same way and the structures are immutable values and a seq of a seq is a seq
user> (= (seq {9 10 1 2 3 4 5 6}) {1 2 3 4 5 6 9 10})
false
user> (= (seq {9 10 1 2 3 4 5 6}) (seq {1 2 3 4 5 6 9 10}))
true
vectors are treated differently:
user> (= [1 2 3 4] (seq [1 2 3 4]))
true
Perhaps understanding the minor inconsistencies is part of learning a language or someday this could be changed (though I would not hold my breath)
EDIT:
I found two maps that produce different sequences for the same value so just calling seq on the maps will not give you proper map equality:
user> (seq (zipmap [3 1 5 9][4 2 6 10]))
([9 10] [5 6] [1 2] [3 4])
user> (seq {9 10 5 6 1 2 3 4})
([1 2] [3 4] [5 6] [9 10])
user>
here is an example of what I'm calling proper map equality:
user> (def a (zipmap [3 1 5 9][4 2 6 10]))
#'user/a
user> (def b {9 10 5 6 1 2 3 4})
#'user/b
user> (every? true? (map #(= (a %) (b %)) (keys a)))
true

(seq some-hash-map) gives you a sequence of entries (key/value pairs).
For example:
foo.core=> (seq {:a 1 :b 2 :c 3})
([:a 1] [:c 3] [:b 2])
which is not the same as [:a 1 :b 2 :c 3].

Related

why Clojure conatins? behave so strangely? [duplicate]

This question already has answers here:
Issue with Clojure 'contains'
(3 answers)
Closed 1 year ago.
Why clojure output true fo first one and false for the second one
(def myset [3 5 7 11 13 17 19])
(defn check-n
[n]
(contains? myset n))
(check-n 1)
(check-n 20)
contains? is function for checking keys in collection. It can be used with map:
(contains? {:a 1 :b 2} :a)
=> true
(contains? {:a 1 :b 2} :c)
=> false
or with vector- in this case, it checks whether vector contains given index:
(contains? [1 2 3] 0)
=> true
(contains? [1 2 3] 3)
=> false
If you want to check occurence of number, use .contains from Java:
(.contains [1 2 3] 3)
=> true
or some with set used as predicate:
(some #{3} [1 2 3])
=> 3
(some #{4} [1 2 3])
=> nil

Single duplicate in a vector

Given a list of integers from 1 do 10 with size of 5, how do I check if there are only 2 same integers in the list?
For example
(check '(2 2 4 5 7))
yields yes, while
(check '(2 1 4 4 4))
or
(check '(1 2 3 4 5))
yields no
Here is a solution using frequencies to count occurrences and filter to count the number of values that occur only twice:
(defn only-one-pair? [coll]
(->> coll
frequencies ; map with counts of each value in coll
(filter #(= (second %) 2)) ; Keep values that have 2 occurrences
count ; number of unique values with only 2 occurrences
(= 1))) ; true if only one unique val in coll with 2 occurrences
Which gives:
user=> (only-one-pair? '(2 1 4 4 4))
false
user=> (only-one-pair? '(2 2 4 5 7))
true
user=> (only-one-pair? '(1 2 3 4 5))
false
Intermediate steps in the function to get a sense of how it works:
user=> (->> '(2 2 4 5 7) frequencies)
{2 2, 4 1, 5 1, 7 1}
user=> (->> '(2 2 4 5 7) frequencies (filter #(= (second %) 2)))
([2 2])
user=> (->> '(2 2 4 5 7) frequencies (filter #(= (second %) 2)) count)
1
Per a suggestion, the function could use a more descriptive name and it's also best practice to give predicate functions a ? at the end of it in Clojure. So maybe something like only-one-pair? is better than just check.
Christian Gonzalez's answer is elegant, and great if you are sure you are operating on a small input. However, it is eager: it forces the entire input list even when itcould in principle tell sooner that the result will be false. This is a problem if the list is very large, or if it is a lazy list whose elements are expensive to compute - try it on (list* 1 1 1 (range 1e9))! I therefore present below an alternative that short-circuits as soon as it finds a second duplicate:
(defn exactly-one-duplicate? [coll]
(loop [seen #{}
xs (seq coll)
seen-dupe false]
(if-not xs
seen-dupe
(let [x (first xs)]
(if (contains? seen x)
(and (not seen-dupe)
(recur seen (next xs) true))
(recur (conj seen x) (next xs) seen-dupe))))))
Naturally it is rather more cumbersome than the carefree approach, but I couldn't see a way to get this short-circuiting behavior without doing everything by hand. I would love to see an improvement that achieves the same result by combining higher-level functions.
(letfn [(check [xs] (->> xs distinct count (= (dec (count xs)))))]
(clojure.test/are [input output]
(= (check input) output)
[1 2 3 4 5] false
[1 2 1 4 5] true
[1 2 1 2 1] false))
but I like a shorter (but limited to exactly 5 item lists):
(check [xs] (->> xs distinct count (= 4)))
In answer to Alan Malloy's plea, here is a somewhat combinatory solution:
(defn check [coll]
(let [accums (reductions conj #{} coll)]
(->> (map contains? accums coll)
(filter identity)
(= (list true)))))
This
creates a lazy sequence of the accumulating set;
tests it against each corresponding new element;
filters for the true cases - those where the element is already present;
tests whether there is exactly one of them.
It is lazy, but does duplicate the business of scanning the given collection. I tried it on Alan Malloy's example:
=> (check (list* 1 1 1 (range 1e9)))
false
This returns instantly. Extending the range makes no difference:
=> (check (list* 1 1 1 (range 1e20)))
false
... also returns instantly.
Edited to accept Alan Malloy's suggested simplification, which I have had to modify to avoid what appears to be a bug in Clojure 1.10.0.
you can do something like this
(defn check [my-list]
(not (empty? (filter (fn[[k v]] (= v 2)) (frequencies my-list)))))
(check '(2 4 5 7))
(check '(2 2 4 5 7))
Similar to others using frequencies - just apply twice
(-> coll
frequencies
vals
frequencies
(get 2)
(= 1))
Positive case:
(def coll '(2 2 4 5 7))
frequencies=> {2 2, 4 1, 5 1, 7 1}
vals=> (2 1 1 1)
frequencies=> {2 1, 1 3}
(get (frequencies #) 2)=> 1
Negative case:
(def coll '(2 1 4 4 4))
frequencies=> {2 1, 1 1, 4 3}
vals=> (1 1 3)
frequencies=> {1 2, 3 1}
(get (frequencies #) 2)=> nil

map into a hashmap of input to output? Does this function exist?

I have a higher-order map-like function that returns a hashmap representing the application (input to output) of the function.
(defn map-application [f coll] (into {} (map #(vector % (f %)) coll)))
To be used thus:
(map-application str [1 2 3 4 5])
{1 "1", 2 "2", 3 "3", 4 "4", 5 "5"}
(map-application (partial * 10) [1 2 3 4 5])
{1 10, 2 20, 3 30, 4 40, 5 50}
Does this function already exist, or does this pattern have a recognised name?
I know it's only a one-liner, but looking at the constellation of related functions in clojure.core, this looks like the kind of thing that already exists.
I guess the term you are looking for is transducer.
https://clojure.org/reference/transducers
in fact the transducing variant would look almost like yours (the key difference is that coll argument is passed to into function not map), but it does it's job without any intermediate collections:
user> (defn into-map [f coll]
(into {} (map (juxt identity f)) coll))
#'user/into-map
user> (into-map inc [1 2 3])
;;=> {1 2, 2 3, 3 4}
this can also be done with the simple reduction, though it requires a bit more manual work:
user> (defn map-into-2 [f coll]
(reduce #(assoc %1 %2 (f %2)) {} coll))
#'user/map-into-2
user> (map-into-2 inc [1 2 3])
;;=> {1 2, 2 3, 3 4}
What you're describing is easily handled by the built-in zipmap function:
(defn map-application
[f coll]
(zipmap coll (map f coll)))
(map-application (partial * 10) [1 2 3 4 5])
=> {1 10, 2 20, 3 30, 4 40, 5 50}

How to search and replace in a Clojure script data structure?

I would like to have a search and replace on the values only inside data structures:
(def str [1 2 3
{:a 1
:b 2
1 3}])
and
(subst str 1 2)
to return
[2 2 3 {:a 2, :b 2, 1 3}]
Another example:
(def str2 {[1 2 3] x, {a 1 b 2} y} )
and
(subst str2 1 2)
to return
{[1 2 3] x, {a 1 b 2} y}
Since the 1's are keys in a map they are not replaced
One option is using of postwalk-replace:
user> (def foo [1 2 3
{:a 1
:b 2
1 3}])
;; => #'user/foo
user> (postwalk-replace {1 2} foo)
;; => [2 2 3 {2 3, :b 2, :a 2}]
Although, this method has a downside: it replaces all elements in a structure, not only values. This may be not what you want.
Maybe this will do the trick...
(defn my-replace [smap s]
(letfn [(trns [s]
(map (fn [x]
(if (coll? x)
(my-replace smap x)
(or (smap x) x)))
s))]
(if (map? s)
(zipmap (keys s) (trns (vals s)))
(trns s))))
Works with lists, vectors and maps:
user> (my-replace {1 2} foo)
;; => (2 2 3 {:a 2, :b 2, 1 3})
...Seems to work on arbitrary nested structures too:
user> (my-replace {1 2} [1 2 3 {:a [1 1 1] :b [3 2 1] 1 1}])
;; => (2 2 3 {:a (2 2 2), :b (3 2 2) 1 2})

Compare two vectors in clojure no matter the order of the items

I want to compare two vectors and find out if the items they have are the same no matter the order the items are in.
So..
right now in clojure:
(= [1 2 3] [3 2 1]) ;=> false
I want:
(other_fun [1 2 3] [3 2 1]) ;=> true
(other_fun [1 2 3 4] [3 2 1]) ;=> false
I could not find a containsAll like in java
If you do care about duplicates, you can compare their frequency maps. These are maps with each collection element as a key and number of occurrences as a value. You create them using standard function frequencies, like in given examples.
Different order, same number of duplicates:
(= (frequencies [1 1 2 3 4])(frequencies [4 1 1 2 3]))
evaluates true.
Different order, different number of duplicates:
(= (frequencies [1 1 2 3 4])(frequencies [4 1 2 3]))
evaluates false.
So, you can write a function:
(defn other_fun [& colls]
(apply = (map frequencies colls)))
If you don't care about duplicates, you could create sets from both vectors and compare these:
(= (set [1 2 3]) (set [3 2 1])) ;=> true
As a function:
(defn set= [& vectors] (apply = (map set vectors)))
If you don't care about duplicates, other answers a perfectly applicable and efficient.
But if you do care about duplicates, probably the easiest way to compare two vectors is sorting and comparing:
user=> (= (sort [3 5 2 2]) (sort [2 2 5 3]))
true
user=> (= (sort [3 5 2 2]) (sort [2 5 3]))
false
Create sets from them:
user=> (= (set [1 2 3]) (set [3 2 1]))
true
user=> (defn other_func [col1 col2]
(= (set col1) (set col2)))
#'user/other_func
user=> (other_func [1 2 3] [3 2 1])
true
You're on the JVM already, so if you want containsAll, then just use containsAll, right?
(defn other_fun
"checkes the presence of the elements of vec1 in vec2 and vice versa"
[vec1 vec2]
(if (or (some nil?
(for [a vec1 b [vec2]] (some #(= % a) b)))
(some nil?
(for [a vec2 b [vec1]] (some #(= % a) b))))
false
true))
(other_fun [1 2 3] [3 2 1]) ;=> true
(other_fun [1 2 3 4] [3 2 1]) ;=> false