Why are the two sets not equal? - clojure

The question is about not=:
Clojure> (doc not=)
---------------------
Cloure.core/not=
([x] [x y] [x y & more])
Same as (not (= obj1 obj2))
Clojure> (not= [1 2 3] [1 2 3])
false
Clojure> (not= '(1 2 3) '(1 2 3))
false
Clojure> (not= #(1 2 3) #(1 2 3))
true
Any suggestion is appreciated!

Sets use braces
user=> (not= #(1 2 3) #(1 2 3))
true
user=> (not= #{1 2 3} #{1 2 3})
false

just for reference the # character is the "dispatch macro" in the clojure reader.
it tells the reader to treat the expression folowing it specially. So far as I know it is
the only reader macro defined in clojure.
#( ) define a function. short for (fn [< optional-args >] ...)
#" " define a regular expression
#' reference a var it's self instead of the value in a var.
#{ } define a set.
#_ don't read the next statement. this is like a super comment, it is more through than a comment but the distinction is not commonly used.

Others have commented that #(1 2 3) is not a set, but rather a function (that raises an error when invoked). The reason that #(1 2 3) is not equal to #(1 2 3) is that each invocation of #(...) creates a new anonymous function, and each new function belongs to a new Java class:
user=> (class #(1 2 3))
user$eval60$fn__61
user=> (class #(1 2 3))
user$eval64$fn__65
These classes have an equals method that doesn't consider objects of the other classes equal, even though they happen to have been defined in the same way. The method is in fact inherited from java.lang.Object:
user=> (for [m (.getMethods (class #(1 2 3)))
:when (= (.getName m) "equals")]
(.getDeclaringClass m))
(java.lang.Object)

Related

Clojure operators in higher-order functions

I've put together a higher order function that in certain cases calls a function parameter, but it seems to have different effects depending on the function. I was able to reproduce the same behaviour just with a simple function:
(defn foo [f a b] (f a b))
For "normal" functions it works fine:
user=> (foo list 2 3)
(2 3)
user=> (foo cons 1 '(2 3))
(1 2 3)
user=> (foo println 2 3)
2 3
nil
But for operators, it does not, it just seems to return the last element:
user=> (foo '+ 2 3)
3
user=> (foo '* 2 3)
3
user=> (foo '- 2 3)
3
Why is this the case?
user=> (foo '+ 2 3)
3
Why is this the case?
' (or quote) is creating a symbol of + when you want the + function value itself: https://clojure.org/guides/weird_characters#_quote
(quote foo) => foo ;; symbol
'foo => foo ;; symbol
So the behavior of always returning the second argument comes from the fact that symbols (like keywords) also act as functions, typically used as a shorthand for get on associative structures (like maps), so these are functionally equivalent:
('foo 1 2) => 2
(get 1 'foo 2) => 2
The 2 happens to be in the position used for default values when the key isn't found in the associative structure.
This would be useful if you had a map with symbol keys, just like keywords:
('foo {'foo 1}) => 1
({'foo 1} 'foo) => 1
('foo {'bar 1} 2) => 2
In clojure, "operators" like + are just normal functions. Don't use the single-quote and it'll work fine.
(ns tst.demo.core
(:use tupelo.core tupelo.test))
(defn foo [f a b] (f a b))
(dotest
(spyx (foo list 2 3))
(spyx (foo println 2 3))
(spyx (foo + 2 3))
(spyx (foo * 2 3))
(spyx (foo - 2 3)) )
with results:
(foo list 2 3) => (2 3)
2 3 ; result of (println 2 3)
(foo println 2 3) => nil ; println always returns `nil`
(foo + 2 3) => 5
(foo * 2 3) => 6
(foo - 2 3) => -1
The helper function spyx just prints an expression, then its value.

Why does clojure.core/rest output a list when input is a vector?

Why does clojure.core/rest output a list when input is a vector?
This creates an unexpected effect:
(conj [1 2 3] 4)
; => [1 2 3 4]
(conj (rest [1 2 3]) 4)
; => (4 2 3)
I know that "it calls seq on its argument" from the docs which creates this effect. I don't understand why this is the desired effect. As a naïve user, I would expect (rest [1 2 3]) to behave like (subvec [1 2 3] 1). I know I could just use subvec for my use case. For the sake of learning, I would like to understand the rationale of rest, and use cases where outputting a list is desirable (even when the input is a vector).
The output of rest is NOT a list, but a seq, which is an even lower level abstraction. From the official documentation for rest:
Returns a possibly empty seq of the items after the first. Calls seq on its
argument.
The confusion arises from the fact that both are printed between parens, but if you look closely, they are different:
user=> (list? (rest [1 2 3]))
false
user=> (seq? (rest [1 2 3]))
true
How it's a seq different from a list? seqs are implemented with an Interface that requires implementing first, rest and cons, but details are up to the collection implementation. For instance, vectors use their own implementation:
user=> (class (rest [1 2 3]))
clojure.lang.PersistentVector$ChunkedSeq
user=> (class (rest '(1 2 3)))
clojure.lang.PersistentList
List are an implementation that at least extends a basic Seq interface, and builds on top. For instance, clojure.lang.PersistentList implements the Counted interface which requires a constant-time version of count.
For a detailed description of the differences between Seqs and Lists, check these links:
Differences between a seq and a list
https://clojure.org/reference/sequences
You make a good case for rest on a vector returning a vector. The trouble is that rest is one of the fundamental operations on sequences, and a vector is not a sequence:
=> (seq? [1 2 3 4])
false
However, if rest can accept a seqable thing such as a vector, you could say that it ought to be able to return such.
What does it return?
=> (type (rest [1 2 3 4]))
clojure.lang.PersistentVector$ChunkedSeq
This gives every appearance of being a subvec wrapped in a seq call.
I know that "it calls seq on its argument"
That is correct. Seqs are implemented with an Interface (ISeq) that requires implementing first, rest and cons.
rest takes any Seq'able (any collection that implements ISequable). The reason for using this is efficiency and simplicity.
The way different collection works, the most efficient way of getting the first and rest is different.
Which is why when you convert one collection into a seq, it will come with the most efficient implementation on rest and the others.
I hope this was clear
I agree that this behavior is unexpected and counterintuitive. As a workaround, I created the append and prepend functions in the Tupelo library.
From the docs, we see examples:
Clojure has the cons, conj, and concat functions, but it is not obvious how they should be used to add a new value to the beginning of a vector or list:
; Add to the end
> (concat [1 2] 3) ;=> IllegalArgumentException
> (cons [1 2] 3) ;=> IllegalArgumentException
> (conj [1 2] 3) ;=> [1 2 3]
> (conj [1 2] 3 4) ;=> [1 2 3 4]
> (conj '(1 2) 3) ;=> (3 1 2) ; oops
> (conj '(1 2) 3 4) ;=> (4 3 1 2) ; oops
; Add to the beginning
> (conj 1 [2 3] ) ;=> ClassCastException
> (concat 1 [2 3] ) ;=> IllegalArgumentException
> (cons 1 [2 3] ) ;=> (1 2 3)
> (cons 1 2 [3 4] ) ;=> ArityException
> (cons 1 '(2 3) ) ;=> (1 2 3)
> (cons 1 2 '(3 4) ) ;=> ArityException
Do you know what conj does when you pass it nil instead of a sequence? It silently replaces it with an empty list: (conj nil 5) ⇒ (5) This can cause you to accumulate items in reverse order if you aren’t aware of the default behavior:
(-> nil
(conj 1)
(conj 2)
(conj 3))
;=> (3 2 1)
These failures are irritating and unproductive, and the error messages don’t make it obvious what went wrong. Instead, use the simple prepend and append functions to add new elements to the beginning or end of a sequence, respectively:
(append [1 2] 3 ) ;=> [1 2 3 ]
(append [1 2] 3 4) ;=> [1 2 3 4]
(prepend 3 [2 1]) ;=> [ 3 2 1]
(prepend 4 3 [2 1]) ;=> [4 3 2 1]
Both prepend and append always return a vector result.

Single duplicate in a vector

Given a list of integers from 1 do 10 with size of 5, how do I check if there are only 2 same integers in the list?
For example
(check '(2 2 4 5 7))
yields yes, while
(check '(2 1 4 4 4))
or
(check '(1 2 3 4 5))
yields no
Here is a solution using frequencies to count occurrences and filter to count the number of values that occur only twice:
(defn only-one-pair? [coll]
(->> coll
frequencies ; map with counts of each value in coll
(filter #(= (second %) 2)) ; Keep values that have 2 occurrences
count ; number of unique values with only 2 occurrences
(= 1))) ; true if only one unique val in coll with 2 occurrences
Which gives:
user=> (only-one-pair? '(2 1 4 4 4))
false
user=> (only-one-pair? '(2 2 4 5 7))
true
user=> (only-one-pair? '(1 2 3 4 5))
false
Intermediate steps in the function to get a sense of how it works:
user=> (->> '(2 2 4 5 7) frequencies)
{2 2, 4 1, 5 1, 7 1}
user=> (->> '(2 2 4 5 7) frequencies (filter #(= (second %) 2)))
([2 2])
user=> (->> '(2 2 4 5 7) frequencies (filter #(= (second %) 2)) count)
1
Per a suggestion, the function could use a more descriptive name and it's also best practice to give predicate functions a ? at the end of it in Clojure. So maybe something like only-one-pair? is better than just check.
Christian Gonzalez's answer is elegant, and great if you are sure you are operating on a small input. However, it is eager: it forces the entire input list even when itcould in principle tell sooner that the result will be false. This is a problem if the list is very large, or if it is a lazy list whose elements are expensive to compute - try it on (list* 1 1 1 (range 1e9))! I therefore present below an alternative that short-circuits as soon as it finds a second duplicate:
(defn exactly-one-duplicate? [coll]
(loop [seen #{}
xs (seq coll)
seen-dupe false]
(if-not xs
seen-dupe
(let [x (first xs)]
(if (contains? seen x)
(and (not seen-dupe)
(recur seen (next xs) true))
(recur (conj seen x) (next xs) seen-dupe))))))
Naturally it is rather more cumbersome than the carefree approach, but I couldn't see a way to get this short-circuiting behavior without doing everything by hand. I would love to see an improvement that achieves the same result by combining higher-level functions.
(letfn [(check [xs] (->> xs distinct count (= (dec (count xs)))))]
(clojure.test/are [input output]
(= (check input) output)
[1 2 3 4 5] false
[1 2 1 4 5] true
[1 2 1 2 1] false))
but I like a shorter (but limited to exactly 5 item lists):
(check [xs] (->> xs distinct count (= 4)))
In answer to Alan Malloy's plea, here is a somewhat combinatory solution:
(defn check [coll]
(let [accums (reductions conj #{} coll)]
(->> (map contains? accums coll)
(filter identity)
(= (list true)))))
This
creates a lazy sequence of the accumulating set;
tests it against each corresponding new element;
filters for the true cases - those where the element is already present;
tests whether there is exactly one of them.
It is lazy, but does duplicate the business of scanning the given collection. I tried it on Alan Malloy's example:
=> (check (list* 1 1 1 (range 1e9)))
false
This returns instantly. Extending the range makes no difference:
=> (check (list* 1 1 1 (range 1e20)))
false
... also returns instantly.
Edited to accept Alan Malloy's suggested simplification, which I have had to modify to avoid what appears to be a bug in Clojure 1.10.0.
you can do something like this
(defn check [my-list]
(not (empty? (filter (fn[[k v]] (= v 2)) (frequencies my-list)))))
(check '(2 4 5 7))
(check '(2 2 4 5 7))
Similar to others using frequencies - just apply twice
(-> coll
frequencies
vals
frequencies
(get 2)
(= 1))
Positive case:
(def coll '(2 2 4 5 7))
frequencies=> {2 2, 4 1, 5 1, 7 1}
vals=> (2 1 1 1)
frequencies=> {2 1, 1 3}
(get (frequencies #) 2)=> 1
Negative case:
(def coll '(2 1 4 4 4))
frequencies=> {2 1, 1 1, 4 3}
vals=> (1 1 3)
frequencies=> {1 2, 3 1}
(get (frequencies #) 2)=> nil

clojure: pop and push

I'm looking for a sequential data structure which is perfect for the following operation. The lenght of the list remains constant, it will never be longer or shorter than a fixed length.
Omit the first item and add x to the end.
(0 1 2 3 4 5 6 7 8 9)
(pop-and-push "10")
(1 2 3 4 5 6 7 8 9 10)
There is only one other reading-operation that has to be done equally often:
(last coll)
pop-and-push could be implemented like this:
(defn pop-and-push [coll x]
(concat (pop coll) ["x"]))
(unfortunately this does not work with sequences produced by e.g. range, it just pops when the sequence declared by the literals '(..) is passed.)
but is this optimal?
The main issue here (once we change "x" to x) is that concat returns a lazy-seq, and lazy-seqs are invalid args to pop.
user=> (defn pop-and-push [coll x] (concat (pop coll) [x]))
#'user/pop-and-push
user=> (pop-and-push [1 2 3] 4)
(1 2 4)
user=> (pop-and-push *1 5)
ClassCastException clojure.lang.LazySeq cannot be cast to clojure.lang.IPersistentStack clojure.lang.RT.pop (RT.java:730)
My naive preference would be to use a vector. This function is easy to implement with subvec.
user=> (defn pop-and-push [v x] (conj (subvec (vec v) 1) x))
#'user/pop-and-push
user=> (pop-and-push [1 2 3] 4)
[2 3 4]
user=> (pop-and-push *1 5)
[3 4 5]
as you can see, this version can actually operate on its own return value
As suggested in the comments, PersistentQueue is made for this situation:
user=> (defn pop-and-push [v x] (conj (pop v) x))
#'user/pop-and-push
user=> (pop-and-push (into clojure.lang.PersistentQueue/EMPTY [1 2 3]) 4)
#object[clojure.lang.PersistentQueue 0x50313382 "clojure.lang.PersistentQueue#7c42"]
user=> (into [] *1)
[2 3 4]
user=> (pop-and-push *2 5)
#object[clojure.lang.PersistentQueue 0x4bd31064 "clojure.lang.PersistentQueue#8023"]
user=> (into [] *1)
[3 4 5]
The PersistentQueue data structure, though less convenient to use in some ways, is actually optimized for this usage.

Anonymous function in Clojure

Maybe this sounds ridiculous question, but it is for me still not exactly clear the difference between where the # of a anonymous function should come. For example in this example i filter the divisors of a positive number:
(filter #(zero? (mod 6 %)) (range 1 (inc 6))) ;;=> (1 2 3 6)
but putting the # right before the (mod 6 %) will cause an error. Is there a rule where in such a context my anonymous function begins, and why should the # come before (zero? ...?
This shows how the #(...) syntax is just a shorthand for (fn [x] ...):
(defn divides-6 [arg]
(zero? (mod 6 arg)))
(println (filter divides-6 (range 1 10))) ; normal function
(println (filter (fn [x] (zero? (mod 6 x))) (range 1 10))) ; anonymous function
(println (filter #(zero? (mod 6 %)) (range 1 10))) ; shorthand version
;=> (1 2 3 6)
;=> (1 2 3 6)
;=> (1 2 3 6)
Using defn is just shorthand for (def divides-6 (fn [x] ...)) (i.e. the def and fn parts are combined into defn to save a little typing). We don't need to define a global name divides-6 if we are only going to use the function once. We can just define the function inline right where it will be used. The #(...) syntax is just a shorthand version as the example shows.
Note that the full name of the form #(...) is the "anonymous function literal". You may also see it called the "function reader macro" or just the "function macro". The syntax (fn [x] ...) is called the "function special form".
Clojure's filter function takes one or two arguments; either way, the first argument must be a function. So there's no "rule" where the anonymous function is defined, as long as ultimately, the first argument to filter is a function.
However, in this case, zero? does not return a function, so (zero? #(mod 6 %)) would cause filter to throw an error. And, in fact, (zero? #(mod 6 %) doesn't make sense, either, because zero? does not take a function as an argument.
filter takes two parameters:
a predicate (a filter, which is a function), and
a collection
So, in a simple way:
(defn my-predicate [x]
(zero? (mod 6 x)))
(def my-collection
(range 1 (inc 6)))
(filter
my-filter
my-collection)
# is a clojure macro, or something that preprocess and reorganize code for you. We can see the result of a macro with macroexpand-1 :
(macroexpand-1 '#(zero? (mod 6 %)))
; (fn* [p1__4777#] (zero? (mod 6 p1__4777#)))
or in a more readable code:
(fn* [x]
(zero?
(mod 6 x))
On a single value of a collection, say 3, we can apply the above function:
( (fn* [x]
(zero?
(mod 6 x)))
3)
; true
And then back to the # version of our code, the input parameter of a function is implicitly %, so:
(
#(zero? (mod 6 %))
3)
; true
And finally, back to your original function, you see why # needs to be the function defining the predicate for the filter function:
(filter
#(zero? (mod 6 %))
(range 1 (inc 6)))
; (1 2 3 6)