What is the difference between set and hash-set in Clojure?

What is the difference between set and hash-set in Clojure? - clojure

I can't find an explanation on the documentation nor on the web for why there are two different functions that seem to do practically the same thing, apart from accepting one a collection and the other one a list of arguments (but this could be easily solved using (apply hash-set coll)).

Just checked the source code for set and hash-set. You are right that there is practically no difference, aside from one accepting multiple arguments and the other accepting a collection.
Here is the source, by the way:
For set
For hash-set

It is just for convenience. Same with vector vs vec. However it is not completely parallel for maps and lists:
(vector 0 1 2) => [0 1 2]
(apply vector (range 3)) => [0 1 2]
(vec (range 3)) => [0 1 2]
(hash-set 0 1 2) => #{0 1 2}
(apply hash-set (range 3)) => #{0 1 2}
(set (range 3)) => #{0 1 2}
(hash-map :a 1 :b 2) => {:b 2, :a 1}
(apply hash-map [:a 1 :b 2]) => {:b 2, :a 1}
(into {} [[:a 1] [:b 2]]) => {:a 1, :b 2}
(list 0 1 2) => (0 1 2)
(apply list (range 3)) => (0 1 2)
(into (list) (range 3)) => (2 1 0) ; *** reversed order ***

Since we can define each in terms of the other:
(defn hash-set [& args]
(clojure.core/set args))
or
(defn set [coll]
(apply clojure.core/hash-set coll))
... it is likely that both are defined separately for speed.

Related

Juxtaposed transducers

Let's imagine we want to compute two different functions on some given input. How can we do that with transducers?
For example, let's say we have these two transducers:
(def xf-dupl (map #(* 2 %)))
(def xf-inc (map inc))
Now, I would like some function f that takes a collection of transducers and returns a new transducer that combines them, as follows:
(into [] (f [xf-dupl xf-inc]) (range 5))
; => [[0 2 4 6 8] [1 2 3 4 5]]
There should probably be a very simple solution to this, but I cannot find it.
Note: I have tried with cgrand/xforms library's transjuxt, but there I get the following
(into [] (x/transjuxt {:a xf-dupl :b xf-inc}) (range 5))
; => [{:a 0 :b 1}]
Thanks for your help!

Using cgrand/xforms you can define f as
(defn f
[xfs]
(comp
(x/multiplex (zipmap (range) xfs))
(x/by-key (x/into []))
(map second)))
Calling f as you outlined in your question yields
user> (into [] (f [xf-dupl xf-inc]) (range 5))
[[0 2 4 6 8] [1 2 3 4 5]]

into vs. partition

This makes sense:
user=> (into {} [[:a 1] [:b 2]])
{:a 1, :b 2}
But why does this generate an error?
user=> (into {} (partition 2 [:a 1 :b 2]))
ClassCastException clojure.lang.Keyword cannot be cast to java.util.Map$Entry clojure.lang.ATransientMap.conj (ATransientMap.java:44)
Just to be sure:
user=> (partition 2 [:a 1 :b 2])
((:a 1) (:b 2))
Does into have a problem with lazy sequences? If so, why?
Beyond an explanation of why this doesn't work, what is the recommended way to conj a sequence of key-value pairs like [:a 1 :b 2] into a map? (apply conj doesn't seem to work, either.)

You can apply the sequence to assoc:
(apply assoc {:foo 1} [:a 1 :b 2])
=> {:foo 1, :a 1, :b 2}
Does into have a problem with lazy sequences? If so, why?
No, into is commonly used with lazily evaluated sequences. This is lazy, but each key/value tuple is a vector, which is why it works when into is reducing the pairs into the map:
(into {} (map vector (range 3) (repeat :x)))
=> {0 :x, 1 :x, 2 :x}
This doesn't work because the key/value pairs are lists:
(into {} (map list (range 3) (repeat :x)))
So the difference isn't laziness; it's due to into using reduce using conj on the map, which only works with vector key/value pairs (or MapEntrys):
(conj {} [:a 1]) ;; ok
(conj {} (MapEntry. :a 1)) ;; ok
(conj {} '(:a 1)) ;; not ok
Update: assoc wrapper for applying empty/nil sequences as suggested in comments:
(defn assoc*
([m] m)
([m k v & kvs]
(apply assoc m k v kvs)))

The recommended way – (assuming the seq arg is non-empty, as pointed out by the OP) – would be
Clojure 1.9.0
user=> (apply assoc {} [:a 1 :b 2])
{:a 1, :b 2}
The version with partition doesn't work because the blocks that partition returns are seqs and those are not treated as map entries when conj'd on to a map the way vectors and actual map entries are.
E.g. (into {} (map vec) (partition 2 [:a 1 :b 2])) would work because here the pairs get converted to vectors before conjing.
Still the approach with assoc is preferable unless there's some particular circumstance that makes into convenient (like, say, if you have a bunch of transducers that you want to use for preprocessing your partition-generated pairs etc.).

Clojure treats a 2-vec such as [:a 1] as equivalent to a MapEntry, doing what amounts to "automatic type conversion". I try to avoid this and always be explicit.
(first {:a 1}) => <#clojure.lang.MapEntry [:a 1]>
(conj {:a 1} [:b 2]) => <#clojure.lang.PersistentArrayMap {:a 1, :b 2}>
So we see that a MapEntry prints like a vector but has a different type (just like a Clojure seq prints like a list but has a different type). seq converts a Clojure map into a sequence of MapEntry's, and first gets us the first one (most Clojure functions call (seq ...) on any input collections before any other processing).
Notice that conj does the inverse type conversion, treating the vector [:b 2] as if it were a MapEntry. However, conj won't perform automatic type conversion for a list or a seq:
(throws? (conj {:a 1} '(:b 2)))
(throws? (into {:a 1} '(:b 2)))
into has the same problem since it is basically just (reduce conj <1st-arg> <2nd-seq>).
The other answers already have 3 ways that work:
(assoc {} :b 2) => {:b 2}
(conj {} [:b 2]) => {:b 2}
(into {} [[:a 1] [:b 2]]) => {:a 1, :b 2}
However, I would avoid those and stick to either hash-map or sorted-map, both of which avoid the problem of empty input seqs:
(apply hash-map []) => {} ; works for empty input seq
(apply hash-map [:a 1 :b 2]) => {:b 2, :a 1}
If your input sequence is a list of pairs, flatten is sometimes helpful:
(apply sorted-map (flatten [[:a 1] [:b 2]])) => {:a 1, :b 2}
(apply hash-map (flatten '((:a 1) (:b 2)))) => {:a 1, :b 2}
P.S.
Please be note that these are not the same:
java.util.Map$Entry (listed in jdk docs as "Map.Entry")
clojure.lang.MapEntry
P.P.S
If you already have a map and want to merge in a (possibly empty) sequence of key-value pairs, just use a combination of into and hash-map:
(into {:a 1} (apply hash-map [])) => {:a 1}
(into {:a 1} (apply hash-map [:b 2])) => {:a 1, :b 2}

clojure programmatically namespace map keys

I recently learned about namespaced maps in clojure.
Very convenient, I was wondering what would be the idiomatic way of programmatically namespacing a map? Is there another syntax that I am not aware of?
;; works fine
(def m #:prefix{:a 1 :b 2 :c 3})
(:prefix/a m) ;; 1
;; how to programmatically prefix the map?
(def m {:a 1 :b 2 :c 3})
(prn #:prefix(:foo m)) ;; java.lang.RuntimeException: Unmatched delimiter: )

This function will do what you want:
(defn map->nsmap
[m n]
(reduce-kv (fn [acc k v]
(let [new-kw (if (and (keyword? k)
(not (qualified-keyword? k)))
(keyword (str n) (name k))
k) ]
(assoc acc new-kw v)))
{} m))
You can give it an actual namespace object:
(map->nsmap {:a 1 :b 2} *ns*)
=> #:user{:a 1, :b 2}
(map->nsmap {:a 1 :b 2} (create-ns 'my.new.ns))
=> #:my.new.ns{:a 1, :b 2}
Or give it a string for the namespace name:
(map->nsmap {:a 1 :b 2} "namespaces.are.great")
=> #:namespaces.are.great{:a 1, :b 2}
And it only alters keys that are non-qualified keywords, which matches the behavior of the #: macro:
(map->nsmap {:a 1, :foo/b 2, "dontalterme" 3, 4 42} "new-ns")
=> {:new-ns/a 1, :foo/b 2, "dontalterme" 3, 4 42}

Here is another example inspired by https://clojuredocs.org/clojure.walk/postwalk#example-542692d7c026201cdc327122
(defn map->nsmap
"Apply the string n to the supplied structure m as a namespace."
[m n]
(clojure.walk/postwalk
(fn [x]
(if (keyword? x)
(keyword n (name x))
x))
m))
Example:
(map->nsmap {:my-ns/a 1 :my-ns/b 2 :my-ns/c 3} "your-ns")
=> #:your-ns{:a 1, :b 2, :c 3}

Clojure: how to move vector elements in a map elegantly

In clojure, I am trying to accomplish the following logic:
Input:
{:a [11 22 33] :b [10 20 30]}, 2
Output:
{:a [11] :b [10 20 30 22 33]}
i.e. Move the last 2 elements from :a to :b
Is there a clojurish way for this operation?

Since you're effectively modifying both mappings in the map, it's probably easiest to explicitly deconstruct the map and just return the new map via a literal, using subvec and into for the vector manipulation:
(defn move [m n]
(let [{:keys [a b]} m
i (- (count a) n)
left (subvec a 0 i)
right (subvec a i)]
{:a left :b (into b right)}))
(move {:a [11 22 33] :b [10 20 30]} 2)
;;=> {:a [11], :b [10 20 30 22 33]}
As a bonus, this particular implementation is both very idiomatic and very fast.
Alternatively, using the split-at' function from here, you could write it like this:
(defn split-at' [n v]
[(subvec v 0 n) (subvec v n)])
(defn move [m n]
(let [{:keys [a b]} m
[left right] (split-at' (- (count a) n) a)]
{:a left :b (into b right)}))

First, using the sub-vec in the other answers will throw an IndexOutOfBoundsException when the number of elements to be moved is greater than the size of the collection.
Secondly, the destructuring, the way most have done here, couples the function to one specific data structure. This being, a map with keys :a and :b and values for these keys that are vectors. Now if you change one of the keys in the input, then you need to also change it in move function.
My solution follows:
(defn move [colla collb n]
(let [newb (into (into [] collb) (take-last n colla))
newa (into [] (drop-last n colla))]
[newa newb]))
This should work for any collection and will return vector of 2 vectors. My solution is far more reusable. Try:
(move (range 100000) (range 200000) 10000)
Edit:
Now you can use first and second to access the vector you need in the return.

I would do it just a little differently than Josh:
(defn tx-vals [ {:keys [a b]} num-to-move ]
{:a (drop-last num-to-move a)
:b (concat b (take-last num-to-move a)) } )
(tx-vals {:a [11 22 33], :b [10 20 30]} 2)
=> {:a (11), :b (10 20 30 22 33)}
Update
Sometimes it may be more convenient to use the clojure.core/split-at function as follows:
(defn tx-vals-2 [ {:keys [a b]} num-to-move ]
(let [ num-to-keep (- (count a) num-to-move)
[a-head, a-tail] (split-at num-to-keep a) ]
{ :a a-head
:b (concat b a-tail) } ))
If vectors are preferred on output (my favorite!), just do:
(defn tx-vals-3 [ {:keys [a b]} num-to-move ]
(let [ num-to-keep (- (count a) num-to-move)
[a-head, a-tail] (split-at num-to-keep a) ]
{:a (vec a-head)
:b (vec (concat b a-tail))} ))
to get the results:
(tx-vals-2 data 2) => {:a (11), :b (10 20 30 22 33)}
(tx-vals-3 data 2) => {:a [11], :b [10 20 30 22 33]}

(defn f [{:keys [a b]} n]
(let [last-n (take-last n a)]
{:a (into [] (take (- (count a) n) a))
:b (into b last-n)}))
(f {:a [11 22 33] :b [10 20 30]} 2)
=> {:a [11], :b [10 20 30 22 33]}

In case if the order of those items does not matter, here is my attempt:
(def m {:a [11 22 33] :b [10 20 30]})
(defn so-42476918 [{:keys [a b]} n]
(zipmap [:a :b] (map vec (split-at (- (count a) n) (concat a b)))))
(so-42476918 m 2)
gives:
{:a [11], :b [22 33 10 20 30]}

i would go with an approach, which differs a bit from the previous answers (well, technically it is the same, but it differs on the application-scale level).
First of all, transferring data between two collections is quite a frequent task, so it at least deserves some special utility function for that in your library:
(defn transfer [from to n & {:keys [get-from put-to]
:or {:get-from :start :put-to :end}}]
(let [f (if (= get-from :end)
(partial split-at (- (count from) n))
(comp reverse (partial split-at n)))
[from swap] (f from)]
[from (if (= put-to :start)
(concat swap to)
(concat to swap))]))
ok, it looks verbose, but it lets you transfer data from start/end of one collection to start/end of the other:
user> (transfer [1 2 3] [4 5 6] 2)
[(3) (4 5 6 1 2)]
user> (transfer [1 2 3] [4 5 6] 2 :get-from :end)
[(1) (4 5 6 2 3)]
user> (transfer [1 2 3] [4 5 6] 2 :put-to :start)
[(3) (1 2 4 5 6)]
user> (transfer [1 2 3] [4 5 6] 2 :get-from :end :put-to :start)
[(1) (2 3 4 5 6)]
So what's left, is to make your domain specific function on top of it:
(defn move [data n]
(let [[from to] (transfer (:a data) (:b data) n
:get-from :end
:put-to :end)]
(assoc data
:a (vec from)
:b (vec to))))
user> (move {:a [1 2 3 4 5] :b [10 20 30 40] :c [:x :y]} 3)
{:a [1 2], :b [10 20 30 40 3 4 5], :c [:x :y]}

How to search and replace in a Clojure script data structure?

I would like to have a search and replace on the values only inside data structures:
(def str [1 2 3
{:a 1
:b 2
1 3}])
and
(subst str 1 2)
to return
[2 2 3 {:a 2, :b 2, 1 3}]
Another example:
(def str2 {[1 2 3] x, {a 1 b 2} y} )
and
(subst str2 1 2)
to return
{[1 2 3] x, {a 1 b 2} y}
Since the 1's are keys in a map they are not replaced

One option is using of postwalk-replace:
user> (def foo [1 2 3
{:a 1
:b 2
1 3}])
;; => #'user/foo
user> (postwalk-replace {1 2} foo)
;; => [2 2 3 {2 3, :b 2, :a 2}]
Although, this method has a downside: it replaces all elements in a structure, not only values. This may be not what you want.
Maybe this will do the trick...
(defn my-replace [smap s]
(letfn [(trns [s]
(map (fn [x]
(if (coll? x)
(my-replace smap x)
(or (smap x) x)))
s))]
(if (map? s)
(zipmap (keys s) (trns (vals s)))
(trns s))))
Works with lists, vectors and maps:
user> (my-replace {1 2} foo)
;; => (2 2 3 {:a 2, :b 2, 1 3})
...Seems to work on arbitrary nested structures too:
user> (my-replace {1 2} [1 2 3 {:a [1 1 1] :b [3 2 1] 1 1}])
;; => (2 2 3 {:a (2 2 2), :b (3 2 2) 1 2})

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

What is the difference between set and hash-set in Clojure? - clojure

I can't find an explanation on the documentation nor on the web for why there are two different functions that seem to do practically the same thing, apart from accepting one a collection and the other one a list of arguments (but this could be easily solved using (apply hash-set coll)).

Just checked the source code for set and hash-set. You are right that there is practically no difference, aside from one accepting multiple arguments and the other accepting a collection. Here is the source, by the way: For set For hash-set

Since we can define each in terms of the other: (defn hash-set [& args] (clojure.core/set args)) or (defn set [coll] (apply clojure.core/hash-set coll)) ... it is likely that both are defined separately for speed.

Related

Juxtaposed transducers

into vs. partition

clojure programmatically namespace map keys

Clojure: how to move vector elements in a map elegantly

How to search and replace in a Clojure script data structure?

Categories

Resources