Novice question, but I don't really understand why there are so many operations for constructing maps in clojure.
You have conj, assoc and merge, but they seem to more or less do the same thing?
(assoc {:a 1 :b 2} :c 3)
(conj {:a 1 :b 2} {:c 3})
(merge {:a 1 :b 2} {:c 3})
What's really the difference and why are all these methods required when they do more or less the same thing?
assoc and conj behave very differently for other data structures:
user=> (assoc [1 2 3 4] 1 5)
[1 5 3 4]
user=> (conj [1 2 3 4] 1 5)
[1 2 3 4 1 5]
If you are writing a function that can handle multiple kinds of collections, then your choice will make a big difference.
Treat merge as a maps-only function (its similar to conj for other collections).
My opinion:
assoc - use when you are 'changing' existing key/value pairs
conj - use when you are 'adding' new key/value pairs
merge - use when you are combining two or more maps
Actually these functions behave quite differently when used with maps.
conj:
Firstly, the (conj {:a 1 :b 2} :c 3) example from the question text does not work at all (neither with 1.1 nor with 1.2; IllegalArgumentException is thrown). There are just a handful of types which can be conjed onto maps, namely two-element vectors, clojure.lang.MapEntrys (which are basically equivalent to two-element vectors) and maps.
Note that seq of a map comprises a bunch of MapEntrys. Thus you can do e.g.
(into a-map (filter a-predicate another-map))
(note that into uses conj -- or conj!, when possible -- internally). Neither merge nor assoc allows you to do that.
merge:
This is almost exactly equivalent to conj, but it replaces its nil arguments with {} -- empty hash maps -- and thus will return a map when the first "map" in the chain happens to be nil.
(apply conj [nil {:a 1} {:b 2}])
; => ({:b 2} {:a 1}) ; clojure.lang.PersistentList
(apply merge [nil {:a 1} {:b 2}])
; => {:a 1 :b 2} ; clojure.lang.PersistentArrayMap
Note there's nothing (except the docstring...) to stop the programmer from using merge with other collection types. If one does that, weirdness ensues; not recommended.
assoc:
Again, the example from the question text -- (assoc {:a 1 :b 2} {:c 3}) -- won't work; instead, it'll throw an IllegalArgumentException. assoc takes a map argument followed by an even number of arguments -- those in odd positions (let's say the map is at position 0) are keys, those at even positions are values. I find that I assoc things onto maps more often than I conj, though when I conj, assoc would feel cumbersome. ;-)
merge-with:
For the sake of completeness, this is the final basic function dealing with maps. I find it extremely useful. It works as the docstring indicates; here's an example:
(merge-with + {:a 1} {:a 3} {:a 5})
; => {:a 9}
Note that if a map contains a "new" key, which hasn't occured in any of the maps to the left of it, the merging function will not be called. This is occasionally frustrating, but in 1.2 a clever reify can provide a map with non-nil "default values".
Since maps are such a ubiquitous data structure in Clojure, it makes sense to have multiple tools for manipulating them. The various different functions are all syntactically convenient in slightly different circumstances.
My personal take on the specific functions you mention :
I use assoc to add a single value to a map given a key and value
I use merge to combine two maps or add multiple new entries at once
I don't generally use conj with maps at all as I associate it mentally with lists
Related
I have a sequence that looks like this: ({:a 1 :b "lorem"} {:a 2 :b "ipsum"}) and I want to convert that to a map of maps using the value of :a as the value of the key in the new map. The result I am expecting is {:1 {:a 1 :b "lorem"} :2 {:a 2 :b "ipsum"}}.
And maybe this is not idiomatic clojure, I'm still learning. I basically receive a large sequence, and I will be looking up values in this sequence by a certain value in each map, and I want to use a map to make it O(1).
In C#, on an IEnumerable, you can call .ToDictionary(x => x.SomeProperty) which would return a Dictionary of key value pairs, using the value of SomeProperty as the key. And this has lookup performance of O(1) instead of the typical O(N) for a list/sequence.
This should do the transform you are after:
(def latin '({:a 1 :b "lorem"} {:a 2 :b "ipsum"}))
(defn transform [kw in]
(zipmap (map kw in) in))
=> (transform :a latin)
{1 {:a 1, :b "lorem"}, 2 {:a 2, :b "ipsum"}}
I hesitated in changing the number to a keyword, as not sure as to the reason you would want to do that...
... edit - because you can always do an indexed (O(1)) retrieve like so:
(get (transform :a latin) 1)
1 doesn't work at the front because it can't be used as a function (unlike :1), however get is a function whose second argument is the search key when the first argument is a map.
I'm trying to retrieve an entire hash from a vector of hashes based on whether or not it has a specific value in a field.
(def foo {:a 1, :b 2})
(def bar {:a 3, :b 4})
(def baz [foo bar])
In baz, I want to return the entire hash where :a 3 so the result will be {:a 3, :b 4}. I have tried get get-in and find but those rely on keys and do not return the entire hash. I've also tried some suggestion from this question but they don't return the hash either.
filter to the rescue!
hello.core> (def foo {:a 1, :b 2})
#'hello.core/foo
hello.core> (def bar {:a 3, :b 4})
#'hello.core/bar
hello.core> (def baz [foo bar])
#'hello.core/baz
hello.core> (filter #(= (:a %) 3) baz)
({:a 3, :b 4})
#(= (:a %) 3) is a short form for creating an anonymous that takes one argument, named %, in which it will look up the key :a and return true if that matches the value 3. Any entry in the vector baz which passes this test will make it into the output.
PS: a note on pronunciation: that data structure is typically called a "map" because it maps one key to one value. This is terribly confusing because there is also a function named map which changes every member of a sequence by a function.
filter definitely does the job as Arthur mentioned. Just for the sake of completeness these are 2 other solutions which differ in 2 aspects from filter:
(some #(when (= 3 (:a %)) %) baz)
(first (drop-while #(not= 3 (:a %)) baz))
these will stop further searching through your whole collection as soon as they have found the first element in the collection which fits your requirements (hence less resource) and
because of that, in contrary to filter they give you only the first fitting element and not all the elements in the collection which pass your
requirements (in case you have multiple repeated elements in your collection).
I am working over a map using keys and vals, if I run the same code multiple time, will it always return the same collection considering order? I tried (keys a), every time I run it (:c :b :a)
returns. But want to confirm it ALWAYS returns the same.
(def a {:a 1 :b 2 :c 3})
(keys a)
(vals a)
Not all Clojure maps will retain the order of entries. If you want to retain insertion order you would need to use clojure.lang.PersistentArrayMap (produced by array-map or the map literal). Keep in mind that an array map is intended for a small number of entries and that certain operations will perform poorly with a larger number of entries.
If you want to maintain sorted order (but not insertion order) then you would need to use a sorted map (produced by sorted-map).
A hash map (produced by hash-map) gives no guarantees in respect of order.
Clojure's map literal produces an array map.
(class {:a 1 :b 2 :c 3})
; => clojure.lang.PersistentArrayMap
; zipmap's returned map type will vary depending on the number of entries in the map
(class (zipmap (range 0 1000) (range 1000 2000)))
; => clojure.lang.PersistentHashMap
(class (zipmap (range 1 3) (range 3 5)))
; => clojure.lang.PersistentArrayMap
(class (sorted-map :a 1 :b 2 :c 3))
; => clojure.lang.PersistentTreeMap
(class (hash-map :a 1 :b 2 :c 3))
; => clojure.lang.PersistentHashMap
You would also need to be careful not to inadvertently change the map type, e.g.:
(class (into {} (map #(vector (key %) (inc (val %))) (sorted-map :a 1))))
; => clojure.lang.PersistentArrayMap
It is best not to rely on the order of entries in a map, so if you can think of a way to achieve what you want without relying on the order of entries in a map then you should strongly consider it.
Maps are unordered and no order is guaranteed. So, don't write code that depends on it, even with array-map.
Any particular instance of a map is guaranteed to return you entries in the same order (via seq, keys, vals, etc) such that (keys m) and (vals m) "match up".
If you need an ordered map, try https://github.com/amalloy/ordered.
How can I serialize and deserialize a sorted map in Clojure?
For example:
(sorted-map :a 1 :b 2 :c 3 :d 4 :e 5)
{:a 1, :b 2, :c 3, :d 4, :e 5}
What I've noticed:
A sorted map is displayed in the same way as an unsorted map in the REPL. This seems convenient at times but inconvenient at others.
EDN does not have support for sorted maps.
Clojure does support custom tagged literals for the reader.
Additional resources:
Correct usage of data-readers
Clojure reader literals
Same question with two usable answers: Saving+reading sorted maps to a file in Clojure.
A third answer would be to set up custom reader literals. You'd print sorted maps as something like
;; non-namespaced tags are meant to be reserved
#my.ns/sorted-map {:foo 1 :bar 2}
and then use an appropriate data function when reading (converting from a hash map to a sorted map). There's a choice to be made as to whether you wish to deal with custom comparators (which is a problem impossible to solve in general, but one can of course choose to deal with special cases).
clojure.edn/read accepts an optional opts map which may contain a :reader key; the value at that key is then taken to be a map specifying which data readers to use for which tags. See (doc clojure.edn/read) for details.
As for printing, you could install a custom method for print-method or use a custom function for printing your sorted maps. I'd probably go with the latter solution -- implementing built-in protocols / multimethods for built-in types is not a great idea in general, so even when it seems reasonable in a particular case it requires extra care etc.; simpler to use one's own function.
Update:
Demonstrating how to reuse IPersistentMap's print-method impl cleanly, as promised in a comment on David's answer:
(def ^:private ipm-print-method
(get (methods print-method) clojure.lang.IPersistentMap))
(defmethod print-method clojure.lang.PersistentTreeMap
[o ^java.io.Writer w]
(.write w "#sorted/map ")
(ipm-print-method o w))
With this in place:
user=> (sorted-map :foo 1 :bar 2)
#sorted/map {:bar 2, :foo 1}
In data_readers.clj:
{sorted/map my-app.core/create-sorted-map}
Note: I wished that this would work, but it did not (not sure why):
{sorted/map clojure.lang.PersistentTreeMap/create}
Now, in my-app.core:
(defn create-sorted-map
[x]
(clojure.lang.PersistentTreeMap/create x))
(defmethod print-method clojure.lang.PersistentTreeMap
[o ^java.io.Writer w]
(.write w "#sorted/map ")
(print-method (into {} o) w))
As an alternative -- less low-level, you can use:
(defn create-sorted-map [x] (into (sorted-map) x))
The tests:
(deftest reader-literal-test
(testing "#sorted/map"
(is (= (sorted-map :v 4 :w 5 :x 6 :y 7 :z 8)
#sorted/map {:v 4 :w 5 :x 6 :y 7 :z 8}))))
(deftest str-test
(testing "str"
(is (= "#sorted/map {:v 4, :w 5, :x 6, :y 7, :z 8}"
(str (sorted-map :v 4 :w 5 :x 6 :y 7 :z 8))))))
Much of this was adapted from the resources I found above.
Note: I am surprised that print-method works, above. It would seem to me that (into {} o) would lose the ordering and thus bungle up the printing, but it works in my testing. I don't know why.
I'm saving a nested map of data to disk via spit. I want some of the maps inside my map to be sorted, and to stay sorted when I slurp the map back into my program. Sorted maps don't have a unique literal representation, so when I spit the map-of-maps onto disk, the sorted maps and the unsorted maps are represented the same, and #(read-string (slurp %))ing the data makes every map the usual unsorted type. Here's a toy example illustrating the problem:
(def sorted-thing (sorted-map :c 3 :e 5 :a 1))
;= #'user/sorted-thing
(spit "disk" sorted-thing)
;= nil
(def read-thing (read-string (slurp "disk")))
;= #'user/read-thing
(assoc sorted-thing :b 2)
;= {:a 1, :b 2, :c 3, :e 5}
(assoc read-thing :b 2)
;= {:b 2, :a 1, :c 3, :e 5}
Is there some way to read the maps in as sorted in the first place, rather than converting them to sorted maps after reading? Or is this a sign that I should be using some kind of real database?
The *print-dup* dynamically rebindable Var is meant to support this use case:
(binding [*print-dup* true]
(prn (sorted-map :foo 1)))
; #=(clojure.lang.PersistentTreeMap/create {:foo 1})
The commented out line is what gets printed.
It so happens that it also affects str when applied to Clojure data structures, and therefore also spit, so if you do
(binding [*print-dup* true]
(spit "foo.txt" (sorted-map :foo 1)))
the map representation written to foo.txt will be the one displayed above.
Admittedly, I'm not 100% sure whether this is documented somewhere; if you feel uneasy about this, you could always spit the result of using pr-str with *print-dup* bound to true:
(binding [*print-dup* true]
(pr-str (sorted-map :foo 1)))
;= "#=(clojure.lang.PersistentTreeMap/create {:foo 1})"
(This time the last line is the value returned rather than printed output.)
Clearly you'll have to have *read-eval* bound to true to be able to read back these literals. That's fine though, it's exactly the purpose it's meant to serve (reading code from trusted sources).
I don't think its necessarily a sign that you should be using a database, but I do think its a sign that you shouldn't be using spit. When you write your sorted maps to disk, don't use the map literal syntax. If you write it out in the following format, read-string will work:
(def sorted-thing (eval (read-string "(sorted-map :c 3 :e 5 :a 1)")))
(assoc sorted-thing :b 2)
;= {:a 1, :b 2, :c 3, :e 5}