Clojure: apply a function to leaf nodes of a map - clojure

How can you apply a function to leaf nodes of a (nested) map? For example, let's have this map:
{:a 0
:b {:c 1}
:d [{:e 2} {:f 3}]}
Let's say we want to increment all leaf nodes in this map and produce the following result:
{:a 1
:b {:c 2}
:d [{:e 3} {:f 4}]}
Currently, I'm thinking of using the map-zipper from this answer and editing the map via the clojure.zip functions. However, I'm not sure how to iterate through the zipper and how to identify leaf nodes. What functions should I look at? Is there a simpler solution without zippers?
Could something like the following work (provided there is a leaf-node? predicate testing if a location in zipper is a leaf node)?
(require '[clojure.zip :as zip])
(defn inc-leaf-nodes
[loc]
(if (zip/end? loc)
(zip/node loc)
(recur (zip/next (if (leaf-node? loc)
(zip/edit loc inc)
loc)))))

A leaf-node? function could be implemented given an appropriate definition of a leaf, but there are at least two issues with the proposed inc-leaf-nodes:
the zipper implementation you linked to uses map entries as nodes, so you cannot simply pass inc to zip/edit – you would have to wrap it in a helper function that would apply it in the value position;
that zipper also doesn't treat non-map values as branches, so it would not descend into the vector [{:e 3} {:f 4}] – you would need a different zipper to overcome this problem.
Assuming that a leaf is something that is not a map, set or a sequential collection and the mapping function is to be applied only in the value position in maps, you could define a map-leaves function like so:
(defn map-leaves [f x]
(cond
(map? x) (persistent!
(reduce-kv (fn [out k v]
(assoc! out k (map-leaves f v)))
(transient {})
x))
(set? x) (into #{} (map #(map-leaves f %)) x)
(sequential? x) (into [] (map #(map-leaves f %)) x)
:else (f x)))
At the REPL:
(map-leaves inc {:a 0 :b {:c 1 :d [{:e 2} {:f 3}]}})
;= {:a 1, :b {:c 2, :d [{:e 3} {:f 4}]}}
(map-leaves inc {:a 0 :b {:c 1 :d [{:e 2} {:f 3} {:g #{1 2 3}}]}})
;= {:a 1, :b {:c 2, :d [{:e 3} {:f 4} {:g #{4 3 2}}]}}

Related

Clojure sum maps with strings and numbers

Hi l have 2 maps like these (with the possibility of having multiple maps)
(def map1 {:a {:b 1} :c ["dog"]})
(def map2 {:a {:b 2} :c ["cat"]})
l need to have this as a return {:a {:b 3} :c ["dog" "cat"]}
How can l do this?
Solution using generic functions
You could use generic functions - let's call it here o+ (operator +) - by making use of Clojure's multiple dispatch ability. By multiple dispatch you avoid if - else or cond clauses - for distinguishing the argument types/classes - thereby the code stays extensible (further cases can be added without having to change existing code). The function at the end of the defmulti definition is the dispatch function. In the defmethod form - right before the actual function arguments list (vector), you list the dispatch case for which then Clojure looks when deciding which method to use.
(defmulti o+ (fn [& args] (mapv class args)))
(defmethod o+ [Number Number] [x y] (+ x y)) ;; Numbers get added
(defmethod o+ [clojure.lang.PersistentVector clojure.lang.PersistentVector]
[x y] (into (empty x) (concat x y))) ;; Vectors concatentated
(defmethod o+ [clojure.lang.PersistentArrayMap clojure.lang.PersistentArrayMap]
[x y] (merge-with o+ x y)) ;; Maps merged-with #'o+ (recursive definition!)
Test
You can fuse then two maps with the (now recursively defined) o+ operator:
(def map1 {:a {:b 1} :c ["dog"]})
(def map2 {:a {:b 2} :c ["cat"]})
(o+ map1 map2) ;; or originally: (merge-with o+ map1 map2)
;; => {:a {:b 3}, :c ["dog" "cat"]}
Even deeper nested maps are covered
Due to the recursive character of this definition - this works also with more deeply nested maps as long as they have the same structure - and use only the defined class cases (otherwise one could add further cases):
(def mapA {:a {:b 2 :c {:d 1 :e ["a"] :f 3}} :g ["b"]})
(def mapB {:a {:b 3 :c {:d 4 :e ["b" "e"] :f 5}} :g ["c"]})
(o+ mapA mapB)
;;=> {:a {:b 5, :c {:d 5, :e ["a" "b" "e"], :f 8}}, :g ["b" "c"]}
Use reduce to add more maps at once
And as long as you can apply o+ on two objects, you can process an arbitrary number of maps using reduce:
(def mapA {:a {:b 2 :c {:d 1 :e ["a"] :f 3}} :g ["b"]})
(def mapB {:a {:b 3 :c {:d 4 :e ["b" "e"] :f 5}} :g ["c"]})
(def mapC {:a {:b 1 :c {:d 3 :e ["c"] :f 1}} :g ["a"]})
(reduce o+ [mapA mapB mapC]) ;; this vector could contain much more maps!
;; => {:a {:b 6, :c {:d 8, :e ["a" "b" "e" "c"], :f 9}}, :g ["b" "c" "a"]}
;; or we could define in addition:
(defmethod o+ :default [& args] (reduce o+ args))
;; which makes `o+` to a variadic function (function which can be
;; called with as many arguments you want)
;; then, whenever we add more than two arguments, it will be activated:
(o+ mapA mapB mapC)
;; => {:a {:b 6, :c {:d 8, :e ["a" "b" "e" "c"], :f 9}}, :g ["b" "c" "a"]}
;; and this also works:
(o+ 1 2 3 4 5 6)
;; => 21
(o+ ["a"] ["b" "c"] ["d"])
;; => ["a" "b" "c" "d"]
You can use merge-with with a custom function to resolve conflicts when merging maps.
(defn my-merge [a b]
(merge-with (fn [a b]
(if (map? a)
(merge-with + a b)
(into a b)))
a b))
Or defined recursively:
(defn my-merge [a b]
(cond (vector? a) (into a b)
(number? a) (+ a b)
(map? a) (merge-with my-merge a b)
:else (throw (ex-info "Unsupported values" {:values [a b]}))))
Here's an idea. I didn't test this, and I'm quite sure a more clever solution will come... especially now that I said that this is an answer (the someone-said-something-wrong-on-the-internet effect) :)
(reduce
(fn [accum item]
(let [b-accum (get-in accum [:a :b])
b-item (get-in item [:a :b])
new-b (+ b-accum b-item)
new-c (vec (concat (:c accum) (:c item))]
{:a {:b new-b} :c new-c}))
[map1 map2])
(require '[net.cgrand.xforms :as x])
(let [map1 {:a {:b 1} :c ["dog"]}
map2 {:a {:b 2} :c ["cat"]}]
(->> [map1 map2]
(into {}
(x/multiplex
{:a (comp (map :a) (map :b) (x/reduce +) (x/into [:b]) (x/into {}))
:c (comp (mapcat :c) (x/reduce conj))}))))

How to merge list of maps in clojure

I am given a list maps:
({:a 1 :b ["red" "blue"]} {:a 2 :b ["green"]} {:a 1 :b ["yellow"]} {:a 2 :b ["orange"]})
and I need to combine them to ultimately look like this:
({:a 1 :b ["red" "blue" "yellow"]} {:a 2 :b ["green" "orange"]})
Where the maps are combined based off the value of the key "a".
So far, I have this
(->> (sort-by :a)
(partition-by :a)
(map (partial apply merge)))
But the merge will overwrite the vector in "b" with the last one giving me
({:a 1 :b ["yellow"]} {:a 2 :b ["orange"]})
Instead of merge, use merge-with, i.e.
(->> a (sort-by :a)
(partition-by :a)
(map (partial
apply
merge-with (fn [x y] (if (= x y) x (into x y))))))
Outputs
({:a 1, :b ["red" "blue" "yellow"]} {:a 2, :b ["green" "orange"]})
You don't really want to sort or partition anything: that is just CPU time spent on nothing, and at the end you have an awkward data structure (a list of maps) instead of something that would be more convenient, a map keyed by the "important" :a value.
Rather, I would write this as a reduce, where each step uses merge-with on the appropriate subsection of the eventual map:
(defn combine-by [k ms]
(reduce (fn [acc m]
(update acc (get m k)
(partial merge-with into)
(dissoc m k)))
{}, ms))
after which we have
user=> (combine-by :a '({:a 1 :b ["red" "blue"]}
{:a 2 :b ["green"]}
{:a 1 :b ["yellow"]}
{:a 2 :b ["orange"]}))
{1 {:b ["red" "blue" "yellow"]}, 2 {:b ["green" "orange"]}}
which is usefully keyed for easily looking up a map by a specific :a, or if you prefer to get the results back out as a list, you can easily unroll it.

into vs. partition

This makes sense:
user=> (into {} [[:a 1] [:b 2]])
{:a 1, :b 2}
But why does this generate an error?
user=> (into {} (partition 2 [:a 1 :b 2]))
ClassCastException clojure.lang.Keyword cannot be cast to java.util.Map$Entry clojure.lang.ATransientMap.conj (ATransientMap.java:44)
Just to be sure:
user=> (partition 2 [:a 1 :b 2])
((:a 1) (:b 2))
Does into have a problem with lazy sequences? If so, why?
Beyond an explanation of why this doesn't work, what is the recommended way to conj a sequence of key-value pairs like [:a 1 :b 2] into a map? (apply conj doesn't seem to work, either.)
You can apply the sequence to assoc:
(apply assoc {:foo 1} [:a 1 :b 2])
=> {:foo 1, :a 1, :b 2}
Does into have a problem with lazy sequences? If so, why?
No, into is commonly used with lazily evaluated sequences. This is lazy, but each key/value tuple is a vector, which is why it works when into is reducing the pairs into the map:
(into {} (map vector (range 3) (repeat :x)))
=> {0 :x, 1 :x, 2 :x}
This doesn't work because the key/value pairs are lists:
(into {} (map list (range 3) (repeat :x)))
So the difference isn't laziness; it's due to into using reduce using conj on the map, which only works with vector key/value pairs (or MapEntrys):
(conj {} [:a 1]) ;; ok
(conj {} (MapEntry. :a 1)) ;; ok
(conj {} '(:a 1)) ;; not ok
Update: assoc wrapper for applying empty/nil sequences as suggested in comments:
(defn assoc*
([m] m)
([m k v & kvs]
(apply assoc m k v kvs)))
The recommended way – (assuming the seq arg is non-empty, as pointed out by the OP) – would be
Clojure 1.9.0
user=> (apply assoc {} [:a 1 :b 2])
{:a 1, :b 2}
The version with partition doesn't work because the blocks that partition returns are seqs and those are not treated as map entries when conj'd on to a map the way vectors and actual map entries are.
E.g. (into {} (map vec) (partition 2 [:a 1 :b 2])) would work because here the pairs get converted to vectors before conjing.
Still the approach with assoc is preferable unless there's some particular circumstance that makes into convenient (like, say, if you have a bunch of transducers that you want to use for preprocessing your partition-generated pairs etc.).
Clojure treats a 2-vec such as [:a 1] as equivalent to a MapEntry, doing what amounts to "automatic type conversion". I try to avoid this and always be explicit.
(first {:a 1}) => <#clojure.lang.MapEntry [:a 1]>
(conj {:a 1} [:b 2]) => <#clojure.lang.PersistentArrayMap {:a 1, :b 2}>
So we see that a MapEntry prints like a vector but has a different type (just like a Clojure seq prints like a list but has a different type). seq converts a Clojure map into a sequence of MapEntry's, and first gets us the first one (most Clojure functions call (seq ...) on any input collections before any other processing).
Notice that conj does the inverse type conversion, treating the vector [:b 2] as if it were a MapEntry. However, conj won't perform automatic type conversion for a list or a seq:
(throws? (conj {:a 1} '(:b 2)))
(throws? (into {:a 1} '(:b 2)))
into has the same problem since it is basically just (reduce conj <1st-arg> <2nd-seq>).
The other answers already have 3 ways that work:
(assoc {} :b 2) => {:b 2}
(conj {} [:b 2]) => {:b 2}
(into {} [[:a 1] [:b 2]]) => {:a 1, :b 2}
However, I would avoid those and stick to either hash-map or sorted-map, both of which avoid the problem of empty input seqs:
(apply hash-map []) => {} ; works for empty input seq
(apply hash-map [:a 1 :b 2]) => {:b 2, :a 1}
If your input sequence is a list of pairs, flatten is sometimes helpful:
(apply sorted-map (flatten [[:a 1] [:b 2]])) => {:a 1, :b 2}
(apply hash-map (flatten '((:a 1) (:b 2)))) => {:a 1, :b 2}
P.S.
Please be note that these are not the same:
java.util.Map$Entry (listed in jdk docs as "Map.Entry")
clojure.lang.MapEntry
P.P.S
If you already have a map and want to merge in a (possibly empty) sequence of key-value pairs, just use a combination of into and hash-map:
(into {:a 1} (apply hash-map [])) => {:a 1}
(into {:a 1} (apply hash-map [:b 2])) => {:a 1, :b 2}

How to select keys in nested maps in Clojure?

Let's say I have a map (m) like this:
(def m {:a 1 :b 2 :c {:d 3 :e 4} :e { ... } ....})
I'd like to create a new map only containing :a, :b and :d from m, i.e. the result should be:
{:a 1 :b 2 :d 3}
I know that I can use select-keys to easily get :a and :b:
(select-keys m [:a :b])
But what's a good way to also get :d? I'm looking for something like this:
(select-keys* m [:a :b [:c :d]])
Does such a function exists in Clojure or what's the recommended approach?
In pure Clojure I would do it like this:
(defn select-keys* [m paths]
(into {} (map (fn [p]
[(last p) (get-in m p)]))
paths))
(select-keys* m [[:a] [:b] [:c :d]]) ;;=> {:a 1, :b 2, :d 3}
I prefer keeping the type of a path regular, so a sequence of keys for all paths. In clojure.spec this would read as
(s/def ::nested-map (s/map-of keyword?
(s/or :num number? :map ::nested-map)))
(s/def ::path (s/coll-of keyword?))
(s/fdef select-keys*
:args (s/cat :m ::nested-map
:paths (s/coll-of ::path)))
As an alternative you can use destructing on a function, for example:
(def m {:a 1 :b 2 :c {:d 3 :e 4}})
(defn get-m
[{a :a b :b {d :d} :c}]
{:a 1 :b b :d d})
(get-m m) ; => {:a 1, :b 2, :d 3}
You can use clojure.walk.
(require '[clojure.walk :as w])
(defn nested-select-keys
[map keyseq]
(w/postwalk (fn [x]
(if (map? x)
(select-keys x keyseq)
(identity x))) map))
(nested-select-keys {:a 1 :b {:c 2 :d 3}} [:a :b :c])
; => {:a 1, :b {:c 2}}
I'm not aware of such a function being part of Clojure. You'll probably have to write it yourself. I've came up with this :
(defn select-keys* [m v]
(reduce
(fn [aggregate next]
(let [key-value (if (vector? next)
[(last next)
(get-in m next)]
[next
(get m next)])]
(apply assoc aggregate key-value)))
{}
v))
Require paths to be vectors so you can use peek (much faster than last). Reduce over the paths like this:
(defn select-keys* [m paths]
(reduce (fn [r p] (assoc r (peek p) (get-in m p))) {} paths))
(select-keys* m [[:a] [:b] [:c :d]]) ;;=> {:a 1, :b 2, :d 3}
Of course, this assumes that all your terminal keys are unique.

How to Increment Values in a Map

I am wrapping my head around state in Clojure. I come from languages where state can be mutated. For example, in Python, I can create a dictionary, put some string => integer pairs inside, and then walk over the dictionary and increment the values.
How would I do this in idiomatic Clojure?
(def my-map {:a 1 :b 2})
(zipmap (keys my-map) (map inc (vals my-map)))
;;=> {:b 3, :a 2}
To update only one value by key:
(update-in my-map [:b] inc) ;;=> {:a 1, :b 3}
Since Clojure 1.7 it's also possible to use update:
(update my-map :b inc)
Just produce a new map and use it:
(def m {:a 3 :b 4})
(apply merge
(map (fn [[k v]] {k (inc v) }) m))
; {:b 5, :a 4}
To update multiple values, you could also take advantage of reduce taking an already filled accumulator, and applying a function on that and every member of the following collection.
=> (reduce (fn [a k] (update-in a k inc)) {:a 1 :b 2 :c 3 :d 4} [[:a] [:c]])
{:a 2, :c 4, :b 2, :d 4}
Be aware of the keys needing to be enclosed in vectors, but you can still do multiple update-ins in nested structures like the original update in.
If you made it a generalized function, you could automatically wrap a vector over a key by testing it with coll?:
(defn multi-update-in
[m v f & args]
(reduce
(fn [acc p] (apply
(partial update-in acc (if (coll? p) p (vector p)) f)
args)) m v))
which would allow for single-level/key updates without the need for wrapping the keys in vectors
=> (multi-update-in {:a 1 :b 2 :c 3 :d 4} [:a :c] inc)
{:a 2, :c 4, :b 2, :d 4}
but still be able to do nested updates
(def people
{"keith" {:age 27 :hobby "needlefelting"}
"penelope" {:age 39 :hobby "thaiboxing"}
"brian" {:age 12 :hobby "rocket science"}})
=> (multi-update-in people [["keith" :age] ["brian" :age]] inc)
{"keith" {:age 28, :hobby "needlefelting"},
"penelope" {:age 39, :hobby "thaiboxing"},
"brian" {:age 13, :hobby "rocket science"}}
To slightly improve #Michiel Brokent's answer. This will work if the key already doesn't present.
(update my-map :a #(if (nil? %) 1 (inc %)))
I've been toying with the same idea, so I came up with:
(defn remap
"returns a function which takes a map as argument
and applies f to each value in the map"
[f]
#(into {} (map (fn [[k v]] [k (f v)]) %)))
((remap inc) {:foo 1})
;=> {:foo 2}
or
(def inc-vals (remap inc))
(inc-vals {:foo 1})
;=> {:foo 2}