I am struggling with the following problem...
Given a collection of maps
[
{:a 1 :b 1 :c 1 :d 1}
{:a 1 :b 2 :c 1 :d 2}
{:a 1 :b 2 :c 2 :d 3}
{:a 2 :b 1 :c 1 :d 5}
{:a 2 :b 1 :c 1 :d 6}
{:a 2 :b 1 :c 1 :d 7}
{:a 2 :b 2 :c 1 :d 7}
{:a 2 :b 3 :c 1 :d 7}
]
want to reduce/transform to...
{
1 {:b [1 2] :c [1 2] :d [1 2 3]}
2 {:b [1 2 3] :c 1 :d [5 6 7]}
}
group-by :a (primary key) and accumulate the distinct values for other keys.
I can do this in a brute force/imperative way, but struggling to figure out how to solve this in clojure way.
Thanks
Here is an admittedly inelegant, first-draft solution:
(defn reducing-fn [list-of-maps grouping-key]
(reduce (fn [m [k lst]]
(assoc m k (dissoc (reduce (fn [m1 m2]
(apply hash-map
(apply concat
(for [[k v] m2]
[k (conj (get m1 k #{}) v)]))))
{}
lst)
grouping-key)))
{}
(group-by #(grouping-key %) list-of-maps)))
user> (reducing-fn [{:a 1 :b 1 :c 1 :d 1}
{:a 1 :b 2 :c 1 :d 2}
{:a 1 :b 2 :c 2 :d 3}
{:a 2 :b 1 :c 1 :d 5}
{:a 2 :b 1 :c 1 :d 6}
{:a 2 :b 1 :c 1 :d 7}
{:a 2 :b 2 :c 1 :d 7}
{:a 2 :b 3 :c 1 :d 7}]
:a)
=> {2 {:c #{1}, :b #{1 2 3}, :d #{5 6 7}}, 1 {:c #{1 2}, :b #{1 2}, :d #{1 2 3}}}
Will try and figure out a more polished approach tomorrow, heading off to bed right now :)
(use 'clojure.set)
(def data
[
{:a 1 :b 1 :c 1 :d 1}
{:a 1 :b 2 :c 1 :d 2}
{:a 1 :b 2 :c 2 :d 3}
{:a 2 :b 1 :c 1 :d 5}
{:a 2 :b 1 :c 1 :d 6}
{:a 2 :b 1 :c 1 :d 7}
{:a 2 :b 2 :c 1 :d 7}
{:a 2 :b 3 :c 1 :d 7}
]
)
(defn key-join
"join of map by key , value is distinct."
[map-list]
(let [keys (keys (first map-list))]
(into {} (for [k keys] [k (vec (set (map #(% k) map-list)))]))))
(defn group-reduce [key map-list]
(let [sdata (set map-list)
group-value (project sdata [key])]
(into {}
(for [m group-value] [(key m) (key-join (map #(dissoc % key) (select #(= (key %) (key m)) sdata)))]))))
;;other version fast than group-reduce
(defn gr [key map-list]
(let [gdata (group-by key map-list)]
(into {} (for [[k m] gdata][k (dissoc (key-join m) key)]))))
user=> (group-reduce :a data)
{1 {:c [1 2], :b [1 2], :d [1 2 3]}, 2 {:c [1], :b [1 2 3], :d [5 6 7]}}
user=> (gr :a data)
{1 {:c [1 2], :b [1 2], :d [1 2 3]}, 2 {:c [1], :b [1 2 3], :d [5 6 7]}}
Another solution:
(defn pivot [new-key m]
(apply merge
(for [[a v] (group-by new-key m)]
{a (let [ks (set (flatten (map keys (map #(dissoc % new-key) v))))]
(zipmap ks (for [k ks] (set (map k v)))))})))
ETA: new-key would be the :a key here and m is your input map.
The first "for" destructures the group-by. That's where you're partitioning the data by the input "new-key." "for" generates a list - it's like Python's list comprehension. Here we're generating a list of maps, each with one key, whose value is a map. First we need to extract the relevant keys. These keys are held in the "ks" binding. We want to accumulate distinct values. While we could do this using reduce, since keywords are also functions, we can use them to extract across the collection and then use "set" to reduce down to distinct values. "zipmap" ties together our keys and their associated values. Then outside the main "for," we need to convert this list of maps into a single map whose keys are the distinct values of "a".
Another solution:
(defn transform
[key coll]
(letfn [(merge-maps
[coll]
(apply merge-with (fnil conj #{}) {} coll))
(process-key
[[k v]]
[k (dissoc (merge-maps v) key)])]
(->> coll
(group-by #(get % key))
(map process-key)
(into (empty coll)))))
Code untested, though.
EDIT: Of course it doesn't work, because of merge-with trying to be too clever.
(defn transform
[key coll]
(letfn [(local-merge-with
[f m & ms]
(reduce (fn [m [k v]] (update-in m [k] f v))
m
(for [m ms e m] e)))
(merge-maps
[coll]
(apply local-merge-with (fnil conj #{}) {} coll))
(process-key
[[k v]]
[k (dissoc (merge-maps v) key)])]
(->> coll
(group-by #(get % key))
(map process-key)
(into (empty coll)))))
Related
I recently learned about namespaced maps in clojure.
Very convenient, I was wondering what would be the idiomatic way of programmatically namespacing a map? Is there another syntax that I am not aware of?
;; works fine
(def m #:prefix{:a 1 :b 2 :c 3})
(:prefix/a m) ;; 1
;; how to programmatically prefix the map?
(def m {:a 1 :b 2 :c 3})
(prn #:prefix(:foo m)) ;; java.lang.RuntimeException: Unmatched delimiter: )
This function will do what you want:
(defn map->nsmap
[m n]
(reduce-kv (fn [acc k v]
(let [new-kw (if (and (keyword? k)
(not (qualified-keyword? k)))
(keyword (str n) (name k))
k) ]
(assoc acc new-kw v)))
{} m))
You can give it an actual namespace object:
(map->nsmap {:a 1 :b 2} *ns*)
=> #:user{:a 1, :b 2}
(map->nsmap {:a 1 :b 2} (create-ns 'my.new.ns))
=> #:my.new.ns{:a 1, :b 2}
Or give it a string for the namespace name:
(map->nsmap {:a 1 :b 2} "namespaces.are.great")
=> #:namespaces.are.great{:a 1, :b 2}
And it only alters keys that are non-qualified keywords, which matches the behavior of the #: macro:
(map->nsmap {:a 1, :foo/b 2, "dontalterme" 3, 4 42} "new-ns")
=> {:new-ns/a 1, :foo/b 2, "dontalterme" 3, 4 42}
Here is another example inspired by https://clojuredocs.org/clojure.walk/postwalk#example-542692d7c026201cdc327122
(defn map->nsmap
"Apply the string n to the supplied structure m as a namespace."
[m n]
(clojure.walk/postwalk
(fn [x]
(if (keyword? x)
(keyword n (name x))
x))
m))
Example:
(map->nsmap {:my-ns/a 1 :my-ns/b 2 :my-ns/c 3} "your-ns")
=> #:your-ns{:a 1, :b 2, :c 3}
What would be a quick way to keep only certain keys from a hash-map?
(def m {:a 1 :b 2 :c 3 :d 4})
explicit version:
((fn [{:keys [b c]}] {:b b :c c})
m)
;= {:b 2, :c 3}
select-keys:
(select-keys m [:b :c])
If you have a map or a collection of maps and you'd like to be able to update the values of several keys with one function, what the the most idiomatic way of doing this?
=> (def m [{:a 2 :b 3} {:a 2 :b 5}])
#'user/m
=> (map #(update-in % [:a] inc) m)
({:a 3, :b 3} {:a 3, :b 5})
Rather than mapping update-in for each key, I'd ideally like some function that operates like this:
=> (map #(update-vals % [:a :b] inc) m)
({:a 3, :b 4} {:a 3, :b 6})
Any advice would be much appreciated! I'm trying to reduce the number of lines in an unnecessarily long script.
Whenever you need to iteratively apply a fn to some data, reduce is your friend:
(defn update-vals [map vals f]
(reduce #(update-in % [%2] f) map vals))
Here it is in action:
user> (def m1 {:a 2 :b 3})
#'user/m1
user> (update-vals m1 [:a :b] inc)
{:a 3, :b 4}
user> (def m [{:a 2 :b 3} {:a 2 :b 5}])
#'user/m
user> (map #(update-vals % [:a :b] inc) m)
({:a 3, :b 4} {:a 3, :b 6})
Let's say I have
(defn test [ & {:keys [a b c]}]
(println a)
(println b)
(println c))
What I want is to call test with a map {:a 1 :b 2 :c 3}.
This works:
(apply test [:a 1 :b 2 :c 3])
These do not:
(apply test {:a 1 :b 2 :c 3})
(apply test (seq {:a 1 :b 2 :c 3}))
EDIT
So you can of course define the function like this also:
(defn test [{:keys [a b c]}] ; No &
(println a)
(println b)
(println c))
And then you can pass a map to it:
(test {:a 1 :b 2 :c 3})
1
2
3
When learning clojure I had missed this was possible. Nevertheless if you ever come across a function defined by me or somebody like me then knowing how to pass a map to it could still be useful ;)
user> (apply list (mapcat seq {:a 1 :b [2 3 4]}))
(:a 1 :b [2 3 4])
Any good reason not to define it like this in the first place?
(defn my-test [{:keys [a b c]}] ;; so without the &
(println a)
(println b)
(println c))
and then call it like this?
(my-test {:a 10 :b 20 :c 30})
which outputs:
10
20
30
nil
This works, but is inelegant:
(apply test (flatten (seq {:a 1 :b 2 :c 3})))
The reason (apply test (seq {:a 1 :b 2 :c 3})) doesn't work is that (seq {:a 1 :b 2 :c 3}) returns [[:a 1] [:b 2] [:c 3]], flatten takes care of this.
Better solutions?
If I have a set of maps like this
(def a #{
{:a 1 :b 2}
{:a 3 :b 4}
{:b 1 :c 2}
{:d 1 :e 2}
{:d 1 :y 2}
})
: how can I find out all the keys? so doing :
(find-all-keys a)
:returns:
(:a :b :c :d :e :y)
?
Another way:
(distinct (mapcat keys a))
Almost the same way:
(set (mapcat keys a))
Something like:
user=> (into #{} (flatten (map keys a)))
#{:y :a :c :b :d :e}
Another way:
(reduce #(into %1 (keys %2)) #{} a)