Clojure, merging two array of maps - clojure

I have two arrays of maps
1st is [{:a 1 :b 2 :d 6} {:a 2 :b 2} {:a 7 :b 7}]
2nd is [{:a 3 :c 3 :e 9 :y 7} {:a 2 :b 6 :c 8}]
depending on the value of a i.e. if its matches in 2nd array the '2nd map' should be merged with '1st map' and the resultant array of maps should be
Res should be [{:a 1 :b 2 :d 6} {:a 2 :b 6 :c 8} {:a 7 :b 7} {:a 3 :c 3 :e 9 :y 7}]
Can anyone help me on this. Thanks in advance.

Here you go:
user> (def xs [{:a 1 :b 2 :d 6} {:a 2 :b 2} {:a 7 :b 7}])
#'user/xs
user> (def ys [{:a 3 :c 3 :e 9 :y 7} {:a 2 :b 6 :c 8}])
#'user/ys
user> (for [[a ms] (group-by :a (concat xs ys))] (apply merge ms))
({:a 1, :b 2, :d 6} {:a 2, :c 8, :b 6} {:a 7, :b 7} {:y 7, :a 3, :c 3, :e 9})

This data structure looks very unwieldy to me nevertheless here's my take:
(defn key-by-a [coll]
"Convert a list of maps to a map of maps keyed by their vals at :a"
(apply hash-map (mapcat (juxt :a identity) coll)))
(defn merge-map-lists [l1 l2]
(->> [l1 l2]
(map key-by-a)
(apply merge-with merge)
(vals)))
One thing it doesn't do is maintaining order of the input lists but since it is not clear which list decides (both might have same keys in different orders) I left that out.

maybe clojure.set/join is what you want:
here is the docs of clojure.set/join:
user=> (def animals #{{:name "betsy" :owner "brian" :kind "cow"}
{:name "jake" :owner "brian" :kind "horse"}
{:name "josie" :owner "dawn" :kind "cow"}})
user=> (def personalities #{{:kind "cow" :personality "stoic"}
{:kind "horse" :personality "skittish"}})
#'user/personalities
user=> (join animals personalities)
#{{:owner "dawn", :name "josie", :kind "cow", :personality "stoic"}
{:owner "brian", :name "betsy", :kind "cow", :personality "stoic"}
{:owner "brian", :name "jake", :kind "horse", :personality "skittish"}}
user=> (join animals personalities)
#{{:kind "horse", :owner "brian", :name "jake", :species "cow", :personality "stoic"}
{:kind "cow", :owner "dawn", :name "josie", :species "cow", :personality "stoic"}
{:kind "horse", :owner "brian", :name "jake", :species "horse", :personality "skittish"}
{:kind "cow", :owner "brian", :name "betsy", :species "cow", :personality "stoic"}
{:kind "cow", :owner "dawn", :name "josie", :species "horse", :personality "skittish"}
{:kind "cow", :owner "brian", :name "betsy", :species "horse", :personality "skittish"}}
;; Notice that "Jake" is both a horse and a cow in the first line. That's
;; likely not what you want. You can tell `join` to only produce output
;; where the `:kind` value is the same as the `:species` value like this:
user=> (join animals personalities {:kind :species})
#{{:kind "cow", :owner "dawn", :name "josie", :species "cow", :personality "stoic"}
{:kind "horse", :owner "brian", :name "jake", :species "horse", :personality "skittish"}
{:kind "cow", :owner "brian", :name "betsy", :species "cow", :personality "stoic"}}

Related

Clojure: Merge 2 vectors of maps

I have 2 vectors of maps: employ-base and employ1. I want to merge the 2 vectors where employ1 has higher priority than employ-base. So if employ1 has the records use them, else use the record from employ-base. What is the best way to do it in clojure?
from:
(def employ-base
[{:id 1 :name "Aaron" :income 0}
{:id 2 :name "Ben" :income 0}
{:id 3 :name "Carry" :income 0}])
(def employ1
[{:id 1 :name "Aaron" :income 1000}
{:id 3 :name "Carry" :income 2000}])
to:
(def employ1
[{:id 1 :name "Aaron" :income 1000}
{:id 2 :name "Ben" :income 0}
{:id 3 :name "Carry" :income 2000}])
Assuming :id is unique per employee, you could group the maps by :id then merge each grouping of maps per :id:
(map
#(apply merge (val %))
(merge-with concat
(group-by :id employ-base)
(group-by :id employ1)))
=> ({:id 1, :name "Aaron", :income 1000}
{:id 2, :name "Ben", :income 0}
{:id 3, :name "Carry", :income 2000})
The precedence of merging is maintained by merging employ1 after employe-base, since merge and merge-with prefer the rightmost values.

Cleaner Way to Sort and Order a Vector of Maps in Clojure?

I have a vector of maps wherein I need to remove the maps where the value of the name key is a duplicate, keeping the one that has the highest value of age. I have a solution but I don't think it looks clean. Is there a better way to do it without breaking it up into multiple functions?
Here is my data:
(def my-maps
[{:name "jess", :age 32}
{:name "ruxpin", :age 4}
{:name "jess", :age 35}
{:name "aero", :age 33}
{:name "banner", :age 4}])
Here is my solution:
(map first (vals (group-by :name (reverse (sort-by :name my-maps)))))
Result:
({:name "ruxpin", :age 4} {:name "jess", :age 35} {:name "banner", :age 4} {:name "aero", :age 33})
another way is the combination of group-by and max-key. The advantage of this method is that you don't need to sort your collection, and sort in turn has an impact on performance and if it can be avoided it should be.
(for [[_ vs] (group-by :name my-maps)]
(apply max-key :age vs))
;;=> ({:name "jess", :age 35}
;; {:name "ruxpin", :age 4}
;; {:name "aero", :age 33}
;; {:name "banner", :age 4})
short version
(->> my-set
(sort-by (juxt :name :age) #(compare %2 %1)) ; sort-by :name, :age in reverse order
(partition-by :name)
(map first))
a transducer version
(def xf (comp (partition-by :name) (map first)))
(->> my-set
(sort-by (juxt :name :age) #(compare %2 %1))
(into [] xf))
for large dataset, the transducer should be better
Your original solution was actually broken unfortunately. It just seemed to work because of the order you had the data in my-set in. Note how you never actually sort by age, so you can never guarantee what order the ages are in.
I solved this with another call to map:
(->> my-set (group-by :name)
(vals)
; Sort by age each list that group-by returns
(map #(sort-by :age %))
(map last)) ; This could also happen in the above map
Note how I'm sorting each :name group by :age, then I take the last of each grouping.
I would do it a little differently, using the max function instead of sorting:
(def my-maps
[{:name "jess", :age 32}
{:name "ruxpin", :age 4}
{:name "jess", :age 35}
{:name "aero", :age 33}
{:name "banner", :age 4}])
(dotest
(let [grouped-data (group-by :name my-maps)
name-age-maps (for [[name map-list] grouped-data]
(let [max-age (apply max
(map :age map-list))
name-age-map {name max-age}]
name-age-map))
final-result (reduce into {} name-age-maps)]
final-result))
with results:
grouped-data =>
{"jess" [{:name "jess", :age 32} {:name "jess", :age 35}],
"ruxpin" [{:name "ruxpin", :age 4}],
"aero" [{:name "aero", :age 33}],
"banner" [{:name "banner", :age 4}]}
name-age-maps =>
({"jess" 35} {"ruxpin" 4} {"aero" 33} {"banner" 4})
final-result =>
{"jess" 35, "ruxpin" 4, "aero" 33, "banner" 4}
Compare by vector fields with different weight and data type (size has more weight), size is descending, name is ascending:
(def some-vector [{:name "head" :size 3}
{:name "mouth" :size 1}
{:name "nose" :size 1}
{:name "neck" :size 2}
{:name "chest" :size 10}
{:name "back" :size 10}
{:name "abdomen" :size 6}
])
(->> (some-vector)
(sort #(compare (str (format "%3d" (:size %2)) (:name %1))
(str (format "%3d" (:size %1)) (:name %2))
)))

Duplicate map in collection based on value in specific key

So I have this:
[{:a ["x" "y"], :b "foo"}
{:a ["x" "y" "z"], :b "bar"}]
And want this:
[{:a "x", :b "foo"}
{:a "y", :b "foo"}
{:a "x", :b "bar"}
{:a "y", :b "bar"}
{:a "z", :b "bar"}]
How can I do this?
for is really nice for known levels of nesting:
(for [x [{:a ["x" "y"], :b "foo"}
{:a ["x" "y" "z"], :b "bar"}]
a (:a x)]
(assoc x :a a))
You can use mapcat:
(def c [{:a ["x" "y"], :b "foo"}
{:a ["x" "y" "z"], :b "bar"}])
(mapcat (fn [{:keys [a] :as m}] (map #(assoc m :a %) a)) c)

Clojure - in a vector of hashmaps return the lowest hashmap where two keys match

I have a vector of hashmaps with a format similar to the following:
[{:a 1 :b 2} {:a 3 :b 4} {:a 1 :b 6} {:a 3 :b 9} {:a 5 :b 1} {:a 6 :b 1}]
I would like to filter out the lowest :b value for matching :a values, so if two :a values are the same e.g. {:a 1 :b 2},{:a 1 :b 6} it should return: {:a 1 :b 2} as 2 is lower than 6
So for the vector above I would like to get:
[{:a 1 :b 2} {:a 3 :b 4} {:a 5 :b 1} {:a 6 :b 1}]
I have tried a few things but I am a bit stuck, any help would be appreciated, thanks.
Your original direction was correct. You only missed the grouping part:
(def test [{:a 1 :b 2} {:a 3 :b 4} {:a 1 :b 6} {:a 3 :b 9} {:a 5 :b 1} {:a 6 :b 1}])
(defn min-map [m]
(map (partial apply min-key :b) (vals (group-by :a m))))
(min-map test)
=> ({:a 1, :b 2} {:a 3, :b 4} {:a 5, :b 1} {:a 6, :b 1})
First we group the the list of maps by the key :a, and extract the values of it. We then examine each group and find the smallest value using min-key by key :b
So I thought about it for a bit and I now have an answer (albeit not a very good one):
(defn contains [a vect]
(apply min-key :b(filter #(= (:a %) (:a a))vect))
)
(defn starter []
(let [tester [{:a 1 :b 2} {:a 3 :b 4} {:a 1 :b 6} {:a 3 :b 9} {:a 5 :b 1} {:a 6 :b 1}]]
(vec(distinct(map #(contains % tester)tester)))
)
)
Thanks for everyones help, If you have any critiques or a better solution please post it.
With the dependency
[tupelo "0.1.68"]
we can write the following code. I left in lots of spy expressions so
(ns clj.core
(:use tupelo.core)
(:require [clojure.core :as clj]
[schema.core :as s]
[tupelo.types :as tt]
[tupelo.schema :as ts]
))
; Prismatic Schema type definitions
(s/set-fn-validation! true) ; #todo add to Schema docs
(def data [ {:a 1 :b 2} {:a 3 :b 4} {:a 1 :b 6} {:a 3 :b 9} {:a 5 :b 1} {:a 6 :b 1} ] )
(defn accum-smallest-b-entries
[cum-map-a2b
; A map where both keys and vals are simple 1-entry maps
; like: {:a 1} -> {:b 2}
; {:a 2} -> {:b 9}
new-a-b-map
; next entry, like {:a 1 :b 2}
]
(newline)
(println "---------------------------------")
(let [new-a-map (select-keys new-a-b-map [:a] ) ; like {:a 1}
_ (spyx new-a-map)
new-b-map (select-keys new-a-b-map [:b] ) ; like {:b 2}
_ (spyx new-b-map)
curr-b-map (get cum-map-a2b new-a-map)
_ (spyx curr-b-map)
next-b-map (if (or (nil? curr-b-map)
(< (grab :b new-b-map) (grab :b curr-b-map)))
new-b-map
curr-b-map)
_ (spyx next-b-map)
]
(spyx (assoc cum-map-a2b new-a-map next-b-map))))
(def a2b-map (reduce accum-smallest-b-entries {} data))
(spyx a2b-map)
(defn combine-keyvals-from-a2b-map
[cum-result
; final result like: [ {:a 1 :b 2}
; {:a 2 :b 9} ]
a2b-entry
; next entry from a2b-map like [ {:a 5} {:b 1} ]
]
(newline)
(println "combine-keyvals-from-a2b-map")
(println "---------------------------------")
(spyx cum-result)
(spyx a2b-entry)
(let [combined-ab-map (glue (first a2b-entry) (second a2b-entry))
_ (spyx combined-ab-map)
new-result (append cum-result combined-ab-map)
_ (spyx new-result)
]
new-result))
(def a-and-b-map (reduce combine-keyvals-from-a2b-map [] a2b-map))
(spyx a-and-b-map)
(defn -main [] )
Running the code we get:
---------------------------------
new-a-map => {:a 1}
new-b-map => {:b 2}
curr-b-map => nil
next-b-map => {:b 2}
(assoc cum-map-a2b new-a-map next-b-map) => {{:a 1} {:b 2}}
---------------------------------
new-a-map => {:a 3}
new-b-map => {:b 4}
curr-b-map => nil
next-b-map => {:b 4}
(assoc cum-map-a2b new-a-map next-b-map) => {{:a 1} {:b 2}, {:a 3} {:b 4}}
---------------------------------
new-a-map => {:a 1}
new-b-map => {:b 6}
curr-b-map => {:b 2}
next-b-map => {:b 2}
(assoc cum-map-a2b new-a-map next-b-map) => {{:a 1} {:b 2}, {:a 3} {:b 4}}
---------------------------------
new-a-map => {:a 3}
new-b-map => {:b 9}
curr-b-map => {:b 4}
next-b-map => {:b 4}
(assoc cum-map-a2b new-a-map next-b-map) => {{:a 1} {:b 2}, {:a 3} {:b 4}}
---------------------------------
new-a-map => {:a 5}
new-b-map => {:b 1}
curr-b-map => nil
next-b-map => {:b 1}
(assoc cum-map-a2b new-a-map next-b-map) => {{:a 1} {:b 2}, {:a 3} {:b 4}, {:a 5} {:b 1}}
---------------------------------
new-a-map => {:a 6}
new-b-map => {:b 1}
curr-b-map => nil
next-b-map => {:b 1}
(assoc cum-map-a2b new-a-map next-b-map) => {{:a 1} {:b 2}, {:a 3} {:b 4}, {:a 5} {:b 1}, {:a 6} {:b 1}}
a2b-map => {{:a 1} {:b 2}, {:a 3} {:b 4}, {:a 5} {:b 1}, {:a 6} {:b 1}}
combine-keyvals-from-a2b-map
---------------------------------
cum-result => []
a2b-entry => [{:a 1} {:b 2}]
combined-ab-map => {:a 1, :b 2}
new-result => [{:a 1, :b 2}]
combine-keyvals-from-a2b-map
---------------------------------
cum-result => [{:a 1, :b 2}]
a2b-entry => [{:a 3} {:b 4}]
combined-ab-map => {:a 3, :b 4}
new-result => [{:a 1, :b 2} {:a 3, :b 4}]
combine-keyvals-from-a2b-map
---------------------------------
cum-result => [{:a 1, :b 2} {:a 3, :b 4}]
a2b-entry => [{:a 5} {:b 1}]
combined-ab-map => {:a 5, :b 1}
new-result => [{:a 1, :b 2} {:a 3, :b 4} {:a 5, :b 1}]
combine-keyvals-from-a2b-map
---------------------------------
cum-result => [{:a 1, :b 2} {:a 3, :b 4} {:a 5, :b 1}]
a2b-entry => [{:a 6} {:b 1}]
combined-ab-map => {:a 6, :b 1}
new-result => [{:a 1, :b 2} {:a 3, :b 4} {:a 5, :b 1} {:a 6, :b 1}]
a-and-b-map => [{:a 1, :b 2} {:a 3, :b 4} {:a 5, :b 1} {:a 6, :b 1}]
In hindsight, it could be simplified more if we were guarenteed that each input map was like {:a :b }, since we could simplify it to a series of 2-d points like [n m], since the keywords :a and :b would be reduntant.
Here is a better answer using the group-by function:
(ns clj.core
(:use tupelo.core)
(:require [clojure.core :as clj]
[schema.core :as s]
[tupelo.types :as tt]
[tupelo.schema :as ts]
))
; Prismatic Schema type definitions
(s/set-fn-validation! true) ; #todo add to Schema docs
(def data [ {:a 1 :b 2} {:a 3 :b 4} {:a 1 :b 6} {:a 3 :b 9} {:a 5 :b 1} {:a 6 :b 1} ] )
(def data-by-a (group-by :a data))
; like { 1 [{:a 1, :b 2} {:a 1, :b 6}],
; 3 [{:a 3, :b 4} {:a 3, :b 9}],
; 5 [{:a 5, :b 1}],
; 6 [{:a 6, :b 1}] }
(spyx data-by-a)
(defn smallest-map-by-b
[curr-result ; like {:a 1, :b 2}
next-value] ; like {:a 1, :b 6}
(if (< (grab :b curr-result)
(grab :b next-value))
curr-result
next-value))
(defn find-min-b
"Return the map with the smallest b value"
[ab-maps] ; like [ {:a 1, :b 2} {:a 1, :b 6} ]
(reduce smallest-map-by-b
(first ab-maps) ; choose 1st as init guess at result
ab-maps))
(def result
(glue
(for [entry data-by-a] ; entry is MapEntry like: [ 1 [{:a 1, :b 2} {:a 1, :b 6}] ]
(spyx (find-min-b (val entry)))
)))
(spyx result)
(defn -main [] )
which produces the result
data-by-a => {1 [{:a 1, :b 2} {:a 1, :b 6}], 3 [{:a 3, :b 4} {:a 3, :b 9}], 5 [{:a 5, :b 1}], 6 [{:a 6, :b 1}]}
(find-min-b (val entry)) => {:a 1, :b 2}
(find-min-b (val entry)) => {:a 3, :b 4}
(find-min-b (val entry)) => {:a 5, :b 1}
(find-min-b (val entry)) => {:a 6, :b 1}
result => [{:a 1, :b 2} {:a 3, :b 4} {:a 5, :b 1} {:a 6, :b 1}]

Merging two lists of maps where entries are identified by id in Clojure

What's the idiomatic way of merging two lists of maps in Clojure where each map entry is identified by an id key?
What's an implementation for foo so that
(foo '({:id 1 :bar true :value 1}
{:id 2 :bar false :value 2}
{:id 3 :value 3})
'({:id 5 :value 5}
{:id 2 :value 2}
{:id 3 :value 3}
{:id 1 :value 1}
{:id 4 :value 4})) => '({:id 1 :bar true :value 1}
{:id 2 :bar false :value 2}
{:id 3 :value 3}
{:id 4 :value 4}
{:id 5 :value 5})
is true?
(defn merge-by
"Merges elems in seqs by joining them on return value of key-fn k.
Example: (merge-by :id [{:id 0 :name \"George\"}{:id 1 :name \"Bernie\"}]
[{:id 2 :name \"Lara\"}{:id 0 :name \"Ben\"}])
=> [{:id 0 :name \"Ben\"}{:id 1 :name \"Bernie\"}{:id 2 :name \"Lara\"}]"
[k & seqs]
(->> seqs
(map (partial group-by k))
(apply merge-with (comp vector
(partial apply merge)
concat))
vals
(map first)))
How about this:
(defn foo [& colls]
(map (fn [[_ equivalent-maps]] (apply merge equivalent-maps))
(group-by :id (sort-by :id (apply concat colls)))))
This is generalized so that you can have an arbitrary number of input sequences, and an arbitrary grouping selector:
(def a [{:id 5 :value 5}
{:id 2 :value 2}
{:id 3 :value 3}
{:id 1 :value 1}
{:id 4 :value 4}])
(def b [{:id 1 :bar true :value 1}
{:id 2 :bar false :value 2}
{:id 3 :value 3}])
(def c [{:id 1 :bar true :value 1}
{:id 2 :bar false :value 2}
{:id 3 :value 3}
{:id 4 :value 4}
{:id 5 :value 5}])
(defn merge-vectors
[selector & sequences]
(let [unpack-grouped (fn [group]
(into {} (map (fn [[k [v & _]]] [k v]) group)))
grouped (map (comp unpack-grouped (partial group-by selector))
sequences)
merged (apply merge-with merge grouped)]
(sort-by selector (vals merged))))
(defn tst
[]
(= c
(merge-vectors :id a b)))