Clojure convert vector of maps into map of vector values - clojure

There are a few SO posts related to this topic, however I could not find anything that works for what I am looking to accomplish.
I have a vector of maps. I am going to use the example from another related SO post:
(def data
[{:id 1 :first-name "John1" :last-name "Dow1" :age "14"}
{:id 2 :first-name "John2" :last-name "Dow2" :age "54"}
{:id 3 :first-name "John3" :last-name "Dow3" :age "34"}
{:id 4 :first-name "John4" :last-name "Dow4" :age "12"}
{:id 5 :first-name "John5" :last-name "Dow5" :age "24"}]))
I would like to convert this into a map with the values of each entry be a vector of the associated values (maintaining the order of data).
Here is what I would like to have as the output:
{:id [1 2 3 4 5]
:first-name ["John1" "John2" "John3" "John4" "John5"]
:last-name ["Dow1" "Dow2" "Dow3" "Dow4" "Dow5"]
:age ["14" "54" "34" "12" "24"]}
Is there an elegant and efficient way to do this in Clojure?

Can be made more efficient, but this is a nice start:
(def ks (keys (first data)))
(zipmap ks (apply map vector (map (apply juxt ks) data))) ;;=>
{:id [1 2 3 4 5]
:first-name ["John1" "John2" "John3" "John4" "John5"]
:last-name ["Dow1" "Dow2" "Dow3" "Dow4" "Dow5"]
:age ["14" "54" "34" "12" "24"]}
Another one that comes close:
(group-by key (into [] cat data))
;;=>
{:id [[:id 1] [:id 2] [:id 3] [:id 4] [:id 5]],
:first-name [[:first-name "John1"] [:first-name "John2"] [:first-name "John3"] [:first-name "John4"] [:first-name "John5"]],
:last-name [[:last-name "Dow1"] [:last-name "Dow2"] [:last-name "Dow3"] [:last-name "Dow4"] [:last-name "Dow5"]],
:age [[:age "14"] [:age "54"] [:age "34"] [:age "12"] [:age "24"]]}

Well, I worked out a solution and then before I could post, Michiel posted a more concise solution, but I'll go ahead and post it anyway =).
(defn map-coll->key-vector-map
[coll]
(reduce (fn [new-map key]
(assoc new-map key (vec (map key coll))))
{}
(keys (first coll))))

For me, the clearest approach here is the following:
(defn ->coll [x]
(cond-> x (not (coll? x)) vector))
(apply merge-with #(conj (->coll %1) %2) data)
Basically, the task here is to merge all maps (merge-with), and gather up all the values at the same key by conjoining (conj) onto the vector at key – ensuring that values are actually conjoined onto a vector (->coll).

If we concatenate the maps into a single sequence of pairs, we have an edge-list representation of a graph. What we have to do is convert it into an adjacency-list (here a vector, not a list) representation.
(defn collect [maps]
(reduce
(fn [acc [k v]] (assoc acc k (conj (get acc k []) v)))
{}
(apply concat maps)))
For example,
=> (collect data)
{:id [1 2 3 4 5]
:first-name ["John1" "John2" "John3" "John4" "John5"]
:last-name ["Dow1" "Dow2" "Dow3" "Dow4" "Dow5"]
:age ["14" "54" "34" "12" "24"]}
The advantage of this method over some of the others is that the maps in the given sequence can be different shapes.

Please consider the reader when writing code! There are no prizes for playing "code golf". There are, however, considerable costs to others when you force them to decipher code that is overly condensed.
I always try to be explicit about what code is doing. This is easiest if you break down a problem into simple steps and use good names. In particular, it is almost impossible to accomplish this using juxt or any other cryptic function.
Here is how I would implement the solution:
(def data
[{:id 1 :first-name "John1" :last-name "Dow1" :age "14"}
{:id 2 :first-name "John2" :last-name "Dow2" :age "54"}
{:id 3 :first-name "John3" :last-name "Dow3" :age "34"}
{:id 4 :first-name "John4" :last-name "Dow4" :age "12"}
{:id 5 :first-name "John5" :last-name "Dow5" :age "24"}])
(def data-keys (keys (first data)))
(defn create-empty-result
"init result map with an empty vec for each key"
[data]
(zipmap data-keys (repeat [])))
(defn append-map-to-result
[cum-map item-map]
(reduce (fn [result map-entry]
(let [[curr-key curr-val] map-entry]
(update-in result [curr-key] conj curr-val)))
cum-map
item-map))
(defn transform-data
[data]
(reduce
(fn [cum-result curr-map]
(append-map-to-result cum-result curr-map))
(create-empty-result data)
data))
with results:
(dotest
(is= (create-empty-result data)
{:id [], :first-name [], :last-name [], :age []})
(is= (append-map-to-result (create-empty-result data)
{:id 1 :first-name "John1" :last-name "Dow1" :age "14"})
{:id [1], :first-name ["John1"], :last-name ["Dow1"], :age ["14"]})
(is= (transform-data data)
{:id [1 2 3 4 5],
:first-name ["John1" "John2" "John3" "John4" "John5"],
:last-name ["Dow1" "Dow2" "Dow3" "Dow4" "Dow5"],
:age ["14" "54" "34" "12" "24"]}))
Note that I included unit tests for the helper functions as a way of both documenting what they are intended to do as well as demonstrating to the reader that they actually work as advertised.
This template project can be used to run the above code.

Related

Map nested vector of array maps

I've been trying to map the nested values of a map within a vector into a vector of vectors without success.
The data I have is like this:
[{:country {:name "chile", :id 1},
:subcountries [{:name "talca", :id 2}
{:name "concepcion", :id 3}
{:name "puerto montt", :id 4}]}
{:country {:name "united states", :id 5},
:subcountries [{:name "boston", :id 6}
{:name "texas", :id 7}]}]
While the code I've been playing with vaguely returns an approximation of what I'm trying to get as a result:
(map
(fn [x]
(let [{{id :id name :name} :country
subcountries :subcountries} x]
[id
name
(map (fn [y] (let [{yid :id yname :yname} y] [yid yname])))]))
data)
The result I'm receiving with that is something pretty odd, since the vector I'd want to have is just a function:
([1 "chile" #function[clojure.core/map/fn--5862]]
[5 "united states" #function[clojure.core/map/fn--5862]])
What am I doing wrong?
Expected output should be something like:
[[["chile" 1] ["talca" 2] ["concepcion" 3] ["puerto montt" 4]]
[["united states" 5] ["boston" 6] ["texas" 7]]]
The reason you're seeing the function in your vector output is that your inner map wasn't applying the function to any data structure, so it was returning a transducer.
Here I've updated the inner map to map the function to the subcountries, which I assume was your intent. (There was also a tiny typo, you had yname :yname instead of yname :name)
(defn f [data]
(mapv
(fn [x]
(let [{{id :id name :name} :country
subcountries :subcountries} x]
[id
name
(mapv (fn [y] (let [{yid :id yname :name} y] [yid yname])) subcountries)]))
data))
Not sure if this is exactly your desired output, since you said "something like...". If not, let us know if you need more help getting it the rest of the way there.
> (f data)
[[1 "chile" [[2 "talca"] [3 "concepcion"] [4 "puerto montt"]]]
[5 "united states" [[6 "boston"] [7 "texas"]]]]
This might be a more "Clojurey" way to do it:
(defn countries->vecs [data]
(let [->pair (juxt :name :id)
map->pairs (fn [{:keys [country subcountries]}]
(apply vector (->pair country)
(map ->pair subcountries)))]
(mapv map->pairs data)))

How to group-by a collection that is already grouped by in Clojure?

I have a collection of maps
(def a '({:id 9345 :value 3 :type "orange"}
{:id 2945 :value 2 :type "orange"}
{:id 145 :value 3 :type "orange"}
{:id 2745 :value 6 :type "apple"}
{:id 2345 :value 6 :type "apple"}))
I want to group this first by value, followed by type.
My output should look like:
{
:orange [{
:value 3,
:id [9345, 145]
}, {
:value 2,
:id [2935]
}],
:apple [{
:value 6,
:id [2745, 2345]
}]
}
How would I do this in Clojure? Appreciate your answers.
Thanks!
Edit:
Here is what I had so far:
(defn by-type-key [data]
(group-by #(get % "type") data))
(reduce-kv
(fn [m k v] (assoc m k (reduce-kv
(fn [sm sk sv] (assoc sm sk (into [] (map #(:id %) sv))))
{}
(group-by :value (map #(dissoc % :type) v)))))
{}
(by-type-key a))
Output:
=> {"orange" {3 [9345 145], 2 [2945]}, "apple" {6 [2745 2345], 3 [125]}}
I just couldnt figure out how to proceed next...
Your requirements are a bit inconsistent (or rather irregular) - you use :type values as keywords in the result, but the rest of the keywords are carried through. Maybe that's what you must do to satisfy some external formats - otherwise you need to either use the same approach as with :type through, or add a new keyword to the result, like :group or :rows and keep the original keywords intact. I will assume the former approach for the moment (but see below, I will get to the shape as you want it,) so the final shape of data is like
{:orange
{:3 [9345 145],
:2 [2945]},
:apple
{:6 [2745 2345]}
}
There is more than one way of getting there, here's the gist of one:
(group-by (juxt :type :value) a)
The result:
{["orange" 3] [{:id 9345, :value 3, :type "orange"} {:id 145, :value 3, :type "orange"}],
["orange" 2] [{:id 2945, :value 2, :type "orange"}],
["apple" 6] [{:id 2745, :value 6, :type "apple"} {:id 2345, :value 6, :type "apple"}]}
Now all rows in your collection are grouped by the keys you need. From this, you can go and get the shape you want, say to get to the shape above you can do
(reduce
(fn [m [k v]]
(let [ks (map (comp keyword str) k)]
(assoc-in m ks
(map :id v))))
{}
(group-by (juxt :type :value) a))
The basic idea is to get the rows grouped by the key sequence (and that's what group-by and juxt do,) and then combine reduce and assoc-in or update-in to beat the result into place.
To get exactly the shape you described:
(reduce
(fn [m [k v]]
(let [type (keyword (first k))
value (second k)
ids (map :id v)]
(update-in m [type]
#(conj % {:value value :id ids}))))
{}
(group-by (juxt :type :value) a))
It's a bit noisy, and it might be harder to see the forest for the trees - that's why I simplified the shape, to highlight the main idea. The more regular your shapes are, the shorter and more regular your functions become - so if you have control over it, try to make it simpler for you.
I would do the transform in two stages (using reduce):
the first to collect the values
the second for formating
The following code solves your problem:
(def a '({:id 9345 :value 3 :type "orange"}
{:id 2945 :value 2 :type "orange"}
{:id 145 :value 3 :type "orange"}
{:id 2745 :value 6 :type "apple"}
{:id 2345 :value 6 :type "apple"}))
(defn standardise [m]
(->> m
;; first stage
(reduce (fn [out {:keys [type value id]}]
(update-in out [type value] (fnil #(conj % id) [])))
{})
;; second stage
(reduce-kv (fn [out k v]
(assoc out (keyword k)
(reduce-kv (fn [out value id]
(conj out {:value value
:id id}))
[]
v)))
{})))
(standardise a)
;; => {:orange [{:value 3, :id [9345 145]}
;; {:value 2, :id [2945]}],
;; :apple [{:value 6, :id [2745 2345]}]}
the output of the first stage is:
(reduce (fn [out {:keys [type value id]}]
(update-in out [type value] (fnil #(conj % id) [])))
{}
a)
;;=> {"orange" {3 [9345 145], 2 [2945]}, "apple" {6 [2745 2345]}}
You may wish to use the built-in function group-by. See http://clojuredocs.org/clojure.core/group-by

Clojure - Recursively Semi-Flatten Nested Map

In clojure, how can I turn a nested map like this:
(def parent {:id "parent-1"
:value "Hi dude!"
:children [{:id "child-11"
:value "How is life?"
:children [{:id "child-111"
:value "Some value"
:children []}]}
{:id "child-12"
:value "Does it work?"
:children []}]})
Into this:
[
[{:id "parent-1", :value "Hi dude!"}]
[{:id "parent-1", :value "Hi dude!"} {:id "child-11", :value "How is life?"}]
[{:id "parent-1", :value "Hi dude!"} {:id "child-11", :value "How is life?"} {:id "child-111", :value "Some value"}]
[{:id "parent-1", :value "Hi dude!"} {:id "child-12", :value "Does it work?"}]
]
I'm stumbling through very hacky recursive attempts and now my brain is burnt out.
What I've got so far is below. It does get the data right, however it puts the data in some extra undesired nested vectors.
How can this be fixed?
Is there a nice idiomatic way to do this in Clojure?
Thanks.
(defn do-flatten [node parent-tree]
(let [node-res (conj parent-tree (dissoc node :children))
child-res (mapv #(do-flatten % node-res) (:children node))
end-res (if (empty? child-res) [node-res] [node-res child-res])]
end-res))
(do-flatten parent [])
Which produces:
[
[{:id "parent-1", :value "Hi dude!"}]
[[
[{:id "parent-1", :value "Hi dude!"} {:id "child-11", :value "How is life?"}]
[[
[{:id "parent-1", :value "Hi dude!"} {:id "child-11", :value "How is life?"} {:id "child-111", :value "Some value"}]
]]]
[
[{:id "parent-1", :value "Hi dude!"} {:id "child-12", :value "Does it work?"}]
]]
]
I don't know if this is idiomatic, but it seems to work.
(defn do-flatten
([node]
(do-flatten node []))
([node parents]
(let [path (conj parents (dissoc node :children))]
(vec (concat [path] (mapcat #(do-flatten % path)
(:children node)))))))
You can leave off the [] when you call it.
another option is to use zippers:
(require '[clojure.zip :as z])
(defn paths [p]
(loop [curr (z/zipper map? :children nil p)
res []]
(cond (z/end? curr) res
(z/branch? curr) (recur (z/next curr)
(conj res
(mapv #(select-keys % [:id :value])
(conj (z/path curr) (z/node curr)))))
:else (recur (z/next curr) res))))
I'd be inclined to use a bit of local state to simplify the logic:
(defn do-flatten
([node]
(let [acc (atom [])]
(do-flatten node [] acc)
#acc))
([node base acc]
(let [new-base (into base (self node))]
(swap! acc conj new-base)
(doall
(map #(do-flatten % new-base acc) (:children node))))))
Maybe some functional purists would dislike it, and of course you can do the whole thing in a purely functional way. My feeling is that it's a temporary and entirely local piece of state (and hence isn't going to cause the kinds of problems that state is notorious for), so if it makes for greater readability (which I think it does), I'm happy to use it.

Filtering a map based on expected keys

In my Clojure webapp I have various model namespaces with functions that take a map as an agrument and somehow insert that map into a database. I would like to be able take out only the desired keys from the map before I do the insert.
A basic example of this is:
(let [msg-keys [:title :body]
msg {:title "Hello" :body "This is an example" :somekey "asdf" :someotherkey "asdf"}]
(select-keys msg msg-keys))
;; => {:title "Hello" :body "This is an example"}
select-keys is not an option when the map is somewhat complex and I would like to select a specific set of nested keys:
(let [person {:name {:first "john" :john "smith"} :age 40 :weight 155 :something {:a "a" :b "b" :c "c" :d "d"}}]
(some-select-key-fn person [:name [:first] :something [:a :b]]))
;; => {:name {:first "john"} :something {:a "a" :b "b"}}
Is there a way to do this with the core functions? Is there a way do this purely with destructuring?
This topic was discussed in the Clojure Google Group along with a few solutions.
Destructuring is probably the closest to a "core" capability, and may be a fine solution if your problem is rather static and the map has all of the expected keys (thus avoiding nil). It could look like:
(let [person {:name {:first "john" :john "smith"} :age 40 :weight 155 :something {:a "a" :b "b" :c "c" :d "d"}}
{{:keys [first]} :name {:keys [a b]} :something} person]
{:name {:first first} :something {:a a :b b}})
;; => {:name {:first "john"}, :something {:a "a", :b "b"}}
Below is a survey of the solutions in the Clojure Google Group thread, applied to your sample map. They each have a different take on how to specify the nested keys to be selected.
Here is Christophe Grand's solution:
(defprotocol Selector
(-select [s m]))
(defn select [m selectors-coll]
(reduce conj {} (map #(-select % m) selectors-coll)))
(extend-protocol Selector
clojure.lang.Keyword
(-select [k m]
(find m k))
clojure.lang.APersistentMap
(-select [sm m]
(into {}
(for [[k s] sm]
[k (select (get m k) s)]))))
Using it requires a slightly modified syntax:
(let [person {:name {:first "john" :john "smith"} :age 40 :weight 155 :something {:a "a" :b "b" :c "c" :d "d"}}]
(select person [{:name [:first] :something [:a :b]}]))
;; => {:something {:b "b", :a "a"}, :name {:first "john"}}
Here is Moritz Ulrich's solution (he cautions that it doesn't work on maps with seqs as keys):
(defn select-in [m keyseq]
(loop [acc {} [k & ks] (seq keyseq)]
(if k
(recur
(if (sequential? k)
(let [[k ks] k]
(assoc acc k
(select-in (get m k) ks)))
(assoc acc k (get m k)))
ks)
acc)))
Using it requires another slightly modified syntax:
(let [person {:name {:first "john" :john "smith"} :age 40 :weight 155 :something {:a "a" :b "b" :c "c" :d "d"}}]
(select-in person [[:name [:first]] [:something [:a :b]]]))
;; => {:something {:b "b", :a "a"}, :name {:first "john"}}
Here is Jay Fields's solution:
(defn select-nested-keys [m top-level-keys & {:as pairs}]
(reduce #(update-in %1 (first %2) select-keys (last %2)) (select-keys m top-level-keys) pairs))
It uses a different syntax:
(let [person {:name {:first "john" :john "smith"} :age 40 :weight 155 :something {:a "a" :b "b" :c "c" :d "d"}}]
(select-nested-keys person [:name :something] [:name] [:first] [:something] [:a :b]))
;; => {:something {:b "b", :a "a"}, :name {:first "john"}}
Here is Baishampayan Ghose's solution:
(defprotocol ^:private IExpandable
(^:private expand [this]))
(extend-protocol IExpandable
clojure.lang.Keyword
(expand [k] {k ::all})
clojure.lang.IPersistentVector
(expand [v] (if (empty? v)
{}
(apply merge (map expand v))))
clojure.lang.IPersistentMap
(expand [m]
(assert (= (count (keys m)) 1) "Number of keys in a selector map can't be more than 1.")
(let [[k v] (-> m first ((juxt key val)))]
{k (expand v)}))
nil
(expand [_] {}))
(defn ^:private extract* [m selectors expand?]
(let [sels (if expand? (expand selectors) selectors)]
(reduce-kv (fn [res k v]
(if (= v ::all)
(assoc res k (m k))
(assoc res k (extract* (m k) v false))))
{} sels)))
(defn extract
"Like select-keys, but can select nested keys.
Examples -
(extract [{:b {:c [:d]}} :g] {:a 1 :b {:c {:d 1 :e 2}} :g 42 :xxx 11})
;=> {:g 42, :b {:c {:d 1}}}
(extract [:g] {:a 1 :b {:c {:d 1 :e 2}} :g 42 :xxx 11})
;=> {:g 42}
(extract [{:b [:c]} :xxx] {:a 1 :b {:c {:d 1 :e 2}} :g 42 :xxx 11})
;=> {:b {:c {:d 1, :e 2}}, :xxx 11}
Also see - exclude"
[selectors m]
(extract* m selectors true))
It uses another syntax (and the parameters are reversed):
(let [person {:name {:first "john" :john "smith"} :age 40 :weight 155 :something {:a "a" :b "b" :c "c" :d "d"}}]
(extract [{:name [:first]} {:something [:a :b]}] person))
;; => {:name {:first "john"}, :something {:a "a", :b "b"}}
Your best bet is probably to use select keys on each nested portion of the structure.
(-> person
(select-keys [:name :something])
(update-in [:name] select-keys [:first])
(update-in [:something] select-keys [:a :b]))
You could of course use a generalized version of the above to implement the syntax you suggest in a function (with a reduce rather than a -> form, most likely, and recursive calls for each nested selection of keys). Destructuring would not help much, it makes binding nested data convenient, but isn't really that useful for constructing values.
Here is how I would do it with reduce and recursion:
(defn simplify
[m skel]
(if-let [kvs (not-empty (partition 2 skel))]
(reduce (fn [m [k nested]]
(if nested
(update-in m [k] simplify nested)
m))
(select-keys m (map first kvs))
kvs)
m))
note that your proposed argument format is inconvenient, so I changed it slightly
user=> (simplify {:name {:first "john" :john "smith"}
:age 40
:weight 155
:something {:a "a" :b "b" :c "c" :d "d"}}
[:name [:first nil] :something [:a nil :b nil]])
{:something {:b "b", :a "a"}, :name {:first "john"}}
the syntax you propose will require a more complex implementation

How to read a file with test data in with Clojure?

I am writing a piece of code that needs to read in a text file that has data. The text file is in the format:
name 1 4
name 2 4 5
name 3 1 9
I am trying to create a vector of a map in the form [:name Sarah :weight 1 cost :4].
When I try reading the file in with the line-seq reader, it reads each line as an item so the partition is not correct. See repl below:
(let [file-text (line-seq (reader "C://Drugs/myproject/src/myproject/data.txt"))
new-test-items (vec (map #(apply struct item %) (partition 3 file-text)))]
(println file-text)
(println new-test-items))
(sarah 1 1 jason 4 5 nila 3 2 jonas 5 6 judy 8 15 denny 9 14 lis 2 2 )
[{:name sarah 1 1, :weight jason 4 5, :value nila 3 2 } {:name jonas 5 6, :weight judy 8 15, :value denny 9 14}]
I then tried to just take 1 partition, but still the structure is not right.
=> (let [file-text (line-seq (reader "C://Drugs/myproject/src/myproject/data.txt"))
new-test-items (vec (map #(apply struct item %) (partition 1 file-text)))]
(println file-text)
(println new-test-items))
(sarah 1 1 jason 4 5 nila 3 2 jonas 5 6 judy 8 15 denny 9 14 lis 2 2 )
[{:name sarah 1 1, :weight nil, :value nil} {:name jason 4 5, :weight nil, :value nil} {:name nila 3 2 , :weight nil, :value nil} {:name jonas 5 6, :weight nil, :value nil} {:name judy 8 15, :weight nil, :value nil} {:name denny 9 14, :weight nil, :value nil} {:name lis 2 2, :weight nil, :value nil} {:name , :weight nil, :value nil}]
nil
Next I tried to slurp the file, but that is worse:
=> (let [slurp-input (slurp "C://Drugs/myproject/src/myproject/data.txt")
part-items (partition 3 slurp-input)
mapping (vec (map #(apply struct item %) part-items))]
(println slurp-input)
(println part-items)
(println mapping))
sarah 1 1
jason 4 5
nila 3 2
jonas 5 6
judy 8 15
denny 9 14
lis 2 2
((s a r) (a h ) (1 1) (
Please help! This seems like such an easy thing to do in Java, but is killing me in Clojure.
split it into a sequence of lines:
(line-seq (reader "/tmp/data"))
split each of them into a sequence of words
(map #(split % #" ") data)
make a function that takes a vector of one data and turns it into a map with the correct keys
(fn [[name weight cost]]
(hash-map :name name
:weight (Integer/parseInt weight)
:cost (Integer/parseInt cost)))
then nest them back together
(map (fn [[name weight cost]]
(hash-map :name name
:weight (Integer/parseInt weight)
:cost (Integer/parseInt cost)))
(map #(split % #" ") (line-seq (reader "/tmp/data"))))
({:weight 1, :name "name", :cost 4}
{:weight 2, :name "name", :cost 4}
{:weight 3, :name "name", :cost 1})
you can also make this more compact by using zip-map
You are trying to do everything in one place without testing intermediate results. Instead Clojure recommends to decompose task into a number of subtasks - this makes code much more flexible and testable. Here's the code for your task (I assume records in file describe people):
(defn read-lines [filename]
(with-open [rdr (clojure.java.io/reader filename)]
(doall (line-seq rdr))))
(defn make-person [s]
(reduce conj (map hash-map [:name :weight :value] (.split s " "))))
(map make-person (read-lines "/path/to/file"))