Translating vector into map - clojure

I've got this list of fields (that's Facebook's graph API fields list).
["a" "b" ["c" ["t"] "d"] "e" ["f"] "g"]
I want to generate a map out of it. The convention is following, if after a key vector follows, then its an inner object for the key. Example vector could be represented as a map as:
{"a" "value"
"b" {"c" {"t" "value"} "d" "value"}
"e" {"f" "value"}
"g" "value"}
So I have this solution so far
(defn traverse
[data]
(mapcat (fn [[left right]]
(if (vector? right)
(let [traversed (traverse right)]
(mapv (partial into [left]) traversed))
[[right]]))
(partition 2 1 (into [nil] data))))
(defn facebook-fields->map
[fields default-value]
(->> fields
(traverse)
(reduce #(assoc-in %1 %2 nil) {})
(clojure.walk/postwalk #(or % default-value))))
(let [data ["a" "b" ["c" ["t"] "d"] "e" ["f"] "g"]]
(facebook-fields->map data "value"))
#=> {"a" "value", "b" {"c" {"t" "value"}, "d" "value"}, "e" {"f" "value"}, "g" "value"}
But it is fat and difficult to follow. I am wondering if there is a more elegant solution.

Here's another way to do it using postwalk for the whole traversal, rather than using it only for default-value replacement:
(defn facebook-fields->map
[fields default-value]
(clojure.walk/postwalk
(fn [v] (if (coll? v)
(->> (partition-all 2 1 v)
(remove (comp coll? first))
(map (fn [[l r]] [l (if (coll? r) r default-value)]))
(into {}))
v))
fields))
(facebook-fields->map ["a" "b" ["c" ["t"] "d"] "e" ["f"] "g"] "value")
=> {"a" "value",
"b" {"c" {"t" "value"}, "d" "value"},
"e" {"f" "value"},
"g" "value"}

Trying to read heavily nested code makes my head hurt. It is worse when the answer is something of a "force-fit" with postwalk, which does things in a sort of "inside out" manner. Also, using partition-all is a bit of a waste, since we need to discard any pairs with two non-vectors.
To me, the most natural solution is a simple top-down recursion. The only problem is that we don't know in advance if we need to remove one or two items from the head of the input sequence. Thus, we can't use a simple for loop or map.
So, just write it as a straightforward recursion, and use an if to determine whether we consume 1 or 2 items from the head of the list.
If the 2nd item is a value, we consume one item and add in
:dummy-value to make a map entry.
If the 2nd item is a vector, we recurse and use that
as the value in the map entry.
The code:
(ns tst.demo.core
(:require [clojure.walk :as walk] ))
(def data ["a" "b" ["c" ["t"] "d"] "e" ["f"] "g"])
(defn parse [data]
(loop [result {}
data data]
(if (empty? data)
(walk/keywordize-keys result)
(let [a (first data)
b (second data)]
(if (sequential? b)
(recur
(into result {a (parse b)})
(drop 2 data))
(recur
(into result {a :dummy-value})
(drop 1 data)))))))
with result:
(parse data) =>
{:a :dummy-value,
:b {:c {:t :dummy-value}, :d :dummy-value},
:e {:f :dummy-value},
:g :dummy-value}
I added keywordize-keys at then end just to make the result a little more "Clojurey".

Since you're asking for a cleaner solution as opposed to a solution, and because I thought it was a neat little problem, here's another one.
(defn facebook-fields->map [coll]
(into {}
(keep (fn [[x y]]
(when-not (vector? x)
(if (vector? y)
[x (facebook-fields->map y)]
[x "value"]))))
(partition-all 2 1 coll)))

Related

How to transform a list of maps to a nested map of maps?

Getting data from the database as a list of maps (LazySeq) leaves me in need of transforming it into a map of maps.
I tried to 'assoc' and 'merge', but that didn't bring the desired result because of the nesting.
This is the form of my data:
(def data (list {:structure 1 :cat "A" :item "item1" :val 0.1}
{:structure 1 :cat "A" :item "item2" :val 0.2}
{:structure 1 :cat "B" :item "item3" :val 0.4}
{:structure 2 :cat "A" :item "item1" :val 0.3}
{:structure 2 :cat "B" :item "item3" :val 0.5}))
I would like to get it in the form
=> {1 {"A" {"item1" 0.1}
"item2" 0.2}}
{"B" {"item3" 0.4}}
2 {"A" {"item1" 0.3}}
{"B" {"item3" 0.5}}}
I tried
(->> data
(map #(assoc {} (:structure %) {(:cat %) {(:item %) (:val %)}}))
(apply merge-with into))
This gives
{1 {"A" {"item2" 0.2}, "B" {"item3" 0.4}},
2 {"A" {"item1" 0.3}, "B" {"item3" 0.5}}}
By merging I lose some entries, but I can't think of any other way. Is there a simple way? I was even about to try to use specter.
Any thoughts would be appreciated.
If I'm dealing with nested maps, first stop is usually to think about update-in or assoc-in - these take a sequence of the nested keys. For a problem like this where the data is very regular, it's straightforward.
(assoc-in {} [1 "A" "item1"] 0.1)
;; =>
{1 {"A" {"item1" 0.1}}}
To consume a sequence into something else, reduce is the idiomatic choice. The reducing function is right on the edge of the complexity level I'd consider an anonymous fn for, so I'll pull it out instead for clarity.
(defn- add-val [acc line]
(assoc-in acc [(:structure line) (:cat line) (:item line)] (:val line)))
(reduce add-val {} data)
;; =>
{1 {"A" {"item1" 0.1, "item2" 0.2}, "B" {"item3" 0.4}},
2 {"A" {"item1" 0.3}, "B" {"item3" 0.5}}}
Which I think was the effect you were looking for.
Roads less travelled:
As your sequence is coming from a database, I wouldn't worry about using a transient collection to speed the aggregation up. Also, now I think about it, dealing with nested transient maps is a pain anyway.
update-in would be handy if you wanted to add up any values with the same key, for example, but the implication of your question is that structure/cat/item tuples are unique and so you just need the grouping.
juxt could be used to generate the key structure - i.e.
((juxt :structure :cat :item) (first data))
[1 "A" "item1"]
but it's not clear to me that there's any way to use this to make the add-val fn more readable.
You may continue to use your existing code. Only the final merge has to change:
(defn deep-merge [& xs]
(if (every? map? xs)
(apply merge-with deep-merge xs)
(apply merge xs)))
(->> data
(map #(assoc {} (:structure %) {(:cat %) {(:item %) (:val %)}}))
(apply deep-merge))
;; =>
{1
{"A" {"item1" 0.1, "item2" 0.2},
"B" {"item3" 0.4}},
2
{"A" {"item1" 0.3},
"B" {"item3" 0.5}}}
Explanation: your original (apply merge-with into) only merge one level down. deep-merge from above will recurse into all nested maps to do the merge.
Addendum: #pete23 - one use of juxt I can think of is to make the function reusable. For example, we can extract arbitrary fields with juxt, then convert them to nested maps (with yet another function ->nested) and finally do a deep-merge:
(->> data
(map (juxt :structure :cat :item :val))
(map ->nested)
(apply deep-merge))
where ->nested can be implemented like:
(defn ->nested [[k & [v & r :as t]]]
{k (if (seq r) (->nested t) v)})
(->nested [1 "A" "item1" 0.1])
;; => {1 {"A" {"item1" 0.1}}}
One sample application (sum val by category):
(let [ks [:cat :val]]
(->> data
(map (apply juxt ks))
(map ->nested)
(apply (partial deep-merge-with +))))
;; => {"A" 0.6000000000000001, "B" 0.9}
Note deep-merge-with is left as an exercise for our readers :)
(defn map-values [f m]
(into {} (map (fn [[k v]] [k (f v)])) m))
(defn- transform-structures [ss]
(map-values (fn [cs]
(into {} (map (juxt :item :val) cs))) (group-by :cat ss)))
(defn transform [data]
(map-values transform-structures (group-by :structure data)))
then
(transform data)

Cleansing a Map of Its Channels

Suppose we have a map m with the following structure:
{:a (go "a")
:b "b"
:c "c"
:d (go "d")}
As shown, m has four keys, two of which contain channels.
Question: How could one write a general function (or macro?) cleanse-map which takes a map like m and outputs its channeless version (which, in this case, would be {:a "a" :b "b" :c "c" :d "d"})?
A good helper function for this question might be as follows:
(defn chan? [c]
(= (type (chan)) (type c)))
It also doesn't matter if the return value of cleanse-map (or whatever it's called) is itself a channel. i.e.:
`(cleanse-map m) ;=> (go {:a "a" :b "b" :c "c" :d "d"})
Limitations of core.async make implementation of cleanse-map not that straightforward. But the following one should work:
(defn cleanse-map [m]
(let [entry-chs (map
(fn [[k v]]
(a/go
(if (chan? v)
[k (a/<! v)]
[k v])))
m)]
(a/into {} (a/merge entry-chs))))
Basically, what is done here:
Each map entry is transformed to a channel which will contain this map entry. If value of map entry is a channel, it is extracted inside go-block within mapping function.
Channels with map-entries are merge-d into single one. After this step you have a channel with collection of map entries.
Channel with map entries is transformed into channel that will contain needed map (a/into step).
(ns foo.bar
(:require
[clojure.core.async :refer [go go-loop <!]]
[clojure.core.async.impl.protocols :as p]))
(def m
{:a (go "a")
:b "b"
:c "c"
:d (go "d")
:e "e"
:f "f"
:g "g"
:h "h"
:i "i"
:j "j"
:k "k"
:l "l"
:m "m"})
(defn readable? [x]
(satisfies? p/ReadPort x))
(defn cleanse-map
"Takes from each channel value in m,
returns a single channel which will supply the fully realized m."
[m]
(go-loop [acc {}
[[k v :as kv] & remaining] (seq m)]
(if kv
(recur (assoc acc k (if (readable? v) (<! v) v)) remaining)
acc)))
(go (prn "***" (<! (cleanse-map m))))
=> "***" {:m "m", :e "e", :l "l", :k "k", :g "g", :c "c", :j "j", :h "h", :b "b", :d "d", :f "f", :i "i", :a "a"}

changing nested map value without knowing keys

I need to change a value in a nested map where I don't know the values of keys in advance. I have come up with the following to do that.
;; input {String {String [String]}}
;; output {String {String String}}
(defn join-z
[x-to-y-to-z]
(zipmap (keys x-to-y-to-z)
(map (fn [y-to-z] (into {} (map (fn [[y z]] {y (clojure.string/join z)})
(seq y-to-z))))
(seq (vals x-to-y-to-z)))))
(def example
{"a" {"b" ["c" "d" "e"]}
"m" {"n" ["o" "p"]}})
;; (join-z example) => {"m" {"n" "op"}, "a" {"b" "cde"}}
This seems to be a hack. What is idiomatic clojure to do this? Or, is there something like Haskell's lens library to use?
UPDATE: based on user5187212 answer
(defn update-vals [f m0]
(reduce-kv (fn [m k v] (assoc m k (f v)))
{}
m0))
;; (update-vals clojure.string/join {"b" ["c" "d" "e"]}) => {"b" "cde"}
(defn join-z [x-to-y-to-z]
(update-vals (partial update-vals clojure.string/join) x-to-y-to-z))
;; (join-z example) => {"m" {"n" "op"}, "a" {"b" "cde"}}
This seems much more elegant. Thanks!
I would suggest reduce-kv.
For the last layer you can use something like:
(defn foo [x]
(reduce-kv
(fn [m k v]
(assoc m k (clojure.string/join v)))
{}
x))
then call it as many times as you need...
(reduce-kv
(fn [m k v]
(assoc m k (foo v)))
{}
example)
An other approach could be over all nested keys and then
(reduce
(fn [m ks]
(update-in m ks clojure.string/join))
example
all-nested-keys)
The short answer is yes, that is how you do it :)
I would go for something more like this:
(into {} (for [[k v] example]
[k (into {} (for [[k2 v2] v]
[k2 (string/join v2)]))]))
Which is pretty much the same thing.
There is a library called Specter
https://github.com/nathanmarz/specter
for queries and transformations:
(ns specter.core
(:require
[clojure.string :as string]
[com.rpl.specter :as s]))
(def example
{"a" {"b" ["c" "d" "e"]}
"m" {"n" ["o" "p"]}})
(s/transform
[s/ALL s/LAST s/ALL s/LAST]
string/join
example)
Which I think is a pretty neat way to express it.

Find Value of Specific Key in Nested Map

In Clojure, how can I find the value of a key that may be deep in a nested map structure? For example:
(def m {:a {:b "b"
:c "c"
:d {:e "e"
:f "f"}}})
(find-nested m :f)
=> "f"
Clojure offers tree-seq to do a depth-first traversal of any value. This will simplify the logic needed to find your nested key:
(defn find-nested
[m k]
(->> (tree-seq map? vals m)
(filter map?)
(some k)))
(find-nested {:a {:b {:c 1}, :d 2}} :c)
;; => 1
Also, finding all matches becomes a matter of replacing some with keep:
(defn find-all-nested
[m k]
(->> (tree-seq map? vals m)
(filter map?)
(keep k)))
(find-all-nested {:a {:b {:c 1}, :c 2}} :c)
;; => [2 1]
Note that maps with nil values might require some special treatment.
Update: If you look at the code above, you can see that k can actually be a function which offers a lot more possibilities:
to find a string key:
(find-nested m #(get % "k"))
to find multiple keys:
(find-nested m #(some % [:a :b]))
to find only positive values in maps of integers:
(find-nested m #(when (some-> % :k pos?) (:k %)))
If you know the nested path then use get-in.
=> (get-in m [:a :d :f])
=> "f"
See here for details: https://clojuredocs.org/clojure.core/get-in
If you don't know the path in your nested structure you could write a function that recurses through the nested map looking for the particular key in question and either returns its value when it finds the first one or returns all the values for :f in a seq.
If you know the "path", consider using get-in:
(get-in m [:a :d :f]) ; => "f"
If the "path" is unknown you can use something like next function:
(defn find-in [m k]
(if (map? m)
(let [v (m k)]
(->> m
vals
(map #(find-in % k)) ; Search in "child" maps
(cons v) ; Add result from current level
(filter (complement nil?))
first))))
(find-in m :f) ; "f"
(find-in m :d) ; {:e "e", :f "f"}
Note: given function will find only the first occurrence.
Here is a version that will find the key without knowing the path to it. If there are multiple matching keys, only one will be returned:
(defn find-key [m k]
(loop [m' m]
(when (seq m')
(if-let [v (get m' k)]
v
(recur (reduce merge
(map (fn [[_ v]]
(when (map? v) v))
m')))))))
If you require all values you can use:
(defn merge-map-vals [m]
(reduce (partial merge-with vector)
(map (fn [[_ v]]
(when (map? v) v))
m)))
(defn find-key [m k]
(flatten
(nfirst
(drop-while first
(iterate (fn [[m' acc]]
(if (seq m')
(if-let [v (get m' k)]
[(merge-map-vals m') (conj acc v)]
[(merge-map-vals m') acc])
[nil acc]))
[m []])))))

Clojure: idiomatic code for frequency map

Let's make frequency map:
(reduce #(update-in %1 [%2] (fnil inc 0)) {} ["a" "b" "a" "c" "c" "a"])
My concern is expression inside lambda #(...) - is it the canonical way to do it? Can I do it better/shorter?
EDIT: Another way I found:
(reduce #(assoc %1 %2 (inc %1 %2 0)) {} ["a" "b" "a" "c" "c" "a"])
Seems like very similar, what are pros/cons? Performance?
Since Clojure 1.2, there is a frequencies function in clojure.core:
user=> (doc frequencies)
-------------------------
clojure.core/frequencies
([coll])
Returns a map from distinct items in coll to the number of times
they appear.
Example:
user=> (frequencies ["a" "b" "a" "c" "c" "a"])
{"a" 3, "b" 1, "c" 2}
It happens to use transients and ternary get; see (source frequencies) for the code, which is as idiomatic as it gets while being highly performance-aware.
There is no need to use update-in. My way would be:
(defn frequencies [coll]
(reduce (fn [m e]
(assoc m e (inc (m e 0))))
{} coll))
Update: I assumed you knew frequencies was in core also and this was just an exercise.
I did a guest lecture a while ago in which I explained how you can get to this solution step by step. Won't be much new for you, since you were already close to the core solution, but maybe it is of value to someone else reading this question. The slides are in Dutch. If you change .html to .org it's easier to get the source code:
http://michielborkent.nl/gastcollege-han-20-06-2013/gastcollege.html
http://michielborkent.nl/gastcollege-han-20-06-2013/gastcollege.org
Another approach using only 'assoc' and recursion:
(defn my-frequencies-helper [freqs a-seq]
(if (empty? a-seq)
freqs
(let [fst (first a-seq)
new_set (if (contains? freqs fst)
(assoc freqs fst (inc (get freqs fst)))
(assoc freqs fst 1))]
(my-frequencies-helper new_set (rest a-seq)))))
(defn my-frequencies [a-seq]
(my-frequencies-helper {} a-seq))
(my-frequencies [1 1 2 2 :D :D :D])
=> {1 2, 2 2, :D 3}
(my-frequencies [:a "moi" :a "moi" "moi" :a 1])
=> {:a 3, "moi" 3, 1 1}