How can I search and dissoc multiple descendent keys.
Example:
(def d {:foo 123
:bar {
:baz 456
:bam {
:whiz 789}}})
(dissoc-descendents d [:foo :bam])
;->> {:bar {:baz 456}}
clojure.walk is useful in this kind of situations:
(use 'clojure.walk)
(postwalk #(if (map? %) (dissoc % :foo :bam) %) d)
If you wanted to implement it directly then I'd suggest something like this:
(defn dissoc-descendents [coll descendents]
(let [descendents (if (set? descendents) descendents (set descendents))]
(if (associative? coll)
(reduce
(fn [m [k v]] (if (descendents k)
(dissoc m k)
(let [new-val (dissoc-descendents v descendents)]
(if (identical? new-val v) m (assoc m k new-val)))))
coll
coll)
coll)))
Key things to note about the implementation:
It makes sense to convert descendents into a set: this will allow quick membership tests if the set of keys to remove is large
There is some logic to ensure that if a value doesn't change, you don't need to alter that part of the map. This is quite a big performance win if large areas of the map are unchanged.
Related
This is similar to Clojure get map key by value
However, there is one difference. How would you do the same thing if hm is like
{1 ["bar" "choco"]}
The idea being to get 1 (the key) where the first element if the value list is "bar"? Please feel free to close/merge this question if some other question answers it.
I tried something like this, but it doesn't work.
(def hm {:foo ["bar", "choco"]})
(keep #(when (= ((nth val 0) %) "bar")
(key %))
hm)
You can filter the map and return the first element of the first item in the resulting sequence:
(ffirst (filter (fn [[k [v & _]]] (= "bar" v)) hm))
you can destructure the vector value to access the second and/or third elements e.g.
(ffirst (filter (fn [[k [f s t & _]]] (= "choco" s))
{:foo ["bar", "choco"]}))
past the first few elements you will probably find nth more readable.
Another way to do it using some:
(some (fn [[k [v & _]]] (when (= "bar" v) k)) hm)
Your example was pretty close to working, with some minor changes:
(keep #(when (= (nth (val %) 0) "bar")
(key %))
hm)
keep and some are similar, but some only returns one result.
in addition to all the above (correct) answers, you could also want to reindex your map to desired form, especially if the search operation is called quite frequently and the the initial map is rather big, this would allow you to decrease the search complexity from linear to constant:
(defn map-invert+ [kfn vfn data]
(reduce (fn [acc entry] (assoc acc (kfn entry) (vfn entry)))
{} data))
user> (def data
{1 ["bar" "choco"]
2 ["some" "thing"]})
#'user/data
user> (def inverted (map-invert+ (comp first val) key data))
#'user/inverted
user> inverted
;;=> {"bar" 1, "some" 2}
user> (inverted "bar")
;;=> 1
I have an edn in which I have nested maps. I found one very good example for this Clojure: a function that search for a val in a nested hashmap and returns the sequence of keys in which the val is contained
(def coll
{:a "aa"
:b {:d "dd"
:e {:f {:h "hh"
:i "ii"}
:g "hh"}}
:c "cc"})
With this answer
(defn find-in [coll x]
(some
(fn [[k v]]
(cond (= v x) [k]
(map? v) (if-let [r (find-in v x)]
(into [k] r))))
coll))
My problem is that because of some I can't get a path for every result, only for the first logical truth. I tried map an keep but they break the recursion. How could I make this code to give back path to all of its results, not only the first one? Any help is appreciated.
You can use a helper function to turn a nested map into a flat map with fully qualified keys. Then find-in can just filter on the value and returns the matched keys.
(defn flatten-map [path m]
(if (map? m)
(mapcat (fn [[k v]] (flatten-map (conj path k) v)) m)
[[path m]]))
(defn find-in [coll x]
(->> (flatten-map [] coll)
(filter (fn [[_ v]] (= v x)))
(map first)))
With your sample:
(find-in coll "hh")
=>
([:b :e :f :h] [:b :e :g])
filter gives all of the results, where some will only give you the first result, or nil if there aren't any. Oftentimes the same problem can be solved by filtering then taking the first, rather than using some.
Today I tried to implement a "R-like" melt function. I use it for Big Data coming from Big Query.
I do not have big constraints about time to compute and this function takes less than 5-10 seconds to work on millions of rows.
I start with this kind of data :
(def sample
'({:list "123,250" :group "a"} {:list "234,260" :group "b"}))
Then I defined a function to put the list into a vector :
(defn split-data-rank [datatab value]
(let [splitted (map (fn[x] (assoc x value (str/split (x value) #","))) datatab)]
(map (fn[y] (let [index (map inc (range (count (y value))))]
(assoc y value (zipmap index (y value)))))
splitted)))
Launch :
(split-data-rank sample :list)
As you can see, it returns the same sequence but it replaces :list by a map giving the position in the list of each item in quoted list.
Then, I want to melt the "dataframe" by creating for each item in a group its own row with its rank in the group.
So that I created this function :
(defn split-melt [datatab value]
(let [splitted (split-data-rank datatab value)]
(map (fn [y] (dissoc y value))
(apply concat
(map
(fn[x]
(map
(fn[[k v]]
(assoc x :item v :Rank k))
(x value)))
splitted)))))
Launch :
(split-melt sample :list)
The problem is that it is heavily indented and use a lot of map. I apply dissoc to drop :list (which is useless now) and I have also to use concat because without that I have a sequence of sequences.
Do you think there is a more efficient/shorter way to design this function ?
I am heavily confused with reduce, does not know whether it can be applied here since there are two arguments in a way.
Thanks a lot !
If you don't need the split-data-rank function, I will go for:
(defn melt [datatab value]
(mapcat (fn [x]
(let [items (str/split (get x value) #",")]
(map-indexed (fn [idx item]
(-> x
(assoc :Rank (inc idx) :item item)
(dissoc value)))
items)))
datatab))
Very simple + silly question:
Does clojure provide multi maps? I currently have something like this:
(defn wrap [func]
(fn [mp x]
(let [k (func x)]
(assoc mp k
(match (get mp k)
nil [x]
v (cons v x))))))
(defn create-mm [func lst]
(reduce (wrap func) {} lst))
Which ends up creating a map, where for each key, we have a vector of all elements with that key. However, it seems like multi maps is a very basic data structure, and I wonder if clojure has it built-in.
Thanks
I don't think this is really necessary as a distinct type, as Clojure's flexibility allow you to quickly make your own by just using maps and sets. See here:
http://paste.lisp.org/display/89840
Edit (I should have just pasted this in since it's so small)
Example Code (Courtesy Stuart Sierra)
(ns #^{:doc "A multimap is a map that permits multiple values for each
key. In Clojure we can represent a multimap as a map with sets as
values."}
multimap
(:use [clojure.set :only (union)]))
(defn add
"Adds key-value pairs the multimap."
([mm k v]
(assoc mm k (conj (get mm k #{}) v)))
([mm k v & kvs]
(apply add (add mm k v) kvs)))
(defn del
"Removes key-value pairs from the multimap."
([mm k v]
(let [mmv (disj (get mm k) v)]
(if (seq mmv)
(assoc mm k mmv)
(dissoc mm k))))
([mm k v & kvs]
(apply del (del mm k v) kvs)))
(defn mm-merge
"Merges the multimaps, taking the union of values."
[& mms]
(apply (partial merge-with union) mms))
(comment
(def mm (add {} :foo 1 :foo 2 :foo 3))
;; mm == {:foo #{1 2 3}}
(mm-merge mm (add {} :foo 4 :bar 2))
;;=> {:bar #{2}, :foo #{1 2 3 4}}
(del mm :foo 2)
;;=> {:foo #{1 3}}
)
Extra test for the case pointed out in the comments:
(comment
(-> {} (add :a 1) (del :a 1) (contains? :a))
;;=> false
)
In clojure, I want to aggregate this data:
(def data [[:morning :pear][:morning :mango][:evening :mango][:evening :pear]])
(group-by first data)
;{:morning [[:morning :pear][:morning :mango]],:evening [[:evening :mango][:evening :pear]]}
My problem is that :evening and :morning are redundant.
Instead, I would like to create the following collection:
([:morning (:pear :mango)] [:evening (:mango :pear)])
I came up with:
(for [[moment moment-fruit-vec] (group-by first data)] [moment (map second moment-fruit-vec)])
Is there a more idiomatic solution?
I've come across similar grouping problems. Usually I end up plugging merge-with or update-in into some seq processing step:
(apply merge-with list (map (partial apply hash-map) data))
You get a map, but this is just a seq of key-value pairs:
user> (apply merge-with list (map (partial apply hash-map) data))
{:morning (:pear :mango), :evening (:mango :pear)}
user> (seq *1)
([:morning (:pear :mango)] [:evening (:mango :pear)])
This solution only gets what you want if each key appears twice, however. This might be better:
(reduce (fn [map [x y]] (update-in map [x] #(cons y %))) {} data)
Both of these feel "more functional" but also feel a little convoluted. Don't be too quick to dismiss your solution, it's easy-to-understand and functional enough.
Don't be too quick to dismiss group-by, it has aggregated your data by the desired key and it hasn't changed the data. Any other function expecting a sequence of moment-fruit pairs will accept any value looked up in the map returned by group-by.
In terms of computing the summary my inclination was to reach for merge-with but for that I had to transform the input data into a sequence of maps and construct a "base-map" with the required keys and empty-vectors as values.
(let [i-maps (for [[moment fruit] data] {moment fruit})
base-map (into {}
(for [key (into #{} (map first data))]
[key []]))]
(apply merge-with conj base-map i-maps))
{:morning [:pear :mango], :evening [:mango :pear]}
Meditating on #mike t's answer, I've come up with:
(defn agg[x y] (if (coll? x) (cons y x) (list y x)))
(apply merge-with agg (map (partial apply hash-map) data))
This solution works also when the keys appear more than twice on data:
(apply merge-with agg (map (partial apply hash-map)
[[:morning :pear][:morning :mango][:evening :mango] [:evening :pear] [:evening :kiwi]]))
;{:morning (:mango :pear), :evening (:kiwi :pear :mango)}
maybe just modify the standard group-by a little bit:
(defn my-group-by
[fk fv coll]
(persistent!
(reduce
(fn [ret x]
(let [k (fk x)]
(assoc! ret k (conj (get ret k []) (fv x)))))
(transient {}) coll)))
then use it as:
(my-group-by first second data)