Intersecting a list of lists of maps overriding equality - clojure

I have a list of lists of maps:
(( {:id 1 :temp 1} {:id 2} )
( {:id 1 :temp 2} )
( {:id 1 :temp 3} {:id 2} ))
I want to get ids which are at intersection of these 3 sets only by :id key. So my result here will be 1
I came up with this solution but it's hurting my eyes:
(def coll '(( {:id 1 :temp 1} {:id 2} )
( {:id 1 :temp 2} )
( {:id 1 :temp 3} {:id 2} )))
(apply clojure.set/intersection
(map set (map (fn [m]
(map #(select-keys % '(:id)) m)) coll)))
returns
#{{:id 1}}
which is Ok, but any other suggestions?

If you are fine with getting #{1} (as you mention initially) instead of #{{:id 1}}, then it can be slightly improved:
(apply set/intersection (map (fn [c] (into #{} (map :id c))) coll))

(require '[clojure.set :refer [intersection]])
The select keys I guess you don't need, since you are only interested in the id. (map :id m) does the job for the inner-most map. By this you are getting rid of a function shorthand. You can use it in the next map:
(map #(map :id %) coll)
;; ((1 2) (1) (1 2))
The third map you introduce is not necessary. it can be merged in the above piece of code:
(map (comp set #(map :id %)) coll)
or:
(map #(set (map :id %)) coll)
both evaluating to: (#{1 2} #{1} #{1 2})
This is still pretty nested. Threading macros don't help here. But you can use a very powerful list comprehension macro called for:
(for [row coll]
(set (map :id row)))
This gives you the advantage of naming list items (rows) but keeping it concise at the same time.
So finally:
(apply intersection (for [row coll]
(set (map :id row))))
;; #{1}

Related

How to transform a list of maps to a nested map of maps?

Getting data from the database as a list of maps (LazySeq) leaves me in need of transforming it into a map of maps.
I tried to 'assoc' and 'merge', but that didn't bring the desired result because of the nesting.
This is the form of my data:
(def data (list {:structure 1 :cat "A" :item "item1" :val 0.1}
{:structure 1 :cat "A" :item "item2" :val 0.2}
{:structure 1 :cat "B" :item "item3" :val 0.4}
{:structure 2 :cat "A" :item "item1" :val 0.3}
{:structure 2 :cat "B" :item "item3" :val 0.5}))
I would like to get it in the form
=> {1 {"A" {"item1" 0.1}
"item2" 0.2}}
{"B" {"item3" 0.4}}
2 {"A" {"item1" 0.3}}
{"B" {"item3" 0.5}}}
I tried
(->> data
(map #(assoc {} (:structure %) {(:cat %) {(:item %) (:val %)}}))
(apply merge-with into))
This gives
{1 {"A" {"item2" 0.2}, "B" {"item3" 0.4}},
2 {"A" {"item1" 0.3}, "B" {"item3" 0.5}}}
By merging I lose some entries, but I can't think of any other way. Is there a simple way? I was even about to try to use specter.
Any thoughts would be appreciated.
If I'm dealing with nested maps, first stop is usually to think about update-in or assoc-in - these take a sequence of the nested keys. For a problem like this where the data is very regular, it's straightforward.
(assoc-in {} [1 "A" "item1"] 0.1)
;; =>
{1 {"A" {"item1" 0.1}}}
To consume a sequence into something else, reduce is the idiomatic choice. The reducing function is right on the edge of the complexity level I'd consider an anonymous fn for, so I'll pull it out instead for clarity.
(defn- add-val [acc line]
(assoc-in acc [(:structure line) (:cat line) (:item line)] (:val line)))
(reduce add-val {} data)
;; =>
{1 {"A" {"item1" 0.1, "item2" 0.2}, "B" {"item3" 0.4}},
2 {"A" {"item1" 0.3}, "B" {"item3" 0.5}}}
Which I think was the effect you were looking for.
Roads less travelled:
As your sequence is coming from a database, I wouldn't worry about using a transient collection to speed the aggregation up. Also, now I think about it, dealing with nested transient maps is a pain anyway.
update-in would be handy if you wanted to add up any values with the same key, for example, but the implication of your question is that structure/cat/item tuples are unique and so you just need the grouping.
juxt could be used to generate the key structure - i.e.
((juxt :structure :cat :item) (first data))
[1 "A" "item1"]
but it's not clear to me that there's any way to use this to make the add-val fn more readable.
You may continue to use your existing code. Only the final merge has to change:
(defn deep-merge [& xs]
(if (every? map? xs)
(apply merge-with deep-merge xs)
(apply merge xs)))
(->> data
(map #(assoc {} (:structure %) {(:cat %) {(:item %) (:val %)}}))
(apply deep-merge))
;; =>
{1
{"A" {"item1" 0.1, "item2" 0.2},
"B" {"item3" 0.4}},
2
{"A" {"item1" 0.3},
"B" {"item3" 0.5}}}
Explanation: your original (apply merge-with into) only merge one level down. deep-merge from above will recurse into all nested maps to do the merge.
Addendum: #pete23 - one use of juxt I can think of is to make the function reusable. For example, we can extract arbitrary fields with juxt, then convert them to nested maps (with yet another function ->nested) and finally do a deep-merge:
(->> data
(map (juxt :structure :cat :item :val))
(map ->nested)
(apply deep-merge))
where ->nested can be implemented like:
(defn ->nested [[k & [v & r :as t]]]
{k (if (seq r) (->nested t) v)})
(->nested [1 "A" "item1" 0.1])
;; => {1 {"A" {"item1" 0.1}}}
One sample application (sum val by category):
(let [ks [:cat :val]]
(->> data
(map (apply juxt ks))
(map ->nested)
(apply (partial deep-merge-with +))))
;; => {"A" 0.6000000000000001, "B" 0.9}
Note deep-merge-with is left as an exercise for our readers :)
(defn map-values [f m]
(into {} (map (fn [[k v]] [k (f v)])) m))
(defn- transform-structures [ss]
(map-values (fn [cs]
(into {} (map (juxt :item :val) cs))) (group-by :cat ss)))
(defn transform [data]
(map-values transform-structures (group-by :structure data)))
then
(transform data)

How to group-by a collection that is already grouped by in Clojure?

I have a collection of maps
(def a '({:id 9345 :value 3 :type "orange"}
{:id 2945 :value 2 :type "orange"}
{:id 145 :value 3 :type "orange"}
{:id 2745 :value 6 :type "apple"}
{:id 2345 :value 6 :type "apple"}))
I want to group this first by value, followed by type.
My output should look like:
{
:orange [{
:value 3,
:id [9345, 145]
}, {
:value 2,
:id [2935]
}],
:apple [{
:value 6,
:id [2745, 2345]
}]
}
How would I do this in Clojure? Appreciate your answers.
Thanks!
Edit:
Here is what I had so far:
(defn by-type-key [data]
(group-by #(get % "type") data))
(reduce-kv
(fn [m k v] (assoc m k (reduce-kv
(fn [sm sk sv] (assoc sm sk (into [] (map #(:id %) sv))))
{}
(group-by :value (map #(dissoc % :type) v)))))
{}
(by-type-key a))
Output:
=> {"orange" {3 [9345 145], 2 [2945]}, "apple" {6 [2745 2345], 3 [125]}}
I just couldnt figure out how to proceed next...
Your requirements are a bit inconsistent (or rather irregular) - you use :type values as keywords in the result, but the rest of the keywords are carried through. Maybe that's what you must do to satisfy some external formats - otherwise you need to either use the same approach as with :type through, or add a new keyword to the result, like :group or :rows and keep the original keywords intact. I will assume the former approach for the moment (but see below, I will get to the shape as you want it,) so the final shape of data is like
{:orange
{:3 [9345 145],
:2 [2945]},
:apple
{:6 [2745 2345]}
}
There is more than one way of getting there, here's the gist of one:
(group-by (juxt :type :value) a)
The result:
{["orange" 3] [{:id 9345, :value 3, :type "orange"} {:id 145, :value 3, :type "orange"}],
["orange" 2] [{:id 2945, :value 2, :type "orange"}],
["apple" 6] [{:id 2745, :value 6, :type "apple"} {:id 2345, :value 6, :type "apple"}]}
Now all rows in your collection are grouped by the keys you need. From this, you can go and get the shape you want, say to get to the shape above you can do
(reduce
(fn [m [k v]]
(let [ks (map (comp keyword str) k)]
(assoc-in m ks
(map :id v))))
{}
(group-by (juxt :type :value) a))
The basic idea is to get the rows grouped by the key sequence (and that's what group-by and juxt do,) and then combine reduce and assoc-in or update-in to beat the result into place.
To get exactly the shape you described:
(reduce
(fn [m [k v]]
(let [type (keyword (first k))
value (second k)
ids (map :id v)]
(update-in m [type]
#(conj % {:value value :id ids}))))
{}
(group-by (juxt :type :value) a))
It's a bit noisy, and it might be harder to see the forest for the trees - that's why I simplified the shape, to highlight the main idea. The more regular your shapes are, the shorter and more regular your functions become - so if you have control over it, try to make it simpler for you.
I would do the transform in two stages (using reduce):
the first to collect the values
the second for formating
The following code solves your problem:
(def a '({:id 9345 :value 3 :type "orange"}
{:id 2945 :value 2 :type "orange"}
{:id 145 :value 3 :type "orange"}
{:id 2745 :value 6 :type "apple"}
{:id 2345 :value 6 :type "apple"}))
(defn standardise [m]
(->> m
;; first stage
(reduce (fn [out {:keys [type value id]}]
(update-in out [type value] (fnil #(conj % id) [])))
{})
;; second stage
(reduce-kv (fn [out k v]
(assoc out (keyword k)
(reduce-kv (fn [out value id]
(conj out {:value value
:id id}))
[]
v)))
{})))
(standardise a)
;; => {:orange [{:value 3, :id [9345 145]}
;; {:value 2, :id [2945]}],
;; :apple [{:value 6, :id [2745 2345]}]}
the output of the first stage is:
(reduce (fn [out {:keys [type value id]}]
(update-in out [type value] (fnil #(conj % id) [])))
{}
a)
;;=> {"orange" {3 [9345 145], 2 [2945]}, "apple" {6 [2745 2345]}}
You may wish to use the built-in function group-by. See http://clojuredocs.org/clojure.core/group-by

All subsets of a set in clojure

I wish to generate all subsets of a set except empty set
ie
(all-subsets #{1 2 3}) => #{#{1},#{2},#{3},#{1,2},#{2,3},#{3,1},#{1,2,3}}
How can this be done in clojure?
In your :dependencies in project.clj:
[org.clojure/math.combinatorics "0.0.7"]
At the REPL:
(require '[clojure.math.combinatorics :as combinatorics])
(->> #{1 2 3}
(combinatorics/subsets)
(remove empty?)
(map set)
(set))
;= #{#{1} #{2} #{3} #{1 2} #{1 3} #{2 3} #{1 2 3}}
clojure.math.combinatorics/subsets sensibly returns a seq of seqs, hence the extra transformations to match your desired output.
Here's a concise, tail-recursive version with dependencies only on clojure.core.
(defn power [s]
(loop [[f & r] (seq s) p '(())]
(if f (recur r (concat p (map (partial cons f) p)))
p)))
If you want the results in a set of sets, use the following.
(defn power-set [s] (set (map set (power s))))
#zcaudate: For completeness, here is a recursive implementation:
(defn subsets
[s]
(if (empty? s)
#{#{}}
(let [ts (subsets (rest s))]
(->> ts
(map #(conj % (first s)))
(clojure.set/union ts)))))
;; (subsets #{1 2 3})
;; => #{#{} #{1} #{2} #{3} #{1 2} #{1 3} #{2 3} #{1 2 3}} (which is correct).
This is a slight variation of #Brent M. Spell's solution in order to seek enlightenment on performance consideration in idiomatic Clojure.
I just wonder if having the construction of the subset in the loop instead of another iteration through (map set ...) would save some overhead, especially, when the set is very large?
(defn power [s]
(set (loop [[f & r] (seq s) p '(#{})]
(if f (recur r (concat p (map #(conj % f) p)))
p))))
(power [1 2 3])
;; => #{#{} #{3} #{2} #{1} #{1 3 2} #{1 3} #{1 2} #{3 2}}
It seems to me loop and recuris not lazy.
It would be nice to have a lazy evaluation version like Brent's, to keep the expression elegancy, while using laziness to achieve efficiency at the sametime.
This version as a framework has another advantage to easily support pruning of candidates for subsets, when there are too many subsets to compute. One can add the logic of pruning at position of conj. I used it to implement the prior algorithm for "Frequent Item Set".
refer to: Algorithm to return all combinations of k elements from n
(defn comb [k l]
(if (= 1 k) (map vector l)
(apply concat
(map-indexed
#(map (fn [x] (conj x %2))
(comb (dec k) (drop (inc %1) l)))
l))))
(defn all-subsets [s]
(apply concat
(for [x (range 1 (inc (count s)))]
(map #(into #{} %) (comb x s)))))
; (all-subsets #{1 2 3})
; (#{1} #{2} #{3} #{1 2} #{1 3} #{2 3} #{1 2 3})
This version is loosely modeled after the ES5 version on Rosetta Code. I know this question seems reasonably solved already... but here you go, anyways.
(fn [s]
(reduce
(fn [a b] (clojure.set/union a
(set (map (fn [y] (clojure.set/union #{b} y)) a))))
#{#{}} s))

How to Increment Values in a Map

I am wrapping my head around state in Clojure. I come from languages where state can be mutated. For example, in Python, I can create a dictionary, put some string => integer pairs inside, and then walk over the dictionary and increment the values.
How would I do this in idiomatic Clojure?
(def my-map {:a 1 :b 2})
(zipmap (keys my-map) (map inc (vals my-map)))
;;=> {:b 3, :a 2}
To update only one value by key:
(update-in my-map [:b] inc) ;;=> {:a 1, :b 3}
Since Clojure 1.7 it's also possible to use update:
(update my-map :b inc)
Just produce a new map and use it:
(def m {:a 3 :b 4})
(apply merge
(map (fn [[k v]] {k (inc v) }) m))
; {:b 5, :a 4}
To update multiple values, you could also take advantage of reduce taking an already filled accumulator, and applying a function on that and every member of the following collection.
=> (reduce (fn [a k] (update-in a k inc)) {:a 1 :b 2 :c 3 :d 4} [[:a] [:c]])
{:a 2, :c 4, :b 2, :d 4}
Be aware of the keys needing to be enclosed in vectors, but you can still do multiple update-ins in nested structures like the original update in.
If you made it a generalized function, you could automatically wrap a vector over a key by testing it with coll?:
(defn multi-update-in
[m v f & args]
(reduce
(fn [acc p] (apply
(partial update-in acc (if (coll? p) p (vector p)) f)
args)) m v))
which would allow for single-level/key updates without the need for wrapping the keys in vectors
=> (multi-update-in {:a 1 :b 2 :c 3 :d 4} [:a :c] inc)
{:a 2, :c 4, :b 2, :d 4}
but still be able to do nested updates
(def people
{"keith" {:age 27 :hobby "needlefelting"}
"penelope" {:age 39 :hobby "thaiboxing"}
"brian" {:age 12 :hobby "rocket science"}})
=> (multi-update-in people [["keith" :age] ["brian" :age]] inc)
{"keith" {:age 28, :hobby "needlefelting"},
"penelope" {:age 39, :hobby "thaiboxing"},
"brian" {:age 13, :hobby "rocket science"}}
To slightly improve #Michiel Brokent's answer. This will work if the key already doesn't present.
(update my-map :a #(if (nil? %) 1 (inc %)))
I've been toying with the same idea, so I came up with:
(defn remap
"returns a function which takes a map as argument
and applies f to each value in the map"
[f]
#(into {} (map (fn [[k v]] [k (f v)]) %)))
((remap inc) {:foo 1})
;=> {:foo 2}
or
(def inc-vals (remap inc))
(inc-vals {:foo 1})
;=> {:foo 2}

swap keys and values in a map

Is there a function to swap the key and value of a given map. So given a map, I want the keys to become values, and values the keys.
(swap {:a 2 b 4}) => {2 :a 4 :b}
One way to do it is
(zipmap (vals my-map) (keys my-map))
However wondering if clojure provides a utility fn for this?
This is the purpose of map-invert in clojure.set:
user=> (clojure.set/map-invert {:a 2 :b 4})
{4 :b, 2 :a}
For anyone reading this at a later date I think the following should be helpful.
A small library is available here https://clojars.org/beoliver/map-inversions
Inverting a map may return a relation. If the map is injective (one-to-one) then the inverse will also be one-to-one. If the map (as if often the case) is many-to-one then you should use a set or vector.
#Values treated as atomic
##one-to-one
the values of the map are unique
(defn invert-one-to-one
"returns a one-to-one mapping"
[m]
(persistent! (reduce (fn [m [k v]] (assoc! m v k)) (transient {}) m)))
(def one-to-one {:a 1 :b 2 :c 3})
> (invert-one-to-one one-to-one)
{1 :a 2 :b 3 :c}
##many-to-one
The values of the map are not unique. This is very common - and it is safest to assume that your maps are of this form... so (def invert invert-many-to-one)
(defn invert-many-to-one
"returns a one-to-many mapping"
([m] (invert-many-to-one #{} m))
([to m]
(persistent!
(reduce (fn [m [k v]]
(assoc! m v (conj (get m v to) k)))
(transient {}) m))))
(def many-to-one {:a 1 :b 1 :c 2})
> (invert-many-to-one many-to-one)
{1 #{:b :a}, 2 #{:c}} ; as expected
> (invert-many-to-one [] many-to-one)
{1 [:b :a], 2 [:c]} ; we can also use vectors
> (invert-one-to-one many-to-one) ; what happens when we use the 'wrong' function?
{1 :b, 2 :c} ; we have lost information
#Values treated as collections
##one-to-many
values are sets/collections but their intersections are always empty.
(No element occurs in two different sets)
(defn invert-one-to-many
"returns a many-to-one mapping"
[m]
(persistent!
(reduce (fn [m [k vs]] (reduce (fn [m v] (assoc! m v k)) m vs))
(transient {}) m)))
(def one-to-many (invert-many-to-one many-to-one))
> one-to-many
{1 #{:b :a}, 2 #{:c}}
> (invert-one-to-many one-to-many)
{:b 1, :a 1, :c 2} ; notice that we don't need to return sets as vals
##many-to-many
values are sets/collections and there exists at least two values whose intersection is not empty. If your values are collections then it is best to assume that they fall into this category.
(defn invert-many-to-many
"returns a many-to-many mapping"
([m] (invert-many-to-many #{} m))
([to m]
(persistent!
(reduce (fn [m [k vs]]
(reduce (fn [m v] (assoc! m v (conj (get m v to) k))) m vs))
(transient {}) m))))
(def many-to-many {:a #{1 2} :b #{1 3} :c #{3 4}})
> (invert-many-to-many many-to-many)
{1 #{:b :a}, 2 #{:a}, 3 #{:c :b}, 4 #{:c}}
;; notice that there are no duplicates when we use a vector
;; this is because each key appears only once
> (invert-many-to-many [] many-to-many)
{1 [:a :b], 2 [:a], 3 [:b :c], 4 [:c]}
> (invert-many-to-one many-to-many)
{#{1 2} #{:a}, #{1 3} #{:b}, #{4 3} #{:c}}
> (invert-one-to-many many-to-many)
{1 :b, 2 :a, 3 :c, 4 :c}
> (invert-one-to-one many-to-many)
{#{1 2} :a, #{1 3} :b, #{4 3} :c} ; this would be missing information if we had another key :d mapping to say #{1 2}
You could also use invert-many-to-many on the one-to-many example.
There's a function reverse-map in clojure.contrib.datalog.util, it's implemented as:
(defn reverse-map
"Reverse the keys/values of a map"
[m]
(into {} (map (fn [[k v]] [v k]) m)))
Here is an option that may fit the problem using reduce:
(reduce #(assoc %1 (second %2) (first %2)) {} {:a 2 :b 4})
Here in a function
(defn invert [map]
(reduce #(assoc %1 (second %2) (first %2)) {} map))
Calling
(invert {:a 2 b: 4})
Then there is the reduce-kv (cleaner in my opinion)
(reduce-kv #(assoc %1 %3 %2) {} {:a 2 :b 4})