So I've just played with Clojure today.
Using this data,
(def test-data
[{:id 35462, :status "COMPLETED", :p 2640000, :i 261600}
{:id 35462, :status "CREATED", :p 240000, :i 3200}
{:id 57217, :status "COMPLETED", :p 470001, :i 48043}
{:id 57217, :status "CREATED", :p 1409999, :i 120105}])
Then transform the above data with,
(as-> test-data input
(group-by :id input)
(map (fn [x] {:id (key x)
:p {:a (as-> (filter #(= (:status %) "COMPLETED") (val x)) tmp
(into {} tmp)
(get tmp :p))
:b (as-> (filter #(= (:status %) "CREATED") (val x)) tmp
(into {} tmp)
(get tmp :p))}
:i {:a (as-> (filter #(= (:status %) "COMPLETED") (val x)) tmp
(into {} tmp)
(get tmp :i))
:b (as-> (filter #(= (:status %) "CREATED") (val x)) tmp
(into {} tmp)
(get tmp :i))}})
input)
(into [] input))
To produce,
[{:id 35462, :p {:a 2640000, :b 240000}, :i {:a 261600, :b 3200}}
{:id 57217, :p {:a 470001, :b 1409999}, :i {:a 48043, :b 120105}}]
But I have a feeling that my code is not the "Clojure way". So my question is, what is the "Clojure way" to achieve what I've produced?
The only things that stand out to me are using as-> when ->> would work just as well, and some work being done redundantly, and some destructuring opportunities:
(defn aggregate [[id values]]
(let [completed (->> (filter #(= (:status %) "COMPLETED") values)
(into {}))
created (->> (filter #(= (:status %) "CREATED") values)
(into {}))]
{:id id
:p {:a (:p completed)
:b (:p created)}
:i {:a (:i completed)
:b (:i created)}}))
(->> test-data
(group-by :id)
(map aggregate))
=>
({:id 35462, :p {:a 2640000, :b 240000}, :i {:a 261600, :b 3200}}
{:id 57217, :p {:a 470001, :b 1409999}, :i {:a 48043, :b 120105}})
However, pouring those filtered values (which are maps themselves) into a map seems suspect to me. This is creating a last-one-wins scenario where the order of your test data affects the output. Try this to see how different orders of test-data affect output:
(into {} (filter #(= (:status %) "COMPLETED") (shuffle test-data)))
It's a pretty odd transformation, keys seem a little arbitrary and it's hard to generalise from n=2 (or indeed to know whether n ever > 2).
I'd use functional decomposition to factor out some of the commonality and get some traction. First of all let us transform the statuses into our keys...
(def status->ab {"COMPLETED" :a "CREATED" :b})
Then, with that in hand, I'd like an easy way of getting the "meat" outof the substructure. Here, for a given key into the data, I'm providing the content of the enclosing map for that key and a given group result.
(defn subgroup->subresult [k subgroup]
(apply array-map (mapcat #(vector (status->ab (:status %)) (k %)) subgroup)))
With this, the main transformer becomes much more tractable:
(defn group->result [group]
{
:id (key group)
:p (subgroup->subresult :p (val group))
:i (subgroup->subresult :i (val group))})
I wouldn't consider generalising across :p and :i for this - if you had more than two keys, then maybe I would generate a map of k -> the subgroup result and do some sort of reducing merge. Anyway, we have an answer:
(map group->result (group-by :id test-data))
;; =>
({:id 35462, :p {:b 240000, :a 2640000}, :i {:b 3200, :a 261600}}
{:id 57217, :p {:b 1409999, :a 470001}, :i {:b 120105, :a 48043}})
There are no one "Clojure way" (I guess you mean functional way) as it depends on how you decompose a problem.
Here is the way I will do:
(->> test-data
(map (juxt :id :status identity))
(map ->nested)
(apply deep-merge)
(map (fn [[id m]]
{:id id
:p (->ab-map m :p)
:i (->ab-map m :i)})))
;; ({:id 35462, :p {:a 2640000, :b 240000}, :i {:a 261600, :b 3200}}
;; {:id 57217, :p {:a 470001, :b 1409999}, :i {:a 48043, :b 120105}})
As you can see, I used a few functions and here is the step-by-step explanation:
Extract index keys (id + status) and the map itself into vector
(map (juxt :id :status identity) test-data)
;; ([35462 "COMPLETED" {:id 35462, :status "COMPLETED", :p 2640000, :i 261600}]
;; [35462 "CREATED" {:id 35462, :status "CREATED", :p 240000, :i 3200}]
;; [57217 "COMPLETED" {:id 57217, :status "COMPLETED", :p 470001, :i 48043}]
;; [57217 "CREATED" {:id 57217, :status "CREATED", :p 1409999, :i 120105}])
Transform into nested map (id, then status)
(map ->nested *1)
;; ({35462 {"COMPLETED" {:id 35462, :status "COMPLETED", :p 2640000, :i 261600}}}
;; {35462 {"CREATED" {:id 35462, :status "CREATED", :p 240000, :i 3200}}}
;; {57217 {"COMPLETED" {:id 57217, :status "COMPLETED", :p 470001, :i 48043}}}
;; {57217 {"CREATED" {:id 57217, :status "CREATED", :p 1409999, :i 120105}}})
Merge nested map by id
(apply deep-merge *1)
;; {35462
;; {"COMPLETED" {:id 35462, :status "COMPLETED", :p 2640000, :i 261600},
;; "CREATED" {:id 35462, :status "CREATED", :p 240000, :i 3200}},
;; 57217
;; {"COMPLETED" {:id 57217, :status "COMPLETED", :p 470001, :i 48043},
;; "CREATED" {:id 57217, :status "CREATED", :p 1409999, :i 120105}}}
For attribute :p and :i, map to :a and :b according to status
(->ab-map {"COMPLETED" {:id 35462, :status "COMPLETED", :p 2640000, :i 261600},
"CREATED" {:id 35462, :status "CREATED", :p 240000, :i 3200}}
:p)
;; => {:a 2640000, :b 240000}
And below are the few helper functions I used:
(defn ->ab-map [m k]
(zipmap [:a :b]
(map #(get-in m [% k]) ["COMPLETED" "CREATED"])))
(defn ->nested [[k & [v & r :as t]]]
{k (if (seq r) (->nested t) v)})
(defn deep-merge [& xs]
(if (every? map? xs)
(apply merge-with deep-merge xs)
(apply merge xs)))
I would approach it more like the following, so it can handle any number of entries for each :id value. Of course, many variations are possible.
(ns tst.demo.core
(:use demo.core tupelo.core tupelo.test)
(:require
[tupelo.core :as t] ))
(dotest
(let [test-data [{:id 35462, :status "COMPLETED", :p 2640000, :i 261600}
{:id 35462, :status "CREATED", :p 240000, :i 3200}
{:id 57217, :status "COMPLETED", :p 470001, :i 48043}
{:id 57217, :status "CREATED", :p 1409999, :i 120105}]
d1 (group-by :id test-data)
d2 (t/forv [[id entries] d1]
{:id id
:status-all (mapv :status entries)
:p-all (mapv :p entries)
:i-all (mapv :i entries)})]
(is= d1
{35462
[{:id 35462, :status "COMPLETED", :p 2640000, :i 261600}
{:id 35462, :status "CREATED", :p 240000, :i 3200}],
57217
[{:id 57217, :status "COMPLETED", :p 470001, :i 48043}
{:id 57217, :status "CREATED", :p 1409999, :i 120105}]})
(is= d2 [{:id 35462,
:status-all ["COMPLETED" "CREATED"],
:p-all [2640000 240000],
:i-all [261600 3200]}
{:id 57217,
:status-all ["COMPLETED" "CREATED"],
:p-all [470001 1409999],
:i-all [48043 120105]}])
))
I have the following map:
(def gigs {:gig-01 {:id :gig-01
:title "Macaron"
:artist "Baher Khairy"
:desc "Sweet meringue-based rhythms with smooth and sweet injections of soul"
:img "https://res.cloudinary.com/schae/image/upload/f_auto,q_auto/v1519552695/giggin/baher-khairy-97645.jpg"
:price 1000
:sold-out false}
:gig-02 {:id :gig-02
:title "Stairs"
:artist "Brentr De Ranter"
:desc "Stairs to the highets peaks of music."
:img "https://res.cloudinary.com/schae/image/upload/f_auto,q_auto/v1519552695/giggin/brent-de-ranter-426248.jpg"
:price 2000
:sold-out false}})
I'd like to create a spec for it, but I'm not sure how to define the key e.g. ":gig-01" any pointers?
You could try:
(s/def ::gig-id
(s/and keyword?
(fn [x] (->> x name (re-matches #"gig-\d+")))))
I recognize that clojure.spec isn't intended for arbitrary data transformation, and as I understand it, it is intended for flexibly encoding domain knowledge via arbitrary predicates. It's an insanely powerful tool, and I love using it.
So much, perhaps, that I've run into a scenario where I am mergeing maps, component-a and component-b, each of which can take one of many forms, into a composite, and then later wanting to "unmix" the composite into its component parts.
This is modeled as two multi-specs for the components and an s/merge of those components for the composite:
;; component-a
(defmulti component-a :protocol)
(defmethod component-a :p1 [_]
(s/keys :req-un [::x ::y ::z]))
(defmethod component-a :p2 [_]
(s/keys :req-un [::p ::q ::r]))
(s/def ::component-a
(s/multi-spec component-a :protocol))
;; component-b
(defmulti component-b :protocol)
(defmethod component-b :p1 [_]
(s/keys :req-un [::i ::j ::k]))
(defmethod component-b :p2 [_]
(s/keys :req-un [::s ::t]))
(s/def ::component-b
(s/multi-spec component-b :protocol))
;; composite
(s/def ::composite
(s/merge ::component-a ::component-b)
What I'd like to be able to do is the following:
(def p1a {:protocol :p1 :x ... :y ... :z ...})
(def p1b (make-b p1a)) ; => {:protocol :p1 :i ... :j ... :k ...}
(def a (s/conform ::component-a p1a))
(def b (s/conform ::component-b p1b))
(def ab1 (s/conform ::composite (merge a b))
(?Fn ::component-a ab1) ; => {:protocol :p1 :x ... :y ... :z ...}
(?Fn ::component-b ab1) ; => {:protocol :p1 :i ... :j ... :k ...}
(def ab2 {:protocol :p2 :p ... :q ... :r ... :s ... :t ...})
(?Fn ::component-a ab2) ; => {:protocol :p2 :p ... :q ... :r ...}
(?Fn ::component-b ab2) ; => {:protocol :p2 :s ... :t ...}
In other words, I'd like to reuse the domain knowledge encoded for component-a and component-b, to decompose a composite.
My first thought was to isolate the keys themselves from the call to s/keys:
(defmulti component-a :protocol)
(defmethod component-a :p1 [_]
(s/keys :req-un <form>)) ; <form> must look like [::x ::y ::z]
However, approaches where the keys of s/keys are computed from "something else" fail because <form> must be an ISeq. That is, <form> can neither be a fn that computes an ISeq, nor a symbol that represents an ISeq.
I also experimented with using s/describe to read the keys dynamically at run-time, but this doesn't work generally with multi-specs as it would with a simple s/def. I won't say I exhausted this approach, but it seemed like a rabbit hole of recursive s/describes and accessing multifns underlying multi-specs directly, which felt dirty.
I also thought about adding a separate multifn based on :protocol:
(defmulti decompose-composite :protocol)
(defmethod decompose-composite :p1
[composite]
{:component-a (select-keys composite [x y z])
:component-b (select-keys composite [i j k]))
But this obviously doesn't reuse domain knowledge, it just duplicates it and exposes another avenue of applying it. It's also specific to the one composite; we'd need a decompose-other-composite for a different composite.
So at this point this is just a fun puzzle. We could always nest the components in the composite, making them trivial to isolate again:
(s/def ::composite
(s/keys :req-un [::component-a ::component-b]))
(def ab {:component-a a :component-b b})
(do-composite-stuff (apply merge (vals ab)))
But is there a better way to achieve ?Fn? Could a custom s/conformer do something like this? Or are merged maps more like physical mixtures, i.e. disproportionately harder to separate?
I also experimented with using s/describe to read the keys dynamically at run-time, but this doesn't work generally with multi-specs as it would with a simple s/def
A workaround that comes to mind is defining the s/keys specs separate from/outside of the defmethods, then getting the s/keys form back and pulling the keywords out.
;; component-a
(s/def ::component-a-p1-map
(s/keys :req-un [::protocol ::x ::y ::z])) ;; NOTE explicit ::protocol key added
(defmulti component-a :protocol)
(defmethod component-a :p1 [_] ::component-a-p1-map)
(s/def ::component-a
(s/multi-spec component-a :protocol))
;; component-b
(defmulti component-b :protocol)
(s/def ::component-b-p1-map
(s/keys :req-un [::protocol ::i ::j ::k]))
(defmethod component-b :p1 [_] ::component-b-p1-map)
(s/def ::component-b
(s/multi-spec component-b :protocol))
;; composite
(s/def ::composite (s/merge ::component-a ::component-b))
(def p1a {:protocol :p1 :x 1 :y 2 :z 3})
(def p1b {:protocol :p1 :i 4 :j 5 :k 6})
(def a (s/conform ::component-a p1a))
(def b (s/conform ::component-b p1b))
(def ab1 (s/conform ::composite (merge a b)))
With standalone specs for the s/keys specs, you can get the individual keys back using s/form:
(defn get-spec-keys [keys-spec]
(let [unqualify (comp keyword name)
{:keys [req req-un opt opt-un]}
(->> (s/form keys-spec)
(rest)
(apply hash-map))]
(concat req (map unqualify req-un) opt (map unqualify opt-un))))
(get-spec-keys ::component-a-p1-map)
=> (:protocol :x :y :z)
And with that you can use select-keys on the composite map:
(defn ?Fn [spec m]
(select-keys m (get-spec-keys spec)))
(?Fn ::component-a-p1-map ab1)
=> {:protocol :p1, :x 1, :y 2, :z 3}
(?Fn ::component-b-p1-map ab1)
=> {:protocol :p1, :i 4, :j 5, :k 6}
And using your decompose-composite idea:
(defmulti decompose-composite :protocol)
(defmethod decompose-composite :p1
[composite]
{:component-a (?Fn ::component-a-p1-map composite)
:component-b (?Fn ::component-b-p1-map composite)})
(decompose-composite ab1)
=> {:component-a {:protocol :p1, :x 1, :y 2, :z 3},
:component-b {:protocol :p1, :i 4, :j 5, :k 6}}
However, approaches where the keys of s/keys are computed from "something else" fail because must be an ISeq. That is, can neither be a fn that computes an ISeq, nor a symbol that represents an ISeq.
Alternatively, you could eval a programmatically constructed s/keys form:
(def some-keys [::protocol ::x ::y ::z])
(s/form (eval `(s/keys :req-un ~some-keys)))
=> (clojure.spec.alpha/keys :req-un [:sandbox.core/protocol
:sandbox.core/x
:sandbox.core/y
:sandbox.core/z])
And then use some-keys directly later.
I've got the following tree:
{:start_date "2014-12-07"
:data {
:people [
{:id 1
:projects [{:id 1} {:id 2}]}
{:id 2
:projects [{:id 1} {:id 3}]}
]
}
}
I want to update the people and projects subtrees by adding a :name key-value pair.
Assuming I have these maps to perform the lookup:
(def people {1 "Susan" 2 "John")
(def projects {1 "Foo" 2 "Bar" 3 "Qux")
How could I update the original tree so that I end up with the following?
{:start_date "2014-12-07"
:data {
:people [
{:id 1
:name "Susan"
:projects [{:id 1 :name "Foo"} {:id 2 :name "Bar"}]}
{:id 2
:name "John"
:projects [{:id 1 :name "Foo"} {:id 3 :name "Qux"}]}
]
}
}
I've tried multiple combinations of assoc-in, update-in, get-in and map calls, but haven't been able to figure this out.
I have used letfn to break down the update into easier to understand units.
user> (def tree {:start_date "2014-12-07"
:data {:people [{:id 1
:projects [{:id 1} {:id 2}]}
{:id 2
:projects [{:id 1} {:id 3}]}]}})
#'user/tree
user> (def people {1 "Susan" 2 "John"})
#'user/people
user> (def projects {1 "Foo" 2 "Bar" 3 "Qux"})
#'user/projects
user>
(defn integrate-tree
[tree people projects]
;; letfn is like let, but it creates fn, and allows forward references
(letfn [(update-person [person]
;; -> is the "thread first" macro, the result of each expression
;; becomes the first arg to the next
(-> person
(assoc :name (people (:id person)))
(update-in [:projects] update-projects)))
(update-projects [all-projects]
(mapv
#(assoc % :name (projects (:id %)))
all-projects))]
(update-in tree [:data :people] #(mapv update-person %))))
#'user/integrate-tree
user> (pprint (integrate-tree tree people projects))
{:start_date "2014-12-07",
:data
{:people
[{:projects [{:name "Foo", :id 1} {:name "Bar", :id 2}],
:name "Susan",
:id 1}
{:projects [{:name "Foo", :id 1} {:name "Qux", :id 3}],
:name "John",
:id 2}]}}
nil
Not sure if entirely the best approach:
(defn update-names
[tree people projects]
(reduce
(fn [t [id name]]
(let [person-idx (ffirst (filter #(= (:id (second %)) id)
(map-indexed vector (:people (:data t)))))
temp (assoc-in t [:data :people person-idx :name] name)]
(reduce
(fn [t [id name]]
(let [project-idx (ffirst (filter #(= (:id (second %)) id)
(map-indexed vector (get-in t [:data :people person-idx :projects]))))]
(if project-idx
(assoc-in t [:data :people person-idx :projects project-idx :name] name)
t)))
temp
projects)))
tree
people))
Just call it with your parameters:
(clojure.pprint/pprint (update-names tree people projects))
{:start_date "2014-12-07",
:data
{:people
[{:projects [{:name "Foo", :id 1} {:name "Bar", :id 2}],
:name "Susan",
:id 1}
{:projects [{:name "Foo", :id 1} {:name "Qux", :id 3}],
:name "John",
:id 2}]}}
With nested reduces
Reduce over the people to update corresponding names
For each people, reduce over projects to update corresponding names
The noisesmith solution looks better since doesn't need to find person index or project index for each step.
Naturally you tried to assoc-in or update-in but the problem lies in your tree structure, since the key path to update John name is [:data :people 1 :name], so your assoc-in code would look like:
(assoc-in tree [:data :people 1 :name] "John")
But you need to find John's index in the people vector before you can update it, same things happens with projects inside.
I've got two maps:
(def people {:1 "John" :2 "Paul" :3 "Ringo" :4 "George"})
(def band
{:data
{:members
{:1 {:id 1 :name "John"}
:2 {:id 2 :name "Paul"}}}})
I want to loop over people and add any members that don't exist in [:data :members] to band, resulting in:
(def band
{:data
{:members
{:1 {:id 1 :name "John"}
:2 {:id 2 :name "Paul"}
:3 {:id 3 :name "Ringo"}
:4 {:id 4 :name "George"}}}})
Here's what I've tried:
(for [[id name] people]
(when-not
(contains? (get-in band [:data :members]) id)
(assoc-in band [:data :members id] {:id id :name name})))
Which yields:
({:data
{:members
{:4 {:id :4, :name "George"},
:1 {:name "John", :id 1},
:2 {:name "Paul", :id 2}}}}
nil
nil
{:data
{:members
{:1 {:name "John", :id 1},
:2 {:name "Paul", :id 2},
:3 {:id :3, :name "Ringo"}}}})
I'm not sure why I'm getting back what looks to be a list of each mutation of band. What am I doing wrong here? How can I add the missing members of people to band [:data :members]?
To be pedantic you aren't getting back any mutation of band. In fact, one of the most important features of Clojure is that the standard types are immutible, and the primary collection operations return a modified copy without changing the original.
Also, for in Clojure is not a loop, it is a list comprehension. This is why it always returns a sequence of each step. So instead of altering an input one step at a time, you made a new variation on the input for each step, each derived from the immutable original.
The standard construct for making a series of updated copies of an input based on a sequence of values is reduce, which passes a new version of the accumulator and each element of the list to your function.
Finally, you are misunderstanding the role of :keyword syntax - prefixing an item with a : is not needed in order to construct map keys - just about any clojure value is a valid key for a map, and keywords are just a convenient idiom.
user=> (def band
{:data
{:members
{1 {:id 1 :name "John"}
2 {:id 2 :name "Paul"}}}})
#'user/band
user=> (def people {1 "John" 2 "Paul" 3 "Ringo" 4 "George"})
#'user/people
user=> (pprint
(reduce (fn [band [id name :as person]]
(if-not (contains? (get-in band [:data :members]) id)
(assoc-in band [:data :members id] {:id id :name name})
band))
band
people))
{:data
{:members
{3 {:id 3, :name "Ringo"},
4 {:id 4, :name "George"},
1 {:name "John", :id 1},
2 {:name "Paul", :id 2}}}}
nil
You may notice the body of the fn passed to reduce is essentially the same as the body of your for comprehension. The difference is that instead of when-not which returns nil on the alternate case, I use if-not, which allows us to propagate the accumulator (here called band, same as the input) regardless of whether any new version of it is made.