How best to update this tree? - clojure

I've got the following tree:
{:start_date "2014-12-07"
:data {
:people [
{:id 1
:projects [{:id 1} {:id 2}]}
{:id 2
:projects [{:id 1} {:id 3}]}
]
}
}
I want to update the people and projects subtrees by adding a :name key-value pair.
Assuming I have these maps to perform the lookup:
(def people {1 "Susan" 2 "John")
(def projects {1 "Foo" 2 "Bar" 3 "Qux")
How could I update the original tree so that I end up with the following?
{:start_date "2014-12-07"
:data {
:people [
{:id 1
:name "Susan"
:projects [{:id 1 :name "Foo"} {:id 2 :name "Bar"}]}
{:id 2
:name "John"
:projects [{:id 1 :name "Foo"} {:id 3 :name "Qux"}]}
]
}
}
I've tried multiple combinations of assoc-in, update-in, get-in and map calls, but haven't been able to figure this out.

I have used letfn to break down the update into easier to understand units.
user> (def tree {:start_date "2014-12-07"
:data {:people [{:id 1
:projects [{:id 1} {:id 2}]}
{:id 2
:projects [{:id 1} {:id 3}]}]}})
#'user/tree
user> (def people {1 "Susan" 2 "John"})
#'user/people
user> (def projects {1 "Foo" 2 "Bar" 3 "Qux"})
#'user/projects
user>
(defn integrate-tree
[tree people projects]
;; letfn is like let, but it creates fn, and allows forward references
(letfn [(update-person [person]
;; -> is the "thread first" macro, the result of each expression
;; becomes the first arg to the next
(-> person
(assoc :name (people (:id person)))
(update-in [:projects] update-projects)))
(update-projects [all-projects]
(mapv
#(assoc % :name (projects (:id %)))
all-projects))]
(update-in tree [:data :people] #(mapv update-person %))))
#'user/integrate-tree
user> (pprint (integrate-tree tree people projects))
{:start_date "2014-12-07",
:data
{:people
[{:projects [{:name "Foo", :id 1} {:name "Bar", :id 2}],
:name "Susan",
:id 1}
{:projects [{:name "Foo", :id 1} {:name "Qux", :id 3}],
:name "John",
:id 2}]}}
nil

Not sure if entirely the best approach:
(defn update-names
[tree people projects]
(reduce
(fn [t [id name]]
(let [person-idx (ffirst (filter #(= (:id (second %)) id)
(map-indexed vector (:people (:data t)))))
temp (assoc-in t [:data :people person-idx :name] name)]
(reduce
(fn [t [id name]]
(let [project-idx (ffirst (filter #(= (:id (second %)) id)
(map-indexed vector (get-in t [:data :people person-idx :projects]))))]
(if project-idx
(assoc-in t [:data :people person-idx :projects project-idx :name] name)
t)))
temp
projects)))
tree
people))
Just call it with your parameters:
(clojure.pprint/pprint (update-names tree people projects))
{:start_date "2014-12-07",
:data
{:people
[{:projects [{:name "Foo", :id 1} {:name "Bar", :id 2}],
:name "Susan",
:id 1}
{:projects [{:name "Foo", :id 1} {:name "Qux", :id 3}],
:name "John",
:id 2}]}}
With nested reduces
Reduce over the people to update corresponding names
For each people, reduce over projects to update corresponding names
The noisesmith solution looks better since doesn't need to find person index or project index for each step.
Naturally you tried to assoc-in or update-in but the problem lies in your tree structure, since the key path to update John name is [:data :people 1 :name], so your assoc-in code would look like:
(assoc-in tree [:data :people 1 :name] "John")
But you need to find John's index in the people vector before you can update it, same things happens with projects inside.

Related

Get list of values from vector of hashmap using filter or for

I'm quite new to clojure and have been struggling to understand how things work exactly. I have a vector of hashmaps as such, titled authors:
------ Authors -----------
[{:id 100, :name "Albert Einstein", :interest "Physics"}
{:id 200, :name "Alan Turing", :interest "Computer Science"}
{:id 300, :name "Jeff Dean", :interest "Programming"}]
I want to write a function that takes the id, and returns a list of the corresponding author names. I have two options for doing so: using filter or using for loop.
When using filter, I have a predicate function already that returns true if the author has matching id:
(defn check-by-id [author id]
(if (= id (:id author)) true false))
But I'm not sure how to use this in order to get the list of author names when passing the id.
Three other ways via keep, for and reduce:
(keep (fn [{:keys [id name]}] (when (= id 100) name)) authors)
;; => ("Albert Einstein")
(for [{:keys [id name]} authors
:when (= id 100)]
name)
;; => ("Albert Einstein")
(reduce (fn [v {:keys [id name]}]
(if (= id 100) (conj v name) v))
[]
authors)
;; => ["Albert Einstein"]
I prefer for (with :when) since it's shortest and in my eyes most clear. reduce I find best when you want to build a specific type of collection, this case a vector.
Filter will filter the list of maps. But the result is still a sequence of maps. You can map or reduce the result to get the list of authors.
(def authors [{:id 100, :name "Albert Einstein", :interest "Physics"}
{:id 100, :name "Richard Fynmann", :interest "Physics"}
{:id 200, :name "Alan Turing", :interest "Computer Science"}
{:id 300, :name "Jeff Dean", :interest "Programming"}])
(defn check-by-id [author id] (= id (:id author)))
(defn filter-ids [id col] (filter #(check-by-id % id) col))
(filter-ids 100 authors)
;; ↪ ({:id 100, :name "Albert Einstein", :interest "Physics"}
;; {:id 100, :name "Richard Fynmann", :interest "Physics"})
(map :name (filter-ids 100 authors))
;; ↪ ("Albert Einstein" "Richard Fynmann")
You can also use group-by for this task:
(def list-of-maps
[{:id 100, :name "Albert Einstein", :interest "Physics"}
{:id 200, :name "Alan Turing", :interest "Computer Science"}
{:id 300, :name "Jeff Dean", :interest "Programming"}])
(map :name (get (group-by :id list-of-maps) 100))
;; => ("Albert Einstein")

What is the Clojure way to transform following data?

So I've just played with Clojure today.
Using this data,
(def test-data
[{:id 35462, :status "COMPLETED", :p 2640000, :i 261600}
{:id 35462, :status "CREATED", :p 240000, :i 3200}
{:id 57217, :status "COMPLETED", :p 470001, :i 48043}
{:id 57217, :status "CREATED", :p 1409999, :i 120105}])
Then transform the above data with,
(as-> test-data input
(group-by :id input)
(map (fn [x] {:id (key x)
:p {:a (as-> (filter #(= (:status %) "COMPLETED") (val x)) tmp
(into {} tmp)
(get tmp :p))
:b (as-> (filter #(= (:status %) "CREATED") (val x)) tmp
(into {} tmp)
(get tmp :p))}
:i {:a (as-> (filter #(= (:status %) "COMPLETED") (val x)) tmp
(into {} tmp)
(get tmp :i))
:b (as-> (filter #(= (:status %) "CREATED") (val x)) tmp
(into {} tmp)
(get tmp :i))}})
input)
(into [] input))
To produce,
[{:id 35462, :p {:a 2640000, :b 240000}, :i {:a 261600, :b 3200}}
{:id 57217, :p {:a 470001, :b 1409999}, :i {:a 48043, :b 120105}}]
But I have a feeling that my code is not the "Clojure way". So my question is, what is the "Clojure way" to achieve what I've produced?
The only things that stand out to me are using as-> when ->> would work just as well, and some work being done redundantly, and some destructuring opportunities:
(defn aggregate [[id values]]
(let [completed (->> (filter #(= (:status %) "COMPLETED") values)
(into {}))
created (->> (filter #(= (:status %) "CREATED") values)
(into {}))]
{:id id
:p {:a (:p completed)
:b (:p created)}
:i {:a (:i completed)
:b (:i created)}}))
(->> test-data
(group-by :id)
(map aggregate))
=>
({:id 35462, :p {:a 2640000, :b 240000}, :i {:a 261600, :b 3200}}
{:id 57217, :p {:a 470001, :b 1409999}, :i {:a 48043, :b 120105}})
However, pouring those filtered values (which are maps themselves) into a map seems suspect to me. This is creating a last-one-wins scenario where the order of your test data affects the output. Try this to see how different orders of test-data affect output:
(into {} (filter #(= (:status %) "COMPLETED") (shuffle test-data)))
It's a pretty odd transformation, keys seem a little arbitrary and it's hard to generalise from n=2 (or indeed to know whether n ever > 2).
I'd use functional decomposition to factor out some of the commonality and get some traction. First of all let us transform the statuses into our keys...
(def status->ab {"COMPLETED" :a "CREATED" :b})
Then, with that in hand, I'd like an easy way of getting the "meat" outof the substructure. Here, for a given key into the data, I'm providing the content of the enclosing map for that key and a given group result.
(defn subgroup->subresult [k subgroup]
(apply array-map (mapcat #(vector (status->ab (:status %)) (k %)) subgroup)))
With this, the main transformer becomes much more tractable:
(defn group->result [group]
{
:id (key group)
:p (subgroup->subresult :p (val group))
:i (subgroup->subresult :i (val group))})
I wouldn't consider generalising across :p and :i for this - if you had more than two keys, then maybe I would generate a map of k -> the subgroup result and do some sort of reducing merge. Anyway, we have an answer:
(map group->result (group-by :id test-data))
;; =>
({:id 35462, :p {:b 240000, :a 2640000}, :i {:b 3200, :a 261600}}
{:id 57217, :p {:b 1409999, :a 470001}, :i {:b 120105, :a 48043}})
There are no one "Clojure way" (I guess you mean functional way) as it depends on how you decompose a problem.
Here is the way I will do:
(->> test-data
(map (juxt :id :status identity))
(map ->nested)
(apply deep-merge)
(map (fn [[id m]]
{:id id
:p (->ab-map m :p)
:i (->ab-map m :i)})))
;; ({:id 35462, :p {:a 2640000, :b 240000}, :i {:a 261600, :b 3200}}
;; {:id 57217, :p {:a 470001, :b 1409999}, :i {:a 48043, :b 120105}})
As you can see, I used a few functions and here is the step-by-step explanation:
Extract index keys (id + status) and the map itself into vector
(map (juxt :id :status identity) test-data)
;; ([35462 "COMPLETED" {:id 35462, :status "COMPLETED", :p 2640000, :i 261600}]
;; [35462 "CREATED" {:id 35462, :status "CREATED", :p 240000, :i 3200}]
;; [57217 "COMPLETED" {:id 57217, :status "COMPLETED", :p 470001, :i 48043}]
;; [57217 "CREATED" {:id 57217, :status "CREATED", :p 1409999, :i 120105}])
Transform into nested map (id, then status)
(map ->nested *1)
;; ({35462 {"COMPLETED" {:id 35462, :status "COMPLETED", :p 2640000, :i 261600}}}
;; {35462 {"CREATED" {:id 35462, :status "CREATED", :p 240000, :i 3200}}}
;; {57217 {"COMPLETED" {:id 57217, :status "COMPLETED", :p 470001, :i 48043}}}
;; {57217 {"CREATED" {:id 57217, :status "CREATED", :p 1409999, :i 120105}}})
Merge nested map by id
(apply deep-merge *1)
;; {35462
;; {"COMPLETED" {:id 35462, :status "COMPLETED", :p 2640000, :i 261600},
;; "CREATED" {:id 35462, :status "CREATED", :p 240000, :i 3200}},
;; 57217
;; {"COMPLETED" {:id 57217, :status "COMPLETED", :p 470001, :i 48043},
;; "CREATED" {:id 57217, :status "CREATED", :p 1409999, :i 120105}}}
For attribute :p and :i, map to :a and :b according to status
(->ab-map {"COMPLETED" {:id 35462, :status "COMPLETED", :p 2640000, :i 261600},
"CREATED" {:id 35462, :status "CREATED", :p 240000, :i 3200}}
:p)
;; => {:a 2640000, :b 240000}
And below are the few helper functions I used:
(defn ->ab-map [m k]
(zipmap [:a :b]
(map #(get-in m [% k]) ["COMPLETED" "CREATED"])))
(defn ->nested [[k & [v & r :as t]]]
{k (if (seq r) (->nested t) v)})
(defn deep-merge [& xs]
(if (every? map? xs)
(apply merge-with deep-merge xs)
(apply merge xs)))
I would approach it more like the following, so it can handle any number of entries for each :id value. Of course, many variations are possible.
(ns tst.demo.core
(:use demo.core tupelo.core tupelo.test)
(:require
[tupelo.core :as t] ))
(dotest
(let [test-data [{:id 35462, :status "COMPLETED", :p 2640000, :i 261600}
{:id 35462, :status "CREATED", :p 240000, :i 3200}
{:id 57217, :status "COMPLETED", :p 470001, :i 48043}
{:id 57217, :status "CREATED", :p 1409999, :i 120105}]
d1 (group-by :id test-data)
d2 (t/forv [[id entries] d1]
{:id id
:status-all (mapv :status entries)
:p-all (mapv :p entries)
:i-all (mapv :i entries)})]
(is= d1
{35462
[{:id 35462, :status "COMPLETED", :p 2640000, :i 261600}
{:id 35462, :status "CREATED", :p 240000, :i 3200}],
57217
[{:id 57217, :status "COMPLETED", :p 470001, :i 48043}
{:id 57217, :status "CREATED", :p 1409999, :i 120105}]})
(is= d2 [{:id 35462,
:status-all ["COMPLETED" "CREATED"],
:p-all [2640000 240000],
:i-all [261600 3200]}
{:id 57217,
:status-all ["COMPLETED" "CREATED"],
:p-all [470001 1409999],
:i-all [48043 120105]}])
))

Find previous item by key value in vector of maps

I have a vector of maps like this:
[{:id 2 :val "v1"} {:id 5 :val "v2"} {:id 10 :val "v3"}]
and now I want to find an element previous to chosen id.
For example: when provided with id = 10 i want to receive:
{:id 5 :val "v2"}
and when selected id = 2 then return nil.
I'm new in clojurescript programming and cannot think of simple solution for this problem... Help please :)
You can use partition to pair adjacent maps and then search for a match on the second by id:
(def ms [{:id 2 :val "v1"} {:id 5 :val "v2"} {:id 10 :val "v3"}])
(ffirst (filter #(= 10 (:id (second %))) (partition 2 1 ms)))
(partition 2 1 data) in the accepted answer is an option, but here are two alternatives based on a "lagged" sequence.
This one first constructs the look-up table (mapping next ids to each item), which should be more performant if many lookups needs to be done. You could even easily map over it. But this approach requires ids to be unique.
(let [data [{:id 2 :val "v1"} {:id 5 :val "v2"} {:id 10 :val "v3"}]
ids (zipmap (map :id (rest data)) data)]
[(ids 10)
ids])
; [{:id 5, :val "v2"}
; {5 {:id 2, :val "v1"}, 10 {:id 5, :val "v2"}}]
This second one generates a sequence of matching documents, which is necessary if there might be more than one:
(let [data [{:id 2 :val "v1"} {:id 5 :val "v2"} {:id 10 :val "v3"}]
next-ids (->> data rest (map :id))]
(->>
(map (fn [item next-id] (if (= 10 next-id) item))
data next-ids)
(filter some?)
first))
; {:id 5, :val "v2"}
You'd get similar code by using partition but instead of #(...) you'd use destructuring: (fn [first-item second-item] (= 10 (:id second-item))). Indeed ffirst comes very handy in this approach.

Add items from collection 1 to collection 2, if collection 2 doesn't contain item from collection 1

I've got two maps:
(def people {:1 "John" :2 "Paul" :3 "Ringo" :4 "George"})
(def band
{:data
{:members
{:1 {:id 1 :name "John"}
:2 {:id 2 :name "Paul"}}}})
I want to loop over people and add any members that don't exist in [:data :members] to band, resulting in:
(def band
{:data
{:members
{:1 {:id 1 :name "John"}
:2 {:id 2 :name "Paul"}
:3 {:id 3 :name "Ringo"}
:4 {:id 4 :name "George"}}}})
Here's what I've tried:
(for [[id name] people]
(when-not
(contains? (get-in band [:data :members]) id)
(assoc-in band [:data :members id] {:id id :name name})))
Which yields:
({:data
{:members
{:4 {:id :4, :name "George"},
:1 {:name "John", :id 1},
:2 {:name "Paul", :id 2}}}}
nil
nil
{:data
{:members
{:1 {:name "John", :id 1},
:2 {:name "Paul", :id 2},
:3 {:id :3, :name "Ringo"}}}})
I'm not sure why I'm getting back what looks to be a list of each mutation of band. What am I doing wrong here? How can I add the missing members of people to band [:data :members]?
To be pedantic you aren't getting back any mutation of band. In fact, one of the most important features of Clojure is that the standard types are immutible, and the primary collection operations return a modified copy without changing the original.
Also, for in Clojure is not a loop, it is a list comprehension. This is why it always returns a sequence of each step. So instead of altering an input one step at a time, you made a new variation on the input for each step, each derived from the immutable original.
The standard construct for making a series of updated copies of an input based on a sequence of values is reduce, which passes a new version of the accumulator and each element of the list to your function.
Finally, you are misunderstanding the role of :keyword syntax - prefixing an item with a : is not needed in order to construct map keys - just about any clojure value is a valid key for a map, and keywords are just a convenient idiom.
user=> (def band
{:data
{:members
{1 {:id 1 :name "John"}
2 {:id 2 :name "Paul"}}}})
#'user/band
user=> (def people {1 "John" 2 "Paul" 3 "Ringo" 4 "George"})
#'user/people
user=> (pprint
(reduce (fn [band [id name :as person]]
(if-not (contains? (get-in band [:data :members]) id)
(assoc-in band [:data :members id] {:id id :name name})
band))
band
people))
{:data
{:members
{3 {:id 3, :name "Ringo"},
4 {:id 4, :name "George"},
1 {:name "John", :id 1},
2 {:name "Paul", :id 2}}}}
nil
You may notice the body of the fn passed to reduce is essentially the same as the body of your for comprehension. The difference is that instead of when-not which returns nil on the alternate case, I use if-not, which allows us to propagate the accumulator (here called band, same as the input) regardless of whether any new version of it is made.

What is the idiomatic way to assoc several keys/values in a nested map in Clojure?

Imagine you have a map like this:
(def person {
:name {
:first-name "John"
:middle-name "Michael"
:last-name "Smith" }})
What is the idiomatic way to change values associated with both :first-name and :last-name in one expression?
(Clarification: Let's say you want to set :first-name to "Bob" and :last-name to "Doe". Let's also say that this map has some other values in it that we want to preserve, so constructing it from scratch is not an option)
Here are a couple of ways.
user> (update-in person [:name] assoc :first-name "Bob" :last-name "Doe")
{:name {:middle-name "Michael", :last-name "Doe", :first-name "Bob"}}
user> (update-in person [:name] merge {:first-name "Bob" :last-name "Doe"})
{:name {:middle-name "Michael", :last-name "Doe", :first-name "Bob"}}
user> (update-in person [:name] into {:first-name "Bob" :last-name "Doe"})
{:name {:middle-name "Michael", :last-name "Doe", :first-name "Bob"}}
user> (-> person
(assoc-in [:name :first-name] "Bob")
(assoc-in [:name :last-name] "Doe"))
{:name {:middle-name "Michael", :last-name "Doe", :first-name "Bob"}}
Edit
update-in does recursive assocs on your map. In this case it's roughly equivalent to:
user> (assoc person :name
(assoc (:name person)
:first-name "Bob"
:last-name "Doe"))
The repetition of keys becomes more and more tedious as you go deeper into a series of nested maps. update-in's recursion lets you avoid repeating keys (e.g. :name) over and over; intermediary results are stored on the stack between recursive calls. Take a look at the source for update-in to see how it's done.
user> (def foo {:bar {:baz {:quux 123}}})
#'user/foo
user> (assoc foo :bar
(assoc (:bar foo) :baz
(assoc (:baz (:bar foo)) :quux
(inc (:quux (:baz (:bar foo)))))))
{:bar {:baz {:quux 124}}}
user> (update-in foo [:bar :baz :quux] inc)
{:bar {:baz {:quux 124}}}
assoc is dynamic (as are update-in, assoc-in, and most other Clojure functions that operate on Clojure data structures). If assoc onto a map, it returns a map. If you assoc onto a vector, it returns a vector. Look at the source for assoc and take a look in in RT.java in the Clojure source for details.
I'm not sure if my case is quite the same but I had list of changes to apply:
(def updates [[[:name :first-name] "Bob"]
[[:name :last-name] "Doe"]])
In that case you can reduce over that list with assoc-in like this:
(reduce #(assoc-in %1 (first %2) (second %2)) person updates)