Merge list of maps by UUID - clojure

I have two list of maps
(def map1 ({:book/public-id #uuid "555b6f35-4e8c-42c5-bb80-b4d9147394c0", :book/name "AAA"}
{:book/public-id #uuid "555b6f35-4e8c-42c5-bb80-b4d9147394c3", :book/name "CCC"}))
and
(def map2 ({:book/public-id #uuid "555b6f35-4e8c-42c5-bb80-b4d9147394c3", :book/author "John"}
{:book/public-id #uuid "555b6f35-4e8c-42c5-bb80-b4d9147394c0", :book/author "Alan"}))
and I want to merge this maps by UUID to get following
({:book/public-id #uuid "555b6f35-4e8c-42c5-bb80-b4d9147394c0", :book/name "AAA", :book/author "Alan"}
{:book/public-id #uuid "555b6f35-4e8c-42c5-bb80-b4d9147394c3", :book/name "CCC", :book/author "John"})
What way can I do it?

(defn group-by-id [m]
(->> m
(map (juxt :book/public-id identity))
(into {})))
(vals (merge-with merge (group-by-id map1) (group-by-id map2)))

If you had vectors instead of lists, you can use join to merge these sets on matching values:
user=> (def map1 [{:book/public-id #uuid "555b6f35-4e8c-42c5-bb80-b4d9147394c0", :book/name "AAA"}
#_=> {:book/public-id #uuid "555b6f35-4e8c-42c5-bb80-b4d9147394c3", :book/name "CCC"}])
#'user/map1
user=> (def map2 [{:book/public-id #uuid "555b6f35-4e8c-42c5-bb80-b4d9147394c3", :book/author "John"}
#_=> {:book/public-id #uuid "555b6f35-4e8c-42c5-bb80-b4d9147394c0", :book/author "Alan"}])
#'user/map2
user=> (clojure.set/join map1 map2)
#{{:book/public-id #uuid "555b6f35-4e8c-42c5-bb80-b4d9147394c3", :book/name "CCC", :book/author "John"} {:book/public-id #uuid "555b6f35-4e8c-42c5-bb80-b4d9147394c0", :book/name "AAA", :book/author "Alan"}}

You can use merge to do that. I renamed the vars to make it a little more clear. Also, I used vectors to represent the collection as that is more idiomatic:
(def titles
[{:book/public-id #uuid "555b6f35-4e8c-42c5-bb80-b4d9147394c0"
:book/name "AAA"}
{:book/public-id #uuid "555b6f35-4e8c-42c5-bb80-b4d9147394c3"
:book/name "CCC"}])
(def authors
[{:book/public-id #uuid "555b6f35-4e8c-42c5-bb80-b4d9147394c3"
:book/author "John"}
{:book/public-id #uuid "555b6f35-4e8c-42c5-bb80-b4d9147394c0"
:book/author "Alan"}])
(def prices
[{:book/public-id #uuid "555b6f35-4e8c-42c5-bb80-b4d9147394c3"
:book/price 25}])
I created a prices var in order to show that the vectors don't need to be the same size. The first step would be to group each book info in one structure, and then we can use merge to get one map per book. To do that, we can use group-by:
(def book-info-by-uuid
(group-by :book/public-id (concat titles authors prices)))
Which will give us a map with uuid's as keys and a vector with all the info of each book as values:
{#uuid "555b6f35-4e8c-42c5-bb80-b4d9147394c0"
[{:book/public-id #uuid "555b6f35-4e8c-42c5-bb80-b4d9147394c0"
:book/name "AAA"}
{:book/public-id #uuid "555b6f35-4e8c-42c5-bb80-b4d9147394c0"
:book/author "Alan"}]
#uuid "555b6f35-4e8c-42c5-bb80-b4d9147394c3"
[{:book/public-id #uuid "555b6f35-4e8c-42c5-bb80-b4d9147394c3"
:book/name "CCC"}
{:book/public-id #uuid "555b6f35-4e8c-42c5-bb80-b4d9147394c3"
:book/author "John"}
{:book/public-id #uuid "555b6f35-4e8c-42c5-bb80-b4d9147394c3"
:book/price 25}]}
Finally, we use merge to obtain the result:
(map #(apply merge %)
(vals books-by-uuid))
({:book/public-id #uuid "555b6f35-4e8c-42c5-bb80-b4d9147394c0"
:book/name "AAA"
:book/author "Alan"}
{:book/public-id #uuid "555b6f35-4e8c-42c5-bb80-b4d9147394c3"
:book/name "CCC"
:book/author "John"
:book/price 25})

Related

Spec: partially overriding generators in a map spec

Assuming I have already defined a spec from which I'd like to generate test data:
(s/def :customer/id uuid?)
(s/def :customer/given-name string?)
(s/def :customer/surname string?)
(s/def :customer/age pos?)
(s/def ::customer
(s/keys
:req-un [:customer/id
:customer/given-name
:customer/surname
:customer/age]))
In generating test data, I'd like to override how ids are generated in order to ensure they're from a smaller pool to encourage collisions:
(defn customer-generator
[id-count]
(gen/let [id-pool (gen/not-empty (gen/vector (s/gen :customer/id) id-count))]
(assoc (s/gen ::customer) :id (gen/element id-pool))))
Is there a way I can simplify this by overriding the :customer/id generator in my test code and then just using (s/gen ::customer)? So, something like the following:
(with-generators [:customer/id (gen/not-empty (gen/vector (s/gen :customer/id) id-count)))]
(s/gen ::customer))
Officially, you can override generators for specs by passing an overrides map to s/gen (See the docstring for more details):
(s/def :customer/id uuid?)
(s/def :customer/given-name string?)
(s/def :customer/surname string?)
(s/def :customer/age nat-int?)
(s/def ::customer
(s/keys
:req-un [:customer/id
:customer/given-name
:customer/surname
:customer/age]))
(def fixed-customer-id (java.util.UUID/randomUUID))
fixed-customer-id
;=> #uuid "c73ff5ea-8702-4066-a31d-bc4cc7015811"
(gen/generate (s/gen ::customer {:customer/id #(s/gen #{fixed-customer-id})}))
;=> {:id #uuid "c73ff5ea-8702-4066-a31d-bc4cc7015811",
; :given-name "1042IKQhd",
; :surname "Uw0AzJzj",
; :age 104}
Alternatively, there is a library for such stuff named genman, which I developed before :)
Using it, you can also write as:
(require '[genman.core :as genman :refer [defgenerator]])
(def fixed-customer-id (java.util.UUID/randomUUID))
(genman/with-gen-group :test
(defgenerator :customer/id
(s/gen #{fixed-customer-id})))
(genman/with-gen-group :test
(gen/generate (genman/gen ::customer)))
Clojure spec uses test.check internally to generate sample values. Here is how test.check can be overridden. Whenever trying to write unit tests with a "fake" function, with-redefs is your friend:
(ns tst.demo.core
(:use tupelo.core tupelo.test)
(:require
[clojure.test.check.generators :as gen]
))
(def id-gen gen/uuid)
(dotest
(newline)
(spyx-pretty (take 3 (gen/sample-seq id-gen)))
(newline)
(with-redefs [id-gen (gen/choose 1 5)]
(spyx-pretty (take 33 (gen/sample-seq id-gen))))
(newline)
)
with result:
-----------------------------------
Clojure 1.10.3 Java 15.0.2
-----------------------------------
Testing tst.demo.core
(take 3 (gen/sample-seq id-gen)) =>
[#uuid "cbfea340-1346-429f-ba68-181e657acba5"
#uuid "7c119cf7-0842-4dd0-a23d-f95b6a68f808"
#uuid "ca35cb86-1385-46ad-8fc2-e05cf7a1220a"]
(take 33 (gen/sample-seq id-gen)) =>
[5 4 3 3 2 2 3 1 2 1 4 1 2 2 4 3 5 2 3 5 3 2 3 2 3 5 5 5 5 1 3 2 2]
Example created
using my favorite template project.
Update
Unfortunately, the above technique does not work for Clojure Spec since (s/def ...) uses a global registery of Spec definitions, and is therefore immune to with-redefs. However, we can overcome this definition by simply redefining the desired spec in the unit test namespace like:
(ns tst.demo.core
(:use tupelo.core tupelo.test)
(:require
[clojure.spec.alpha :as s]
[clojure.spec.gen.alpha :as gen]
))
(s/def :app/id (s/int-in 9 99))
(s/def :app/name string?)
(s/def :app/cust (s/keys :req-un [:app/id :app/name]))
(dotest
(newline)
(spyx-pretty (gen/sample (s/gen :app/cust)))
(newline)
(s/def :app/id (s/int-in 2 5)) ; overwrite the definition of :app/id for testing
(spyx-pretty (gen/sample (s/gen :app/cust)))
(newline))
with result
-----------------------------------
Clojure 1.10.3 Java 15.0.2
-----------------------------------
Testing tst.demo.core
(gen/sample (s/gen :app/cust)) =>
[{:id 10, :name ""}
{:id 9, :name "n"}
{:id 10, :name "fh"}
{:id 9, :name "aI"}
{:id 11, :name "8v5F"}
{:id 10, :name ""}
{:id 10, :name "7"}
{:id 10, :name "3m6Wi"}
{:id 13, :name "OG2Qzfqe"}
{:id 10, :name ""}]
(gen/sample (s/gen :app/cust)) =>
[{:id 3, :name ""}
{:id 3, :name ""}
{:id 2, :name "5e"}
{:id 3, :name ""}
{:id 2, :name "y01C"}
{:id 3, :name "l2"}
{:id 3, :name "c"}
{:id 3, :name "pF"}
{:id 4, :name "0yrxyJ7l"}
{:id 4, :name "40"}]
So, it's a little ugly, but the redefinition of :app/id does the trick, and it only takes effect during unit test runs, leaving the main application unaffected.
user> (def ^:dynamic *idgen* (s/gen uuid?))
#'user/*idgen*
user> (s/def :customer/id (s/with-gen uuid? (fn [] ##'*idgen*)))
:customer/id
user> (s/def :customer/age pos-int?)
:customer/age
user> (s/def ::customer (s/keys :req-un [:customer/id :customer/age]))
:user/customer
user> (gen/sample (s/gen ::customer))
({:id #uuid "d18896f1-6199-42bf-9be3-3d0652583902", :age 1}
{:id #uuid "b6209798-4ffa-4e20-9a76-b3a799a31ec6", :age 2}
{:id #uuid "6f9c6400-8d79-417c-bc62-6b4557f7d162", :age 1}
{:id #uuid "47b71396-1b5f-4cf4-bd80-edf4792300c8", :age 2}
{:id #uuid "808692b9-0698-4fb8-a0c5-3918e42e8f37", :age 2}
{:id #uuid "ba663f0a-7c99-4967-a2df-3ec6cb04f514", :age 1}
{:id #uuid "8521b611-c38c-4ea9-ae84-35c8a2d2ff2f", :age 4}
{:id #uuid "c559d48d-4c50-438f-846c-780cdcdf39d5", :age 3}
{:id #uuid "03c2c114-03a0-4709-b9dc-6d326a17b69d", :age 40}
{:id #uuid "14715a50-81c5-48e4-bffe-e194631bb64b", :age 4})
user> (binding [*idgen* (let [idpool (gen/sample (s/gen :customer/id) 5)] (gen/elements idpool))] (gen/sample (s/gen ::customer)))
({:id #uuid "3e64131d-e7ad-4450-993d-fa651339df1c", :age 2}
{:id #uuid "575b2bef-956d-4c42-bdfa-982c7756a33c", :age 1}
{:id #uuid "575b2bef-956d-4c42-bdfa-982c7756a33c", :age 1}
{:id #uuid "3e64131d-e7ad-4450-993d-fa651339df1c", :age 1}
{:id #uuid "1a2eafed-8242-4229-b432-99edb361569d", :age 3}
{:id #uuid "1a2eafed-8242-4229-b432-99edb361569d", :age 1}
{:id #uuid "05bd521a-26f9-46e0-8b26-f798e0bf0452", :age 3}
{:id #uuid "575b2bef-956d-4c42-bdfa-982c7756a33c", :age 19}
{:id #uuid "31b80714-7ae0-40a0-b932-f7b5f078f2ad", :age 2}
{:id #uuid "05bd521a-26f9-46e0-8b26-f798e0bf0452", :age 5})
user>
A little clumsier than what you wanted, but maybe this is adequate.
You are probably better off using binding rather than with-redefs since binding modifies thread-local bindings, whereas with-redefs changes the root binding.
Since this is for generating bad test data, I'd consider avoiding the use of dynamic vars and binding altogether and just use a different spec that is only local to the test env.

What is the Clojure way to transform following data?

So I've just played with Clojure today.
Using this data,
(def test-data
[{:id 35462, :status "COMPLETED", :p 2640000, :i 261600}
{:id 35462, :status "CREATED", :p 240000, :i 3200}
{:id 57217, :status "COMPLETED", :p 470001, :i 48043}
{:id 57217, :status "CREATED", :p 1409999, :i 120105}])
Then transform the above data with,
(as-> test-data input
(group-by :id input)
(map (fn [x] {:id (key x)
:p {:a (as-> (filter #(= (:status %) "COMPLETED") (val x)) tmp
(into {} tmp)
(get tmp :p))
:b (as-> (filter #(= (:status %) "CREATED") (val x)) tmp
(into {} tmp)
(get tmp :p))}
:i {:a (as-> (filter #(= (:status %) "COMPLETED") (val x)) tmp
(into {} tmp)
(get tmp :i))
:b (as-> (filter #(= (:status %) "CREATED") (val x)) tmp
(into {} tmp)
(get tmp :i))}})
input)
(into [] input))
To produce,
[{:id 35462, :p {:a 2640000, :b 240000}, :i {:a 261600, :b 3200}}
{:id 57217, :p {:a 470001, :b 1409999}, :i {:a 48043, :b 120105}}]
But I have a feeling that my code is not the "Clojure way". So my question is, what is the "Clojure way" to achieve what I've produced?
The only things that stand out to me are using as-> when ->> would work just as well, and some work being done redundantly, and some destructuring opportunities:
(defn aggregate [[id values]]
(let [completed (->> (filter #(= (:status %) "COMPLETED") values)
(into {}))
created (->> (filter #(= (:status %) "CREATED") values)
(into {}))]
{:id id
:p {:a (:p completed)
:b (:p created)}
:i {:a (:i completed)
:b (:i created)}}))
(->> test-data
(group-by :id)
(map aggregate))
=>
({:id 35462, :p {:a 2640000, :b 240000}, :i {:a 261600, :b 3200}}
{:id 57217, :p {:a 470001, :b 1409999}, :i {:a 48043, :b 120105}})
However, pouring those filtered values (which are maps themselves) into a map seems suspect to me. This is creating a last-one-wins scenario where the order of your test data affects the output. Try this to see how different orders of test-data affect output:
(into {} (filter #(= (:status %) "COMPLETED") (shuffle test-data)))
It's a pretty odd transformation, keys seem a little arbitrary and it's hard to generalise from n=2 (or indeed to know whether n ever > 2).
I'd use functional decomposition to factor out some of the commonality and get some traction. First of all let us transform the statuses into our keys...
(def status->ab {"COMPLETED" :a "CREATED" :b})
Then, with that in hand, I'd like an easy way of getting the "meat" outof the substructure. Here, for a given key into the data, I'm providing the content of the enclosing map for that key and a given group result.
(defn subgroup->subresult [k subgroup]
(apply array-map (mapcat #(vector (status->ab (:status %)) (k %)) subgroup)))
With this, the main transformer becomes much more tractable:
(defn group->result [group]
{
:id (key group)
:p (subgroup->subresult :p (val group))
:i (subgroup->subresult :i (val group))})
I wouldn't consider generalising across :p and :i for this - if you had more than two keys, then maybe I would generate a map of k -> the subgroup result and do some sort of reducing merge. Anyway, we have an answer:
(map group->result (group-by :id test-data))
;; =>
({:id 35462, :p {:b 240000, :a 2640000}, :i {:b 3200, :a 261600}}
{:id 57217, :p {:b 1409999, :a 470001}, :i {:b 120105, :a 48043}})
There are no one "Clojure way" (I guess you mean functional way) as it depends on how you decompose a problem.
Here is the way I will do:
(->> test-data
(map (juxt :id :status identity))
(map ->nested)
(apply deep-merge)
(map (fn [[id m]]
{:id id
:p (->ab-map m :p)
:i (->ab-map m :i)})))
;; ({:id 35462, :p {:a 2640000, :b 240000}, :i {:a 261600, :b 3200}}
;; {:id 57217, :p {:a 470001, :b 1409999}, :i {:a 48043, :b 120105}})
As you can see, I used a few functions and here is the step-by-step explanation:
Extract index keys (id + status) and the map itself into vector
(map (juxt :id :status identity) test-data)
;; ([35462 "COMPLETED" {:id 35462, :status "COMPLETED", :p 2640000, :i 261600}]
;; [35462 "CREATED" {:id 35462, :status "CREATED", :p 240000, :i 3200}]
;; [57217 "COMPLETED" {:id 57217, :status "COMPLETED", :p 470001, :i 48043}]
;; [57217 "CREATED" {:id 57217, :status "CREATED", :p 1409999, :i 120105}])
Transform into nested map (id, then status)
(map ->nested *1)
;; ({35462 {"COMPLETED" {:id 35462, :status "COMPLETED", :p 2640000, :i 261600}}}
;; {35462 {"CREATED" {:id 35462, :status "CREATED", :p 240000, :i 3200}}}
;; {57217 {"COMPLETED" {:id 57217, :status "COMPLETED", :p 470001, :i 48043}}}
;; {57217 {"CREATED" {:id 57217, :status "CREATED", :p 1409999, :i 120105}}})
Merge nested map by id
(apply deep-merge *1)
;; {35462
;; {"COMPLETED" {:id 35462, :status "COMPLETED", :p 2640000, :i 261600},
;; "CREATED" {:id 35462, :status "CREATED", :p 240000, :i 3200}},
;; 57217
;; {"COMPLETED" {:id 57217, :status "COMPLETED", :p 470001, :i 48043},
;; "CREATED" {:id 57217, :status "CREATED", :p 1409999, :i 120105}}}
For attribute :p and :i, map to :a and :b according to status
(->ab-map {"COMPLETED" {:id 35462, :status "COMPLETED", :p 2640000, :i 261600},
"CREATED" {:id 35462, :status "CREATED", :p 240000, :i 3200}}
:p)
;; => {:a 2640000, :b 240000}
And below are the few helper functions I used:
(defn ->ab-map [m k]
(zipmap [:a :b]
(map #(get-in m [% k]) ["COMPLETED" "CREATED"])))
(defn ->nested [[k & [v & r :as t]]]
{k (if (seq r) (->nested t) v)})
(defn deep-merge [& xs]
(if (every? map? xs)
(apply merge-with deep-merge xs)
(apply merge xs)))
I would approach it more like the following, so it can handle any number of entries for each :id value. Of course, many variations are possible.
(ns tst.demo.core
(:use demo.core tupelo.core tupelo.test)
(:require
[tupelo.core :as t] ))
(dotest
(let [test-data [{:id 35462, :status "COMPLETED", :p 2640000, :i 261600}
{:id 35462, :status "CREATED", :p 240000, :i 3200}
{:id 57217, :status "COMPLETED", :p 470001, :i 48043}
{:id 57217, :status "CREATED", :p 1409999, :i 120105}]
d1 (group-by :id test-data)
d2 (t/forv [[id entries] d1]
{:id id
:status-all (mapv :status entries)
:p-all (mapv :p entries)
:i-all (mapv :i entries)})]
(is= d1
{35462
[{:id 35462, :status "COMPLETED", :p 2640000, :i 261600}
{:id 35462, :status "CREATED", :p 240000, :i 3200}],
57217
[{:id 57217, :status "COMPLETED", :p 470001, :i 48043}
{:id 57217, :status "CREATED", :p 1409999, :i 120105}]})
(is= d2 [{:id 35462,
:status-all ["COMPLETED" "CREATED"],
:p-all [2640000 240000],
:i-all [261600 3200]}
{:id 57217,
:status-all ["COMPLETED" "CREATED"],
:p-all [470001 1409999],
:i-all [48043 120105]}])
))

How to successfully install ubergraph

First I have to say, I'm completely new to clojure, so forgive me if I'm missing something obvious.
I recently installed the clojure package on the atom text editor in order to create some graphs and tried to add ubergraph, an extension that makes weighted graphs possible, since these are not supported in the standard clojure package.
I followed the quickstart guide on ubergraphs github https://github.com/Engelberg/ubergraph and managed to complete the first step (adding ubergraph to leiningen dependencies). I downloaded the git repository and don't know how to carry on from here. Running the example code
(ns example.core
(:require [ubergraph.core :as uber]))
(def graph1
(uber/graph [:a :b] [:a :c] [:b :d]))
on the repl as described on github ends up with the following error:
CompilerException java.lang.NullPointerException, compiling:(ubergraph/core.clj:11:1)
The line that seems to cause the error in core.clj is:
(import-vars
[...])
I skipped over the vars since I don't think they're causing the problem.
Clojure runs on the correct version (1.9.0) and java 8 is installed. Help is appreciated, thanks in advance.
Based on your comment "Also, do I have to put the lib somewhere specific?", this seems to caused by a misunderstanding of how to install a library. You shouldn't be manually dealing with stuff like that; leiningen handles library installation for you.
Here's a quick guide that assumes you haven't created a project yet. If you have, skip to step 2.
Run lein new app you-project-name-here. This will create a barebones project with a project.clj and basic file structure. If you use an IDE like IntelliJ+Cursive, creating a new project will do this step automatically.
Go into your project.clj, and add [ubergraph "0.5.2"] to the :dependencies entry. As a minimal, reduced example, it should look something like:
(defproject example "0.1.0-SNAPSHOT"
:dependencies [[org.clojure/clojure "1.10.0"]
[ubergraph "0.5.2"]]
:main example.core) ; The path to your core
Have your core as something like:
(ns example.core
(:require [ubergraph.core :as uber])
(:gen-class))
(def graph1
(uber/graph [:a :b] [:a :c] [:b :d]))
(defn -main
"I don't do a whole lot ... yet."
[& args]
(println "The graph:" graph1))
Now run lein run. You should see it download the dependencies, then print something like this mess:
The graph: {:node-map {:a #ubergraph.core.NodeInfo{:out-edges {:b #{#ubergraph.core.UndirectedEdge{:id #uuid "0768ef5b-1507-4bb0-b3da-fc14a84d013d", :src :a, :dest :b, :mirror? false}}, :c #{#ubergraph.core.UndirectedEdge{:id #uuid "acddd770-52cc-4b1f-aec1-762861e70ee2", :src :a, :dest :c, :mirror? false}}}, :in-edges {:b #{#ubergraph.core.UndirectedEdge{:id #uuid "0768ef5b-1507-4bb0-b3da-fc14a84d013d", :src :b, :dest :a, :mirror? true}}, :c #{#ubergraph.core.UndirectedEdge{:id #uuid "acddd770-52cc-4b1f-aec1-762861e70ee2", :src :c, :dest :a, :mirror? true}}}, :out-degree 2, :in-degree 2}, :b #ubergraph.core.NodeInfo{:out-edges {:a #{#ubergraph.core.UndirectedEdge{:id #uuid "0768ef5b-1507-4bb0-b3da-fc14a84d013d", :src :b, :dest :a, :mirror? true}}, :d #{#ubergraph.core.UndirectedEdge{:id #uuid "ef931d4e-8143-4cd1-8a10-c3692c47072f", :src :b, :dest :d, :mirror? false}}}, :in-edges {:a #{#ubergraph.core.UndirectedEdge{:id #uuid "0768ef5b-1507-4bb0-b3da-fc14a84d013d", :src :a, :dest :b, :mirror? false}}, :d #{#ubergraph.core.UndirectedEdge{:id #uuid "ef931d4e-8143-4cd1-8a10-c3692c47072f", :src :d, :dest :b, :mirror? true}}}, :out-degree 2, :in-degree 2}, :c #ubergraph.core.NodeInfo{:out-edges {:a #{#ubergraph.core.UndirectedEdge{:id #uuid "acddd770-52cc-4b1f-aec1-762861e70ee2", :src :c, :dest :a, :mirror? true}}}, :in-edges {:a #{#ubergraph.core.UndirectedEdge{:id #uuid "acddd770-52cc-4b1f-aec1-762861e70ee2", :src :a, :dest :c, :mirror? false}}}, :out-degree 1, :in-degree 1}, :d #ubergraph.core.NodeInfo{:out-edges {:b #{#ubergraph.core.UndirectedEdge{:id #uuid "ef931d4e-8143-4cd1-8a10-c3692c47072f", :src :d, :dest :b, :mirror? true}}}, :in-edges {:b #{#ubergraph.core.UndirectedEdge{:id #uuid "ef931d4e-8143-4cd1-8a10-c3692c47072f", :src :b, :dest :d, :mirror? false}}}, :out-degree 1, :in-degree 1}}, :allow-parallel? false, :undirected? true, :attrs {}, :cached-hash #object[clojure.lang.Atom 0x16da1abc {:status :ready, :val -1}]}
I suspect the NPE was because you had installed ubergraph somehow, but didn't allow it to automatically resolve its dependencies. When it tried to run import-vals, one of the libraries it depends on wasn't found, and it threw a fit.

Clojure parse nested vectors

I am looking to transform a clojure tree structure into a map with its dependencies
For example, an input like:
[{:value "A"}
[{:value "B"}
[{:value "C"} {:value "D"}]
[{:value "E"} [{:value "F"}]]]]
equivalent to:
:A
:B
:C
:D
:E
:F
output:
{:A [:B :E] :B [:C :D] :C [] :D [] :E [:F] :F}
I have taken a look at tree-seq and zippers but can't figure it out!
Here's a way to build up the desired map while using a zipper to traverse the tree. First let's simplify the input tree to match your output format (maps of :value strings → keywords):
(def tree
[{:value "A"}
[{:value "B"} [{:value "C"} {:value "D"}]
{:value "E"} [{:value "F"}]]])
(def simpler-tree
(clojure.walk/postwalk
#(if (map? %) (keyword (:value %)) %)
tree))
;; [:A [:B [:C :D] :E [:F]]]
Then you can traverse the tree with loop/recur and clojure.zip/next, using two loop bindings: the current position in tree, and the map being built.
(loop [loc (z/vector-zip simpler-tree)
deps {}]
(if (z/end? loc)
deps ;; return map when end is reached
(recur
(z/next loc) ;; advance through tree
(if (z/branch? loc)
;; for (non-root) branches, add top-level key with direct descendants
(if-let [parent (some-> (z/prev loc) z/node)]
(assoc deps parent (filterv keyword? (z/children loc)))
deps)
;; otherwise add top-level key with no direct descendants
(assoc deps (z/node loc) [])))))
=> {:A [:B :E], :B [:C :D], :C [], :D [], :E [:F], :F []}
This is easy to do using the tupelo.forest library. I reformatted your source data to make it fit into the Hiccup syntax:
(dotest
(let [relationhip-data-hiccup [:A
[:B
[:C]
[:D]]
[:E
[:F]]]
expected-result {:A [:B :E]
:B [:C :D]
:C []
:D []
:E [:F]
:F []} ]
(with-debug-hid
(with-forest (new-forest)
(let [root-hid (tf/add-tree-hiccup relationhip-data-hiccup)
result (apply glue (sorted-map)
(forv [hid (all-hids)]
(let [parent-tag (grab :tag (hid->node hid))
kid-tags (forv [kid-hid (hid->kids hid)]
(let [kid-tag (grab :tag (hid->node kid-hid))]
kid-tag))]
{parent-tag kid-tags})))]
(is= (format-paths (find-paths root-hid [:A]))
[[{:tag :A}
[{:tag :B} [{:tag :C}] [{:tag :D}]]
[{:tag :E} [{:tag :F}]]]])
(is= result expected-result ))))))
API docs are here. The project README (in progress) is here. A video from the 2017 Clojure Conj is here.
You can see the above live code in the project repo.

How best to update this tree?

I've got the following tree:
{:start_date "2014-12-07"
:data {
:people [
{:id 1
:projects [{:id 1} {:id 2}]}
{:id 2
:projects [{:id 1} {:id 3}]}
]
}
}
I want to update the people and projects subtrees by adding a :name key-value pair.
Assuming I have these maps to perform the lookup:
(def people {1 "Susan" 2 "John")
(def projects {1 "Foo" 2 "Bar" 3 "Qux")
How could I update the original tree so that I end up with the following?
{:start_date "2014-12-07"
:data {
:people [
{:id 1
:name "Susan"
:projects [{:id 1 :name "Foo"} {:id 2 :name "Bar"}]}
{:id 2
:name "John"
:projects [{:id 1 :name "Foo"} {:id 3 :name "Qux"}]}
]
}
}
I've tried multiple combinations of assoc-in, update-in, get-in and map calls, but haven't been able to figure this out.
I have used letfn to break down the update into easier to understand units.
user> (def tree {:start_date "2014-12-07"
:data {:people [{:id 1
:projects [{:id 1} {:id 2}]}
{:id 2
:projects [{:id 1} {:id 3}]}]}})
#'user/tree
user> (def people {1 "Susan" 2 "John"})
#'user/people
user> (def projects {1 "Foo" 2 "Bar" 3 "Qux"})
#'user/projects
user>
(defn integrate-tree
[tree people projects]
;; letfn is like let, but it creates fn, and allows forward references
(letfn [(update-person [person]
;; -> is the "thread first" macro, the result of each expression
;; becomes the first arg to the next
(-> person
(assoc :name (people (:id person)))
(update-in [:projects] update-projects)))
(update-projects [all-projects]
(mapv
#(assoc % :name (projects (:id %)))
all-projects))]
(update-in tree [:data :people] #(mapv update-person %))))
#'user/integrate-tree
user> (pprint (integrate-tree tree people projects))
{:start_date "2014-12-07",
:data
{:people
[{:projects [{:name "Foo", :id 1} {:name "Bar", :id 2}],
:name "Susan",
:id 1}
{:projects [{:name "Foo", :id 1} {:name "Qux", :id 3}],
:name "John",
:id 2}]}}
nil
Not sure if entirely the best approach:
(defn update-names
[tree people projects]
(reduce
(fn [t [id name]]
(let [person-idx (ffirst (filter #(= (:id (second %)) id)
(map-indexed vector (:people (:data t)))))
temp (assoc-in t [:data :people person-idx :name] name)]
(reduce
(fn [t [id name]]
(let [project-idx (ffirst (filter #(= (:id (second %)) id)
(map-indexed vector (get-in t [:data :people person-idx :projects]))))]
(if project-idx
(assoc-in t [:data :people person-idx :projects project-idx :name] name)
t)))
temp
projects)))
tree
people))
Just call it with your parameters:
(clojure.pprint/pprint (update-names tree people projects))
{:start_date "2014-12-07",
:data
{:people
[{:projects [{:name "Foo", :id 1} {:name "Bar", :id 2}],
:name "Susan",
:id 1}
{:projects [{:name "Foo", :id 1} {:name "Qux", :id 3}],
:name "John",
:id 2}]}}
With nested reduces
Reduce over the people to update corresponding names
For each people, reduce over projects to update corresponding names
The noisesmith solution looks better since doesn't need to find person index or project index for each step.
Naturally you tried to assoc-in or update-in but the problem lies in your tree structure, since the key path to update John name is [:data :people 1 :name], so your assoc-in code would look like:
(assoc-in tree [:data :people 1 :name] "John")
But you need to find John's index in the people vector before you can update it, same things happens with projects inside.