clojure rename-keys in nested structure - clojure

Suppose I have a nested structure, something like this:
{:data1
{:categories [
{:name "abc" :id 234 :desc "whatever"}
{:name "def" :id 456 :desc "nothing"}]
}
:data2 {...}
:data3 {...}
}
And I need to transform the key names in the maps. I can transform the top level keys like this:
(rename-keys mymap {:data1 :d1})
But I'm not sure how to rename keys nested more deeply in the data structure (say I want to rename the :desc field to :description).
I'm pretty sure that zippers are the answer but can't quite figure out how to do it, or if there's a more straightforward way.

Same as Brian Carper's solution, except the walk namespace already has a specific function for this purpose. All keys at all levels are changed, be they nested inside any sort of collection or seq.
(:use 'clojure.walk)
(def x
{:data1
{:categories
[{:desc "whatever", :name "abc", :id 234}
{:desc "nothing", :name "def", :id 456}]},
:data2
{:categories
[{:desc "whatever", :name "abc", :id 234}
{:desc "nothing", :name "def", :id 456}]}})
(postwalk-replace {:desc :something} x)
{:data1
{:categories
[{:something "whatever", :name "abc", :id 234}
{:something "nothing", :name "def", :id 456}]},
:data2
{:categories
[{:something "whatever", :name "abc", :id 234}
{:something "nothing", :name "def", :id 456}]}}

postwalk is a pretty heavy sledgehammer in general, although it looks from your original question like you might need it. In many cases, you can perform updates in a nested structure with update-in:
user> (let [m {:foo {:deep {:bar 1 :baz 2}}}]
(update-in m [:foo :deep] clojure.set/rename-keys {:baz :periwinkle}))
{:foo {:deep {:periwinkle 2, :bar 1}}}

If you want to rename all :desc keys regardless of at which level of nesting they're located, this might work. If you only want to rename :desc keys at a certain level of nesting, you'll need something slightly more sophisticated.
This only works because clojure.set/rename-keys currently does nothing (returns its first argument untouched) if its first argument isn't a map.
user> (require '[clojure [set :as set] [walk :as walk]])
nil
user> (def x {:data1
{:categories
[{:desc "whatever", :name "abc", :id 234}
{:desc "nothing", :name "def", :id 456}]},
:data2
{:categories
[{:desc "whatever", :name "abc", :id 234}
{:desc "nothing", :name "def", :id 456}]}})
#'user/x
user> (walk/postwalk #(set/rename-keys % {:desc :description :id :ID}) x)
{:data1
{:categories
[{:name "abc", :ID 234, :description "whatever"}
{:name "def", :ID 456, :description "nothing"}]},
:data2
{:categories
[{:name "abc", :ID 234, :description "whatever"}
{:name "def", :ID 456, :description "nothing"}]}}
nil

Related

Spec: partially overriding generators in a map spec

Assuming I have already defined a spec from which I'd like to generate test data:
(s/def :customer/id uuid?)
(s/def :customer/given-name string?)
(s/def :customer/surname string?)
(s/def :customer/age pos?)
(s/def ::customer
(s/keys
:req-un [:customer/id
:customer/given-name
:customer/surname
:customer/age]))
In generating test data, I'd like to override how ids are generated in order to ensure they're from a smaller pool to encourage collisions:
(defn customer-generator
[id-count]
(gen/let [id-pool (gen/not-empty (gen/vector (s/gen :customer/id) id-count))]
(assoc (s/gen ::customer) :id (gen/element id-pool))))
Is there a way I can simplify this by overriding the :customer/id generator in my test code and then just using (s/gen ::customer)? So, something like the following:
(with-generators [:customer/id (gen/not-empty (gen/vector (s/gen :customer/id) id-count)))]
(s/gen ::customer))
Officially, you can override generators for specs by passing an overrides map to s/gen (See the docstring for more details):
(s/def :customer/id uuid?)
(s/def :customer/given-name string?)
(s/def :customer/surname string?)
(s/def :customer/age nat-int?)
(s/def ::customer
(s/keys
:req-un [:customer/id
:customer/given-name
:customer/surname
:customer/age]))
(def fixed-customer-id (java.util.UUID/randomUUID))
fixed-customer-id
;=> #uuid "c73ff5ea-8702-4066-a31d-bc4cc7015811"
(gen/generate (s/gen ::customer {:customer/id #(s/gen #{fixed-customer-id})}))
;=> {:id #uuid "c73ff5ea-8702-4066-a31d-bc4cc7015811",
; :given-name "1042IKQhd",
; :surname "Uw0AzJzj",
; :age 104}
Alternatively, there is a library for such stuff named genman, which I developed before :)
Using it, you can also write as:
(require '[genman.core :as genman :refer [defgenerator]])
(def fixed-customer-id (java.util.UUID/randomUUID))
(genman/with-gen-group :test
(defgenerator :customer/id
(s/gen #{fixed-customer-id})))
(genman/with-gen-group :test
(gen/generate (genman/gen ::customer)))
Clojure spec uses test.check internally to generate sample values. Here is how test.check can be overridden. Whenever trying to write unit tests with a "fake" function, with-redefs is your friend:
(ns tst.demo.core
(:use tupelo.core tupelo.test)
(:require
[clojure.test.check.generators :as gen]
))
(def id-gen gen/uuid)
(dotest
(newline)
(spyx-pretty (take 3 (gen/sample-seq id-gen)))
(newline)
(with-redefs [id-gen (gen/choose 1 5)]
(spyx-pretty (take 33 (gen/sample-seq id-gen))))
(newline)
)
with result:
-----------------------------------
Clojure 1.10.3 Java 15.0.2
-----------------------------------
Testing tst.demo.core
(take 3 (gen/sample-seq id-gen)) =>
[#uuid "cbfea340-1346-429f-ba68-181e657acba5"
#uuid "7c119cf7-0842-4dd0-a23d-f95b6a68f808"
#uuid "ca35cb86-1385-46ad-8fc2-e05cf7a1220a"]
(take 33 (gen/sample-seq id-gen)) =>
[5 4 3 3 2 2 3 1 2 1 4 1 2 2 4 3 5 2 3 5 3 2 3 2 3 5 5 5 5 1 3 2 2]
Example created
using my favorite template project.
Update
Unfortunately, the above technique does not work for Clojure Spec since (s/def ...) uses a global registery of Spec definitions, and is therefore immune to with-redefs. However, we can overcome this definition by simply redefining the desired spec in the unit test namespace like:
(ns tst.demo.core
(:use tupelo.core tupelo.test)
(:require
[clojure.spec.alpha :as s]
[clojure.spec.gen.alpha :as gen]
))
(s/def :app/id (s/int-in 9 99))
(s/def :app/name string?)
(s/def :app/cust (s/keys :req-un [:app/id :app/name]))
(dotest
(newline)
(spyx-pretty (gen/sample (s/gen :app/cust)))
(newline)
(s/def :app/id (s/int-in 2 5)) ; overwrite the definition of :app/id for testing
(spyx-pretty (gen/sample (s/gen :app/cust)))
(newline))
with result
-----------------------------------
Clojure 1.10.3 Java 15.0.2
-----------------------------------
Testing tst.demo.core
(gen/sample (s/gen :app/cust)) =>
[{:id 10, :name ""}
{:id 9, :name "n"}
{:id 10, :name "fh"}
{:id 9, :name "aI"}
{:id 11, :name "8v5F"}
{:id 10, :name ""}
{:id 10, :name "7"}
{:id 10, :name "3m6Wi"}
{:id 13, :name "OG2Qzfqe"}
{:id 10, :name ""}]
(gen/sample (s/gen :app/cust)) =>
[{:id 3, :name ""}
{:id 3, :name ""}
{:id 2, :name "5e"}
{:id 3, :name ""}
{:id 2, :name "y01C"}
{:id 3, :name "l2"}
{:id 3, :name "c"}
{:id 3, :name "pF"}
{:id 4, :name "0yrxyJ7l"}
{:id 4, :name "40"}]
So, it's a little ugly, but the redefinition of :app/id does the trick, and it only takes effect during unit test runs, leaving the main application unaffected.
user> (def ^:dynamic *idgen* (s/gen uuid?))
#'user/*idgen*
user> (s/def :customer/id (s/with-gen uuid? (fn [] ##'*idgen*)))
:customer/id
user> (s/def :customer/age pos-int?)
:customer/age
user> (s/def ::customer (s/keys :req-un [:customer/id :customer/age]))
:user/customer
user> (gen/sample (s/gen ::customer))
({:id #uuid "d18896f1-6199-42bf-9be3-3d0652583902", :age 1}
{:id #uuid "b6209798-4ffa-4e20-9a76-b3a799a31ec6", :age 2}
{:id #uuid "6f9c6400-8d79-417c-bc62-6b4557f7d162", :age 1}
{:id #uuid "47b71396-1b5f-4cf4-bd80-edf4792300c8", :age 2}
{:id #uuid "808692b9-0698-4fb8-a0c5-3918e42e8f37", :age 2}
{:id #uuid "ba663f0a-7c99-4967-a2df-3ec6cb04f514", :age 1}
{:id #uuid "8521b611-c38c-4ea9-ae84-35c8a2d2ff2f", :age 4}
{:id #uuid "c559d48d-4c50-438f-846c-780cdcdf39d5", :age 3}
{:id #uuid "03c2c114-03a0-4709-b9dc-6d326a17b69d", :age 40}
{:id #uuid "14715a50-81c5-48e4-bffe-e194631bb64b", :age 4})
user> (binding [*idgen* (let [idpool (gen/sample (s/gen :customer/id) 5)] (gen/elements idpool))] (gen/sample (s/gen ::customer)))
({:id #uuid "3e64131d-e7ad-4450-993d-fa651339df1c", :age 2}
{:id #uuid "575b2bef-956d-4c42-bdfa-982c7756a33c", :age 1}
{:id #uuid "575b2bef-956d-4c42-bdfa-982c7756a33c", :age 1}
{:id #uuid "3e64131d-e7ad-4450-993d-fa651339df1c", :age 1}
{:id #uuid "1a2eafed-8242-4229-b432-99edb361569d", :age 3}
{:id #uuid "1a2eafed-8242-4229-b432-99edb361569d", :age 1}
{:id #uuid "05bd521a-26f9-46e0-8b26-f798e0bf0452", :age 3}
{:id #uuid "575b2bef-956d-4c42-bdfa-982c7756a33c", :age 19}
{:id #uuid "31b80714-7ae0-40a0-b932-f7b5f078f2ad", :age 2}
{:id #uuid "05bd521a-26f9-46e0-8b26-f798e0bf0452", :age 5})
user>
A little clumsier than what you wanted, but maybe this is adequate.
You are probably better off using binding rather than with-redefs since binding modifies thread-local bindings, whereas with-redefs changes the root binding.
Since this is for generating bad test data, I'd consider avoiding the use of dynamic vars and binding altogether and just use a different spec that is only local to the test env.

Finding a regular-expression-like sequence of values in a Clojure vector

I am using the libpostal library to find an full address (street, city, state, and postal code) within a news article. libpostal when given input text:
There was an accident at 5 Main Street Boulder, CO 10566 -- which is at the corner of Wilson.
returns a vector:
[{:label "house", :value "there was an accident at 5"}
{:label "road", :value "main street"}
{:label "city", :value "boulder"}
{:label "state", :value "co"}
{:label "postcode", :value "10566"}
{:label "road", :value "which is at the corner of wilson."}
I am wondering if there is a clever way in Clojure to extract a sequence where the :label values occur in a sequence:
[road unit? level? po_box? city state postcode? country?]
where ? represents an optional value in the match.
You could do this with clojure.spec. First define some specs that match your maps' :label values:
(defn has-label? [m label] (= label (:label m)))
(s/def ::city #(has-label? % "city"))
(s/def ::postcode #(has-label? % "postcode"))
(s/def ::state #(has-label? % "state"))
(s/def ::house #(has-label? % "house"))
(s/def ::road #(has-label? % "road"))
Then define a regex spec e.g. s/cat + s/?:
(s/def ::valid-seq
(s/cat :road ::road
:city (s/? ::city) ;; ? = zero or once
:state ::state
:zip (s/? ::postcode)))
Now you can conform or valid?-ate your sequences:
(s/conform ::valid-seq [{:label "road" :value "Damen"}
{:label "city" :value "Chicago"}
{:label "state" :value "IL"}])
=>
{:road {:label "road", :value "Damen"},
:city {:label "city", :value "Chicago"},
:state {:label "state", :value "IL"}}
;; this is also valid, missing an optional value in the middle
(s/conform ::valid-seq [{:label "road" :value "Damen"}
{:label "state" :value "IL"}
{:label "postcode" :value "60622"}])
=>
{:road {:label "road", :value "Damen"},
:state {:label "state", :value "IL"},
:zip {:label "postcode", :value "60622"}}

Clojure: how to use compare with set/union [duplicate]

This question already has answers here:
Custom equality in Clojure distinct
(3 answers)
Closed 5 years ago.
For the sake of example, let's assume I have two sets:
(def set-a #{{:id 1 :name "ABC" :zip 78759} {:id 2 :name "DEF" :zip 78759}})
(def set-b #{{:id 1 :name "ABC" :zip 78753} {:id 3 :name "XYZ" :zip 78704}})
I would like to find an union between the sets, using only :id and :name fields. However, with out using a custom comparator I get four elements in the set, because :zip field is different.
(clojure.set/union set-a set-b)
#{{:id 3, :name "XYZ", :zip 78704} {:id 1, :name "ABC", :zip 78753}
{:id 1, :name "ABC", :zip 78759} {:id 2, :name "DEF", :zip 78759}}
What is the idomatic way of finding union between two sets using a custom comparator or compare?
You could use group-by to do this:
(map first (vals (group-by (juxt :id :name) (concat set-a set-b))))
Or threaded:
(->> (concat set-a set-b)
(group-by (juxt :id :name))
(vals)
(map first))
This is grouping your elements by a combination of their key/values i.e. (juxt :id :name). Then it grabs the values of the produced map, then maps first over that to get the first item in each grouping.
Or use some code specifically built for this like distinct-by.
Note these approaches apply to any collection, not just sets.
If you don't mind throwing :zip away entirely, consider using clojure.set/project.
(clojure.set/union
(clojure.set/project set-a [:id :name])
(clojure.set/project set-b [:id :name]))
#{{:id 3, :name "XYZ"} {:id 2, :name "DEF"} {:id 1, :name "ABC"}}

How best to update this tree?

I've got the following tree:
{:start_date "2014-12-07"
:data {
:people [
{:id 1
:projects [{:id 1} {:id 2}]}
{:id 2
:projects [{:id 1} {:id 3}]}
]
}
}
I want to update the people and projects subtrees by adding a :name key-value pair.
Assuming I have these maps to perform the lookup:
(def people {1 "Susan" 2 "John")
(def projects {1 "Foo" 2 "Bar" 3 "Qux")
How could I update the original tree so that I end up with the following?
{:start_date "2014-12-07"
:data {
:people [
{:id 1
:name "Susan"
:projects [{:id 1 :name "Foo"} {:id 2 :name "Bar"}]}
{:id 2
:name "John"
:projects [{:id 1 :name "Foo"} {:id 3 :name "Qux"}]}
]
}
}
I've tried multiple combinations of assoc-in, update-in, get-in and map calls, but haven't been able to figure this out.
I have used letfn to break down the update into easier to understand units.
user> (def tree {:start_date "2014-12-07"
:data {:people [{:id 1
:projects [{:id 1} {:id 2}]}
{:id 2
:projects [{:id 1} {:id 3}]}]}})
#'user/tree
user> (def people {1 "Susan" 2 "John"})
#'user/people
user> (def projects {1 "Foo" 2 "Bar" 3 "Qux"})
#'user/projects
user>
(defn integrate-tree
[tree people projects]
;; letfn is like let, but it creates fn, and allows forward references
(letfn [(update-person [person]
;; -> is the "thread first" macro, the result of each expression
;; becomes the first arg to the next
(-> person
(assoc :name (people (:id person)))
(update-in [:projects] update-projects)))
(update-projects [all-projects]
(mapv
#(assoc % :name (projects (:id %)))
all-projects))]
(update-in tree [:data :people] #(mapv update-person %))))
#'user/integrate-tree
user> (pprint (integrate-tree tree people projects))
{:start_date "2014-12-07",
:data
{:people
[{:projects [{:name "Foo", :id 1} {:name "Bar", :id 2}],
:name "Susan",
:id 1}
{:projects [{:name "Foo", :id 1} {:name "Qux", :id 3}],
:name "John",
:id 2}]}}
nil
Not sure if entirely the best approach:
(defn update-names
[tree people projects]
(reduce
(fn [t [id name]]
(let [person-idx (ffirst (filter #(= (:id (second %)) id)
(map-indexed vector (:people (:data t)))))
temp (assoc-in t [:data :people person-idx :name] name)]
(reduce
(fn [t [id name]]
(let [project-idx (ffirst (filter #(= (:id (second %)) id)
(map-indexed vector (get-in t [:data :people person-idx :projects]))))]
(if project-idx
(assoc-in t [:data :people person-idx :projects project-idx :name] name)
t)))
temp
projects)))
tree
people))
Just call it with your parameters:
(clojure.pprint/pprint (update-names tree people projects))
{:start_date "2014-12-07",
:data
{:people
[{:projects [{:name "Foo", :id 1} {:name "Bar", :id 2}],
:name "Susan",
:id 1}
{:projects [{:name "Foo", :id 1} {:name "Qux", :id 3}],
:name "John",
:id 2}]}}
With nested reduces
Reduce over the people to update corresponding names
For each people, reduce over projects to update corresponding names
The noisesmith solution looks better since doesn't need to find person index or project index for each step.
Naturally you tried to assoc-in or update-in but the problem lies in your tree structure, since the key path to update John name is [:data :people 1 :name], so your assoc-in code would look like:
(assoc-in tree [:data :people 1 :name] "John")
But you need to find John's index in the people vector before you can update it, same things happens with projects inside.

Add items from collection 1 to collection 2, if collection 2 doesn't contain item from collection 1

I've got two maps:
(def people {:1 "John" :2 "Paul" :3 "Ringo" :4 "George"})
(def band
{:data
{:members
{:1 {:id 1 :name "John"}
:2 {:id 2 :name "Paul"}}}})
I want to loop over people and add any members that don't exist in [:data :members] to band, resulting in:
(def band
{:data
{:members
{:1 {:id 1 :name "John"}
:2 {:id 2 :name "Paul"}
:3 {:id 3 :name "Ringo"}
:4 {:id 4 :name "George"}}}})
Here's what I've tried:
(for [[id name] people]
(when-not
(contains? (get-in band [:data :members]) id)
(assoc-in band [:data :members id] {:id id :name name})))
Which yields:
({:data
{:members
{:4 {:id :4, :name "George"},
:1 {:name "John", :id 1},
:2 {:name "Paul", :id 2}}}}
nil
nil
{:data
{:members
{:1 {:name "John", :id 1},
:2 {:name "Paul", :id 2},
:3 {:id :3, :name "Ringo"}}}})
I'm not sure why I'm getting back what looks to be a list of each mutation of band. What am I doing wrong here? How can I add the missing members of people to band [:data :members]?
To be pedantic you aren't getting back any mutation of band. In fact, one of the most important features of Clojure is that the standard types are immutible, and the primary collection operations return a modified copy without changing the original.
Also, for in Clojure is not a loop, it is a list comprehension. This is why it always returns a sequence of each step. So instead of altering an input one step at a time, you made a new variation on the input for each step, each derived from the immutable original.
The standard construct for making a series of updated copies of an input based on a sequence of values is reduce, which passes a new version of the accumulator and each element of the list to your function.
Finally, you are misunderstanding the role of :keyword syntax - prefixing an item with a : is not needed in order to construct map keys - just about any clojure value is a valid key for a map, and keywords are just a convenient idiom.
user=> (def band
{:data
{:members
{1 {:id 1 :name "John"}
2 {:id 2 :name "Paul"}}}})
#'user/band
user=> (def people {1 "John" 2 "Paul" 3 "Ringo" 4 "George"})
#'user/people
user=> (pprint
(reduce (fn [band [id name :as person]]
(if-not (contains? (get-in band [:data :members]) id)
(assoc-in band [:data :members id] {:id id :name name})
band))
band
people))
{:data
{:members
{3 {:id 3, :name "Ringo"},
4 {:id 4, :name "George"},
1 {:name "John", :id 1},
2 {:name "Paul", :id 2}}}}
nil
You may notice the body of the fn passed to reduce is essentially the same as the body of your for comprehension. The difference is that instead of when-not which returns nil on the alternate case, I use if-not, which allows us to propagate the accumulator (here called band, same as the input) regardless of whether any new version of it is made.