Using clojure.spec to decompose a map - clojure

I recognize that clojure.spec isn't intended for arbitrary data transformation, and as I understand it, it is intended for flexibly encoding domain knowledge via arbitrary predicates. It's an insanely powerful tool, and I love using it.
So much, perhaps, that I've run into a scenario where I am mergeing maps, component-a and component-b, each of which can take one of many forms, into a composite, and then later wanting to "unmix" the composite into its component parts.
This is modeled as two multi-specs for the components and an s/merge of those components for the composite:
;; component-a
(defmulti component-a :protocol)
(defmethod component-a :p1 [_]
(s/keys :req-un [::x ::y ::z]))
(defmethod component-a :p2 [_]
(s/keys :req-un [::p ::q ::r]))
(s/def ::component-a
(s/multi-spec component-a :protocol))
;; component-b
(defmulti component-b :protocol)
(defmethod component-b :p1 [_]
(s/keys :req-un [::i ::j ::k]))
(defmethod component-b :p2 [_]
(s/keys :req-un [::s ::t]))
(s/def ::component-b
(s/multi-spec component-b :protocol))
;; composite
(s/def ::composite
(s/merge ::component-a ::component-b)
What I'd like to be able to do is the following:
(def p1a {:protocol :p1 :x ... :y ... :z ...})
(def p1b (make-b p1a)) ; => {:protocol :p1 :i ... :j ... :k ...}
(def a (s/conform ::component-a p1a))
(def b (s/conform ::component-b p1b))
(def ab1 (s/conform ::composite (merge a b))
(?Fn ::component-a ab1) ; => {:protocol :p1 :x ... :y ... :z ...}
(?Fn ::component-b ab1) ; => {:protocol :p1 :i ... :j ... :k ...}
(def ab2 {:protocol :p2 :p ... :q ... :r ... :s ... :t ...})
(?Fn ::component-a ab2) ; => {:protocol :p2 :p ... :q ... :r ...}
(?Fn ::component-b ab2) ; => {:protocol :p2 :s ... :t ...}
In other words, I'd like to reuse the domain knowledge encoded for component-a and component-b, to decompose a composite.
My first thought was to isolate the keys themselves from the call to s/keys:
(defmulti component-a :protocol)
(defmethod component-a :p1 [_]
(s/keys :req-un <form>)) ; <form> must look like [::x ::y ::z]
However, approaches where the keys of s/keys are computed from "something else" fail because <form> must be an ISeq. That is, <form> can neither be a fn that computes an ISeq, nor a symbol that represents an ISeq.
I also experimented with using s/describe to read the keys dynamically at run-time, but this doesn't work generally with multi-specs as it would with a simple s/def. I won't say I exhausted this approach, but it seemed like a rabbit hole of recursive s/describes and accessing multifns underlying multi-specs directly, which felt dirty.
I also thought about adding a separate multifn based on :protocol:
(defmulti decompose-composite :protocol)
(defmethod decompose-composite :p1
[composite]
{:component-a (select-keys composite [x y z])
:component-b (select-keys composite [i j k]))
But this obviously doesn't reuse domain knowledge, it just duplicates it and exposes another avenue of applying it. It's also specific to the one composite; we'd need a decompose-other-composite for a different composite.
So at this point this is just a fun puzzle. We could always nest the components in the composite, making them trivial to isolate again:
(s/def ::composite
(s/keys :req-un [::component-a ::component-b]))
(def ab {:component-a a :component-b b})
(do-composite-stuff (apply merge (vals ab)))
But is there a better way to achieve ?Fn? Could a custom s/conformer do something like this? Or are merged maps more like physical mixtures, i.e. disproportionately harder to separate?

I also experimented with using s/describe to read the keys dynamically at run-time, but this doesn't work generally with multi-specs as it would with a simple s/def
A workaround that comes to mind is defining the s/keys specs separate from/outside of the defmethods, then getting the s/keys form back and pulling the keywords out.
;; component-a
(s/def ::component-a-p1-map
(s/keys :req-un [::protocol ::x ::y ::z])) ;; NOTE explicit ::protocol key added
(defmulti component-a :protocol)
(defmethod component-a :p1 [_] ::component-a-p1-map)
(s/def ::component-a
(s/multi-spec component-a :protocol))
;; component-b
(defmulti component-b :protocol)
(s/def ::component-b-p1-map
(s/keys :req-un [::protocol ::i ::j ::k]))
(defmethod component-b :p1 [_] ::component-b-p1-map)
(s/def ::component-b
(s/multi-spec component-b :protocol))
;; composite
(s/def ::composite (s/merge ::component-a ::component-b))
(def p1a {:protocol :p1 :x 1 :y 2 :z 3})
(def p1b {:protocol :p1 :i 4 :j 5 :k 6})
(def a (s/conform ::component-a p1a))
(def b (s/conform ::component-b p1b))
(def ab1 (s/conform ::composite (merge a b)))
With standalone specs for the s/keys specs, you can get the individual keys back using s/form:
(defn get-spec-keys [keys-spec]
(let [unqualify (comp keyword name)
{:keys [req req-un opt opt-un]}
(->> (s/form keys-spec)
(rest)
(apply hash-map))]
(concat req (map unqualify req-un) opt (map unqualify opt-un))))
(get-spec-keys ::component-a-p1-map)
=> (:protocol :x :y :z)
And with that you can use select-keys on the composite map:
(defn ?Fn [spec m]
(select-keys m (get-spec-keys spec)))
(?Fn ::component-a-p1-map ab1)
=> {:protocol :p1, :x 1, :y 2, :z 3}
(?Fn ::component-b-p1-map ab1)
=> {:protocol :p1, :i 4, :j 5, :k 6}
And using your decompose-composite idea:
(defmulti decompose-composite :protocol)
(defmethod decompose-composite :p1
[composite]
{:component-a (?Fn ::component-a-p1-map composite)
:component-b (?Fn ::component-b-p1-map composite)})
(decompose-composite ab1)
=> {:component-a {:protocol :p1, :x 1, :y 2, :z 3},
:component-b {:protocol :p1, :i 4, :j 5, :k 6}}
However, approaches where the keys of s/keys are computed from "something else" fail because must be an ISeq. That is, can neither be a fn that computes an ISeq, nor a symbol that represents an ISeq.
Alternatively, you could eval a programmatically constructed s/keys form:
(def some-keys [::protocol ::x ::y ::z])
(s/form (eval `(s/keys :req-un ~some-keys)))
=> (clojure.spec.alpha/keys :req-un [:sandbox.core/protocol
:sandbox.core/x
:sandbox.core/y
:sandbox.core/z])
And then use some-keys directly later.

Related

Clojure spec maps

Having two following specs:
(s/def ::x keyword?)
(s/def ::y keyword?)
(s/def ::z keyword?)
(s/def ::a
(s/keys :req-un [::x
::y]
:opt-un [::z]))
(s/def ::b
(s/map-of string? string?))
how do I combine ::a and ::b into ::m so the following data is valid:
(s/valid? ::m
{:x :foo
:y :bar
:z :any})
(s/valid? ::m
{:x :foo
:y :bar})
(s/valid? ::m
{:x :foo
:y :bar
:z :baz})
(s/valid? ::m
{:x :foo
:y :bar
:z "baz"})
(s/valid? ::m
{:x :foo
:y :bar
:t "tic"})
additionally, how do I combine ::a and ::b into ::m so the following data is invalid:
(s/valid? ::m
{"r" "foo"
"t" "bar"})
(s/valid? ::m
{:x :foo
"r" "bar"})
(s/valid? ::m
{:x :foo
:y :bar
:r :any})
Neither of :
(s/def ::m (s/merge ::a ::b))
(s/def ::m (s/or :a ::a :b ::b))
works (as expected), but is there a way to match map entries in priority of the spec order?
The way it should work is the following:
take all the map entries of the value (which is a map)
partition the map entries into two sets. One confirming the ::a spec and the other conforming the ::b spec.
The two sub-maps should conform each the relevant spec as a whole. E.g the first partition should have all the required keys.
You can do this by treating the map not as a map but as a collection of map entries, and then validate the map entries. Handling the "required" keys part has to be done by s/and'ing an additional predicate.
(s/def ::x keyword?)
(s/def ::y keyword?)
(s/def ::z keyword?)
(s/def ::entry (s/or :x (s/tuple #{::x} ::x)
:y (s/tuple #{::y} ::y)
:z (s/tuple #{::z} ::z)
:str (s/tuple string? string?)))
(defn req-keys? [m] (and (contains? m :x) (contains? m :y)))
(s/def ::m (s/and map? (s/coll-of ::entry :into {}) req-keys?))

How to create a spec where all keys are optional but at least one of the specified keys should be present?

How am I supposed to create a spec where all keys are optional but at least one of the specified keys should be present?
(s/def ::my-spec (s/and (help-plz??)(s/keys :opt-un [::a ::b])))
(s/valid? ::my-spec {} => false
(s/valid? ::my-spec {:a 1}) => true
(s/valid? ::my-spec {:b 1}) => true
(s/valid? ::my-spec {:a 1 :b 1}) => true
(s/valid? ::my-spec {:A1 :B 1}) => true
With the current spec alpha, in order to use the same key collection for both the keys spec and the at-least-one-exists check, you'll need to use a macro. (The upcoming spec 2 alpha addresses this by exposing more data-driven APIs for creating specs.)
Here's a quick sketch for your particular example:
(defmacro one-or-more-keys [ks]
(let [keyset (set (map (comp keyword name) ks))]
`(s/and (s/keys :opt-un ~ks)
#(some ~keyset (keys %)))))
(s/def ::my-spec (one-or-more-keys [::foo ::bar]))
(s/conform ::my-spec {:bar nil})
=> {:bar nil}
(s/conform ::my-spec {:baz nil})
=> :clojure.spec.alpha/invalid
Alternatively, you could just define the key collection twice, and use a similar predicate with s/and.
Per the docs for keys:
The :req key vector supports 'and' and 'or' for key groups:
(s/keys :req [::x ::y (or ::secret (and ::user ::pwd))] :opt [::z])
Your code should be:
(s/def ::my-spec (s/keys :req-un [(or ::a ::b)]))

How to define Clojure spec for `'(foo (:x 1 :y 2))`

The:
(s/def ::a (s/cat :k keyword? :i int?))
(s/def ::b (s/cat :symbol any?
:a (s/coll-of ::a)))
specs:
(s/conform ::b '(foo ((:x 1) (:y 2))))
The:
(s/def ::a (s/cat :k keyword? :i int?))
(s/def ::b (s/cat :symbol any?
:a (s/* ::a)))
specs:
(s/conform ::b '(foo :x 1 :y 2))
but how do I spec (s/conform ::b '(foo (:x 1 :y 2))) ?
To nest that into a list, you need to wrap it in s/spec. E.g.:
(s/def ::b (s/cat :symbol any? :a (s/spec (s/* ::a))))
This is mentioned in the Spec Guide:
When regex ops are combined, they describe a single sequence. If you need to spec a nested sequential collection, you must use an explicit call to spec to start a new nested regex context. For example to describe a sequence like [:names ["a" "b"] :nums [1 2 3]], you need nested regular expressions to describe the inner sequential data:
(s/def ::nested
(s/cat :names-kw #{:names}
:names (s/spec (s/* string?))
:nums-kw #{:nums}
:nums (s/spec (s/* number?))))
(s/conform ::nested [:names ["a" "b"] :nums [1 2 3]])
;;=> {:names-kw :names, :names ["a" "b"], :nums-kw :nums, :nums [1 2 3]}

Specify content of a submap based on a field

Maybe my question has already been answered but I am stuck with a submap specification.
Imagine I have two possibilities like that
{:type :a
:spec {:name "a"}}
{:type :b
:spec {:id "b"}}
In short: the :spec keys depends on the type. For the type :a, the :spec must contain the field :name and for type :b the spec must contain the field :id.
I tried this:
(s/def ::type keyword?)
(defmulti input-type ::type)
(defmethod input-type :a
[_]
(s/keys :req-un [::name]))
(defmethod input-type :b
[_]
(s/keys :req-un [::id]))
(s/def ::spec (s/multi input-type ::type))
(s/def ::input (s/keys :req-un [::type ::spec]))
This tells me: no method ([:spec nil]).
I think I see why: maybe type is not acccessible.
So I thought to make a multi-spec of a higher level (based on the whole map).
Problem: I do not know how to define :spec based on :type because they have the same name. Do you know how to perform this?
Thanks
(s/def ::type keyword?)
(s/def ::id string?)
(s/def ::name string?)
(s/def :id/spec (s/keys :req-un [::id]))
(s/def :name/spec (s/keys :req-un [::name]))
To accommodate the two different meanings for your :spec map, we can define those in different namespaces: :id/spec and :name/spec. Note that the non-namespace suffix of these keywords are both spec and our keys specs are using un-namespaced keywords. These are "fake" namespaces here, but you could also define these in other, "real" namespaces in your project.
(defmulti input-type :type)
(defmethod input-type :a [_]
(s/keys :req-un [::type :name/spec]))
(defmethod input-type :b [_]
(s/keys :req-un [::type :id/spec]))
(s/def ::input (s/multi-spec input-type :type))
(s/valid? ::input {:type :a, :spec {:name "a"}})
=> true
You can also get samples of this spec:
(gen/sample (s/gen ::input))
=>
({:type :a, :spec {:name ""}}
{:type :b, :spec {:id "aI"}} ...

clojure.spec coll-of alternative types

I'm using clojure.spec to validate a vector of map-entries. The vector looks like:
[{:point {:x 30 :y 30}}
{:point {:x 34 :y 33}}
{:user "joe"}]
I'd like to structure the spec to require 1..N ::point entries and only a single ::user entry.
Here is my (unsuccessful) attempt at structuring this spec:
(s/def ::coord (s/and number? #(>= % 0)))
(s/def ::x ::coord)
(s/def ::y ::coord)
(s/def ::point (s/keys :req-un [::x ::y]))
(s/def ::user (s/and string? seq))
(s/def ::vector-entry (s/or ::pt ::user))
(s/def ::my-vector (s/coll-of ::vector-entry :kind vector))
When I run just the validation of one ::point entry, it works:
spec> (s/valid? ::point {:point {:x 0 :y 0}})
true
spec> (s/valid? ::my-vector [{:point {:x 0 :y 0}}])
false
Any ideas on how to structure the s/or part so the vector entries can be of either ::user or ::point types?
Also, any ideas on how to require one and only one ::user entry and 1..N ::point entries in the vector?
Here is a possible spec for the data in your question:
(require '[clojure.spec.alpha :as s])
(s/def ::coord nat-int?)
(s/def ::x ::coord)
(s/def ::y ::coord)
(s/def ::xy (s/keys :req-un [::x ::y]))
(s/def ::point (s/map-of #{:point} ::xy))
(s/def ::username (s/and string? seq))
(s/def ::user (s/map-of #{:user} ::username))
(s/def ::vector-entry (s/or :point ::point :user ::user))
(s/def ::my-vector (s/coll-of ::vector-entry :kind vector))
(s/valid? ::point {:point {:x 0 :y 0}})
(s/valid? ::my-vector [{:point {:x 0 :y 0}}])
(s/valid? ::my-vector [{:point {:x 0 :y 0}} {:user "joe"}])
A few observations:
An or spec requires that specs be given names.
The labelling of the different items by type :point or :user necessitates a level of indirection, I used map-of on the top and keys for the nested level but there are many choices
The small errors in your specs could be caught early by trying each subform at the REPL.
In this case the relative difficulty of specing the data is a hint that this data shape will be inconvenient for programs, too. Why force a program to do an O(N) search when you know :user is required?
Hope this helps!
While Stuart's answer is very instructive and solves many of your problems, I don't think it covers your criteria of ensuring "one and only one ::user entry."
Riffing off his answer:
(s/def ::coord nat-int?)
(s/def ::x ::coord)
(s/def ::y ::coord)
(s/def ::xy (s/keys :req-un [::x ::y]))
(s/def ::point (s/map-of #{:point} ::xy))
(s/def ::username (s/and string? seq))
(s/def ::user (s/map-of #{:user} ::username))
(s/def ::vector-entry (s/or :point ::point
:user ::user))
(s/def ::my-vector (s/and (s/coll-of ::vector-entry
:kind vector)
(fn [entries]
(= 1
(count (filter (comp #{:user}
key)
entries))))))
(s/valid? ::point {:point {:x 0 :y 0}})
;; => true
(s/valid? ::my-vector [{:point {:x 0 :y 0}}])
;; => false
(s/valid? ::my-vector [{:point {:x 0 :y 0}}
{:user "joe"}])
;; => true
(s/valid? ::my-vector [{:point {:x 0 :y 0}}
{:point {:x 1 :y 1}}
{:user "joe"}])
;; => true
(s/valid? ::my-vector [{:point {:x 0 :y 0}}
{:user "joe"}
{:user "frank"}])
;; => false
The important addition is in the spec for ::my-vector. Note that the conformed output of s/or is a map entry, and that is what is passed to the new custom predicate.
I should note that, while this works, it adds yet another linear scan to your validation. Unfortunately, I don't know if spec provides a good way to do it in a single pass.
The answers by Tim and Stuart solved the issue and were very informative. I would like to point out that there is also a feature of Clojure spec which can be used to specify structure for the vector of points and a user.
Namely, spec allows to use regular expressions to specify sequences. For more information see the Spec guide.
Below is a solution using specs for sequences. This builds on the previous solutions.
(require '[clojure.spec.alpha :as s])
(s/def ::coord nat-int?)
(s/def ::x ::coord)
(s/def ::y ::coord)
(s/def ::xy (s/keys :req-un [::x ::y]))
(s/def ::point (s/map-of #{:point} ::xy))
(s/def ::username (s/and string? seq))
(s/def ::user (s/map-of #{:user} ::username))
(s/def ::my-vector (s/cat :points-before (s/* ::point)
:user ::user
:points-after (s/* ::point)))
(s/valid? ::point {:point {:x 0 :y 0}})
;; => true
(s/valid? ::my-vector [{:point {:x 0 :y 0}}])
;; => false
(s/valid? ::my-vector [{:point {:x 0 :y 0}}
{:user "joe"}])
;; => true
(s/valid? ::my-vector [{:point {:x 0 :y 0}}
{:point {:x 1 :y 1}}
{:user "joe"}])
;; => true
(s/valid? ::my-vector [{:point {:x 0 :y 0}}
{:user "joe"}
{:user "frank"}])
;; => false
(s/valid? ::my-vector [{:point {:x 0 :y 0}}
{:user "joe"}
{:point {:x 1 :y 1}}])
;; => true
This can be easily adapted if the ::user entry is required at the end.