defrecord argument filtrering - clojure

Let's say I've defined a record like this:
(defrecord MyRecord [x y z])
And I construct it like this:
(def test (map->MyRecord {:x "1" :y "2" :z "3" :w "ikk"}))
I can do like this:
(:w test) ; Returns "ikk"
Why is :w retained? I'm thinking that since I've created a record that takes x, y and z these are the only "keys" that should be present.
Is there a good way to exclude keys that are not present as arguments in the record declaration without using select-keys?
For example:
(defrecord MyRecord1 [x y z])
(defrecord MyRecord2 [x y w])
(defprotocol MyProtocol
(do-stuff [data]))
(extend-protocol MyProtocol
MyRecord1
(do-stuff [data]
(let [data (select-keys data [:x :y :z])] ; (S1)
...))
MyRecord2
(do-stuff [data]
(let [data (select-keys data [:x :y :w])] ; (S2)
...)))
I want to avoid doing select-keys (S1, S2) manually for each record when I use MyProtocol when records are constructed using map-> with additional data (that I don't care about).

Clojure records implement IPersistentMap (as well as java.util.Map for Java interop), and behaves like a normal map - this means that you can use them wherever and in the same way you would use maps. You can look at a record as a typed map - it's a map, but you can easily do dispatch using multimethods and protocols.
This makes it very easy to start representing your data using plain maps, and then advance to a record when you need the type.
As maps, records support additional keys, but they are treated differently. With your example, (.x test) works, but (.w test) does not, since only the predefined keys become fields in the implementing Java class.
To avoid extra keys, just make your own constructor:
(defn limiting-map->MyRecord
[m]
(map->MyRecord
(select-keys m [:x :y :z])))

It is a feature of defrecord. See: http://clojure.org/datatypes Section deftype and defrecord:
defrecord provides a complete implementation of a persistent map, including:
...
extensible fields (you can assoc keys not supplied with the defrecord definition)
Via Reflection you can see, that your x,y,z params are regular attributes of the object:
user=> (>pprint (.? (map->MyRecord {:w 4})))
(#[__extmap :: (user.MyRecord) | java.lang.Object]
;...
#[x :: (user.MyRecord) | java.lang.Object]
#[y :: (user.MyRecord) | java.lang.Object]
#[z :: (user.MyRecord) | java.lang.Object])
And the additional values are stored in that __extmap:
user=> (.-__extmap (map->MyRecord {:w 4}))
{:w 4}
This means, that there is nothing left for you other than watch out on the places you want to deal with your record as a Map, since new keys can be added at any time:
user=> (let [r (->MyRecord 1 2 3) r (assoc r :w 4)] (keys r))
(:x :y :z :w)
So if you find yourself repeating code like (select-keys myr [:x :y :z]) then extract that as a helper fn.
Adding things like your own c'tors is always a good idea (e.g. if you want to have 0 for missing keys instead of nil e.g.), but this only protects you from yourself and the users following your example.

Related

Inverse process of :keys destructuring: construct map from sequence

The more I write in Clojure, the more I come across the following sort of pattern:
(defn mapkeys [foo bar baz]
{:foo foo, :bar bar, :baz baz})
In a certain sense, this looks like the inverse process that a destructuring like
(let [{:keys [foo bar baz]}] ... )
would achieve.
Is there a "built-in" way in Clojure to achieve something similar to the above mapkeys (mapping name to keyword=>value) - perhaps for an arbitrary length list of names?
No such thing is built in, because it doesn't need to be. Unlike destructuring, which is fairly involved, constructing maps is very simple in Clojure, and so fancy ways of doing it are left for ordinary libraries. For example, I long ago wrote flatland.useful.map/keyed, which mirrors the three modes of map destructuring:
(let [transforms {:keys keyword
:strs str
:syms identity}]
(defmacro keyed
"Create a map in which, for each symbol S in vars, (keyword S) is a
key mapping to the value of S in the current scope. If passed an optional
:strs or :syms first argument, use strings or symbols as the keys instead."
([vars] `(keyed :keys ~vars))
([key-type vars]
(let [transform (comp (partial list `quote)
(transforms key-type))]
(into {} (map (juxt transform identity) vars))))))
But if you only care about keywords, and don't demand a docstring, it could be much shorter:
(defmacro keyed [names]
(into {}
(for [n names]
[(keyword n) n])))
I find that I quite frequently want to either construct a map from individual values or destructure a map to retrieve individual values. In the Tupelo Library I have a handy pair of functions for this purpose that I use all the time:
(ns tst.demo.core
(:use demo.core tupelo.core tupelo.test))
(dotest
(let [m {:a 1 :b 2 :c 3}]
(with-map-vals m [a b c]
(spyx a)
(spyx b)
(spyx c)
(spyx (vals->map a b c)))))
with result
; destructure a map into values
a => 1
b => 2
c => 3
; construct a map
(vals->map a b c) => {:a 1, :b 2, :c 3}
P.S. Of course I know you can destructure with the :keys syntax, but it always seemed a bit non-intuitive to me.

why records are not functions?

Unlike maps, records are not functions. Why?
;; maps are functions of the keys
({:a 1} :a) ;; 1
({:a 1}) ;; error
({:a 1} 1) ;; nil
;; records? no, records are not functions
(defrecord T [t])
((->T 1) :t) ;; error: T cannot be cast to clojure.lang.IFn
(:t (->T 1)) ;; 1
Others have answered this, but here's how you make one of your defrecord types implement the IFn interface:
user> (defrecord Blah
[x y]
clojure.lang.IFn
(invoke [o arg] (arg o)))
user> (let [obj (->Blah 1 2)]
[(obj :x) (obj :y)])
[1 2]
In deftype and defrecord, Rich writes "defrecord provides a complete implementation of a persistent map ...". So records ought to work as functions?
NO. IPersistentMap, the interface for persistent maps, does not implement IFn, the interface for Clojure functions.
However, consistency would be nice: converting maps to records ought to produce as few surprises as possible. Having said that ...
We use records only where the keys are known keywords, hence the
keyword-as-function syntax has always been available.
Creating a record is nothing like creating a map. This is a bigger
inconsistency, I think.
I hope this meandering maundering helps.

clojure same ":or" value for all keys

I've defined a record with a bunch of fields--some of which are computed, some of which don't map directly to keys in the JSON data I'm ingesting. I'm writing a factory function for it, but I want to have sensible default/not-found values. Is there a better way that tacking on :or [field1 "" field2 "" field3 "" field4 ""...]? I could write a macro but I'd rather not if I don't have to.
There are three common idioms for implementing defaults in constructor functions.
:or destructoring
Example:
(defn make-creature [{:keys [type name], :or {type :human
name (str "unnamed-" (name type))}}]
;; ...
)
This is useful when you want to specify the defaults inline. As a bonus, it allows let style bindings in the :or map where the kvs are ordered according to the :keys vector.
Merging
Example:
(def default-creature-spec {:type :human})
(defn make-creature [spec]
(let [spec (merge default-creature-spec
spec)]
;; ....
))
This is useful when you want to define the defaults externally, generate them at runtime and/or reuse them elsewhere.
Simple or
Example:
(defn make-creature [{:keys [type name]}]
(let [type (or type :human)
name (or name (str "unnamed-" (name type)))]
;; ...
))
This is as useful as :or destructoring but only those defaults are evaluated that are actually needed, i. e. it should be used in cases where computing the default adds unwanted overhead. (I don't know why :or evaluates all defaults (as of Clojure 1.7), so this is a workaround).
If you really want the same default value for all the fields, and they really have to be different than nil, and you don't want to write them down again, then you can get the record fields by calling keys on an empty instance, and then construct a map with the default values merged with the actual values:
(defrecord MyFancyRecord [a b c d])
(def my-fancy-record-fields (keys (map->MyFancyRecord {})))
;=> (:a :b :c :d)
(def default-fancy-fields (zipmap my-fancy-record-fields (repeat "")))
(defn make-fancy-record [fields]
(map->MyFancyRecord (merge default-fancy-fields
fields)))
(make-fancy-record {})
;=> {:a "", :b "", :c "", :d ""}
(make-fancy-record {:a 1})
;=> {:a 1, :b "", :c "", :d ""}
To get the list of record fields you could also use the static method getBasis on your record class:
(def my-fancy-record-fields (map keyword (MyFancyRecord/getBasis)))
(getBasis is not part of the public records api so there are no guarantees it won't be removed in future clojure versions. Right now it's available in both clojure and clojurescript, it's usage is explained in "Clojure programming by Chas Emerick, Brian Carper, Christophe Grand" and it's also mentioned in this thread during a discussion about how to get the keys from a record. So, it's up to you to decide if it's a good idea to use it)

Reusing destructuring for multimethods

Is there any way to reuse a destructuring between multiple methods in a multimethod?
(defmulti foo (fn [x] (:a x)))
(defmethod foo :1 [{:keys [a b c d e]}] (str a b c d e))
(defmethod foo :2 [a] "")
(defmethod foo :3 [a] "")
Now this is a trivial example, but imagine we have a much more complicated destructuring with nested maps and I want to use it on all my defmethods for foo. How would I do that?
A practical solution would be to only use the keys that you need for each individual method. An important thing to note about destructuring is that you don't have to bind every value in the collection you're destructuring. Let's say every map passed to this multimethod contains the keys :a through :e, but you only need a couple of those keys per method. You could do something like this:
; note: a keyword can act as a function; :a here is equivalent to (fn [x] (:a x))
(defmulti foo :a)
(defmethod foo :1 [{:keys [a b c d e]}] (str a b c d e))
(defmethod foo :2 [{:keys [b d]}] (str b d))
(defmethod foo :3 [{:keys [c e a]}] (str a c e))
If you have a complicated nested structure and you want to grab specific values, you can just leave out the keys you don't need, or alternatively, depending on your use case, a let binding within the function definition might end up being easier to read. Steve Losh's Caves of Clojure comes to mind -- in writing a roguelike text adventure game from scratch in Clojure, he used nested maps to represent the state of a game. Initially he wrote some of the functions using destructuring to access the inner bits of the "game state" map, e.g.:
(defmethod draw-ui :play [ui {{:keys [tiles]} :world :as game} screen]
...
But then later, he decided to make this code more readable by pulling the destructuring out into a let binding:
(defmethod draw-ui :play [ui game screen]
(let [world (:world game)
tiles (:tiles world)
...
The point is, if you're working with a deeply nested structure and you want to keep your code simple (especially if you're writing a multimethod with several methods taking that same structure as an argument), you may want to avoid using destructuring and just use let bindings to grab the pieces you want. get-in is a good tool for concisely getting values from nested collections. Going back to the Caves of Clojure example, if Steve just needed the tiles, he could have done something like this:
(defmethod draw-ui :play [ui game screen]
(let [tiles (get-in game [:world :tiles])
...
Personally, I find that much easier to read than mucking up the function arguments with {{:keys [tiles]} :world :as game}.
EDIT:
If you really want to avoid having to repeat the destructuring for each multimethod, and you want each method to have the same bindings available, you could write a macro:
(defmulti foo :a)
(defmacro deffoomethod [dispatch-val & body]
`(defmethod foo ~dispatch-val [{:keys [~'a ~'b ~'c ~'d ~'e]}]
~#body))
(deffoomethod 1 (str a b c d e))
(deffoomethod 2 (str b d))
(deffoomethod 3 (str a c e))
(foo {:a 1 :b 2 :c 3 :d 4 :e 5})
;=> "12345"
(foo {:a 2 :b \h :d \i})
;=> "hi"
(foo {:a 3 :b \x :c 0 :d \x :e 0})
;=> "300"
I wouldn't recommend this approach, though, as it breaks macro hygiene. Anyone using this macro has to remember that it binds the symbols a through e to the corresponding keys in the argument, and that could be problematic.

what advantage is there to use 'get' instead to access a map

Following up from this question: Idiomatic clojure map lookup by keyword
Map access using clojure can be done in many ways.
(def m {:a 1}
(get m :a) ;; => 1
(:a m) ;; => 1
(m :a) ;; => 1
I know I use mainly the second form, and sometimes the third, rarely the first. what are the advantages (speed/composability) of using each?
get is useful when the map could be nil or not-a-map, and the key could be something non-callable (i.e. not a keyword)
(def m nil)
(def k "some-key")
(m k) => NullPointerException
(k m) => ClassCastException java.lang.String cannot be cast to clojure.lang.IFn
(get m k) => nil
(get m :foo :default) => :default
From the clojure web page we see that
Maps implement IFn, for invoke() of one argument (a key) with an
optional second argument (a default value), i.e. maps are functions of
their keys. nil keys and values are ok.
Sometimes it is rewarding to take a look under the hoods of Clojure. If you look up what invoke looks like in a map, you see this:
https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/APersistentMap.java#L196
It apparently calls the valAt method of a map.
If you look at what the get function does when called with a map, this is a call to clojure.lang.RT.get, and this really boils down to the same call to valAt for a map (maps implement ILookUp because they are Associatives):
https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/RT.java#L634.
The same is true for a map called with a key and a not-found-value. So, what is the advantage? Since both ways boil down to pretty much the same, performance wise I would say nothing. It's just syntactic convenience.
You can pass get to partial etc. to build up HOFs for messing with your data, though it doesn't come up often.
user=> (def data {"a" 1 :b 2})
#'user/data
user=> (map (partial get data) (keys data))
(1 2)
I use the third form a lot when the data has strings as keys
I don't think there is a speed difference, and even if that would be the case, that would be an implementation detail.
Personally I prefer the second option (:a m) because it sometimes makes code a bit easier on the eye. For example, I often have to iterate through a sequence of maps:
(def foo '({:a 1} {:a 2} {:a 3}))
If I want to filter all values of :a I can now use:
(map :a foo)
Instead of
(map #(get % :a) foo)
or
(map #(% :a) foo)
Of course this is a matter of personal taste.
To add to the list, get is also useful when using the threading macro -> and you need to access via a key that is not a keyword
(let [m {"a" :a}]
(-> m
(get "a")))
One advantage of using the keyword first approach is it is the most concise way of accessing the value with a forgiving behavior in the case the map is nil.