I wish to know what is the idiomatic way to achieve data encapsulation
in Clojure. Below I describe my particular use case to motivate the example
code I provide.
I have code in a dbaccess module that performs a complex
database query to fetch some
data from a database. I also have a number of functions that operate
on the returned, raw, data. These functions then provide more
processed, refined views of the data and are invoked numerous
times with different arguments from
other modules in the system. Let's call them "API functions".
The query is heavy and should execute only once at the beginning,
the API functions will then operate on the raw-data
from memory without having to perform another DB query.
Here's my approach using closures:
dbaccess module
(ns dbaccess)
(let
[raw-data (complex-database-query)]
(defn create-client-names []
(fn [some-args] raw-data))
(defn create-client-portfolio []
(fn [some-args] raw-data))
(defn create-client-stocks []
(fn [some-args] raw-data)))
some other client module
(def client-names (create-client-names))
(doall (map println (client-names "Baltimore")))
I dislike having to name the created functions that have
captured the raw-data.
More importantly, the code above doesn't allow
the client modules to configure aspects of the query before it executes
(e.g. the database connection information).
If, on the other hand, closures are not used I will have to
explicitly pass the raw-data
back and forth between the dbaccess module and the other modules that need to invoke API functions.
Is there a better way? Should I perhaps use mutable state in the dbaccess module?
I will have to explicitly pass the raw-data back and forth between the
dbaccess module and the other modules that need to invoke API
functions
You should do this, pass the data the function need explicitly, because:
This will lead to loose coupling between how the data is created and how it is processed.
Functions will be more clear to understand while reading it.
Testing of the individual function will be easy as you can mock data easily.
I guess you don't need let in this case:
(def ^:private raw-data (promise))
(future (deliver raw-date (complex-database-query))) ;; A. Webb mentioned this
(defn create-client-names []
(fn [some-args] #raw-data))
...
Why aren't create-client-names and other functions just
(defn create-client-names [some-args]
#raw-data)
...
?
And IMO It's better to use doseq instead of map if there is imperative body:
(doseq [name (client-names "Baltimore")]
(println name))
Related
I have this function:
(defn get-validator []
(UrlValidator. (into-array ["https"])))
I want it to be evaluated only once, on the first call, then just return the result. Which is the better way to write it:
(def get-validator (UrlValidator. (into-array ["https"])))
(def ^:const get-validator (UrlValidator. (into-array ["https"])))
(defonce get-validator (UrlValidator. (into-array ["https"])))
Or is there another way that is better? Documentation suggests that defonce is the correct one, but it's not clearly stated.
First of all, def sets a var with the given name and optionally the
given "init" value in that var, when the namespace is loaded/compiled
(note that there is basically not much of a difference between loading
and compiling and that's the reason, why you don't want to have
side-effects in your def).
Roughly all three versions are the same, with the following differences:
:^const allows inlining this value; so this effects the following
forms, when they are compiled and might improve performance in the
following code
defonce prevents re-def-ining the var again once the namespace is
reloaded; this is most useful if you def mutable state, that you
want to survive over reloading your code
In any case, requiring the ns for the first time, will execute the code
to init the var. Then it is basically left alone (imagine a static
property in a Java class with a static initializer).
All that said: if UrlValidator has internal state you might still be
better off using a function to create a fresh one, whenever you need it.
Another approach, assuming UrlValidator. is referentially transparent, is to use clojure.core/memoize.
So
(defn get-validator []
(UrlValidator. (into-array ["https"])))
(def uval (memoize get-validator))
And then use (uval) whenever you need the validator. Because of the memoization, get-validator will be called only once.
This approach will make only one call to UrlValidator. the first time (uval) is executed. All the other suggestions will call UrlValidator., once, when the namespace is loaded.
I'd like to hide the details of my persistence layer behind some sort of interface. In Java I would just create an interface and choose the correct implementation in some sort of bootup function. I'm still struggling on how to do that in Clojure. I don't necessarily need any type-safety here, I trust in my unit tests to find any issues there. The best thing I could come up with was to create a map containing anonymous functions with specific keys, like so:
(def crux-db {
:get-by-id (fn [id] (get-obj...))
:save (fn [id obj] (store-obj...))
})
(def fs-db {
:get-by-id (fn [id] (get-obj...))
:save (fn [id obj] (store-obj...))
})
If I'm not missing something, this would allow me to replace the database implementation by def-ing (def db crux-db) or (def db fs-db), as long as all the functions exist in all implementation maps. Somehow I feel like this is not the clojure way but I can't put my finger on it. Is there another way to do this?
Protocols are a way to do that. They let you define what functions should be there. And
you can later implement them for different things with e.g.
a defrecord.
A protocol is a named set of named methods and their signatures, defined using defprotocol:
(defprotocol AProtocol
"A doc string for AProtocol abstraction"
(bar [a b] "bar docs")
(baz [a] [a b] [a b c] "baz docs"))
No implementations are provided
Docs can be specified for the protocol and the functions
The above yields a set of polymorphic functions and a protocol object
all are namespace-qualified by the namespace enclosing the definition
The resulting functions dispatch on the type of their first argument, and thus must have at least one argument
defprotocol is dynamic, and does not require AOT compilation
defprotocol will automatically generate a corresponding interface, with the same name as the protocol, e.g. given a protocol my.ns/Protocol, an interface my.ns.Protocol. The interface will have methods corresponding to the protocol functions, and the protocol will automatically work with instances of the interface.
Since you mentioned crux in your code, you can have a peek at how they
use it
here
and then using defrecords to implement some of
them
There are several ways to achieve this. One way would be to use protocols. The other way would be to just use higher-order functions, where you would "inject" the specific function and expose it like so:
(defn get-by-id-wrapper [implementation]
(fn [id]
(implementation id)
...))
(defn cruxdb-get-by-id [id]
...)
(def get-by-id (get-by-id-wrapper cruxdb-get-by-id))
Also worth mentioning here are libraries like component or integrant which are used to manage the lifecylce of state.
i have a record:
(defrecord Foo [a b])
and an instance method for it
(defn inc-a-field [this] (into this {:a (inc (:a this))}))
is it best practice to define a protocol for that? (since it is Foo specific)
Yes, it's best to define protocol with all desired methods first if you want to attach them to you record type. An alternative is to use ordinary functions with no attachment to your record.
Protocols are very handy for stateful operations. For example, look at carmine connection record implementation.
But if your record is just a map with predefined structure, then it may be better to use ordinary clojure functions instead.
You should also look at this question, it's very similar to yours.
I'm new to clojure and I'm trying to make sense of the different design choices available in different situation. In this particular case I would like to group tightly coupled functionality and make it possible to pass the functions around as a collection.
When to use function maps to group tightly related functionality and when to use protocols (+ implementations)?
What are the advantages and drawbacks?
Is either more idiomatic?
For reference, here are two examples of what I mean. With fn maps:
(defn do-this [x] ...)
(defn do-that [] ...)
(def some-do-context { :do-this (fn [x] (do-this x)
:do-that (fn [] (do-that) }
and in the second case,
(defprotocol SomeDoContext
(do-this[this x] "does this")
(do-that[this] "does that")
(deftype ParticularDoContext []
SomeDoContext
(do-this[this x] (do-this x))
(do-that[this] (do-that))
It all depends on what you meant by "tightly related functionality". There can be 2 interpretations:
These set of functions implement a particular component/sub-system of the system. Example: Logging, Authentication etc. In this case you will probably use a clojure namespace (AKA module) to group the related functions rather than using a hash map.
These set of functions work together on some data structure or type or object etc. In this case you will use Protocol based approach, which allows ad-hoc polymorphism such that new types can also provide this set of functionality. Example: Any interface kind of thing: Sortable, Printable etc.
Protocols are like Interfaces so if what you're trying to create is one, use a protocol.
If you're just trying to group related functions somewhere use a namespace, it's ok to have your functions floating there attached to no particular Object.
It seems to me you're thinking about Objects and using the map just to simulate an Object or a struct you're attaching your functions together. Feels unnatural to me unless it's indeed a type or protocol, and you should use defrecord, deftype and defprotocol in those cases.
An example taken from here about using defprotocol and defrecord:
(defprotocol IAnimal
"the animal protocol"
(inner-report [o] "a report"))
(defrecord Dog []
IAnimal
(inner-report [o]
(println "Woof Woof.\n")))
(defrecord Cat []
IAnimal
(inner-report [o]
(println "Meow Meow.\n")))
My first impression would be that the first way is data oriented and the second way is type oriented. So far, I prefer data oriented.
Perhaps the decision is related with Alan Perlis quote: "It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures."
defn = public
defn- = private
Perhaps I have bad Clojure coding style -- but I find that most functions I write in Clojure are small helper functions that I do not want to expose.
Is there some configuration option, where:
defn = private by default,
and to make something public, I have to do defn+?
Thanks!
No. There is not.
An alternative approach which might or might not work for you is to declare a foo.bar.internal namespace containing all the private helpers which is used by your foo.bar namespace. This has advantages over private function declarations when you want to use private functions in macro expansions.
If the "helper functions" are very likely to be used once, you could choose to make them locals of your bigger functions, or write them as anonymous functions. See letfn: http://clojuredocs.org/clojure_core/clojure.core/letfn and http://clojuredocs.org/clojure_core/clojure.core/fn.
I hardly ever use letfn myself.
The beauty of Clojure being a Lisp, is that you can build and adapt the language to suit your needs. I highly recommend you read On Lisp, by Paul Graham. He now gives his book away for free.
Regarding your suggestion of defn+ vs defn vs defn-. This sound to me like a good case for writing your own Macro. defn is a function and defn- is a macro. You can simply redefine them as you wish, or wrap them in your own.
Here follows a suggestion to implementation, based mainly on Clojure's own implementation - including a simple utility, and a test.
(defmacro defn+
"same as Clojure's defn, yielding public def"
[name & decls]
(list* `defn (with-meta name (assoc (meta name) :public true)) decls))
(defmacro defn
"same as Clojure's defn-, yielding non-public def"
[name & decls]
(list* `defn (with-meta name (assoc (meta name) :private true)) decls))
(defn mac1
"same as `macroexpand-1`"
[form]
(. clojure.lang.Compiler (macroexpand1 form)))
(let [ ex1 (mac1 '(defn f1 [] (println "Hello me.")))
ex2 (mac1 '(defn+ f2 [] (println "Hello World!"))) ]
(defn f1 [] (println "Hello me."))
(defn+ f2 [] (println "Hello World!"))
(prn ex1) (prn (meta #'f1)) (f1)
(prn ex2) (prn (meta #'f2)) (f2) )
As stated by #kotarak, there is no way (as far as I know) to do that, nor is it desirable.
Here is why I dislike defn- :
I found out that when using various Clojure libraries I sometimes need to slightly modify one function to better suit my particular needs.
It is often something quite small, and that makes sense only in my particular case. Often this is just a char or two.
But when this function reuses internal private functions, it makes it harder to modify. I have to copy-paste all those private functions.
I understand that this is a way for the programmer to say that "this might change without notice".
Regardless, I would like the opposite convention :
always use defn, which makes everything public
use defn+ (that doesn't exist yet) to specify to the programmer which functions are part of the public API that he is supposed to use. defn+ should be no different from defnotherwise.
Also please note that it is possible to access private functions anyway :
;; in namespace user
user> (defn- secret []
"TOP SECRET")
;; from another namespace
(#'user/secret) ;;=> "TOP SECRET"