clojure keeping protocol definition in a separate namespace from implementation - clojure

I've been trying to build a discipline of separating my protocol definitions into their own namespace, primarily as a stylistic choice. One thing I don't like about this approach is that since anything a namespace requires is effectively "private" to that namespace, users wanting to call protocol functions from another namespace would have to add a require statement to their code for the protocols.
For example:
Protocol definition namespace:
(ns project.protocols)
(defprotocol Greet
(greet [this greeting]))
Implementation namespace:
(ns project.entities
(:require [project.protocols :as protocols]))
(defrecord TheDude
[name drink]
protocols/Greet
(greet [this greeting]
(println "The Dude sips a" drink)
(println greeting)))
Core namespace:
(ns project.core
(:require [project.protocols :as protocols]
[project.entities :refer [TheDude]]))
(let [dude (TheDude. "Jeff" "white russian")]
(protocols/greet dude "not on the rug, man..."))
This works just fine, but I don't particularly like that users need to be aware of the need to require project.protocols which is really an implementation detail internal to project.entities. In other languages I would just refer to project.entities/greet within project.core but namespaces don't "export" their required vars in Clojure, they are internal to the requiring namespace only. I see two obvious alternatives, a third could be using something like Potemkin:
Don't put the protocol definitions in a separate namespace, just define them in the same file as the implementation (e.g. project.entities here).
Inside the implementing file, create vars pointing to each and every protocol function (this is very ugly and just feels wrong, but works).
As an example of number 2:
(ns project.entities
(:require [project.protocols :as protocols]))
(defrecord TheDude
[name drink]
protocols/Greet
(greet [this greeting]
(println "not on the rug, man...")
(println "guess i'll have another " drink)))
(def greet protocols/greet) ; ¯\_(ツ)_/¯
My question, which I suppose is primarily one of preference, is what is the "best practices" (if any) way to handle this sort of separation of concerns? I realize that adding the require in project.core is only one more line, but my concern is less about line count and more about minimizing what a user would need to be aware of.
EDIT:
I think the obvious way to accomplish this is to not expect users to require both namespaces, but to create a core namespace which does that for them:
(ns project.core
(:require [project.protocols :as protocols]
[project.entities :refer [TheDude]]))
;; create wrapper 'constructor' functions like this for each record in `project.entities`
(defn new-dude
[{:keys [name drink] :as dude}]
(map->TheDude dude))
;; similarly, wrap each protocol method
(defn greet [person phrase]
(protocols/greet person phrase))
Now any user can just require core, and if they want to extend the protocol to their own record in a different namespace, they can do so and calls to core/greet will pick up the new implementations. Additionally, if there is any pre/post processing to be done, that can be handled in the "higher level" API function core/greet.

In a program that needs protocols, usually objects are instantiated in some places and consumed (by protocol) in other places. Perhaps in some cases where that's not so, you didn't really need a protocol anyway. In effect, is not too common to "require" the namespaces of both a protocol and an implementation of it. If it happens a lot, it is a code smell.
pete23's answer mentions using the dot syntax to call a record's method without involving the protocol's namespace. But using the protocol function has some modest advantages.
Absent implementation inheritance, a protocol contains essential ("primitive") functions only. Such functions are convenient to implement, but not necessarily super-friendly to callers. The protocol namespace is a great place to add non-primitive accessors, of the sort that you might object-orientedly have declared as a default method on an interface or as an inherited non-abstract method on an abstract base class. Consumers that use the protocol's namespace can call the primitives and non-primitives alike.
Sometimes a primitive turns out to need pre- or post-processing common to all implementations. No need to repeat the common stuff in every implementation! Just lightly refactor: rename the protocol function from f to -f, update the implementations, and add a function f in the protocol's namespace that wraps -f with the necessary pre and post. The callers do not need any change.

Related

How do I abstract away implementation details in Clojure?

I'd like to hide the details of my persistence layer behind some sort of interface. In Java I would just create an interface and choose the correct implementation in some sort of bootup function. I'm still struggling on how to do that in Clojure. I don't necessarily need any type-safety here, I trust in my unit tests to find any issues there. The best thing I could come up with was to create a map containing anonymous functions with specific keys, like so:
(def crux-db {
:get-by-id (fn [id] (get-obj...))
:save (fn [id obj] (store-obj...))
})
(def fs-db {
:get-by-id (fn [id] (get-obj...))
:save (fn [id obj] (store-obj...))
})
If I'm not missing something, this would allow me to replace the database implementation by def-ing (def db crux-db) or (def db fs-db), as long as all the functions exist in all implementation maps. Somehow I feel like this is not the clojure way but I can't put my finger on it. Is there another way to do this?
Protocols are a way to do that. They let you define what functions should be there. And
you can later implement them for different things with e.g.
a defrecord.
A protocol is a named set of named methods and their signatures, defined using defprotocol:
(defprotocol AProtocol
"A doc string for AProtocol abstraction"
(bar [a b] "bar docs")
(baz [a] [a b] [a b c] "baz docs"))
No implementations are provided
Docs can be specified for the protocol and the functions
The above yields a set of polymorphic functions and a protocol object
all are namespace-qualified by the namespace enclosing the definition
The resulting functions dispatch on the type of their first argument, and thus must have at least one argument
defprotocol is dynamic, and does not require AOT compilation
defprotocol will automatically generate a corresponding interface, with the same name as the protocol, e.g. given a protocol my.ns/Protocol, an interface my.ns.Protocol. The interface will have methods corresponding to the protocol functions, and the protocol will automatically work with instances of the interface.
Since you mentioned crux in your code, you can have a peek at how they
use it
here
and then using defrecords to implement some of
them
There are several ways to achieve this. One way would be to use protocols. The other way would be to just use higher-order functions, where you would "inject" the specific function and expose it like so:
(defn get-by-id-wrapper [implementation]
(fn [id]
(implementation id)
...))
(defn cruxdb-get-by-id [id]
...)
(def get-by-id (get-by-id-wrapper cruxdb-get-by-id))
Also worth mentioning here are libraries like component or integrant which are used to manage the lifecylce of state.

Is gathering namespace functions into a map via a macros idiomatic Clojure?

I'm learning Clojure via a pet project. The project would consist of several workers that would be called from other functions.
Each worker is defined in their own namespace as a set of functions (currently two: get-data for gathering data and write-data for writing the gathered data into a file).
In order to make the code a bit DRYer, I decided to write a macro that would gather functions from namespace into a map that can be passed around:
(ns clojure-bgproc.workers)
(defmacro gen-worker-info []
(let [get-data (ns-resolve *ns* 'get-data)
write-data (ns-resolve *ns* 'write-data)]
`(def ~(quote worker-info)
{:get-data ~get-data
:write-data ~write-data}
)
)
)
In my worker code, I use my macro (code abridged for clarity):
(ns clojure-bgproc.workers.summary
(:require [clojure-bgproc.workers :refer [gen-worker-info]]))
(defn get-data [params]
<...>
)
(defn write-data [data file]
;; <...>
)
(gen-worker-info)
While it does work (I get my get-data and write-data functions in clojure-bgproc.workers.summary/worker-info, I find it a bit icky, especially since, if I move the macro call to the top of the file, it doesn't work.
My question is, is there a more idiomatic way to do so? Is this idiomatic Clojure at all?
Thank you.
I think you're in a weird spot because you've structured your program wrong:
Each worker is defined in their own namespace as a set of functions
This is the real problem. Namespaces are a good place to put functions and values that you will refer to in hand-written code. For stuff you want to access programmatically, they are not a good storage space. Instead, make the data you want to access first-class by putting it into an ordinary proper data structure, and then it's easy to manipulate.
For example, this worker-info map you're thinking of deriving from the namespace is great! In fact, that should be the only way workers are represented: as a map with keys for the worker's functions. Then you just define somewhere a list (or vector, or map) of such worker maps, and that's your list of workers. No messing about with namespaces needed.
My go-to solution for defining the workers would be Protocols. I would also apply some of the well-tried frameworks for system lifecycle management.
Protocols provide a way of defining a set of methods and their signatures. You may think of them as similar, but more flexible than, interfaces in object-oriented programming.
Your workers will probably have some state and life-cycle, e.g., the workers may be running or stopped, acquiring and releasing a resource, and so on. I suggest you take a look at Integrant for managing a system with stateful components (i.e., workers).
I would argue for avoiding macros in this case. The adage data over functions over macros seems to apply here. Macros are not available at runtime, make debugging harder, and force all other programmers who look at your code to learn a new Domain-Specific Language, i.e., the one you defined with your macros.

Advising protocol methods in Clojure

I'm trying to advise a number of methods in one library with utility functions from another library, where some of the methods to be advised are defined with (defn) and some are defined with (defprotocol).
Right now I'm using this library, which uses (alter-var-root). I don't care which library I use (or whether I hand-roll my own).
The problem I'm running into right now is that protocol methods sometimes can be advised, and sometimes cannot, depending on factors that are not perfectly clear to me.
If I define a protocol, then define a type and implement that protocol in-line, then advising never seems to work. I am assuming this is because the type extends the JVM interface directly and skips the vars.
If, in a single namespace, I define a protocol, then advise its methods, and then extend the protocol to a type, the advising will not work.
If, in a single namespace, I define a protocol, then extend the protocol to a type, then advise the protocol's methods, the advising will work.
What I would like to do is find a method of advising that works reliably and does not rely on undefined implementation details. Is this possible?
Clojure itself doesn't provide any possibilities to advice functions in a reliable way, even those defined via def/defn. Consider the following example:
(require '[richelieu.core :as advice])
(advice/defadvice add-one [f x] (inc (f x)))
(defn func-1 [x] x)
(def func-2 func-1)
(advice/advise-var #'func-1 add-one)
> (func-1 0)
1
> (func-2 0)
0
After evaluation of the form (def func-2 func-1), var func-2 will contain binding of var func-1 (in other words its value), so advice-var won't affect it.
Eventhough, definitions like func-2 are rare, you may have noticed or used the following:
(defn generic-function [generic-parameter x y z]
...)
(def specific-function-1 (partial generic-function <specific-arg-1>))
(def specific-function-2 (partial generic-function <specific-arg-2>))
...
If you advice generic-function, none of specific functions will work as expected due to peculiarity described above.
If advising is critical for you, as a solution that may work, I'd suppose the following: since Clojure functions are compiled to java classes, you may try to replace java method invoke with other method that had desired behaviour (however, things become more complicated when talking about replacing protocol/interface methods: seems that you'll have to replace needed method in every class that implements particular protocol/interface).
Otherwise, you'll need explicit wrapper for every function that you want to advice. Macros may help to reduce boilerplate in this case.

Unit Testing Local Functions (letfn) in Clojure?

I spent a couple of years doing Scheme "back in the day" and am now learning Clojure. One of the "best practices" in Scheme was to define helper functions within the parent function thus limiting their visibility from "outside." Of course back then TDD wasn't done (or known!) so testing such functions wasn't an issue.
I'm still tempted to structure Clojure functions this way; i.e., using letfn to bind helper functions within the main function. Of course testing such "local" functions is problematic. I realize I can define "private" functions, but this scopes the visibility to the namespace which helps, but is not as fine grained. If you come upon a letfn within another function it's pretty clear that the function is not available for general use.
So, my question is, can one test such local functions and if so, how? If not, then is there some convention to aid in code reading so that it's clear that a function has only one caller?
TIA,
Bill
The usual approach is to just put the functions in the namespace.
One option is using metadata:
user=> (defn ^{::square #(* % %)} cube [x]
#_=> (* x ((::square (meta #'cube)) x)))
#'user/cube
user=> (meta #'cube)
{…, :user/square #<user$fn__780 user$fn__780#2e62c3f9>}
user=> (cube 3)
27
It is of course possible to write a macro to make this prettier.

How does clojure's defrecord method name resolution work?

After defining a record and the interfaces it implements, I can call its methods either by its name or using the java interop way using the dot operator.
user=> (defprotocol Eat (eat [this]))
Eat
user=> (defrecord animal [name] Eat (eat [this] "eating"))
user.animal
user=> (eat (animal. "bob"))
"eating"
user=> (.eat (animal. "bob"))
"eating"
user=>
Under the surface, what is going on there? Are there new clojure functions being defined? What happens when there are functions you defined that share the same name (is this possible?), how are these ambiguities resolved?
Also, is it possible to "import" java methods for other java objects so that you do not need the . operator so that behavior is like above? (For the purpose, for example, of unifying the user interface)
When you define a protocol, each of its methods are created as functions in your current namespaces. It follows that you can't have two protocols defining the same function in the same namespace. It also means that you can have them in separate namespaces and that a given type can extend both[1] of them without any nameclash because they are namespaced (in opposition to Java where a single class can't implement two interfaces with homonymous methods).
From a user perspective, protocol methods are no different from plain old non-polymorphic functions.
The fact that you can call a protocol method using interop is an implementation detail. The reason for that is that for each protocol, the Clojure compiler creates a corresponding backing interface. Later on when you define a new type with inline protocol extensions, then this type will implement these protocols' backing interfaces.
Consequently you can't use the interop form on an object for which the extension hasn't been provided inline:
(defrecord VacuumCleaner [brand model]
(extend-protocol Eat
VacuumCleaner
(eat [this] "eating legos and socks"))
(.eat (VaacumCleaner. "Dyson" "DC-20"))
; method not found exception
The compiler has special support for protocol functions so they are compiled as an instance check followed by a virtual method call, so when applicable (eat ...) will be as fast as (.eat ...).
To reply to "can one import java methods", you can wrap them in regular fns:
(def callme #(.callme %1 %2 %3))
(obviously you may need to add other arities to account for overloads and type hints to remove reflection)
[1] however you can't extend both inline (at least one of them must be in a extend-* form), because of an implementation limitation