I have the option of directly implementing a Protocol in the body of a defrecord instead of using extend-protocol/extend-type
(defprotocol Fly
(fly [this]))
(defrecord Bird [name]
Fly
(fly [this] (format "%s flies" name)))
=>(fly (Bird. "crow"))
"crow flies"
If I now try to override the Fly protocol, I get an error
(extend-type Bird
Fly
(fly [this] (format "%s flies away (:name this)))
class user.Bird already directly implements interface user.Fly for protocol:#'user/Fly
On the other hand if instead I use extend-type initially
(defrecord Dragon [color])
(extend-type Dragon
Fly
(fly [this] (format "%s dragon flies" (:color this))))
=>(fly (Dragon. "Red"))
"Red dragon flies"
I can then "override" the the fly function
(extend-type Dragon
Fly
(fly [this] (format "%s dragon flies away" (:color this))))
=>(fly (Dragon. "Blue"))
"Blue dragon flies away"
My question is, why not allow extension in both cases? Is this a JVM limitation because of the Record <-> Class relation or is there a use case for a non overridable protocol?
On one level, it is an issue of the JVM not allowing method implementations to be swapped in and out of classes, as implementing a protocol inline amounts to having the class created by defrecord implement an interface corresponding to the protocol. Note that while opting to do so does sacrifice some flexibility, it also buys speed -- and indeed speed is the major design consideration here.
On another level, it is generally a very bad idea for protocol implementations on types to be provided by code which does not own either the type or the protocol in question. See for example this thread on the Clojure Google group for a related discussion (including a statement by Rich Hickey); there's also a relevant entry in the Clojure Library Coding Standards:
Protocols:
One should only extend a protocol to a type if he owns either
the type or the protocol.
If one breaks the previous rule, he should be prepared to withdraw,
should the implementor of either
provide a definition
If a protocol comes with Clojure itself, avoid extending it to types
you don't own, especially e.g.
java.lang.String and other core Java
interfaces. Rest assured if a protocol
should extend to it, it will, else
lobby for it.
The motive is, as stated by Rich Hickey, [to prevent] "people
extend protocols to types for which
they don't make sense, e.g. for which
the protocol authors considered but
rejected an implementation due to a
semantic mismatch.". "No extension
will be there (by design), and people
without sufficient
understanding/skills might fill the
void with broken ideas."
This has also been discussed a lot in the Haskell community in connection with type classes (google "orphan instances"; there are some good posts on the topic here on SO too).
Now clearly an inline implementation is always provided by the owner of the type and so should not be replaced by client code. So, I can see two valid use cases remain for protocol method replacement:
testing things out at the REPL and applying quick tweaks;
modifying protocol method implementations in a running Clojure image.
(1) might be less of a problem if one uses extend & Co. at development time and only switches to inline implementations at some late performance-tuning stage; (2) is just something one may have to sacrifice if top speed is required.
Related
I've been trying to build a discipline of separating my protocol definitions into their own namespace, primarily as a stylistic choice. One thing I don't like about this approach is that since anything a namespace requires is effectively "private" to that namespace, users wanting to call protocol functions from another namespace would have to add a require statement to their code for the protocols.
For example:
Protocol definition namespace:
(ns project.protocols)
(defprotocol Greet
(greet [this greeting]))
Implementation namespace:
(ns project.entities
(:require [project.protocols :as protocols]))
(defrecord TheDude
[name drink]
protocols/Greet
(greet [this greeting]
(println "The Dude sips a" drink)
(println greeting)))
Core namespace:
(ns project.core
(:require [project.protocols :as protocols]
[project.entities :refer [TheDude]]))
(let [dude (TheDude. "Jeff" "white russian")]
(protocols/greet dude "not on the rug, man..."))
This works just fine, but I don't particularly like that users need to be aware of the need to require project.protocols which is really an implementation detail internal to project.entities. In other languages I would just refer to project.entities/greet within project.core but namespaces don't "export" their required vars in Clojure, they are internal to the requiring namespace only. I see two obvious alternatives, a third could be using something like Potemkin:
Don't put the protocol definitions in a separate namespace, just define them in the same file as the implementation (e.g. project.entities here).
Inside the implementing file, create vars pointing to each and every protocol function (this is very ugly and just feels wrong, but works).
As an example of number 2:
(ns project.entities
(:require [project.protocols :as protocols]))
(defrecord TheDude
[name drink]
protocols/Greet
(greet [this greeting]
(println "not on the rug, man...")
(println "guess i'll have another " drink)))
(def greet protocols/greet) ; ¯\_(ツ)_/¯
My question, which I suppose is primarily one of preference, is what is the "best practices" (if any) way to handle this sort of separation of concerns? I realize that adding the require in project.core is only one more line, but my concern is less about line count and more about minimizing what a user would need to be aware of.
EDIT:
I think the obvious way to accomplish this is to not expect users to require both namespaces, but to create a core namespace which does that for them:
(ns project.core
(:require [project.protocols :as protocols]
[project.entities :refer [TheDude]]))
;; create wrapper 'constructor' functions like this for each record in `project.entities`
(defn new-dude
[{:keys [name drink] :as dude}]
(map->TheDude dude))
;; similarly, wrap each protocol method
(defn greet [person phrase]
(protocols/greet person phrase))
Now any user can just require core, and if they want to extend the protocol to their own record in a different namespace, they can do so and calls to core/greet will pick up the new implementations. Additionally, if there is any pre/post processing to be done, that can be handled in the "higher level" API function core/greet.
In a program that needs protocols, usually objects are instantiated in some places and consumed (by protocol) in other places. Perhaps in some cases where that's not so, you didn't really need a protocol anyway. In effect, is not too common to "require" the namespaces of both a protocol and an implementation of it. If it happens a lot, it is a code smell.
pete23's answer mentions using the dot syntax to call a record's method without involving the protocol's namespace. But using the protocol function has some modest advantages.
Absent implementation inheritance, a protocol contains essential ("primitive") functions only. Such functions are convenient to implement, but not necessarily super-friendly to callers. The protocol namespace is a great place to add non-primitive accessors, of the sort that you might object-orientedly have declared as a default method on an interface or as an inherited non-abstract method on an abstract base class. Consumers that use the protocol's namespace can call the primitives and non-primitives alike.
Sometimes a primitive turns out to need pre- or post-processing common to all implementations. No need to repeat the common stuff in every implementation! Just lightly refactor: rename the protocol function from f to -f, update the implementations, and add a function f in the protocol's namespace that wraps -f with the necessary pre and post. The callers do not need any change.
I'm a Clojure novice and was looking for some concrete examples of when to use protocols and when to use multimethods. I know that protocols are generally geared towards creating a type hierarchy and typical OOP things, that they were added to the language after multimethods, and that protocols generally have better performance so my question is this:
Are protocols meant to replace multimethods? If not, could you give me an example where I would use multimethods instead of protocols?
Protocol and multimethods are complementary and intended for slightly different use cases.
Protocols provide efficient polymorphic dispatch based on the type of the first argument. Because the is able to exploit some very efficient JVM features, protocols give you the best performance.
Multimethods enable very flexible polymorphism which can dispatch based on any function of the method's arguments. Multimethods are slower but very flexible
In general, my advice is to use protocols unless you have a specific case that requires multimethods.
A case where you could require multimethods is something like the following:
(defn balance-available? [amount balance] (> balance amount))
(defmulti withdraw balance-available?)
(defmethod withdraw true [amount balance]
(- balance amount))
(defmethod withdraw false [amount balance]
(throw (Error. "Insufficient balance available!")))
Note that you can't use protocols here for both of the following reasons:
The dispatch function needs to use both arguments to determine which method implementation to use (i.e. it is a multiple dispatch case).
You also can't distinguish on the type of the first argument (which is presumably always a numeric value)
Multimethods are more powerful and more expensive,
use protocols when they are sufficient but if you need to dispatch based on the phase of the moon as seen from mars then multimethods are your best choice.
Protocols exist to allow the simple stuff to stay simple and provide a way for clojure to generate very much the same bytecode that the equivalent java would. It seems that most people use protocols most of the time. I use multimethods when I need to dispatch on more than one argument, though I have to admit that this has only come up once, and full isa hierarchies are used even less often (by me). so in short use Multimethods when you need them
the best example In my expierence is right at the start, in core.clj
As mention by Arthur, multimethods are more powerful and more expensive. Indeed, protocols can be thought of as a special case of mutlimethods where the dispatch function is class. Of course, this is not really the case as protocols are more than that.
If you need to dispatch on something other than the class of the first argument, you'll need to use a multimethod, or redesign. Dispatching on type is a good use case for protocols.
I like multimethods when you don't otherwise need a class hierarchy. For example if you have a media database and your records are like {:media-type :video, :bytes ...} then you can have a multimethod
(defmulti make-grayscale :media-type)
Then you can make various
; in video.clj
(defmethod make-grayscale :video [record]
(ffmpeg ... (:bytes record))
; in photo.clj
(defmethod make-grayscale :photo [record]
(imagemagick ... (:bytes record))
That way you can avoid having a central cond expression, so you get the modularity of classes. But you don't have to go through all that "wrapper class hierarchy" boilerplate, which to me is a bane that should be left for the Java world. Multimethods are just functions and feel more clojuresque to me.
I could not understand the usage of reify function in Clojure.
What is it used for in clojure?
Could you provide examples?
reify is to defrecord what fn is to defn.
"Ah right...... so what's reify"
Put simply, protocols are lists of functions a datatype should support, records are datatypes and reifications are anonymous datatypes.
Maybe this is long-winded, but reify can't be understood concretely without understanding protocols and types/records: Protocols are a way to use the same name for a function such as conj that actually acts differently when given different arguments ((conj [:b :c] :a) => [:b :c :a] but (conj '(:b :c) :a) => (:a :b :c)). Records are like objects or types (but they act like maps making them awesomer).
More fundamentally, the goal is to solve "The Expression Problem" that is to have the ability to seamlessly add new types of data that work with existing functions, and new functions that work seamlessly with existing data.
So one day you say to yourself, "Self, you should learn what it means to be a duck!" so you write a protocol:
(defprotocol Quacks
(quack [_] "should say something in ducky fashion"))
But it's all too abstract so you 'realify' it:
(def donald (reify Quacks
(quack [_] "Quacks and says I'm Donald")))
Now at last you can experience your creation:
(quack donald) => "Quacks and says I'm Donald"
Then you remember about Daffy:
(def daffy (reify Quacks
(quack [_] (str "Quacks and says I'm Daffy"))))
(quack daffy) => "Quacks and says I'm Daffy"
But by the time you remember about Huey, you realize your mistake and define what a duck is in a reusable way:
(defrecord Duck [name]
Quacks
(quack [_] (str "Quacks and says I'm " name)))
And make new ducks (there are several ways to do it):
(def huey (->Duck "Huey"))
(def duey (Duck. "Duey"))
(def louie (new Duck "Louie"))
(quack huey) => "Quacks and says I'm Huey"
Remember that records act like maps (thank to protocols!):
(:name huey) => "Huey"
But then you remember that ducks must quack and walk so you write another protocol:
(defprotocol Walks
(walk [_] "should walk like a duck"))
And extend the definition of duck
(extend-type Duck
Walks
(walk [_] "waddle waddle"))
(walk louie) => "waddle waddle"
Now we can extend other types to implement the same protocol (teach the same function how to work with other things):
So let's say we want programmers to quack too :-)
(defrecord Programmer [] Quacks
(quack [_] "Monads are simply monoids in a category of endofunctors..."))
(quack (Programmer.)) => "Monads are simply monoids in a category of endofunctors..."
I recommend this great explanation of protocols, an explanation of reify and a chapter on protocols in "Clojure for the Brave and True".
Disclaimer: This is only meant to give an initial understanding of what protocols are, not best practices on how to use them. "Psst! I answered this question largely to teach myself, because until yesterday I had never actually written my own protocol/interface!"
So while I hope that it enhances someone else's learning, I'll heartily welcome criticism or edit-suggestions!".
reify macro allow to create an anonymous class extending java.lang.Object class and/or implementing specified interfaces/protocols. The API docs don't describe the purpose clearly but rather provide the technical details what that macro does. Java interop documentation provides a brief description of the purpose:
As of Clojure 1.2, reify is also available for implementing
interfaces.
Even more information can be found in datatypes documentation where you can find a very detailed description what it does and how it compares to proxy:
While deftype and defrecord define named types, reify defines both an
anonymous type and creates an instance of that type. The use case is
where you need a one-off implementation of one or more protocols or
interfaces and would like to take advantage of the local context. In
this respect it is use case similar to proxy, or anonymous inner
classes in Java.
The method bodies of reify are lexical closures, and can refer to the
surrounding local scope. reify differs from proxy in that:
Only protocols or interfaces are supported, no concrete superclass.
The method bodies are true methods of the resulting class, not
external fns. Invocation of methods on the instance is direct, not
using map lookup. No support for dynamic swapping of methods in the
method map. The result is better performance than proxy, both in
construction and invocation. reify is preferable to proxy in all cases
where its constraints are not prohibitive.
I only understood a couple of sections in the reducers talk, one which was that a data-structure could implement the IReducible interface and be able to transform natively, without being turned into a LazySeq first.
I'm hoping to exploit this in clojurescript with native javascript arrays and objects but am not too sure where to start. Can anyone provide an example of how this may be done?
In ClojureScript, the relevant protocol is called IReduce and is already implemented for arrays in the standard library. The relevant extend-type form is here (link to the latest commit on master as of right now).
There's also IKVReduce used by reduce-kv, as well as clojure.core.reducers/reduce in the case of map arguments.
You could provide a wrapper for native objects which you'd like to transform in this way:
(defn wrap-as-reducible [obj]
(reify
IReduce
(-reduce [this f]
...)
(-reduce [this f init]
...)
IKVReduce
(-kv-reduce [this f init]
...)))
Implement either or both of IReduce and IKVReduce according to your needs.
Directly implementing either protocol for "native objects" in general is probably not a good idea, as that would amount to providing a default case which would render checks for reducibility meaningless etc.
After defining a record and the interfaces it implements, I can call its methods either by its name or using the java interop way using the dot operator.
user=> (defprotocol Eat (eat [this]))
Eat
user=> (defrecord animal [name] Eat (eat [this] "eating"))
user.animal
user=> (eat (animal. "bob"))
"eating"
user=> (.eat (animal. "bob"))
"eating"
user=>
Under the surface, what is going on there? Are there new clojure functions being defined? What happens when there are functions you defined that share the same name (is this possible?), how are these ambiguities resolved?
Also, is it possible to "import" java methods for other java objects so that you do not need the . operator so that behavior is like above? (For the purpose, for example, of unifying the user interface)
When you define a protocol, each of its methods are created as functions in your current namespaces. It follows that you can't have two protocols defining the same function in the same namespace. It also means that you can have them in separate namespaces and that a given type can extend both[1] of them without any nameclash because they are namespaced (in opposition to Java where a single class can't implement two interfaces with homonymous methods).
From a user perspective, protocol methods are no different from plain old non-polymorphic functions.
The fact that you can call a protocol method using interop is an implementation detail. The reason for that is that for each protocol, the Clojure compiler creates a corresponding backing interface. Later on when you define a new type with inline protocol extensions, then this type will implement these protocols' backing interfaces.
Consequently you can't use the interop form on an object for which the extension hasn't been provided inline:
(defrecord VacuumCleaner [brand model]
(extend-protocol Eat
VacuumCleaner
(eat [this] "eating legos and socks"))
(.eat (VaacumCleaner. "Dyson" "DC-20"))
; method not found exception
The compiler has special support for protocol functions so they are compiled as an instance check followed by a virtual method call, so when applicable (eat ...) will be as fast as (.eat ...).
To reply to "can one import java methods", you can wrap them in regular fns:
(def callme #(.callme %1 %2 %3))
(obviously you may need to add other arities to account for overloads and type hints to remove reflection)
[1] however you can't extend both inline (at least one of them must be in a extend-* form), because of an implementation limitation