Clojure Class Hierarchy [duplicate] - clojure

I'm a Clojure novice and was looking for some concrete examples of when to use protocols and when to use multimethods. I know that protocols are generally geared towards creating a type hierarchy and typical OOP things, that they were added to the language after multimethods, and that protocols generally have better performance so my question is this:
Are protocols meant to replace multimethods? If not, could you give me an example where I would use multimethods instead of protocols?

Protocol and multimethods are complementary and intended for slightly different use cases.
Protocols provide efficient polymorphic dispatch based on the type of the first argument. Because the is able to exploit some very efficient JVM features, protocols give you the best performance.
Multimethods enable very flexible polymorphism which can dispatch based on any function of the method's arguments. Multimethods are slower but very flexible
In general, my advice is to use protocols unless you have a specific case that requires multimethods.
A case where you could require multimethods is something like the following:
(defn balance-available? [amount balance] (> balance amount))
(defmulti withdraw balance-available?)
(defmethod withdraw true [amount balance]
(- balance amount))
(defmethod withdraw false [amount balance]
(throw (Error. "Insufficient balance available!")))
Note that you can't use protocols here for both of the following reasons:
The dispatch function needs to use both arguments to determine which method implementation to use (i.e. it is a multiple dispatch case).
You also can't distinguish on the type of the first argument (which is presumably always a numeric value)

Multimethods are more powerful and more expensive,
use protocols when they are sufficient but if you need to dispatch based on the phase of the moon as seen from mars then multimethods are your best choice.
Protocols exist to allow the simple stuff to stay simple and provide a way for clojure to generate very much the same bytecode that the equivalent java would. It seems that most people use protocols most of the time. I use multimethods when I need to dispatch on more than one argument, though I have to admit that this has only come up once, and full isa hierarchies are used even less often (by me). so in short use Multimethods when you need them
the best example In my expierence is right at the start, in core.clj

As mention by Arthur, multimethods are more powerful and more expensive. Indeed, protocols can be thought of as a special case of mutlimethods where the dispatch function is class. Of course, this is not really the case as protocols are more than that.
If you need to dispatch on something other than the class of the first argument, you'll need to use a multimethod, or redesign. Dispatching on type is a good use case for protocols.

I like multimethods when you don't otherwise need a class hierarchy. For example if you have a media database and your records are like {:media-type :video, :bytes ...} then you can have a multimethod
(defmulti make-grayscale :media-type)
Then you can make various
; in video.clj
(defmethod make-grayscale :video [record]
(ffmpeg ... (:bytes record))
; in photo.clj
(defmethod make-grayscale :photo [record]
(imagemagick ... (:bytes record))
That way you can avoid having a central cond expression, so you get the modularity of classes. But you don't have to go through all that "wrapper class hierarchy" boilerplate, which to me is a bane that should be left for the Java world. Multimethods are just functions and feel more clojuresque to me.

Related

Advising protocol methods in Clojure

I'm trying to advise a number of methods in one library with utility functions from another library, where some of the methods to be advised are defined with (defn) and some are defined with (defprotocol).
Right now I'm using this library, which uses (alter-var-root). I don't care which library I use (or whether I hand-roll my own).
The problem I'm running into right now is that protocol methods sometimes can be advised, and sometimes cannot, depending on factors that are not perfectly clear to me.
If I define a protocol, then define a type and implement that protocol in-line, then advising never seems to work. I am assuming this is because the type extends the JVM interface directly and skips the vars.
If, in a single namespace, I define a protocol, then advise its methods, and then extend the protocol to a type, the advising will not work.
If, in a single namespace, I define a protocol, then extend the protocol to a type, then advise the protocol's methods, the advising will work.
What I would like to do is find a method of advising that works reliably and does not rely on undefined implementation details. Is this possible?
Clojure itself doesn't provide any possibilities to advice functions in a reliable way, even those defined via def/defn. Consider the following example:
(require '[richelieu.core :as advice])
(advice/defadvice add-one [f x] (inc (f x)))
(defn func-1 [x] x)
(def func-2 func-1)
(advice/advise-var #'func-1 add-one)
> (func-1 0)
1
> (func-2 0)
0
After evaluation of the form (def func-2 func-1), var func-2 will contain binding of var func-1 (in other words its value), so advice-var won't affect it.
Eventhough, definitions like func-2 are rare, you may have noticed or used the following:
(defn generic-function [generic-parameter x y z]
...)
(def specific-function-1 (partial generic-function <specific-arg-1>))
(def specific-function-2 (partial generic-function <specific-arg-2>))
...
If you advice generic-function, none of specific functions will work as expected due to peculiarity described above.
If advising is critical for you, as a solution that may work, I'd suppose the following: since Clojure functions are compiled to java classes, you may try to replace java method invoke with other method that had desired behaviour (however, things become more complicated when talking about replacing protocol/interface methods: seems that you'll have to replace needed method in every class that implements particular protocol/interface).
Otherwise, you'll need explicit wrapper for every function that you want to advice. Macros may help to reduce boilerplate in this case.

What is reify in Clojure?

I could not understand the usage of reify function in Clojure.
What is it used for in clojure?
Could you provide examples?
reify is to defrecord what fn is to defn.
"Ah right...... so what's reify"
Put simply, protocols are lists of functions a datatype should support, records are datatypes and reifications are anonymous datatypes.
Maybe this is long-winded, but reify can't be understood concretely without understanding protocols and types/records: Protocols are a way to use the same name for a function such as conj that actually acts differently when given different arguments ((conj [:b :c] :a) => [:b :c :a] but (conj '(:b :c) :a) => (:a :b :c)). Records are like objects or types (but they act like maps making them awesomer).
More fundamentally, the goal is to solve "The Expression Problem" that is to have the ability to seamlessly add new types of data that work with existing functions, and new functions that work seamlessly with existing data.
So one day you say to yourself, "Self, you should learn what it means to be a duck!" so you write a protocol:
(defprotocol Quacks
(quack [_] "should say something in ducky fashion"))
But it's all too abstract so you 'realify' it:
(def donald (reify Quacks
(quack [_] "Quacks and says I'm Donald")))
Now at last you can experience your creation:
(quack donald) => "Quacks and says I'm Donald"
Then you remember about Daffy:
(def daffy (reify Quacks
(quack [_] (str "Quacks and says I'm Daffy"))))
(quack daffy) => "Quacks and says I'm Daffy"
But by the time you remember about Huey, you realize your mistake and define what a duck is in a reusable way:
(defrecord Duck [name]
Quacks
(quack [_] (str "Quacks and says I'm " name)))
And make new ducks (there are several ways to do it):
(def huey (->Duck "Huey"))
(def duey (Duck. "Duey"))
(def louie (new Duck "Louie"))
(quack huey) => "Quacks and says I'm Huey"
Remember that records act like maps (thank to protocols!):
(:name huey) => "Huey"
But then you remember that ducks must quack and walk so you write another protocol:
(defprotocol Walks
(walk [_] "should walk like a duck"))
And extend the definition of duck
(extend-type Duck
Walks
(walk [_] "waddle waddle"))
(walk louie) => "waddle waddle"
Now we can extend other types to implement the same protocol (teach the same function how to work with other things):
So let's say we want programmers to quack too :-)
(defrecord Programmer [] Quacks
(quack [_] "Monads are simply monoids in a category of endofunctors..."))
(quack (Programmer.)) => "Monads are simply monoids in a category of endofunctors..."
I recommend this great explanation of protocols, an explanation of reify and a chapter on protocols in "Clojure for the Brave and True".
Disclaimer: This is only meant to give an initial understanding of what protocols are, not best practices on how to use them. "Psst! I answered this question largely to teach myself, because until yesterday I had never actually written my own protocol/interface!"
So while I hope that it enhances someone else's learning, I'll heartily welcome criticism or edit-suggestions!".
reify macro allow to create an anonymous class extending java.lang.Object class and/or implementing specified interfaces/protocols. The API docs don't describe the purpose clearly but rather provide the technical details what that macro does. Java interop documentation provides a brief description of the purpose:
As of Clojure 1.2, reify is also available for implementing
interfaces.
Even more information can be found in datatypes documentation where you can find a very detailed description what it does and how it compares to proxy:
While deftype and defrecord define named types, reify defines both an
anonymous type and creates an instance of that type. The use case is
where you need a one-off implementation of one or more protocols or
interfaces and would like to take advantage of the local context. In
this respect it is use case similar to proxy, or anonymous inner
classes in Java.
The method bodies of reify are lexical closures, and can refer to the
surrounding local scope. reify differs from proxy in that:
Only protocols or interfaces are supported, no concrete superclass.
The method bodies are true methods of the resulting class, not
external fns. Invocation of methods on the instance is direct, not
using map lookup. No support for dynamic swapping of methods in the
method map. The result is better performance than proxy, both in
construction and invocation. reify is preferable to proxy in all cases
where its constraints are not prohibitive.

how would clojure's reducers work for native javascript objects/arrays?

I only understood a couple of sections in the reducers talk, one which was that a data-structure could implement the IReducible interface and be able to transform natively, without being turned into a LazySeq first.
I'm hoping to exploit this in clojurescript with native javascript arrays and objects but am not too sure where to start. Can anyone provide an example of how this may be done?
In ClojureScript, the relevant protocol is called IReduce and is already implemented for arrays in the standard library. The relevant extend-type form is here (link to the latest commit on master as of right now).
There's also IKVReduce used by reduce-kv, as well as clojure.core.reducers/reduce in the case of map arguments.
You could provide a wrapper for native objects which you'd like to transform in this way:
(defn wrap-as-reducible [obj]
(reify
IReduce
(-reduce [this f]
...)
(-reduce [this f init]
...)
IKVReduce
(-kv-reduce [this f init]
...)))
Implement either or both of IReduce and IKVReduce according to your needs.
Directly implementing either protocol for "native objects" in general is probably not a good idea, as that would amount to providing a default case which would render checks for reducibility meaningless etc.

Extending a library-provided protocol without impacting other users

I'm using a 3rd-party library (clj-msgpack), and wish to extend a protocol for a type which the library also provides a handler for.
On its own, this is simple enough -- but is there any way to do this which wouldn't impact other users of this library running inside the same JVM? Something similar to a dynamic var binding (only taking effect under a given point on the stack) would be ideal.
At present, I'm doing an unconditional override but using a dynamic var to enable my modified behavior; however, this feels far too much like monkey-patching for my comfort.
For the curious, the (admitted abomination) I'm putting into place follows:
(in-ns 'clj-msgpack.core)
(def ^:dynamic *keywordize-strings*
"Assume that any string starting with a colon should be unpacked to a keyword"
false)
(extend-protocol Unwrapable
RawValue
(unwrap [o]
(let [v (.getString o)]
(if (and *keywordize-strings* (.startsWith v ":"))
(keyword (.substring v 1))
v))))
After some thought I see two basic approches (one of which I get from you):
Dynamic binding (as you are doing it now):
Some complain that dynamic binding holds to the principal of most supprise; "what? is behaves this way only when called from there?". While I don't personally hold to this being a bad-thing(tm) some people do. In this case it exacly matches your desire and so long as you have one point where you decide if you want keywordized-strings this should work. If you add a second point that changes them back and a code path that crosses the two well... your on your own. But hey, working code has it's merits.
Inheritance:
good'ol java style or using clojure's add-hoc heirarchies you could extend the type of object you are passing around to be keywordized-string-widgewhatzit that extends widgewhatzit and add a new handler for your specific subclass. This only works in some cases and forces a different object style on the rest of the design. Some smart people will also argue that it still follows the principal of most surprise because the type of the objects will be different when called via another code path.
Personally I would go with your existing solution unless you can change your whole program to use keywords instead of strings (which would of course be my first (potentially controversial) choice)

Whats the rationale behind closed records in Clojure?

I have the option of directly implementing a Protocol in the body of a defrecord instead of using extend-protocol/extend-type
(defprotocol Fly
(fly [this]))
(defrecord Bird [name]
Fly
(fly [this] (format "%s flies" name)))
=>(fly (Bird. "crow"))
"crow flies"
If I now try to override the Fly protocol, I get an error
(extend-type Bird
Fly
(fly [this] (format "%s flies away (:name this)))
class user.Bird already directly implements interface user.Fly for protocol:#'user/Fly
On the other hand if instead I use extend-type initially
(defrecord Dragon [color])
(extend-type Dragon
Fly
(fly [this] (format "%s dragon flies" (:color this))))
=>(fly (Dragon. "Red"))
"Red dragon flies"
I can then "override" the the fly function
(extend-type Dragon
Fly
(fly [this] (format "%s dragon flies away" (:color this))))
=>(fly (Dragon. "Blue"))
"Blue dragon flies away"
My question is, why not allow extension in both cases? Is this a JVM limitation because of the Record <-> Class relation or is there a use case for a non overridable protocol?
On one level, it is an issue of the JVM not allowing method implementations to be swapped in and out of classes, as implementing a protocol inline amounts to having the class created by defrecord implement an interface corresponding to the protocol. Note that while opting to do so does sacrifice some flexibility, it also buys speed -- and indeed speed is the major design consideration here.
On another level, it is generally a very bad idea for protocol implementations on types to be provided by code which does not own either the type or the protocol in question. See for example this thread on the Clojure Google group for a related discussion (including a statement by Rich Hickey); there's also a relevant entry in the Clojure Library Coding Standards:
Protocols:
One should only extend a protocol to a type if he owns either
the type or the protocol.
If one breaks the previous rule, he should be prepared to withdraw,
should the implementor of either
provide a definition
If a protocol comes with Clojure itself, avoid extending it to types
you don't own, especially e.g.
java.lang.String and other core Java
interfaces. Rest assured if a protocol
should extend to it, it will, else
lobby for it.
The motive is, as stated by Rich Hickey, [to prevent] "people
extend protocols to types for which
they don't make sense, e.g. for which
the protocol authors considered but
rejected an implementation due to a
semantic mismatch.". "No extension
will be there (by design), and people
without sufficient
understanding/skills might fill the
void with broken ideas."
This has also been discussed a lot in the Haskell community in connection with type classes (google "orphan instances"; there are some good posts on the topic here on SO too).
Now clearly an inline implementation is always provided by the owner of the type and so should not be replaced by client code. So, I can see two valid use cases remain for protocol method replacement:
testing things out at the REPL and applying quick tweaks;
modifying protocol method implementations in a running Clojure image.
(1) might be less of a problem if one uses extend & Co. at development time and only switches to inline implementations at some late performance-tuning stage; (2) is just something one may have to sacrifice if top speed is required.