Making Clojure's defprotocol play nice (polymorphically) with existing functions

Making Clojure's defprotocol play nice (polymorphically) with existing functions - clojure

How can I write a defprotocol (and defrecord to implement it) that declares a method with the same name as an existing function, and dispatch dynamically to the protocol/record's method iff I call it with an instance of the protocol/record, but otherwise dispatch to the existing function?
For example, I want to create a geometry helper that supports basic arithmetic (just multiplication in this example, to keep it short):
(defprotocol SizeOps
(* [this factor] "Multiply each dimension by factor and return a new Size"))
At this point I'm already getting some foreboding pushback from the compiler:
Warning: protocol #'user/SizeOps is overwriting function *
WARNING: * already refers to: #'clojure.core/* in namespace: user, being replaced by: #'user/*
Then the implementation:
(defrecord Size [width height]
SizeOps
(* [this factor] (Size. (* width factor) (* height factor))))
That compiles okay, but when I try to use it, the only * it knows is the one in my protocol:
(* (Size. 1 2) 10)
IllegalArgumentException No implementation of method: :* of protocol: #'user/SizeOps found for class: java.lang.Long
I can hack around this by fully specifying the core * function in my implementation:
(defrecord Size [width height]
SizeOps
(* [this factor] (Size. (clojure.core/* width factor) (clojure.core/* height factor))))
(* (Size. 1 2) 10)
#user.Size{:width 10, :height 20}
But I get the same IllegalArgumentException if I try to call (* 3 4) later on. I can stomach using the namespaced clojure.core/* in my defrecord implementation, but I want my users to be able to call * on my Size records as well as on Long, Double, etc., as usual.
Similar Q&A:
5438379: extending String with a * operator that works like Python: (* "!" 3) ⇒ "!!!", but obscures core's * just as in my example
6492458: excluding core functions like (ns user (:refer-clojure :exclude [*])) avoids the "overwriting" warning, but also avoids having that function around :(
1535235: same, with a gesture toward using a multimethod but no details
I suspect the right solution lies somewhere in lower-level dispatch functionality like defmulti and defmethod or deftype / derive but I'm not super familiar with the nuances of Clojure's runtime polymorphism. And I'm gonna have a whole host of Size, Point, Rectangle, Circle, etc., types that each support some subset of +, -, *, / operations, so I'd love to know if there's a way to tell defprotocol to participate in / build on the polymorphism of any existing functions rather than simply overwrite them.

In cases like this, when you run into limitations of protocols in and of themselves, it can help to create a separate function that simply calls the protocol method for some of its functionality, and does the rest of what needs to be done using the additional capabilities that are given to regular defns:
(ns example.size
(:refer-clojure :exclude [*])
(:require [clojure.core :as clj]))
(defprotocol SizeOps
(times [this factor]))
(extend-protocol SizeOps
Object
(times [this factor] (clj/* this factor)))
(defrecord Size [width height]
SizeOps
(times [this factor] (->Size (clj/* width factor) (clj/* height factor))))
(defn *
([] (clj/*))
([x] (clj/* x))
([x y] (times x y))
([x y & more] (apply clj/* x y more)))
There are a couple advantages to the specific approach I've taken here:
All paths except the two-argument path just use arity dispatch (which is fast), and the two-argument path only additionally uses protocol dispatch (which I think is as fast as you're generally going to get for what you're trying to do)
You keep all the arities, so the behavior should be identical to clojure.core/* for regular old numbers
Feel free to optimize any of this as needed.
Finally, to demonstrate:
(ns example.core
(:refer-clojure :exclude [*])
(:require [example.size :refer [* ->Size]]))
(* (->Size 1 2) 10) ;=> #example.size.Size{:width 10, :height 20}
(* 3 4) ;=> 12
Hopefully sufficiently ergonomic, as alluded to earlier.

Related

Functional alternative to "let"

I find myself writing a lot of clojure in this manner:
(defn my-fun [input]
(let [result1 (some-complicated-procedure input)
result2 (some-other-procedure result1)]
(do-something-with-results result1 result2)))
This let statement seems very... imperative. Which I don't like. In principal, I could be writing the same function like this:
(defn my-fun [input]
(do-something-with-results (some-complicated-procedure input)
(some-other-procedure (some-complicated-procedure input)))))
The problem with this is that it involves recomputation of some-complicated-procedure, which may be arbitrarily expensive. Also you can imagine that some-complicated-procedure is actually a series of nested function calls, and then I either have to write a whole new function, or risk that changes in the first invocation don't get applied to the second:
E.g. this works, but I have to have an extra shallow, top-level function that makes it hard to do a mental stack trace:
(defn some-complicated-procedure [input] (lots (of (nested (operations input)))))
(defn my-fun [input]
(do-something-with-results (some-complicated-procedure input)
(some-other-procedure (some-complicated-procedure input)))))
E.g. this is dangerous because refactoring is hard:
(defn my-fun [input]
(do-something-with-results (lots (of (nested (operations (mistake input))))) ; oops made a change here that wasn't applied to the other nested calls
(some-other-procedure (lots (of (nested (operations input))))))))
Given these tradeoffs, I feel like I don't have any alternatives to writing long, imperative let statements, but when I do, I cant shake the feeling that I'm not writing idiomatic clojure. Is there a way I can address the computation and code cleanliness problems raised above and write idiomatic clojure? Are imperitive-ish let statements idiomatic?

The kind of let statements you describe might remind you of imperative code, but there is nothing imperative about them. Haskell has similar statements for binding names to values within bodies, too.

If your situation really needs a bigger hammer, there are some bigger hammers that you can either use or take for inspiration. The following two libraries offer some kind of binding form (akin to let) with a localized memoization of results, so as to perform only the necessary steps and reuse their results if needed again: Plumatic Plumbing, specifically the Graph part; and Zach Tellman's Manifold, whose let-flow form furthermore orchestrates asynchronous steps to wait for the necessary inputs to become available, and to run in parallel when possible. Even if you decide to maintain your present course, their docs make good reading, and the code of Manifold itself is educational.

I recently had this same question when I looked at this code I wrote
(let [user-symbols (map :symbol states)
duplicates (for [[id freq] (frequencies user-symbols) :when (> freq 1)] id)]
(do-something-with duplicates))
You'll note that map and for are lazy and will not be executed until do-something-with is executed. It's also possible that not all (or even not any) of the states will be mapped or the frequencies calculated. It depends on what do-something-with actually requests of the sequence returned by for. This is very much functional and idiomatic functional programming.

i guess the simplest approach to keep it functional would be to have a pass-through state to accumulate the intermediate results. something like this:
(defn with-state [res-key f state]
(assoc state res-key (f state)))
user> (with-state :res (comp inc :init) {:init 10})
;;=> {:init 10, :res 11}
so you can move on to something like this:
(->> {:init 100}
(with-state :inc'd (comp inc :init))
(with-state :inc-doubled (comp (partial * 2) :inc'd))
(with-state :inc-doubled-squared (comp #(* % %) :inc-doubled))
(with-state :summarized (fn [st] (apply + (vals st)))))
;;=> {:init 100,
;; :inc'd 101,
;; :inc-doubled 202,
;; :inc-doubled-squared 40804,
;; :summarized 41207}

The let form is a perfectly functional construct and can be seen as syntactic sugar for calls to anonymous functions. We can easily write a recursive macro to implement our own version of let:
(defmacro my-let [bindings body]
(if (empty? bindings)
body
`((fn [~(first bindings)]
(my-let ~(rest (rest bindings)) ~body))
~(second bindings))))
Here is an example of calling it:
(my-let [a 3
b (+ a 1)]
(* a b))
;; => 12
And here is a macroexpand-all called on the above expression, that reveal how we implement my-let using anonymous functions:
(clojure.walk/macroexpand-all '(my-let [a 3
b (+ a 1)]
(* a b)))
;; => ((fn* ([a] ((fn* ([b] (* a b))) (+ a 1)))) 3)
Note that the expansion doesn't rely on let and that the bound symbols become parameter names in the anonymous functions.

As others write, let is actually perfectly functional, but at times it can feel imperative. It's better to become fully comfortable with it.
You might, however, want to kick the tires of my little library tl;dr that lets you write code like for example
(compute
(+ a b c)
where
a (f b)
c (+ 100 b))

Clojure koans solution - But they are often better written using the names of functions

What has to be filled in the blanks for making is pass?
(= 25 (__ square))
This question is from clojure koans

We require a function that, applied to the function square, yields 25. The answer given by Alan Thompson ...
(fn [f] (f 5))
... is the expected one. But there innumerable others. The simplest is the function that returnis 25, regardless of its argument:
(fn [_] 25)
This is a common enough construction that there is a core function constantly to do it. So we can abbreviate the above to
(constantly 25)
We can convert any function capable of taking a single argument into a solution by overriding its response to the argument square:
(defn convert [g] (fn [x] (if (= x square) 25 (g x))))
For example,
=> ((convert +) square)
25

How about using the following high-order function:
#(% 5)
Which gives final solution as below?
(= 25 (#(% 5) square)))

IMHO this is a stupid question to be present on Clojure Koans. I don't recall anything this weird when I went through the Koans in 2014.
Here is the answer:
(fn [f] (f 5)) ; missing piece
You also need to know that the function square is (fn [x] (* x x)) from the previous question. They you have to define a (very dumb format) function as above and invoke it in place. I have never seen anything like this horrible format in real life.
For the record, the entire answer will then look like:
(= 25 ( (fn [f] (f 5)) square))
where the previous problem #9 defines:
(defn square [n] (* n n))
P.S. If you haven't seen it yet, please check out these sites for Clojure documentation, examples, and reference:
Brave Clojure
The Clojure CheatSheet
Clojuredocs.org
Clojure-Doc.org
CrossClj.info
CljDoc.org

Is there a good way to check return types when refactoring?

Currently when I'm refactoring in Clojure I tend to use the pattern:
(defn my-func [arg1 arg2]
(assert (= demo.core.Record1 (class arg1) "Incorrect class for arg1: " (class arg1))
(assert (= demo.core.Record1 (class arg2) "Incorrect class for arg2: " (class arg2))
...
That is, I find myself manually checking the return types in case a downstream part of the system modifies them to something I don't expect. (As in, if I refactor and get a stack-trace I don't expect, then I express my assumptions as invariants, and step forward from there).
In one sense, this is exactly the kind of invariant checking that Bertrand Meyer anticipated. (Author of Object Oriented Software Construction, and proponent of the idea of Design by Contract).
The challenge is that I don't find these out until run-time. I would be nice to find these out at compile-time - by simply stating what the function expects.
Now I know Clojure is essentially a dynamic language. (Whilst Clojure has a 'compiler' of sorts, we should expect the application of values to a function to only come to realisation at run-time.)
I just want a good pattern to make refactoring easier. (ie see all the flow-on effects of changing an argument to a function, without seeing it break on the first call, then moving onto the next, then moving onto the next breakage.)
My question is: Is there a good way to check return types when refactoring?

If I understand you right, prismatic/schema should be your choice.
https://github.com/plumatic/schema
(s/defn ^:always-validate my-func :- SomeResultClass
[arg1 :- demo.core.Record1
arg2 :- demo.core.Record1]
...)
you should just turn off all the validation before release, so it won't affect performance.
core.typed is nice, but as far as i remember, it enforces you to annotate all your code, while schema lets you only annotate critical parts.

You have a couple of options, only one of which is "compile" time:
Tests
As Clojure is a dynamic language, tests are absolutely essential. They are your safety net when refactoring. Even in statically typed languages tests are still of use.
Pre and Post Conditions
They allow you to verify your invariants by adding metadata to your functions such as in this example from Michael Fogus' blog:
(defn constrained-fn [f x]
{:pre [(pos? x)]
:post [(= % (* 2 x))]}
(f x))
(constrained-fn #(* 2 %) 2)
;=> 4
(constrained-fn #(float (* 2 %)) 2)
;=> 4.0
(constrained-fn #(* 3 %) 2)
;=> java.lang.Exception: Assert failed: (= % (* 2 x)
core.typed
core.typed is the only option in this list that will give you compile time checking. Your example would then be expressed like so:
(ann my-func (Fn [Record1 Record1 -> ResultType]))
(defn my-func [arg1 arg2]
...)
This comes at the expense of running core.typed as a seperate action, possibly as part of your test suite.
And still on the realm of runtime validation/checking, there are even more options such as bouncer and schema.

What is the standard way to write nested define statements (like in scheme) for clojure?

All examples are taken from the SICP Book: http://sicpinclojure.com/?q=sicp/1-3-3-procedures-general-methods
This was motivated from the MIT video series on LISP - http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-001-structure-and-interpretation-of-computer-programs-spring-2005/video-lectures/2a-higher-order-procedures/
In scheme, you can put 'define' inside another 'define':
(define (close-enough? v1 v2)
(define tolerance 0.00001)
(< (abs (- v1 v2)) tolerance ) )
In clojure, there is the 'let' statement with the only difference that it is nested:
(defn close-enough? [v1 v2]
(let [tolerance 0.00001]
(< (Math/abs (- v1 v2) )
tolerance) ) )
But what about rewriting in clojure something bigger like this?:
(define (sqrt x)
(define (fixed-point f first-guess)
(define (close-enough? v1 v2)
(define tolerance 0.00001)
(< (abs (- v1 v2)) tolerance))
(define (try guess)
(let ((next (f guess)))
(if (close-enough? guess next)
next
(try next))))
(try first-guess))
(fixed-point (lambda (y) (average y (/ x y)))
1.0))
This does in fact work but looks very unconventional...
(defn sqrt [n]
(let [precision 10e-6
abs #(if (< % 0) (- %) %)
close-enough? #(-> (- %1 %2) abs (< precision))
averaged-func #(/ (+ (/ n %) %) 2)
fixed-point (fn [f start]
(loop [old start
new (f start)]
(if (close-enough? old new)
new
(recur new (f new) ) ) ) )]
(fixed-point averaged-func 1) ) )
(sqrt 10)
UPDATED Mar/8/2012
Thanks for the answer!
Essentially 'letfn' is not too different from 'let' - the functions being called have to be nested in the 'letfn' definition (as opposed to Scheme where the functions are used in the next sexp after its definitions and only existing within the scope of the top-level function in which it is defined).
So another question... Why doesn't clojure give the capability of doing what scheme does? Is it some sort of language design decision? What I like about the scheme organization is:
1) The encapsulation of ideas so that I as the programmer have an idea as to what little blocks are being utilized bigger block - especially if I am only using the little blocks once within the big block (for whatever reason, even if the little blocks are useful in their own right).
2) This also stops polluting the namespace with little procedures that are not useful to the end user (I've written clojure programs, came back to them a week later and had to re-learn my code because it was in a flat structure and I felt that I was looking at the code inside out as opposed to in a top down manner).
3) A common method definition interface so I can pull out a particular sub-method, de-indent it test it, and paste the changed version back without too much fiddling around.
Why isn't this implemented in clojure?

the standard way to write nested, named, procedures in clojure is to use letfn.
as an aside, your example use of nested functions is pretty suspicious. all of the functions in the example could be top-level non-local functions since they're more or less useful on their own and don't close over anything but each other.

The people critizing his placement of this in all one function, don't understand the context of why its been done in such a manner. SICP, which is where the example is from, was trying to illustrate the concept of a module, but without adding any other constructs to the base language. So "sqrt" is a module, with one function in its interface, and the rest are local or private functions within that module. This was based on R5RS scheme I believe, and later schemes have since added a standard module construct I think(?). But regardless, its more demonstrating the principle of hiding implementation.
The seasoned schemer also goes through similar examples of nested local functions, but usually to both hide implementation and to close over values as well.
But even if this wasn't a pedagogical example, you can see how this is a very lightweight module and I probably would write it this way within a larger "real" module. Reuse is ok, if its planned for. Otherwise, you are just exposing functions that probably won't be a perfect fit for what you need later on and, at the same time, burdening those functions with unexpected use cases that could break them later on.

letfn is the standard way.
But since Clojure is a Lisp, you can create (almost) any semantics you want. Here's a proof of concept that defines define in terms of letfn.
(defmacro define [& form]
(letfn [(define? [exp]
(and (list? exp) (= (first exp) 'define)))
(transform-define [[_ name args & exps]]
`(~name ~args
(letfn [~#(map transform-define (filter define? exps))]
~#(filter #(not (define? %)) exps))))]
`(defn ~#(transform-define `(define ~#form)))))
(define sqrt [x]
(define average [a b] (/ (+ a b) 2))
(define fixed-point [f first-guess]
(define close-enough? [v1 v2]
(let [tolerance 0.00001]
(< (Math/abs (- v1 v2)) tolerance)))
(define tryy [guess]
(let [next (f guess)]
(if (close-enough? guess next)
next
(tryy next))))
(tryy first-guess))
(fixed-point (fn [y] (average y (/ x y)))
1.0))
(sqrt 10) ; => 3.162277660168379
For real code, you'd want to change define to behave more like R5RS: allow non-fn values, be available in defn, defmacro, let, letfn, and fn, and verify that the inner definitions are at the beginning of the enclosing body.
Note: I had to rename try to tryy. Apparently try is a special non-function, non-macro construct for which redefinition silently fails.

Which Vars affect a Clojure function?

How do I programmatically figure out which Vars may affect the results of a function defined in Clojure?
Consider this definition of a Clojure function:
(def ^:dynamic *increment* 3)
(defn f [x]
(+ x *increment*))
This is a function of x, but also of *increment* (and also of clojure.core/+(1); but I'm less concerned with that). When writing tests for this function, I want to make sure that I control all relevant inputs, so I do something like this:
(assert (= (binding [*increment* 3] (f 1)) 4))
(assert (= (binding [*increment* -1] (f 1)) 0))
(Imagine that *increment* is a configuration value that someone might reasonably change; I don't want this function's tests to need changing when this happens.)
My question is: how do I write an assertion that the value of (f 1) can depend on *increment* but not on any other Var? Because I expect that one day someone will refactor some code and cause the function to be
(defn f [x]
(+ x *increment* *additional-increment*))
and neglect to update the test, and I would like to have the test fail even if *additional-increment* is zero.
This is of course a simplified example – in a large system, there can be lots of dynamic Vars, and they can get referenced through a long chain of function calls. The solution needs to work even if f calls g which calls h which references a Var. It would be great if it didn't claim that (with-out-str (prn "foo")) depends on *out*, but this is less important. If the code being analyzed calls eval or uses Java interop, of course all bets are off.
I can think of three categories of solutions:
Get the information from the compiler
I imagine the compiler does scan function definitions for the necessary information, because if I try to refer to a nonexistent Var, it throws:
user=> (defn g [x] (if true x (+ *foobar* x)))
CompilerException java.lang.RuntimeException: Unable to resolve symbol: *foobar* in this context, compiling:(NO_SOURCE_PATH:24)
Note that this happens at compile time, and regardless of whether the offending code will ever be executed. Thus the compiler should know what Vars are potentially referenced by the function, and I would like to have access to that information.
Parse the source code and walk the syntax tree, and record when a Var is referenced
Because code is data and all that. I suppose this means calling macroexpand and handling each Clojure primitive and every kind of syntax they take. This looks so much like a compilation phase that it would be great to be able to call parts of the compiler, or somehow add my own hooks to the compiler.
Instrument the Var mechanism, execute the test and see which Vars get accessed
Not as complete as the other methods (what if a Var is used in a branch of the code that my test fails to exercise?) but this would suffice. I imagine I would need to redefine def to produce something that acts like a Var but records its accesses somehow.
(1) Actually that particular function doesn't change if you rebind +; but in Clojure 1.2 you can bypass that optimization by making it (defn f [x] (+ x 0 *increment*)) and then you can have fun with (binding [+ -] (f 3)). In Clojure 1.3 attempting to rebind + throws an error.

Regarding your first point you could consider using the analyze library. With it you can quite easily figure out which dynamic vars are used in an expression:
user> (def ^:dynamic *increment* 3)
user> (def src '(defn f [x]
(+ x *increment*)))
user> (def env {:ns {:name 'user} :context :eval})
user> (->> (analyze-one env src)
expr-seq
(filter (op= :var))
(map :var)
(filter (comp :dynamic meta))
set)
#{#'user/*increment*}

I know that this doesn't answer your question, but wouldn't it be a lot less work to just provide two versions of a function where one version has no free variables, and the other version calls the first one with the appropriate top-level defines?
For example:
(def ^:dynamic *increment* 3)
(defn f
([x]
(f x *increment*))
([x y]
(+ x y)))
This way you can write all your tests against (f x y), which doesn't rely on any global state.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Making Clojure's defprotocol play nice (polymorphically) with existing functions - clojure

Related

Functional alternative to "let"

Clojure koans solution - But they are often better written using the names of functions

Is there a good way to check return types when refactoring?

What is the standard way to write nested define statements (like in scheme) for clojure?

Which Vars affect a Clojure function?

Categories

Resources