Unit Testing Local Functions (letfn) in Clojure? - unit-testing

I spent a couple of years doing Scheme "back in the day" and am now learning Clojure. One of the "best practices" in Scheme was to define helper functions within the parent function thus limiting their visibility from "outside." Of course back then TDD wasn't done (or known!) so testing such functions wasn't an issue.
I'm still tempted to structure Clojure functions this way; i.e., using letfn to bind helper functions within the main function. Of course testing such "local" functions is problematic. I realize I can define "private" functions, but this scopes the visibility to the namespace which helps, but is not as fine grained. If you come upon a letfn within another function it's pretty clear that the function is not available for general use.
So, my question is, can one test such local functions and if so, how? If not, then is there some convention to aid in code reading so that it's clear that a function has only one caller?
TIA,
Bill

The usual approach is to just put the functions in the namespace.
One option is using metadata:
user=> (defn ^{::square #(* % %)} cube [x]
#_=> (* x ((::square (meta #'cube)) x)))
#'user/cube
user=> (meta #'cube)
{…, :user/square #<user$fn__780 user$fn__780#2e62c3f9>}
user=> (cube 3)
27
It is of course possible to write a macro to make this prettier.

Related

Is gathering namespace functions into a map via a macros idiomatic Clojure?

I'm learning Clojure via a pet project. The project would consist of several workers that would be called from other functions.
Each worker is defined in their own namespace as a set of functions (currently two: get-data for gathering data and write-data for writing the gathered data into a file).
In order to make the code a bit DRYer, I decided to write a macro that would gather functions from namespace into a map that can be passed around:
(ns clojure-bgproc.workers)
(defmacro gen-worker-info []
(let [get-data (ns-resolve *ns* 'get-data)
write-data (ns-resolve *ns* 'write-data)]
`(def ~(quote worker-info)
{:get-data ~get-data
:write-data ~write-data}
)
)
)
In my worker code, I use my macro (code abridged for clarity):
(ns clojure-bgproc.workers.summary
(:require [clojure-bgproc.workers :refer [gen-worker-info]]))
(defn get-data [params]
<...>
)
(defn write-data [data file]
;; <...>
)
(gen-worker-info)
While it does work (I get my get-data and write-data functions in clojure-bgproc.workers.summary/worker-info, I find it a bit icky, especially since, if I move the macro call to the top of the file, it doesn't work.
My question is, is there a more idiomatic way to do so? Is this idiomatic Clojure at all?
Thank you.
I think you're in a weird spot because you've structured your program wrong:
Each worker is defined in their own namespace as a set of functions
This is the real problem. Namespaces are a good place to put functions and values that you will refer to in hand-written code. For stuff you want to access programmatically, they are not a good storage space. Instead, make the data you want to access first-class by putting it into an ordinary proper data structure, and then it's easy to manipulate.
For example, this worker-info map you're thinking of deriving from the namespace is great! In fact, that should be the only way workers are represented: as a map with keys for the worker's functions. Then you just define somewhere a list (or vector, or map) of such worker maps, and that's your list of workers. No messing about with namespaces needed.
My go-to solution for defining the workers would be Protocols. I would also apply some of the well-tried frameworks for system lifecycle management.
Protocols provide a way of defining a set of methods and their signatures. You may think of them as similar, but more flexible than, interfaces in object-oriented programming.
Your workers will probably have some state and life-cycle, e.g., the workers may be running or stopped, acquiring and releasing a resource, and so on. I suggest you take a look at Integrant for managing a system with stateful components (i.e., workers).
I would argue for avoiding macros in this case. The adage data over functions over macros seems to apply here. Macros are not available at runtime, make debugging harder, and force all other programmers who look at your code to learn a new Domain-Specific Language, i.e., the one you defined with your macros.

Advising protocol methods in Clojure

I'm trying to advise a number of methods in one library with utility functions from another library, where some of the methods to be advised are defined with (defn) and some are defined with (defprotocol).
Right now I'm using this library, which uses (alter-var-root). I don't care which library I use (or whether I hand-roll my own).
The problem I'm running into right now is that protocol methods sometimes can be advised, and sometimes cannot, depending on factors that are not perfectly clear to me.
If I define a protocol, then define a type and implement that protocol in-line, then advising never seems to work. I am assuming this is because the type extends the JVM interface directly and skips the vars.
If, in a single namespace, I define a protocol, then advise its methods, and then extend the protocol to a type, the advising will not work.
If, in a single namespace, I define a protocol, then extend the protocol to a type, then advise the protocol's methods, the advising will work.
What I would like to do is find a method of advising that works reliably and does not rely on undefined implementation details. Is this possible?
Clojure itself doesn't provide any possibilities to advice functions in a reliable way, even those defined via def/defn. Consider the following example:
(require '[richelieu.core :as advice])
(advice/defadvice add-one [f x] (inc (f x)))
(defn func-1 [x] x)
(def func-2 func-1)
(advice/advise-var #'func-1 add-one)
> (func-1 0)
1
> (func-2 0)
0
After evaluation of the form (def func-2 func-1), var func-2 will contain binding of var func-1 (in other words its value), so advice-var won't affect it.
Eventhough, definitions like func-2 are rare, you may have noticed or used the following:
(defn generic-function [generic-parameter x y z]
...)
(def specific-function-1 (partial generic-function <specific-arg-1>))
(def specific-function-2 (partial generic-function <specific-arg-2>))
...
If you advice generic-function, none of specific functions will work as expected due to peculiarity described above.
If advising is critical for you, as a solution that may work, I'd suppose the following: since Clojure functions are compiled to java classes, you may try to replace java method invoke with other method that had desired behaviour (however, things become more complicated when talking about replacing protocol/interface methods: seems that you'll have to replace needed method in every class that implements particular protocol/interface).
Otherwise, you'll need explicit wrapper for every function that you want to advice. Macros may help to reduce boilerplate in this case.

Clojure: Should I prepend new- or make- to constructor functions?

When creating a new Java object via a wrapper function, what is the "constructor" naming standard? Should I prepend make- or new- to the function name? Or just call it the type of thing it returns? At one point I got accustomed to using make- because it's the way it was done in the Scheme book SICP.
For example, I'm passing some stuff to a function which eventually returns an instance of a Java class. Here are a few examples:
(def myimage1 (make-image "img.svg" 100 200)) ; These all return
(def myimage2 (new-image "img.svg" 100 200)) ; an instance of
(def myimage3 (image "img.svg" 100 200)) ; javafx.scene.image.Image
Is it the same for creating Clojure-only structs, such as maps, etc?:
(def mystruct1 (make-custom-struct args))
(def mystruct2 (new-custom-struct args))
(def mystruct3 (custom-struct args))
I prefer the last version without make- or new-, but often times the binding (for example inside a let) would have the same name, which would suggest prepending the constructor name, or using a different binding name, such as:
(let [image (make-image "img.svg" 100 200)] ...)
(let [imuj (image "img.svg" 100 200)] ...)
However other times I'd just like to use the function in-line without the cluttering of the new- or `make-':
(do-something-with-an-image (image "img.svg" 100 200))
Stuart Sierra suggests not using a prefix. His idea is that a pure function can be replaced with its implementation, and so a simple noun makes a better name in that sense than a verb.
I agree that (image ...) is the version that looks the best. However, you are right that you can easily end up shadowing the function with local variables. What can you do?
Namespaces are here to counter the issue introduced by the Lisp-1 nature of Clojure:
(let [image (png/image "file.png")] ...)
Alternatively, when you are in the same namespace, you should give better variable names: image is a little bit generic; try something more precise, like avatar or thumbnail. Note also that sometimes image is the best name to give to the variable.
You can also prefix or suffix variable names, like user-image, old-image or new-image.
Regarding functions, the make- prefix is not bad: it is readable, unambiguous and it fixes the problem of shadowing existing bindings. So don't discard it too quickly. This prefix tends to be found more often than new-, which is a little unclear: sometimes you have old and new data, where new is an adjective.
I think -> is a fun prefix, and it's already used by the factory functions generated by defrecord.

Clojure best practice for nested let

Is is good practice to use Clojure nested let in the following way, or is it confusing ?
(defn a-fun [config]
(let [config (-> config (parse) (supply-defaults))]
;; do something with config
))
I noticed I have this pattern of parsing/checking/validating things quite often in my input functions that talk to the external world (in this case a Clojurescript library that exposes public functions, but I also had Compojure routes with this same feeling).
Is it confusing, because one has to understand the rules for bindings visibility (not sure what the exact wording is) ?
What would be the idiomatic way to do it ? Change the config name to parsed-config, put it in another function, something else completely ?
I would reach for this idiom when
the rebinding is the same kind of thing and
you want to make clear that the local binding supersedes the
global one.
For example
(defn fact [n]
(loop [n n, answer 1]
(if (pos? n)
(recur (dec n) (* answer n))
answer)))
This also stops you using the global binding by accident, as I was prone to do.
#Thumbnail's answer is good, but I personally would almost never shadow an outer binding with an inner one in this way. Even if you understand binding rules, and want to shadow an outer variable for a good reason, it's confusing for someone reading the code--which could very well be you, later, after you've forgotten how the code works.
Suppose I have a complex function, and I see the variable foo used somewhere in the middle of it. I look up and see a binding for it--perhaps as a function parameter, which would be obvious and easy to notice. If I don't notice that somewhere below that, the name was rebound, then I will misunderstand what's in the variable.
So I usually make up new, related names that correspond to the role of the different variables in the code. Sometimes the name differences are somewhat arbitrary.
I think these are good reasons not to shadow variables, and I think #Thumbnail gives go reasons to go ahead and shadow them. There are tradeoffs, and you have to decide what's best for your situation.
Short functions are probably better contexts for shadowing. Personally, I'd add a very noticeable comment if I did this sort of thing, or if I was doing it over and over again, maybe a very noticeable comment near the top of the file.
EDIT: As nha's comment made me realize, it can be more reasonable to shadow variables when the new binding occurs immediately after the previous binding; that makes it hard to miss the fact that the name is being redefined.
Another option would be to slightly rename the argument, keeping the general name for the "final" version of the data:
(defn a-fun [config-in]
(let [config (-> config-in (parse) (supply-defaults))]
;; do something with config
))
I also sometimes use the suffixes -arg, -orig, etc to differentiate various stages of processing.

Clojure style: defn- vs. letfn

Clojure style (and good software engineering in general) puts emphasis on lots of small functions, a subset of which are publicly visible to provide an external interface.
In Clojure there seem to be a couple of ways to do this:
(letfn [(private-a ...)
(private-b ...)]
(defn public-a ...)
(defn public-b ...))
(defn- private-a ...)
(defn- private-b ...)
(defn public-a ...)
(defn public-b ...)
The letfn form seems more verbose and perhaps less flexible, but it reduces the scope of the functions.
My guess is that letfn is intended only for use inside other forms, when little helper functions are used in only a small area. Is this the consensus? Should letfn ever be used at a top level (as I've seen recommended before)? When should it be used?
letfn is intended for use in cases of mutual recursion:
(letfn [(is-even? [n]
(if (zero? n)
true
(is-odd? (dec n))))
(is-odd? [n]
(if (zero? n)
false
(is-even? (dec n))))]
(is-even? 42))
;; => true
Don't use it at the top level.
Also don't use the defn macro anywhere else than at the top level unless you have very specific reasons. It will be expanded to the def special form which will create and intern global var.
The purpose of letfn is totally different from the purpose of defn form. Using letfn at the top level does not give you same properties of defn, since any bindings of names to functions bound inside letfn is not visible outside its scope. The binding for functions bound inside let or letfn is not available outside its lexical scope. Also, the visibility of functions bound inside letfn is independent of the order in which they are bound inside that lexical scope. That is not the case with let.
My rules are these:
If a subsidiary function is used in one public one, define it
locally with let or letfn.
If it is used in several, define it at top level with defn-.
And
Don't use let or letfn at top level.
Don't use def or defn or defn- anywhere other than top level.