Refactoring Clojure function out of file - clojure

Given that each Clojure namespace corresponds to a file, isn't it the case that a public function, macro, etc. can never be moved out of that file without breaking backward compatibility?
This seems like a surprisingly rigid system--essentially, refactoring of public-facing code can only be done within a single file.
Is there a technical reason for this limitation? Something to do with Java interop, maybe?

You can split a single namespace into multiple files ( see Splitting a Clojure namespace over multiple files ) but it is quite rare to do so. Also you can import-vars using https://github.com/ztellman/potemkin but again this is rarely done in practice. Clojure libraries tend to have relatively small public interfaces, perhaps because they usually operate on common data structures. As such there are rarely files with much code in them.
If you are wanting to preserve backwards compatibility, you can def a var into a namespace (or even within a namespace with a different name), to ensure that any callers will still resolve to the right function.

Functions that are not considered part of the public api can be marked private, which leaves the opportunity for later refactoring without breaking calling code. Any changes to a public api will, of course, risk breaking backwards compatibility and there will a trade off between that breaking change and introducing a new api with redundant functionality.
(ns foo)
;; only visible in the foo ns
(defn- a-private-fn [] ...)
;; only visible in the foo ns
(def ^:private a-private-var BAR 1)

Related

Is gathering namespace functions into a map via a macros idiomatic Clojure?

I'm learning Clojure via a pet project. The project would consist of several workers that would be called from other functions.
Each worker is defined in their own namespace as a set of functions (currently two: get-data for gathering data and write-data for writing the gathered data into a file).
In order to make the code a bit DRYer, I decided to write a macro that would gather functions from namespace into a map that can be passed around:
(ns clojure-bgproc.workers)
(defmacro gen-worker-info []
(let [get-data (ns-resolve *ns* 'get-data)
write-data (ns-resolve *ns* 'write-data)]
`(def ~(quote worker-info)
{:get-data ~get-data
:write-data ~write-data}
)
)
)
In my worker code, I use my macro (code abridged for clarity):
(ns clojure-bgproc.workers.summary
(:require [clojure-bgproc.workers :refer [gen-worker-info]]))
(defn get-data [params]
<...>
)
(defn write-data [data file]
;; <...>
)
(gen-worker-info)
While it does work (I get my get-data and write-data functions in clojure-bgproc.workers.summary/worker-info, I find it a bit icky, especially since, if I move the macro call to the top of the file, it doesn't work.
My question is, is there a more idiomatic way to do so? Is this idiomatic Clojure at all?
Thank you.
I think you're in a weird spot because you've structured your program wrong:
Each worker is defined in their own namespace as a set of functions
This is the real problem. Namespaces are a good place to put functions and values that you will refer to in hand-written code. For stuff you want to access programmatically, they are not a good storage space. Instead, make the data you want to access first-class by putting it into an ordinary proper data structure, and then it's easy to manipulate.
For example, this worker-info map you're thinking of deriving from the namespace is great! In fact, that should be the only way workers are represented: as a map with keys for the worker's functions. Then you just define somewhere a list (or vector, or map) of such worker maps, and that's your list of workers. No messing about with namespaces needed.
My go-to solution for defining the workers would be Protocols. I would also apply some of the well-tried frameworks for system lifecycle management.
Protocols provide a way of defining a set of methods and their signatures. You may think of them as similar, but more flexible than, interfaces in object-oriented programming.
Your workers will probably have some state and life-cycle, e.g., the workers may be running or stopped, acquiring and releasing a resource, and so on. I suggest you take a look at Integrant for managing a system with stateful components (i.e., workers).
I would argue for avoiding macros in this case. The adage data over functions over macros seems to apply here. Macros are not available at runtime, make debugging harder, and force all other programmers who look at your code to learn a new Domain-Specific Language, i.e., the one you defined with your macros.

Advising protocol methods in Clojure

I'm trying to advise a number of methods in one library with utility functions from another library, where some of the methods to be advised are defined with (defn) and some are defined with (defprotocol).
Right now I'm using this library, which uses (alter-var-root). I don't care which library I use (or whether I hand-roll my own).
The problem I'm running into right now is that protocol methods sometimes can be advised, and sometimes cannot, depending on factors that are not perfectly clear to me.
If I define a protocol, then define a type and implement that protocol in-line, then advising never seems to work. I am assuming this is because the type extends the JVM interface directly and skips the vars.
If, in a single namespace, I define a protocol, then advise its methods, and then extend the protocol to a type, the advising will not work.
If, in a single namespace, I define a protocol, then extend the protocol to a type, then advise the protocol's methods, the advising will work.
What I would like to do is find a method of advising that works reliably and does not rely on undefined implementation details. Is this possible?
Clojure itself doesn't provide any possibilities to advice functions in a reliable way, even those defined via def/defn. Consider the following example:
(require '[richelieu.core :as advice])
(advice/defadvice add-one [f x] (inc (f x)))
(defn func-1 [x] x)
(def func-2 func-1)
(advice/advise-var #'func-1 add-one)
> (func-1 0)
1
> (func-2 0)
0
After evaluation of the form (def func-2 func-1), var func-2 will contain binding of var func-1 (in other words its value), so advice-var won't affect it.
Eventhough, definitions like func-2 are rare, you may have noticed or used the following:
(defn generic-function [generic-parameter x y z]
...)
(def specific-function-1 (partial generic-function <specific-arg-1>))
(def specific-function-2 (partial generic-function <specific-arg-2>))
...
If you advice generic-function, none of specific functions will work as expected due to peculiarity described above.
If advising is critical for you, as a solution that may work, I'd suppose the following: since Clojure functions are compiled to java classes, you may try to replace java method invoke with other method that had desired behaviour (however, things become more complicated when talking about replacing protocol/interface methods: seems that you'll have to replace needed method in every class that implements particular protocol/interface).
Otherwise, you'll need explicit wrapper for every function that you want to advice. Macros may help to reduce boilerplate in this case.

How to deal with a variable in a library that needs to be set outside of it?

I'm using Datomic in several projects and it's time to move all the common code to a small utilities library.
One challenge is to deal with a shared database uri, on which most operations depend, but must be set by the project using the library. I wonder if there is a well-established way to do this. Here are some alternatives I've thought about:
Dropping the uri symbol in the library and adding the uri as an argument to every function that accesses the database
Altering it via alter-var-root, or similar mechanism, in an init function
Keeping it in the library as a dynamic var *uri* and overriding the value in a hopefully small adapter layer like
(def my-url ...bla...)
(defn my-fun [args]
(with-datomic-uri my-uri
(apply library/my-fun args))
Keeping uri as an atom in the library
There was a presentation from Stuart Sierra last Clojure/West, called Clojure in the Large, dealing with design patterns for larger Clojure applications.
One of those was the problem you describe.
To summarize tips regarding the problem at hand:
1 Clear constructor
So you have a well defined initial state.
(defn make-connection [uri]
{:uri uri
...}
2 Make dependencies clear
(defn update-db [connection]
...
3 It's easier to test
(deftest t-update
(let [conn (make-connection)]
(is (= ... (update-db conn)))))
4 Safer to reload
(require ... :reload)
Keeping uri in a variable to be bound later is pretty common, but introduces hidden dependencies, also assumes body starts and ends on a single thread.
Watch the talk, many more tips on design.
My feeling is to keep most datomic code as free of implicit state as possible.
Have query functions take a database value. Have write functions (transact) take a database connection. That maximizes potential reuse and avoids implicit assumptions like only ever talking to one database connection or inadvertently implicitly hardcoding query functions to only work on the current database value - as opposed to past (as-of) or "future" (with) database values.
Coordinating a single common connection for the standard use case of the library then becomes the job of small additional namespace. Using an atom makes sense here to hold the uri or connection. A few convenience macros, perhaps called with-connection, and with-current-db could then wrap the main library functions if manually coding for and passing connection and database values is a nuisance.

What's the point of defining something as dynamic when you don't need to define something as dynamic to with-redefs it?

It seems to me that with-redefs can do everything that binding to a dynamic symbol can do, only it doesn't have the limitation of needing the ^:dynamic metadata. So when should I use one over the other?
Aside from requiring the ^:dynamic metadata, binding also creates bindings that are only visible in the current thread, whereas the bindings made by with-redefs are visible in all threads. So, with-redefs is a very blunt tool and has the potential to affect other code running in the same VM. I've never seen with-redefs used outside of test code, nor should it be (at least in my opinion).
I would summarize the difference between the two as thus:
binding with ^:dynamic allows you to introduce a little bit of dynamic behavior in a controlled fashion. It's a good way of defining extension points in an API that let callers far up the call chain change the behavior of your code without having to explicitly pass parameters all the way through the call stack (some of which might not even be their code).
with-redefs is a free-for-all. It's useful in testing, e.g. for mocking out entire sub-systems when the function under test has lots of dependencies.
Declaring a var as ^:dynamic, together with the convention of using earmuffs to name dynamic vars (e.g. *my-dynamic-var*), has the added bonus that it's a self-documenting way of advertising to callers that that part of your code can be modified dynamically.
In summary: prefer ^:dynamic and binding when writing APIs and production code. Use with-redefs in testing, and as a last resort to dynamically alter the behavior of vars beyond your control that weren't declared ^:dynamic (and then, use with caution).

Should MACROs be avoided when making a library

The library offers a class to be derived from with the derived class as a template argument.
Example:
class userclass : public lib::superclass<userclass>
{}
As you can see its quite a lot to type. And the "userclass" should always derive as public for it to work correctly. So i came up with two MACROs looking like this:
#define SUPER(x) public lib::superclass<x>
#define SUPERCLASS(x) class x : public lib::superclass<x>
The user can now type either of.
class userclass : SUPER(userclass)
{}
SUPERCLASS(userclass)
{}
But the main problem is that the macros SUPER and SUPERCLASS exist in the users global namespace as quick as the header is included.
Can/should i:
Have a way of preserving the namespace requirement but still defaulting to public derives?
Use the macros as they are.
Simply require the user to write out the full "public lib::superclass".
I'm using vs 11 and the library is targeted against windows developers.
The first rule of using macros is "Don't, if there's any other solution". In this case, there is another solution, so get rid of them.
Secondly, your macros do way more harm than good, because people have no idea what they expand to by just reading it, whereas the full definition does. Seriously, you're saving a truly minute number of characters for a truly hideous cost in readability. It's far superior to simply write out the inheritance.
That really isn't a lot to type. I've seen a lot longer lines that shouldn't be shortened. Hiding it with a macro just obfuscates your code. If I take a quick look at your SUPERCLASS(userclass) {}, I can pretty much guess that it's a class (I don't like to use a library based on guesses) but I don't know if it or its parent is called userclass (or neither) or what kind of inheritance it's using. It means you have to document it and you force people to look it up when they need it.
So the correct answer is option 3 - don't use macros.
If you really really really need to use a macro in your library, give it a library specific prefix. That's as close as you'll get to a namespaced macro.
I vote for 3. Simply require the user to write out the full "public lib::superclass".
Macros in libraries may be useful if:
there is really a lot to write,
something has to be written several times
or you want to hide an ugly implementation detail and the language doesn't allow you to do otherwise.
But in your case:
not so much to write,
yes, you have to put the class name twice,
you do not want to hide the fact that you are subclassing, or even that you are writing a class!
I don't think that the duplication of the class name - the one positive point - is worth it. Particularly because you will hide the class keyword and cause quite some confusion on the reader.
Anyway, if a library uses macros it is customary to put the library name in front of all macros:
#define MY_FANCY_LIBRARY_NAME_SUPER(x) public lib::superclass<x>
But now you are not saving so much typing...
PS: Remember the golden rule of programming:
Code is written once but read forever, thus it should be easy to read, more than easy to write.