Is clojure.lang really just implementation details? - clojure

In Clojure, some tasks (such as instantiating a PersistentQueue or using deftype to implement a custom data type that is compatible with the clojure.core functions) require knowledge of the classes and/or interfaces in clojure.lang.
However, according to clojure.lang/package.html:
The only class considered part of the public API is clojure.lang.IFn. All other classes should be considered implementation details.
Are these statements incorrect or outdated? If so, are there plans to correct them in the future? If not, is there a more preferred way to perform the tasks mentioned above, or should they simply not be done at all in idiomatic Clojure code?

Alex Miller has commented on this in the past (the whole thread is worth reading though):
I'd say there is a range of "public"-ness to the internals of Clojure.
The new Clojure API (clojure.java.api.Clojure) is an official public API for external callers of Clojure. This API basically consists of ways to resolve vars and invoke functions.
For Clojure users in Clojure, pretty much any var that's public and has a docstring, and shows up in the api docs can be considered public API.
Clojure vars that are private or have no docstring (such that the var is omitted from public api docs) are likely places to tread very carefully.
The Clojure internal Java interfaces [clojure.lang] are certainly intended to allow library builders to create useful stuff that plays in the Clojure world. I do not know that anyone has ever said that they are "public", but I certainly think that any change to a core interface likely to break external users would be considered very carefully.
The Clojure internal Java classes [clojure.lang] should in most cases be considered private and subject to change without notice. There are grey areas even there.
In general, we do not place a high value on encapsulation or hiding internals. In most cases, the internals are left available if they might be useful to an advanced user doing interesting things, with the caveat that the weirder things you do, the more likely you are to be accidentally broken in a future release.

Related

Unit tests, private methods and hidden abstraction

I was reading about unit tests, to learn a bit more about that.
It seems that tests private methods shouldn't be the general rule, only some exceptions. I have find this article, in which explains that: https://enterprisecraftsmanship.com/posts/unit-testing-private-methods/
Here it says that if the private method is complex, one option it is create a public class and implement the method here, so it can be tested.
Here is my doubt.
The reason to don't test private methods it is because it is recomended to test only what the client can use from the library.
The reason to use private methods is don't let to client to know or have details about the internal implementation, so it is good idea to keep the library as simple as possible for the client.
But if for test the private method I create a new public class put the method there, now public, am I not giving to the client details about the implementation? In practice, instead of declaring public the private method, a public class is create to put there the public method.
So I guess that I am misunderstanding something, but I don't know what.
In one of the answers of this question: How do you unit test private methods? it is said that one option it is to pass the private method to a public class too, but it doesn't explain more (I guess the ansewer could be much longer).
Thanks.
But if for test the private method I create a new public class put the method there, now public, am I not giving to the client details about the implementation?
The trick here is to not actually expose this. Exactly how to do this is language/ecosystem dependent, but generally speaking you’ll try to ship your code in a way that implementation details will not be (easily) accessible by end users.
For example, in C++ you could have private headers exposing the functionality that you don’t ship with your library (not a problem if they aren’t included in your interface headers). Java has its “jigsaw” module system. But even then, if it can’t be mechanically enforced you can still socially enforce it by making it very clear with things like package and class names when end users aren’t supposed to use things; For example, if you’re not using Java’s module system you can still have your implementation details for your package lib.foo in a package called lib.foo.implementation_details, similar to how in languages like Smalltalk where you don’t have private methods at all you can still give your methods names like private_foo that make it quite clear they’re not meant for external users.
Of course mechanical enforcement is better, but as you note it’s not always an option. Even if it’s not available, the principles of private vs. public and interface vs. implementation still apply, you just have to be a bit more creative in how you make sure people actually adhere to these things.
The reason to don't test private methods it is because it is recomended to test only what the client can use from the library.
There are a lot of people explaining "the" goals of unit-testing, but in fact they are describing their goals when doing unit-testing. Unit-testing is applied in many different domains and contexts, starting from toy projects but ending in safety-relevant software for nuclear plants, airplanes etc.
In other words, there is a lot of software developed where the abovementioned recommendation may be fine. But, you can apply unit-testing way beyond that. If you don't want to start with a restricted view to what unit-testing can be about, you might better look at it in the following way: One of the primary goals of testing in general and also for unit-testing is to find bugs (see Myers, Badgett, Sandler: The Art of Software Testing, or, Beizer: Software Testing Techniques, but also many others).
Taking it as granted that unit-testing is about finding bugs, then you may also want to test the implementation details: Bugs are in the implementation - different implementations of the same functionality have different bugs. Take a simple fibonacci function: It can be implemented as a recursive function, as an iterative function, as a closed expression (Moivre/Binet), using a hand-written lookup-table, using an automatically-generated lookup-table etc. For each of these implementations, the set of likely bugs will differ dramatically.
Another example is sorting: There is a plethora of sort functions, which from a functionality perspective do the same thing and many even have the same user interface. The IntroSort algorithm is quite interesting with respect to testing because it implements a quicksort, but when it realizes that it runs into a degenerated sort, it falls back to another algorithm (typically heap-sort). Testing an IntroSort means, also to create such a degenerated set of data that forces the algorithm to enter the heap-sort, just to be sure that the potential bugs in the heap-sort part can be found. Looking at the public interface alone, you would never come up with such a test case (at least that would be quite a coincidence).
Summarized so far: Testing implementation details is by no means bad practice. It comes at a cost: Tests that go into implementation details are certainly more likely to break or become useless when the implementation changes. Therefore, it depends on your project whether finding more bugs is more important than saving test maintenance effort.
Regarding the possibilities to make private functions accessible for tests but still not make them part of the "official" public interface: #Cubic has explained nicely the difference between a) being public in the technical sense of the visibility rules of the given programming language, and b) belonging to the "official" and documented public API.

Equivalent of "Package-Protected" for Typescript Testing?

When doing testing in Java, I could use the package-protected access level to ensure proper encapsulation without making methods public. However, when testing Typescript, I have encountered several situations where I wish I could do a similar thing, as the method I am looking at seems sufficiently complex and is currently set as private or protected.
What is the best practice for testing these types of functions? Should they just be made public for the purposes of testing, or is there an easier way?
Leave them public but name them as _foo (instead of foo).
More
This follows the convention in JavaScript land. TypeScript hasn't created any new concepts here 🌹

Clojure module dependencies

I'm trying to create a modular application in clojure.
Lets suppose that we have a blog engine, which consists of two modules, for example - database module, and article module (something that stores articles for blog), all with some configuration parameters.
So - article module depends on storage, And having two instances of article module and database module (with different parameters) allows us to host two different blogs in two different databases.
I tried to implement this creating new namespaces for each initialized module on-the-fly, and defining functions in this namespaces with partially applied parameters. But this approach is some sort of hacking, i think.
What is right way to do this?
A 'module' is a noun, as in the 'Kingdom of Nouns' by Steve Yegge.
Stick to non side-effecting or pure functions of their parameters (verbs) as much as possible except at the topmost levels of your abstractions. You can organize those functions however you like. At the topmost levels you will have some application state, there are many approaches to manage that, but the one I use the most is to hide these top-level services under a clojure protocol, then implement it in a clojure record (which may hold references to database connections or some-such).
This approach maximizes flexibility and prevents you from writing yourself into a corner. It's analagous to java's dependency injection. Stuart Sierra did a good talk recently on these topics at Clojure/West 2013, but the video is not yet available.
Note the difference from your approach. You need to separate the management and resolution of objects from their lifecycles. Tying them to namespaces is quick for access, but it means any functions you write as clients that use that code are now accessing global state. With protocols, you can separate the implementation detail of global state from the interface of access.
If you need a motivating example of why this is useful, consider, how would you intercept all access to a service that's globally accessible? Well, you would push the full implementation down and make the entry point a wrapper function, instead of pushing the relevant details closer to the client code. What if you wanted some behavior for some clients of the code and not others? Now you're stuck. This is just anticipating making those inevitable trade-offs preemptively and making your life easier.

What is the clojure equivalent to google guice?

I came across google guice and could not really understand it and what it did, although there seems to be alot of hype around it. I was hoping to get a clojurian perspective of the library and why it is needed/not needed in clojure applications and if there was anything similar built into the language.
Because of Java's OO and type system, dynamically switching between different underlying implementations (for test (mocking) purposes for instance) can be difficult to manage. Libraries like Google Guice are intended to handle these dependency injections in Java more gracefully.
In Clojure and other functional languages functions can be passed around, which makes using different implementations much easier.
There's several ways this can be done in Clojure:
Using your choice of function as parameters in higher order functions.
(Re)Binding your choice of function to a var.
Encapsulating your choice of function inside closures that can then be passed around and called.
Chapter 12 of Clojure Programming has some nice examples of OO patterns like dependency injection and the alternative ways to handle these in Clojure.
Sean Devlin also has a Full Disclojure video on Dependency Injection in Clojure. His example might have been chosen better, though. Instead of using completely different function implementations in his closure, he uses a factory that returns different 'versions' of a function. The gist stays the same though.
Basically, dependency injection is a pattern that is a necessary evil in OOP, and can be solved easily (or is not even a problem) in FP.
The rough Clojure equivalents are still in development. There are two libraries currently in development (as of Oct '12): Prismatic's Graph (not yet open sourced) and Flow by Stuart Sierra.
Note that I consider Guice to be more than dependency injection. It provides a framework for application configuration / modularization. The above libraries aim to accomplish that goal.

Prefixing interfaces with I?

I am currently reading "Clean Code" By Rober Martin (UncleBob), and generally loving the musings of UncleBob. However, I got a bit confused, when I read that he avoids prefixing interfaces like "IPerson". He states "I don't want my users knowing that I'm handing them an interface".
Thinking in TDD/injection perspective, I will always be very interested in telling the "users" of my classes that I am handing on an interface. The primary reason is that I consider Interfaces contracts between the different "agents" of a system. An agent working with one corner of my system, should not know the concrete implementation of another agents work; they should only exchange contracts, and expect the contracts to be fulfilled without knowing how. The other, but also very important, reason is that an interface can be mocked fully, and thus making unit-testing much easier. There are limits to how much you can mock on a concrete class.
Therefore, I prefer to visualize that I am indeed handing on an interface... or taking an interface as argument. But since UncleBob is a heavyweight champ in our community, and I am just another flyweigth desk jockey, I would like to know if I am missing something.
Is it wrong for me to insist on I's in interfaces??
There are a number of conventions in Java and C# that we have grown comfortable with; but that are backwards. For example, the convention of putting private variables at the top of each class is quite silly from a technical point of view. The most important things about a class are it's public methods. The least important things, the things we hide behind a privacy barrier, are the instance variables. So why would we put them at the top?
The "I" in front of interfaces is another backwards convention. When you are passed a reference to an object, you should expect it to be an interface. Interfaces should be the default; so there is no point in doing something extra, like using an I prefix, to announce that you are doing what everyone expects you to do. It would be better (though still wrong) if we reserved a special marker for the exceptional condition of passing a concrete class.
Another problem with using I, is that (oddly) we use it to communication the implementation decision of using an interface. Usually we don't want implementation decisions expressed so loudly, because that makes them hard to change. Consider, for example, what might happen if you decided that IFoo really ought to be an abstract class instead of an interface. Should you change the name to Foo or CFoo, or ACFoo?
I can hear the wheels turning in your head. You are thinking: "Yeah, but interfaces have a special place in the language, and so it's reasonable to mark them with a special naming convention." That's true. But integers also have a special place in the language, and we don't mark them (any more). Besides, ask yourself this, why do interfaces have a special place in the language?
The whole idea behind interfaces in Java and C# was a cop-out. The language designers could have just used abstract classes, but they were worried about the difficulties of implementing multiple inheritance. So they made a back-room deal with themselves. They invented an artificial construct (i.e. interfaces) that would provide some of the power of multiple inheritance, and they constrained normal classes to single inheritance.
This was one of the worst decision the language designers made. They invented a new and heavyweight syntax element in order to exclude a useful and powerful (albeit controversial) language feature. Interfaces were not invented to enable, they were invented to disable. Interfaces are a hack placed in the language by designers who didn't want to solve the harder problem of MI. So when you use the I prefix, you are putting a big spotlight on one of the largest hacks in language history.
The next time you write a function signature like this:
public void myFunction(IFoo foo) {...}
Ask yourself this: "Why do I want to know that the author of IFoo used the word 'interface'? What difference does it make to me whether he used 'interface' or 'class' or even 'struct'? That's his business, not mine! So why is he forcing me to know his business by putting this great big I in front of his type name? Why doesn't he zip his declarations up and keep his privates out of my face?"
I consider Interfaces contracts
between the different "agents" of a
system. An agent working with one
corner of my system, should not know
the concrete implementation of another
agents work; they should only exchange
contracts, and expect the contracts to
be fulfilled without knowing how. The
other, but also very important, reason
is that an interface can be mocked
fully, and thus making unit-testing
much easier. There are limits to how
much you can mock on a concrete class.
All of this is true - but how does it necessitate a naming convention for interfaces?
Basically, prefixing interfaces with "I" is nothing but another example of the useless kind of Hungarian notation, because in a statically typed language (the only kind where interfaces as a language construct make sense) you can always easily and quickly find out what a type is, usually by hovering the mouse over it in the IDE.
If you're talking about .NET, then interfaces with I at the beginning are so ubiquitous that dropping them would confuse the hell out of everyone.
Plus I'd much rather have
public class Foo : IFoo {}
than
public class FooImpl : Foo {}
It all boils down to personal preference and I did for a while play with the idea myself but I went back to the I prefix. YMMV