I'm relatively new to Clojure but I've noticed many projects implement a "core" namespace; e.g. foobar.core seems to be a very common pattern. What's the history behind this and why is it the de facto standard?
It's just a generic name that represents the "core" functionality of something. The core functions of Clojure itself define a general purpose programming language, with very generic functions you might need in many different problem domains. Functions which are only applicable to a specific problem domain go into domain-specific namespaces, such as clojure.string or clojure.set.
Tools like Leiningen can't know a priori what sort of a program you're trying to write, so they set you up with a foo/core namespace by default, modelled after clojure.core -- clojure.core is the core functionality of Clojure, so foo.core is the core functionality of foo.
If you can give your core namespace a less generic name, or just use foo/core to kick off bootstrapping (such as by holding your -main function), that's encouraged, so that the bulk of your code will reside in more semantically meaningful namespaces instead. This helps you find the specific code you want to work on later.
Related
In my Clojure project, I am using Clojure Spec but If I need to use some lib like compojure-api then I need to use Schema.
What is the advantage one over the others?
Why would I consider one over the others?
Which one is good for compile type checked?
These are three merely different approaches to give the developer some type safety. All the three offer their own DSL to describe the schema/type of data but they are very different in philosophy. They are all actively maintained and have a nice community.
This is an opinionated overview based of my experiences.
core typed
core typed tries to extend the clojure language with additional macros to annotate functions and vars with static type information. It then uses static type analysis to ensure that the code matches the type info (that is it produces and consumes data of the right types).
Some advantages:
Static typing in general is a very strong tool. If you are familiar with staticly typed programming languages you will appreciate this very much.
Many bugs can be found during compilation time. No more NullPointerExceptions!
Some drawbacks:
Changing something in type or code may require extra work to propagate the changes through all parts of your code. And sometimes it is just too complicated to write type info or correct programs.
Static code checking will slow down your compile times and may slow down your development workflow.
Schema
In Schema you also write type annotations but type checking happens runtime. It encourages you to construct schema declarations dynamically and lets you specify where you want to check for schema and where you do not want its funcionality.
Some advantages:
Very friendly DSL to describe data schema.
Various tools. For example: Data generation for generative testing, tools to explain why a schema does not match.
Some disadvantages:
Only checks for schema where and when you tell it to do so.
External library, not supported by the core team.
Spec
Spec is the latest player with a philosophy borrowed from Racket lang. It is (going to be) part of the Clojure core library from the Clojure version 1.9.
The basic idea is to have entity types specified by the (namespaced) keys in a map object. Spec declarations are stored in the application's registry bound to namespaced keywords. Spec is very strong in sequence validation.
Some advantages:
Part of the Clojure core, not an external library. It is now used for parsing macro arguments and also for documentation purposes.
The community is very excited about it resulting in interesting ideas such as using spec in genetic programming and generative testing.
Some disadvantages:
Will be available in Clojure 1.9 which is not yet a released stable version. It is still a new technology not widely used.
Spec do not look like the data they are describing.
Personally, core.typed feels intimidating and core.spec feels immature so I use schema in production. My advice is the following:
If you need static type checking then core.typed is the way to go.
If you want to do parsing then core.spec is a nice choice.
If you want simple type descriptions then schema will be a good fit.
In Clojure, some tasks (such as instantiating a PersistentQueue or using deftype to implement a custom data type that is compatible with the clojure.core functions) require knowledge of the classes and/or interfaces in clojure.lang.
However, according to clojure.lang/package.html:
The only class considered part of the public API is clojure.lang.IFn. All other classes should be considered implementation details.
Are these statements incorrect or outdated? If so, are there plans to correct them in the future? If not, is there a more preferred way to perform the tasks mentioned above, or should they simply not be done at all in idiomatic Clojure code?
Alex Miller has commented on this in the past (the whole thread is worth reading though):
I'd say there is a range of "public"-ness to the internals of Clojure.
The new Clojure API (clojure.java.api.Clojure) is an official public API for external callers of Clojure. This API basically consists of ways to resolve vars and invoke functions.
For Clojure users in Clojure, pretty much any var that's public and has a docstring, and shows up in the api docs can be considered public API.
Clojure vars that are private or have no docstring (such that the var is omitted from public api docs) are likely places to tread very carefully.
The Clojure internal Java interfaces [clojure.lang] are certainly intended to allow library builders to create useful stuff that plays in the Clojure world. I do not know that anyone has ever said that they are "public", but I certainly think that any change to a core interface likely to break external users would be considered very carefully.
The Clojure internal Java classes [clojure.lang] should in most cases be considered private and subject to change without notice. There are grey areas even there.
In general, we do not place a high value on encapsulation or hiding internals. In most cases, the internals are left available if they might be useful to an advanced user doing interesting things, with the caveat that the weirder things you do, the more likely you are to be accidentally broken in a future release.
I'm trying to create a modular application in clojure.
Lets suppose that we have a blog engine, which consists of two modules, for example - database module, and article module (something that stores articles for blog), all with some configuration parameters.
So - article module depends on storage, And having two instances of article module and database module (with different parameters) allows us to host two different blogs in two different databases.
I tried to implement this creating new namespaces for each initialized module on-the-fly, and defining functions in this namespaces with partially applied parameters. But this approach is some sort of hacking, i think.
What is right way to do this?
A 'module' is a noun, as in the 'Kingdom of Nouns' by Steve Yegge.
Stick to non side-effecting or pure functions of their parameters (verbs) as much as possible except at the topmost levels of your abstractions. You can organize those functions however you like. At the topmost levels you will have some application state, there are many approaches to manage that, but the one I use the most is to hide these top-level services under a clojure protocol, then implement it in a clojure record (which may hold references to database connections or some-such).
This approach maximizes flexibility and prevents you from writing yourself into a corner. It's analagous to java's dependency injection. Stuart Sierra did a good talk recently on these topics at Clojure/West 2013, but the video is not yet available.
Note the difference from your approach. You need to separate the management and resolution of objects from their lifecycles. Tying them to namespaces is quick for access, but it means any functions you write as clients that use that code are now accessing global state. With protocols, you can separate the implementation detail of global state from the interface of access.
If you need a motivating example of why this is useful, consider, how would you intercept all access to a service that's globally accessible? Well, you would push the full implementation down and make the entry point a wrapper function, instead of pushing the relevant details closer to the client code. What if you wanted some behavior for some clients of the code and not others? Now you're stuck. This is just anticipating making those inevitable trade-offs preemptively and making your life easier.
I came across google guice and could not really understand it and what it did, although there seems to be alot of hype around it. I was hoping to get a clojurian perspective of the library and why it is needed/not needed in clojure applications and if there was anything similar built into the language.
Because of Java's OO and type system, dynamically switching between different underlying implementations (for test (mocking) purposes for instance) can be difficult to manage. Libraries like Google Guice are intended to handle these dependency injections in Java more gracefully.
In Clojure and other functional languages functions can be passed around, which makes using different implementations much easier.
There's several ways this can be done in Clojure:
Using your choice of function as parameters in higher order functions.
(Re)Binding your choice of function to a var.
Encapsulating your choice of function inside closures that can then be passed around and called.
Chapter 12 of Clojure Programming has some nice examples of OO patterns like dependency injection and the alternative ways to handle these in Clojure.
Sean Devlin also has a Full Disclojure video on Dependency Injection in Clojure. His example might have been chosen better, though. Instead of using completely different function implementations in his closure, he uses a factory that returns different 'versions' of a function. The gist stays the same though.
Basically, dependency injection is a pattern that is a necessary evil in OOP, and can be solved easily (or is not even a problem) in FP.
The rough Clojure equivalents are still in development. There are two libraries currently in development (as of Oct '12): Prismatic's Graph (not yet open sourced) and Flow by Stuart Sierra.
Note that I consider Guice to be more than dependency injection. It provides a framework for application configuration / modularization. The above libraries aim to accomplish that goal.
I can understand the use for one level of namespaces. But 3 levels of namespaces. Looks insane. Is there any practical use for that? Or is it just a misconception?
Hierarchical namespaces do have a use in that they allow progressively more refined definitions. Certainly a single provider may produce two classes with the same name. Often the first level is occupied by the company name, the second specifies the product, the third (and possibly more) my provide the domain.
There are also other uses of namespace segregation. One popular situation is placing the base classes for a factory pattern in its own namespace and then derived factories in their own namespaces by provider. E.g. System.Data, System.Data.SqlClient and System.Data.OleDbClient.
Obviously it's a matter of opinion. But it really boils down to organization. For example, I have a project which has a plugin api that has functions/objects which look something like this:
plugins::v1::function
When 2.0 is rolled out they will be put into the v2 sub-namespace. I plan to only deprecate but never remove v1 members which should nicely support backwards compatibility in the future. This is just one example of "sane" usage. I imagine some people will differ, but like I said, it's a matter of opinion.
Big codebases will need it. Look at boost for an example. I don't think anyone would call the boost code 'insane'.
If you consider the fact that at any one level of a hierarchy, people can only comprehend somewhere very roughly on the order of 10 items, then two levels only gives you 100 maximum. A sufficiently big project is going to need more, so can easily end up 3 levels deep.
I work on XXX application in my company yyy, and I am writing a GUI subsystem. So I use yyy::xxx::gui as my namespace.
You can easily find yourself in a situation when you need more than one level. For example, your company has a giant namespace for all of its code to separate it from third party code, and you are writing a library which you want to put in its own namespace. Generally, whenever you have a very large and complex system, which is broken down hierarchically, it is reasonable to use several namespace levels.
It depends on your needs and programming style. But one of the benefits of namespace is to help partition name space (hence the name). With a single namespace, as your project is increases in size and complexity, so does the likelihood of name-collision.
If you're writing code that's meant to be shared or reused, this becomes even more important.
I agree for applications. Most people that use multiple levels of namespaces (in my experience) come from a Java or .NET background where the noise is significantly less. I find that good class prefixes can take the place of multiple levels of namespaces.
But I have seen good use of multiple namespace levels in boost (and other libraries). Everything is in the boost namespace, but libraries are allowed (encouraged?) to be in their own namespace. For example - boost::this_thread namespace. It allows things like...
boost::this_thread::get_id()
boost::this_thread::interruption_requested()
"this_thread" is just a namespace for a collection of free functions. You could do the same thing with a class and static functions (i.e. the Java way of defining a free function), but why do something unnatural when the language has a natural way of doing it?
Just look at the .Net base class library to see a namespace hierarchy put to good use. It goes four or five levels deep in a few places, but mostly it's just two or three, and the organization is very nice for finding things.
The bigger the codebase the bigger the need for hierarchical namespaces. As your project gets bigger and bigger you find you need to break it out in ways to make it easier to find stuff.
For instance we currently use a 2 level hierarchy. However some of the bigger portions we are now talking about breaking them out into 3 levels.