How does Clojure approach Separation of Concerns? - clojure

How does Clojure approach Separation of Concerns ? Since code is data, functions can be passed as parameters and used as returns...
And, since there is that principle "Better 1000 functions that work on 1 data structure, than 100 functions on 100 data structures" (or something like that).
I mean, pack everything a map, give it a keyword as key, and that's it ? functions, scalars, collections, everything...
The idea of Separation of Concerns is implemented, in Java, by means of Aspects (aspect oriented programming) and annotations. This is my view of the concept and might be somewhat limited, so don't take it for granted.
What is the right way (idiomatic way) to go about in Clojure, to avoid the WTFs of fellow programmers _

In a functional language, the best way to handle separation of concerns is to convert any programming problem into a set of transformations on a data structure. For instance, if you write a web app, the overall goal is to take a request and transform it into a response, which can be thought of as simply transforming the request data into response data. (In a non-trivial web app, the starting data would probably include not only the request, but also session and database information) Most programming tasks can be thought of in this way.
Each "concern" would be a function in a "pipeline" that helps make the transform possible. In this way, each function is completely decoupled from the other steps.
Note that this means that your data, as it undergoes these transformations, needs to be rich in its structure. Essentially, we want to put all the "intelligence" of our program into the data, not in the code. In a complicated functional program, the data at the different levels may be complex enough that in needs to look like a programming language in its own right- This is where the idea of "domain-specific languages" comes into play.
Clojure has excellent support for manipulating complex heterogenous data structures, which makes this less cumbersome than it may sound (i.e. it's not cumbersome at all if done right)
In addition, Clojure's support for lazy data structures allows these intermediate data structures to actually be (conceptually) infinite in size, which makes this decoupling possible in most scenarios. See the following paper for info on why having infinite data structures is so valuable in this situation: http://www.cs.kent.ac.uk/people/staff/dat/miranda/whyfp90.pdf
This "pipeline" approach can handle 90% of your needs for separating concerns. For the remaining 10% you can use Clojure macros, which, at a high level, can be thought of as a very powerful tool for aspect-oriented programming.
That's how I believe you can best decouple concerns in Clojure- Note that "objects" or "aspects" are not really necessary concepts in this approach.

Related

Lisp-family: how to escape object-oriented java-like thinking? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
Backstory: I've made a lot of large and relatively complex projects in Java, have a lot of experience in embedded C programming. I've got acquainted with scheme and CL syntax and wrote some simple programms with racket.
Question: I've planned a rather big project and want to do it in racket. I've heard a lot of "if you "get" lisp, you will become a better programmer", etc. But every time I try to plan or write a program I still "decompose" the task with familiar stateful objects with interfaces.
Are there "design patterns" for lisp? How to "get" lisp-family "mojo"? How to escape object-oriented constraint on your thinking? How to apply functional programming ideas boosted by powerful macro-facitilties? I tried studying source code of big projects on github (Light Table, for instance) and got more confused, rather than enlightened.
EDIT1 (less ambigious questions): is there a good literatue on the topic, that you can recommend or are there good open source projects written in cl/scheme/clojure that are of high quality and can serve as a good example?
A number of "paradigms" have come into fashion over the years:
structured programming, object oriented, functional, etc. More will come.
Even after a paradigm falls out of fashion, it can still be good at solving the particular problems that first made it popular.
So for example using OOP for a GUI is still natural. (Most GUI frameworks have a bunch of states modified by messages/events.)
Racket is multi-paradigm. It has a class system. I rarely use it,
but it's available when an OO approach makes sense for the problem.
Common Lisp has multimethods and CLOS. Clojure has multimethods and Java class interop.
And anyway, basic stateful OOP ~= mutating a variable in a closure:
#lang racket
;; My First Little Object
(define obj
(let ([val #f])
(match-lambda*
[(list) val]
[(list 'double) (set! val (* 2 val))]
[(list v) (set! val v)])))
obj ;#<procedure:obj>
(obj) ;#f
(obj 42)
(obj) ;42
(obj 'double)
(obj) ;84
Is this a great object system? No. But it helps you see that the essence of OOP is encapsulating state with functions that modify it. And you can do this in Lisp, easily.
What I'm getting at: I don't think using Lisp is about being "anti-OOP" or "pro-functional". Instead, it's a great way to play with (and use in production) the basic building blocks of programming. You can explore different paradigms. You can experiment with ideas like "code is data and vice versa".
I don't see Lisp as some sort of spiritual experience. At most, it's like Zen, and satori is the realization that all of these paradigms are just different sides of the same coin. They're all wonderful, and they all suck. The paradigm pointing at the solution, is not the solution. Blah blah blah. :)
My practical advice is, it sounds like you want to round out your experience with functional programming. If you must do this the first time on a big project, that's challenging. But in that case, try to break your program into pieces that "maintain state" vs. "calculate things". The latter are where you can try to focus on "being more functional". Look for opportunities to write pure functions. Chain them together. Learn how to use higher-order functions. And finally, connect them to the rest of your application -- which can continue to be stateful and OOP and imperative. That's OK, for now, and maybe forever.
A way to compare programming in OO vs Lisp (and "functional" programming in general) is to look at what each "paradigm" enables for the programmer.
One viewpoint in this line of reasoning, which looks at representations of data, is that the OO style makes it easier to extend data representations, but makes it more difficult to add operations on data. In contrast, the functional style makes it easier to add operations but harder to add new data representations.
Concretely, if there is a Printer interface, with OO, it's very easy to add a new HPPrinter class that implements the interface, but if you want to add a new method to an existing interface, you must edit every existing class that implements the interface, which is more difficult and may be impossible if the class definitions are hidden in a library.
In contrast, with the functional style, functions (instead of classes) are the unit of code, so one can easily add a new operation (just write a function). However, each function is responsible for dispatching according to the kind of input, so adding a new data representation requires editing all existing functions that operate on that kind of data.
Determining which style is more appropriate for your domain depends on whether you are more likely to add representations or operations.
This is a high-level generalization of course, and each style has developed solutions to cope with the tradeoffs mentioned (eg mixins for OO), but I think it still holds to a large degree.
Here is a well-known academic paper that captured the idea 25 years ago.
Here are some notes from a recent course (I taught) describing the same philosophy.
(Note that the course follows the How to Design Programs curriculum, which initially emphasizes the functional approach, but later transitions to the OO style.)
edit: Of course this only answers part of your question and does not address the (more or less orthogonal) topic of macros. For that I refer to Greg Hendershott's excellent tutorial.
A personal view:
If you parameterise an object design in the names of the classes and their methods - as you might do with C++ templates - then you end up with something that looks quite like a functional design. In other words, functional programming does not make useless distinctions between similar structures because their parts go by different names.
My exposure has been to Clojure, which tries to steal the good bit from object programming
working to interfaces
while discarding the dodgy and useless bits
concrete inheritance
traditional data hiding.
Opinions vary about how successful this programme has been.
Since Clojure is expressed in Java (or some equivalent), not only can objects do what functions can do, there is a regular mapping from one to the other.
So where can any functional advantage lie? I'd say expressiveness. There are lots of repetitive things you do in programs that are not worth capturing in Java - who used lambdas before Java provided compact syntax for them? Yet the mechanism was always there.
And Lisps have macros, which have the effect of making all structures first class. And there's a synergy between these aspects that you will enjoy.
The "Gang of 4" design patterns apply to the Lisp family just as much as they do to other languages. I use CL, so this is more of a CL perspective/commentary.
Here's the difference: Think in terms of methods that operate on families of types. That's what defgeneric and defmethod are all about. You should use defstruct and defclass as containers for your data, keeping in mind that all you really get are accessors to the data. defmethod is basically your usual class method (more or less) from the perspective of an operator on a group of classes or types (multiple inheritance.)
You'll find that you'll use defun and define a lot. That's normal. When you do see commonality in parameter lists and associated types, then you'll optimize using defgeneric/defmethod. (Look for CL quadtree code on github, for an example.)
Macros: Useful when you need to glue code around a set of forms. Like when you need to ensure that resources are reclaimed (closing files) or the C++ "protocol" style using protected virtual methods to ensure specific pre- and post-processing.
And, finally, don't hesitate to return a lambda to encapsulate internal machinery. That's probably the best way to implement an iterator ("let over lambda" style.)
Hope this gets you started.

Immutable Data Structure - Application maintenance

I have been reading about Immutable data structures and understood that change detection has been made easy . And quite often, I hear that it makes the application maintenance simpler and provides an easy to understand programming model.
I need help to understand the way it simplifies the job.
The Clojure community has embraced immutability and it is an eye opener. The best I can do is send you to the source: Rich Hickey's essay on State and his talk The Value of Values. Rich explains how separating the concept of a variable into three distinct concepts: identity, state, and value helps you model your system and reason about it.
It boils down to this: in your programming model, you should only allow things to change if they change in the system you are trying to model. Otherwise you are adding moving parts (mutable variables and objects) to a model that doesn't need them. This makes it harder to understand the model (specially as time evolves) but has little or no benefit.
Even though reading helps, the only way to grok this is to program in a language that takes immutability as a default until you realize how most of the systems you model actually have only a handful of things that change instead of pages and pages of mutable variables.
Immutability is certainly more embraced in functional languages than in imperative ones, even if you can have a Java programming style that limits mutability (see this for immutability in Java). That said, I will just comment on [functional/immutability] and [object/mutability].
I'm Clojure fan and find functional programming really powerful, but...
May be I spent too much time with C++ & Java and not enough with Lisp & Clojure, but I reckon that the simpler maintenance argument has yet to be proven by facts. I'm not sure there are reliable surveys on the actual cost of maintenance in big production systems with data on the technology used and associated costs.
Certainly, in terms of LOC, language like Clojure are really more focused and concise than Java. Hence you can say that less code leads to less maintenance, but I think functional style gives really more compact code that needs a very focused attention to fully understand what a function is doing comparing to imperative style which is more verbose but kind of straightforward. One big advantage of functional programming associated with immutability, is the ability to isolate a function and experiment with it without the need to drag a heavy context of satellite objects or build a bunch of mocks, which is very often the case with OO languages. Putting aside the experimentation, a pure function won't modify its arguments, which ease the fear to break unintentionally some piece of code outside the scope of the function.
But, putting aside the merits of functional/immutability over oop/mutability, in terms of maintenance, my experience leads me to think that it's not the technology which is the main issue, but the design, code quality and evolution of this code over time even when the initial one was of good quality. By "good", I mean that the code is respectful of style conventions (like basic naming), managed complexity, and has a sensible test harness, in a continuous (or at least automated) build environment.
Then, the question becomes: is there a paradigm (functional/immutability, object-oriented/mutability) that enforced a better design and better code. My feeling is that functional languages are the land of computer science passionates, OTOH OOP is more mainstream. Isn't it because OOP is easier to apprehend or is ot just a matter of education? but then, in order to maintain a system in the long run, should one go for a "clever" functional environment with few people able to tackle it, or some mainstream OO technology - with its unsafeness or permissiveness - but lots of people having some knowledge in it?
Certainly the solution is to choose the right technologies (plural) with the right, motivated people...

Clojure Protocols vs Types

Disclaimer
Despite the title, this is a genuine question, not an attempt at Emacs/Vi flamewars.
Context
I've used Haskell for a few months, and written a small ~10K LOC interpreter. In the past year, I've switched to Clojure. For quite a while, I struggled with Clojure's lack of types. Then I switched into using defrecords in Clojure, and now, switched to Clojure's defprotocols.
I really really like defprotocols. In fact, more than types.
I'm now at the point where for my Clojure functions, for it's documentation string, I just specify:
* the protocols of the inputs
* the protocols of the outputs
Using this, it appears I now have an ad-hoc type system (not compiler checked; but human checked).
Question
I suspect there's something about types that I'm missing. What does types provide over protocols?
Questioning the question...
Your question "What [do] types provide over protocols?" seems awkward to me. Types and protocols are perpendicular; They describe different things. Types/records define structure of data, while Protocols define the structure of some behavior or functionality. And part of why this question seems weird to me is that these things are not mutually exlusive! You can have types implement a protocol, thereby giving them whatever behaviour/functionality that protocol describes. In fact, since your context makes it clear that you have been using protocols, I have to wonder how you've been using them. My guess is that you've been using them with records (or possibly reifying them), but you could just as easily use protocols and (def)types together.
So to me, it seems you've compared apples with oranges here. To help clarify, let me compare apples to apples and oranges to oranges with a couple of different questions:
What problems do protocols solve, and what are the alternatives and their respective advantages/disadvantages?
Protocols let you define functions that operate in different ways on different types. The only other ways to do this are multimethods and simple function logic:
multimethods: have value in being extremely flexible. You can dispatch behavior on type by passing type as the dispatch function, but you can also use any other arbitrary function for dispatching.
internal function logic: You can also (of course) manually check for types in conditionals in your function definitions to decide how to process differently given different types. This is more primitive than multimethod dispatch, and also less extensible. Except in simple cases, multimethods are preferred.
Protocols have the advantage of being much more performant, being based on JVM class/method dispatch, which has been highly optimized. Additionally, protocols were designed to address the expression problem (great read), which makes them really powerful tools for crafting nice, modular, extensible APIs.
What are the advantages/disadvantages of (def)records or reify over (def)types?
On the side of how we specify the structure of data, we have a number of options available:
(def)records: produce a type good for "representing application domain information" (from http://clojure.org/datatypes; worth a read)
(def)types: produce a lighter weight type for creating "artifacts of the implementation/programming domain", such as the standard collection types
reify: construct a one-off object with an anonymous type implementing one or more protocols; good for... one-off things which need to implement a protocol(s)
Practically, records behave like clojure hash-maps, but have the added benefit of being able to implement protocols and have faster attribute lookup. Conveniently, the remain extensible via assoc, though attributes added in this fashion do not share the compiled lookup performance. This is what makes these constructs convenient for implementing applciation logic. Using deftype is advantageous for aspects of implementation/programming domain because they don't implement excess bagage, making the the use cleaner for these cases.
Protocols create interfaces and interfaces are a well, the interface to a type. they describe some aspects of a type though with much less rigor than you would come to expect in a language like Haskell.
machine checking
type inference (you don't get some of your protocols generated from docs of others)
parametric polymorphism (parameterised protocols / protocols with generics don't exist)
higher order protocols (what is the protocol for a function that returns a protocol?)
automatic generation of code / boilerplate
inter-operation with automated tools

How should I design a mechanism in C++ to manage relatively generic entities within a simulation?

I would like to start my question by stating that this is a C++ design question, more then anything, limiting the scope of the discussion to what is accomplishable in that language.
Let us pretend that I am working on a vehicle simulator that is intended to model modern highway systems. As part of this simulation, entities will be interacting with each other to avoid accidents, stop at stop lights and perhaps eventually even model traffic enforcement with radar guns and subsequent exciting high speed chases.
Being a spatial simulation written in C++, it seems like it would be ideal to start with some kind of Vehicle hierarchy, with cars and trucks deriving from some common base class. However, a common problem I have run in to is that such a hierarchy is usually very rigidly defined, and introducing unexpected changes - modeling a boat for instance - tends to introduce unexpected complexity that tends to grow over time into something quite unwieldy.
This simple aproach seems to suffer from a combinatoric explosion of classes. Imagine if I created a MoveOnWater interface and a MoveOnGround interface, and used them to define Car and Boat. Then lets say I add RadarEquipment. Now I have to do something like add the classes RadarBoat and RadarCar. Adding more capabilities using this approach and the whole thing rapidly becomes quite unreasonable.
One approach I have been investigating to address this inflexibility issue is to do away with the inheritance hierarchy all together. Instead of trying to come up with a type safe way to define everything that could ever be in this simulation, I defined one class - I will call it 'Entity' - and the capabilities that make up an entity - can it drive, can it fly, can it use radar - are all created as interfaces and added to a kind of capability list that the Entity class contains. At runtime, the proper capabilities are created and attached to the entity and functions that want to use these interfaced must first query the entity object and check for there existence. This approach seems to be the most obvious alternative, and is working well for the time being. I, however, worry about the maintenance issues that this approach will have. Effectively any arbitrary thing can be added, and there is no single location in which all possible capabilities are defined. Its not a problem currently, when the total number of things is quite small, but I worry that it might be a problem when someone else starts trying to use and modify the code.
As one potential alternative, I pondered using the template system to achieve type safe while keeping the same kind of flexibility. I imagine I could create entities that inherited whatever combination of interfaces I wanted. Using these objects would entail creating a template class or function that used any combination of the interfaces. One example might be the simple move on road using just the MoveOnRoad interface, whereas more complex logic, like a "high speed freeway chase", could use methods from both MoveOnRoad and Radar interfaces.
Of course making this approach usable mandates the use of boost concept check just to make debugging feasible. Also, this approach has the unfortunate side effect of making "optional" interfaces all but impossible. It is not simple to write a function that can have logic to do one thing if the entity has a RadarEquipment interface, and do something else if it doesn't. In this regard, type safety is somewhat of a curse. I think some trickery with boost any may be able to pull it off, but I haven't figured out how to make that work and it seems like way to much complexity for what I am trying to achieve.
Thus, we are left with the dynamic "list of capabilities" and achieving the goal of having decision logic that drives behavior based on what the entity is capable of becomes trivial.
Now, with that background in mind, I am open to any design gurus telling me where I err'd in my reasoning. I am eager to learn of a design pattern or idiom that is commonly used to address this issue, and the sort of tradeoffs I will have to make.
I also want to mention that I have been contemplating perhaps an even more random design. Even though I my gut tells me that this should be designed as a high performance C++ simulation, a part of me wants to do away with the Entity class and object-orientated foo all together and uses a relational model to define all of these entity states. My initial thought is to treat entities as an in memory database and use procedural query logic to read and write the various state information, with the necessary behavior logic that drives these queries written in C++. I am somewhat concerned about performance, although it would not surprise me if that was a non-issue. I am perhaps more concerned about what maintenance issues and additional complexity this would introduce, as opposed to the relatively simple list-of-capabilities approach.
Encapsulate what varies and Prefer object composition to inheritance, are the two OOAD principles at work here.
Check out the Bridge Design pattern. I visualize Vehicle abstraction as one thing that varies, and the other aspect that varies is the "Medium". Boat/Bus/Car are all Vehicle abstractions, while Water/Road/Rail are all Mediums.
I believe that in such a mechanism, there may be no need to maintain any capability. For example, if a Bus cannot move on Water, such a behavior can be modelled by a NOP behavior in the Vehicle Abstraction.
Use the Bridge pattern when
you want to avoid a permanent binding
between an abstraction and its
implementation. This might be the
case, for example, when the
implementation must be selected or
switched at run-time.
both the abstractions and their
implementations should be extensible
by subclassing. In this case, the
Bridge pattern lets you combine the
different abstractions and
implementations and extend them
independently.
changes in the implementation of an
abstraction should have no impact on
clients; that is, their code should
not have to be recompiled.
Now, with that background in mind, I am open to any design gurus telling me where I err'd in my reasoning.
You may be erring in using C++ to define a system for which you as yet have no need/no requirements:
This approach seems to be the most
obvious alternative, and is working
well for the time being. I, however,
worry about the maintenance issues
that this approach will have.
Effectively any arbitrary thing can be
added, and there is no single location
in which all possible capabilities are
defined. Its not a problem currently,
when the total number of things is
quite small, but I worry that it might
be a problem when someone else starts
trying to use and modify the code.
Maybe you should be considering principles like YAGNI as opposed to BDUF.
Some of my personal favourites are from Systemantics:
"15. A complex system that works is invariably found to have evolved from a simple system that works"
"16. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over, beginning with a working simple system."
You're also worring about performance, when you have no defined performance requirements, and no problems with performance:
I am somewhat concerned about
performance, although it would not
surprise me if that was a non-issue.
Also, I hope you know about double-dispatch, which might be useful for implementing anything-to-anything interactions (it's described in some detail in More Effective C++ by Scott Meyers).

What are the best resources for learning how to avoid side effects and state in OOP?

I've been playing with functional programming lately and there are pretty good treatments on the topic of side effects, why they should be contained, etc. In projects where OOP is used, I'm looking for some resources which lay out some strategies for minimizing side effect and/or state.
A good example of this is the book RESTful Web Services which gives you strategies for minimizing state in a web application. What others exist?
Remember I'm not looking for another OOP analysts/design patterns book (though good encapsulation and loose coupling help avoid side effects) but rather a resource where the topic itself is state/side effects.
Some compiled answers
OOP programmers who mostly care about state do so because of concurrency, so read Java Concurrency in Practice. [exactly what I was looking for]
Use TDD to make side effects more visible [I like it, example: the bigger your setUps are, the more state you need in place to run your tests = good warning]
Command-query separation [Good stuff, prevents the side effect of changing a function argument which is generally confusing]
Methods do only one thing, perhaps use descriptive names if they change the state of their object so it's simple and clear.
Make objects immutable [I really like this]
Pass values as parameters, instead of storing them in member variables. [I don't link this; it clutters up function prototype and is actively discouraged by Clean Code and other books, though I do admit it helps the state issue]
Recompute values instead of storing and updating them [I also really like this; in the apps I work on performance is a minor concern]
Similarly, don't copy state around if you can avoid it. Make one object responsible for keeping it and let others access it there. [Basic OOP principle, good advice]
I don't think you'll find a lot current material in the OO world on this topic, simply because OOP (and most imperative programming, for that matter) relies on state and side effects. Consider logging, for instance. It's pure side-effect, yet in any self-respecting J2EE app, it's everywhere. Hoare's original QuickSort relies on mutable state, since you have to swap values around a pivot, and yet it too is everywhere.
This is why many OO programmers have trouble wrapping their heads around functional programming paradigms. They try to reassign the value of "x," discover that it can't be done (at least not in the way it can in every other language they've worked in), and they throw up their hands and shout "This is impossible!" Eventually, if they're patient, they learn recursion and currying and how the map function replaces the need for loops, and they calm down. But the learning curve can be very steep for some.
The OO programmers these days who care most about avoiding state are those working on concurrency. The reasons for this are obvious -- mutable state and side effects cause huge headaches when you're trying to manage concurrency between threads. As a result, the best discussion I've seen in the OO world about avoiding state is Java Concurrency in Practice.
I think the rules are quite simple: methods should only ever do one thing, and the intent should be communicated clearly in the method name.
Methods should either query or change data, but never both.
Some small things I do:
Prefer immutable state, it is relatively benign. E.g. in Java I make member variables final and set them in the constructor wherever possible.
Pass around values as parameters, instead of storing them in member variables.
Recompute values instead of storing and updating them, if that can be done cheaply enough. This helps to avoid inconsistent data by forgetting to update it.
Similarly, don't copy state around if you can avoid it. Make one object responsible for keeping it and let others access it there.
One way to isolate side-effects in OO is to let operations only return a description object of the side-effects to cause.
Command-query separation is a pattern that is close to this idea.
By practising TDD (or at least writing unit tests) one will typically be much more aware of side-effects and use them more sparingly, and also separate them from other side-effect free expressions that are easy to write data driven (expected, actual) unit-tests for.