Related
I need to do a big trick and am keen on hearing your suggestions.
What I need is a macro that takes ordinary clojure code peppered with a special "await" form. The await forms contains only clojure code and are supposed to return the code's return value. Now, what I want is that when I run whatever is being produced by this macro, it should stop executing when the first "await" form is due for evaluation.
Then, it should dump all the variables defined in its scope so far to the database (I will ignore the problem that not all Clojure types can be serialised to EDN, e.g. functions can't), together with some marker of the place it has stopped in.
Then, if I want to run this code again (possibly on a different machine, another day) - it will read its state from the DB and continue where it stopped.
Therefore I could have, for example:
(defexecutor my-executor
(let [x 7
y (await (+ 3 x))]
(if (await (> y x))
"yes"
"no")))
Now, when I do:
(my-executor db-conn "unique-job-id")
the first time I should get a special return value, something like
:deferred
The second time it should be like this as well, only the third time a real return value should be returned.
The question I have is not how to write such executor, but rather how to gather information from within the macro about all the declared variables to be able to store them. Later I also want to re-establish them when I continue execution. The await forms can be nested, of course :)
I had a peek into core.async source code because it is doing a similar thing inside, but what I have found there made me shiver - it seems they employ the Clojure AST analyser to get this info. Is this really so complex? I know of &env variable inside a macro, but do not have any idea how to use it in this situation. Any help would be appreciated.
And one more thing. Please do not ask me why I need this or that there is a different way of solving a problem - I want this specific solution.
I will ignore the problem that not all Clojure types can be serialised to EDN, e.g. functions can't
If you ignore this, it will be very restrictive for the kinds of Clojure expressions you can handle. Functions are everywhere, e.g. in the implementation of things like doseq and for. Likewise, a lot of interesting programs will depend on some Java object like a file handle or whatever.
The question I have is not how to write such executor, but rather how to gather information from within the macro about all the declared variables to be able to store them.
If you manage to write such an executor, I suspect its implementation will need to know about local variables anyway. So you can put off this question until you are done implementing your executor - you will probably find it obsolete, if you can implement your executor.
I had a peek into core.async source code because it is doing a similar thing inside, but what I have found there made me shiver - it seems they employ the Clojure AST analyser to get this info. Is this really so complex?
Yes, this is very intrusive. You are basically writing a compiler. Thank your lucky stars they wrote the analyzer for you already, instead of having to analyze expressions yourself.
I know of &env variable inside a macro, but do not have any idea how to use it in this situation.
This is the easy part. If you like, you can write a simple macro that gives you all the locals in scope. This question has been asked and answered before, e.g. in Clojure get local lets.
And one more thing. Please do not ask me why I need this or that there is a different way of solving a problem - I want this specific solution.
This is generally an unproductive attitude when asking a question. It's admitting you're posing an XY problem, and still refusing to tell anyone what the Y is.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
Backstory: I've made a lot of large and relatively complex projects in Java, have a lot of experience in embedded C programming. I've got acquainted with scheme and CL syntax and wrote some simple programms with racket.
Question: I've planned a rather big project and want to do it in racket. I've heard a lot of "if you "get" lisp, you will become a better programmer", etc. But every time I try to plan or write a program I still "decompose" the task with familiar stateful objects with interfaces.
Are there "design patterns" for lisp? How to "get" lisp-family "mojo"? How to escape object-oriented constraint on your thinking? How to apply functional programming ideas boosted by powerful macro-facitilties? I tried studying source code of big projects on github (Light Table, for instance) and got more confused, rather than enlightened.
EDIT1 (less ambigious questions): is there a good literatue on the topic, that you can recommend or are there good open source projects written in cl/scheme/clojure that are of high quality and can serve as a good example?
A number of "paradigms" have come into fashion over the years:
structured programming, object oriented, functional, etc. More will come.
Even after a paradigm falls out of fashion, it can still be good at solving the particular problems that first made it popular.
So for example using OOP for a GUI is still natural. (Most GUI frameworks have a bunch of states modified by messages/events.)
Racket is multi-paradigm. It has a class system. I rarely use it,
but it's available when an OO approach makes sense for the problem.
Common Lisp has multimethods and CLOS. Clojure has multimethods and Java class interop.
And anyway, basic stateful OOP ~= mutating a variable in a closure:
#lang racket
;; My First Little Object
(define obj
(let ([val #f])
(match-lambda*
[(list) val]
[(list 'double) (set! val (* 2 val))]
[(list v) (set! val v)])))
obj ;#<procedure:obj>
(obj) ;#f
(obj 42)
(obj) ;42
(obj 'double)
(obj) ;84
Is this a great object system? No. But it helps you see that the essence of OOP is encapsulating state with functions that modify it. And you can do this in Lisp, easily.
What I'm getting at: I don't think using Lisp is about being "anti-OOP" or "pro-functional". Instead, it's a great way to play with (and use in production) the basic building blocks of programming. You can explore different paradigms. You can experiment with ideas like "code is data and vice versa".
I don't see Lisp as some sort of spiritual experience. At most, it's like Zen, and satori is the realization that all of these paradigms are just different sides of the same coin. They're all wonderful, and they all suck. The paradigm pointing at the solution, is not the solution. Blah blah blah. :)
My practical advice is, it sounds like you want to round out your experience with functional programming. If you must do this the first time on a big project, that's challenging. But in that case, try to break your program into pieces that "maintain state" vs. "calculate things". The latter are where you can try to focus on "being more functional". Look for opportunities to write pure functions. Chain them together. Learn how to use higher-order functions. And finally, connect them to the rest of your application -- which can continue to be stateful and OOP and imperative. That's OK, for now, and maybe forever.
A way to compare programming in OO vs Lisp (and "functional" programming in general) is to look at what each "paradigm" enables for the programmer.
One viewpoint in this line of reasoning, which looks at representations of data, is that the OO style makes it easier to extend data representations, but makes it more difficult to add operations on data. In contrast, the functional style makes it easier to add operations but harder to add new data representations.
Concretely, if there is a Printer interface, with OO, it's very easy to add a new HPPrinter class that implements the interface, but if you want to add a new method to an existing interface, you must edit every existing class that implements the interface, which is more difficult and may be impossible if the class definitions are hidden in a library.
In contrast, with the functional style, functions (instead of classes) are the unit of code, so one can easily add a new operation (just write a function). However, each function is responsible for dispatching according to the kind of input, so adding a new data representation requires editing all existing functions that operate on that kind of data.
Determining which style is more appropriate for your domain depends on whether you are more likely to add representations or operations.
This is a high-level generalization of course, and each style has developed solutions to cope with the tradeoffs mentioned (eg mixins for OO), but I think it still holds to a large degree.
Here is a well-known academic paper that captured the idea 25 years ago.
Here are some notes from a recent course (I taught) describing the same philosophy.
(Note that the course follows the How to Design Programs curriculum, which initially emphasizes the functional approach, but later transitions to the OO style.)
edit: Of course this only answers part of your question and does not address the (more or less orthogonal) topic of macros. For that I refer to Greg Hendershott's excellent tutorial.
A personal view:
If you parameterise an object design in the names of the classes and their methods - as you might do with C++ templates - then you end up with something that looks quite like a functional design. In other words, functional programming does not make useless distinctions between similar structures because their parts go by different names.
My exposure has been to Clojure, which tries to steal the good bit from object programming
working to interfaces
while discarding the dodgy and useless bits
concrete inheritance
traditional data hiding.
Opinions vary about how successful this programme has been.
Since Clojure is expressed in Java (or some equivalent), not only can objects do what functions can do, there is a regular mapping from one to the other.
So where can any functional advantage lie? I'd say expressiveness. There are lots of repetitive things you do in programs that are not worth capturing in Java - who used lambdas before Java provided compact syntax for them? Yet the mechanism was always there.
And Lisps have macros, which have the effect of making all structures first class. And there's a synergy between these aspects that you will enjoy.
The "Gang of 4" design patterns apply to the Lisp family just as much as they do to other languages. I use CL, so this is more of a CL perspective/commentary.
Here's the difference: Think in terms of methods that operate on families of types. That's what defgeneric and defmethod are all about. You should use defstruct and defclass as containers for your data, keeping in mind that all you really get are accessors to the data. defmethod is basically your usual class method (more or less) from the perspective of an operator on a group of classes or types (multiple inheritance.)
You'll find that you'll use defun and define a lot. That's normal. When you do see commonality in parameter lists and associated types, then you'll optimize using defgeneric/defmethod. (Look for CL quadtree code on github, for an example.)
Macros: Useful when you need to glue code around a set of forms. Like when you need to ensure that resources are reclaimed (closing files) or the C++ "protocol" style using protected virtual methods to ensure specific pre- and post-processing.
And, finally, don't hesitate to return a lambda to encapsulate internal machinery. That's probably the best way to implement an iterator ("let over lambda" style.)
Hope this gets you started.
Quite often, I swap! an atom value using an anonymous function that uses one or more external values in calculating the new value. There are two ways to do this, one with what I understand is a closure and one not, and my question is which is the better / more efficient way to do it?
Here's a simple made-up example -- adding a variable numeric value to an atom -- showing both approaches:
(def my-atom (atom 0))
(defn add-val-with-closure [n]
(swap! my-atom
(fn [curr-val]
;; we pull 'n' from outside the scope of the function
;; asking the compiler to do some magic to make this work
(+ curr-val n)) ))
(defn add-val-no-closure [n]
(swap! my-atom
(fn [curr-val val-to-add]
;; we bring 'n' into the scope of the function as the second function parameter
;; so no closure is needed
(+ curr-val val-to-add))
n))
This is a made-up example, and of course, you wouldn't actually write this code to solve this specific problem, because:
(swap! my-atom + n)
does the same thing without any need for an additional function.
But in more complicated cases you do need a function, and then the question arises. For me, the two ways of solving the problem are of about equal complexity from a coding perspective. If that's the case, which should I prefer? My working assumption is that the non-closure method is the better one (because it's simpler for the compiler to implement).
There's a third way to solve the problem, which is not to use an anonymous function. If you use a separate named function, then you can't use a closure and the question doesn't arise. But inlining an anonymous function often makes for more readable code, and I'd like to leave that pattern in my toolkit.
Thanks!
edit in response to A. Webb's answer below (this was too long to put into a comment):
My use of the word "efficiency" in the question was misleading. Better words might have been "elegance" or "simplicity."
One of the things that I like about Clojure is that while you can write code to execute any particular algorithm faster in other languages, if you write idiomatic Clojure code it's going to be decently fast, and it's going to be simple, elegant, and maintainable. As the problems you're trying to solve get more complex, the simplicity, elegance and maintainability get more and more important. IMO, Clojure is the most "efficient" tool in this sense for solving a whole range of complex problems.
My question was really -- given that there are two ways that I can solve this problem, what's the more idiomatic and Clojure-esque way of doing it? For me when I ask that question, how 'fast' the two approaches are is one consideration. It's not the most important one, but I still think it's a legitimate consideration if this is a common pattern and the different approaches are a wash from other perspectives. I take A. Webb's answer below to be, "Whoa! Pull back from the weeds! The compiler will handle either approach just fine, and the relative efficiency of each approach is anyway unknowable without getting deeper into the weeds of target platforms and the like. So take your hint from the name of the language and when it makes sense to do so, use closures."
closing edit on April 10, 2014
I'm going to mark A. Webb's answer as accepted, although I'm really accepting A. Webb's answer and omiel's answer -- unfortunately I can't accept them both, and adding my own answer that rolls them up seems just a bit gratuitous.
One of the many things that I love about Clojure is the community of people who work together on it. Learning a computer language doesn't just mean learning code syntax -- more fundamentally it means learning patterns of thinking about and understanding problems. Clojure, and Lisp behind it, has an incredibly powerful set of such patterns. For example, homoiconicity ("code as data") means that you can dynamically generate code at compile time using macros, or destructuring allows you to concisely and readably unpack complex data structures. None of the patterns are unique to Clojure, but Clojure brings them all together in ways that make solving problems a joy. And the only way to learn those patterns is from people who know and use them already. When I first picked Clojure more than a year ago, one of the reasons that I picked it over Scala and other contenders was the reputation of the Clojure community for being helpful and constructive. And I haven't been disappointed -- this exchange around my question, like so many others on StackOverflow and elsewhere, shows how willing the community is to help a newcomer like me -- thank you!
After you figure out the implementation details of the current compiler version for the current version of your current target host, then you'll have to start worrying about the optimizer and the JIT and then the target computer's processors.
You are too deep in the weeds, turn back to the main path.
Closing over free variables when applicable is the natural thing to do and an extremely important idiom. You may assume a language named Clojure has good support for closures.
I prefer the first approach as being simpler (as long as the closure is simple) and somewhat easier to read. I often struggle reading code where you have an anonymous function immediately called with parameters ; I have to resolve to count parentheses to be sure of what's happening, and I feel it's not a good thing.
I think the only way it could be the wrong thing to do is if the closures closes over a value that shouldn't be captured, like the head of a long lazy sequence.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
The question of understanding a large code has previously been well answered. But I feel I should ask this question again to ask the problems I have been facing.
I have just started a student job. I am a beginner programmer and just learned about classes two months back. At the job though, I have been handed a code that is part of a big software. I understand what that code is supposed to do (to read a file). But after spending a few weeks trying to understand the code and modify it to achieve our desired results, I have come to the conclusion that I need to understand each line of that code. The code is about 1300 lines.
Now when i start reading the code, I find that, for example, a variable is defined as:
VarType VarName
Now VarType is not a type like int or float. It is a user defined type so i have to go the class to see what this type is.
In the next line, I see a function being called, like points.interpolate(x);
Now i have to go into another class and see what the interpolate function does.
This happens a lot which means even if I try to understand a small part of the code, I have to go to 3 or 4 different classes and keep them in mind all at one time without losing the main objective and that is tough.
I may not be a skilled programmer but I want to be able to do this. Can I have some suggestions how i should approach this?
Also (I will sound really stupid when I ask this) what is a debugger? I hope this gives you an idea of where I stand (and the need to ask this question again). :(
With any luck, those functions and classes should have at least some documentation to describe what they do. You do not need to do know how they work to understand what they do. When you see the use of interpolate, don't start looking at how it works, otherwise you end up in a deep depth-first-search through the code base. Instead, read its documentation, and that should tell you everything you need to know to understand the code that uses it.
If there is no documentation, I feel for you. I can suggest two tips:
Make general assumptions about what a function or class will do from its name, return type and arguments and the surrounding code that uses it until something happens that contradicts those assumptions. I can make a pretty good guess about what interpolate does without reading how it works. This only works when the names of the functions or classes are sufficiently self-documenting.
If you need a deep understanding of how some code works, start from the bottom and work upwards. Doing this means that you won't end up having to remember where you were in some high level code as you search through the code base. Get a good understanding of the low level fundamental classes before you attempt to understand the high level application of those types.
This also means that you will understand the functions and classes in a generic sense, rather than in the context of the code that led you to them. When you find points.interpolate(x), instead of wondering what interpolate does to these specific points with this specific x argument, find out what it does in general. Later, you will be able to apply your new-found knowledge to any code that uses the same function.
Nonetheless, I wouldn't worry about 1300 lines of code. That's basically a small project. It's only larger than examples and college assignments. If you take these tips into account, that amount of code should be easily manageable.
A debugger is a program that helps you debug your code. Common features of debuggers allow you to step through your code line-by-line and watch as the values of variables change. You can also set up breakpoints in your code that are of interest and the debugger will let you know when it's hit them. Some debuggers even let you change code while executing. There are many different debuggers that all have different sets of features.
Try making assumptions about what the code does based on its title. For example, assume that the interpolate function correctly interpolates your point; only go digging in that bit of code if the output looks suspicious.
First, consider getting an editor/IDE that has the following features:
parens/brackets/braces matching
collapsing/uncollapsing of blocks of code between curly braces
type highlighting (in tooltips)
macro expansion (in tooltips or in a separate window/panel)
function prototype expansion (in tooltips or in a separate window/panel)
quick navigation to types, functions and classes and back
opening the same file in multiple windows/panels at different positions
search for all mentions/uses of a specific type, variable, function or class and presentation of that as a list
call tree/graph construction/navigation
regex search in addition to simple search
bookmarks?
Source Insight is one of such tools. There must be others.
Second, consider annotating the code as you go through it. While doing this, note (write down) the following:
invariants (what's always true or must always be true)
assumptions (what may not be true, e.g. missing checks/validations or unwarranted expectations), think "what if"
objectives (the what) of a piece of code
peculiarities/details of implementation (the how; e.g. whether exceptions are thrown and which, which error codes are returned and when)
a simplified call tree/graph to see the code flow
do the same for data flow
Draw diagrams (in ASCII or on paper/board); I sometimes photograph my papers or the board. Specifically, draw block diagrams and state machines.
Work with code at different levels of abstraction/detail. Zoom in to see the details, zoom out to see the structure. Collapse/uncollapse blocks of code and branches of the call tree/graph.
Also, have a checklist of what you are going to do. Check the items you've done. Add more as necessary. Assign priorities to work items, if it's appropriate.
A debugger is a program that lets you execute your program step by step and examine its state (variables). It also lets you modify the state and that may be useful at times too.
You may use a debugger to understand your code if you're not very well familiar with it or with the programming language.
Another thing that may come in handy is writing tests or input data test sets for your program. They may reveal problems and limitations in terms of logic and performance.
Also, don't neglect documentation and people! If there's something or someone that can give you more information about the project/code, use that something or someone. Ask for advice.
I know this sounds like a lot, but you'll end up doing some of this at some point anyway. Just wait for a big enough project. :)
You may basically needs to understand what is the functionality of a function being called at first, then understand what is input and output to that function, for example, if you really needs to understand how interpolate is done, you can then go to the details. Usually, the name of the functions are self-explainable, you can get a feeling about what the function does from its name if the code is well written.
Another thing you may want to try is to run some toy examples to go through the code, you can use some of the debuggers or IDE that can help you navigate through the code. Understanding large-scale code takes time and experience, just be patience.
"Try the Debugger Approach"
[Update : A debugger is a special program that lets you pause a running program to examine the state of program (Variable Values/Which function is running/Who is the parent function etc.,)]
The way I do it is by Step Debugging the code, for the usecase I want to understand.
If you are using an Advanced/Mordern IDE then setting breakpoints at the entry point (like main() or a point of interest) is fairly easy. And from there on just enter into the function you want to examine or overstep the function.
To give you a step by step approach
Setup a break point in the main() methods (entry points) starting expression.
Run the program with debugging active
The program will break at the break point.
Now, if step over until you come across a function/expression that seems interesting. (say, your points.interpolate(x); ) function
Step into the function, and examine the program state like the variables and function stack, in live.
Avoid complex system Libraries. Just Step over/Step out. (Example: Avoid something like MathLib.boringComputaion() )
Repeat until the program exits.
I found out that this way of learning is very rapid and gives you a quick understanding of any complex/large piece of software.
Use Eclipse, or if you cant then try GDB if its C/C++. Every popular programming language has a decent Debugger.
Understand the basic debugging operations like will be a benifit:
Setting-up a breakpoint.
Stopping at a breakpoint.
Examine/Watch Variables.
Examine Function Stack (the hierarchy of function calls)
Single-Step - Stepping to next Line in Code.
Step-Into a function.
Step-Out of a function.
Step-over a function.
Jumping to the next breakpoint (point of interest).
Hope, it helps!
Many great answer have already been given. I thought to add my understanding as a former student (not too long ago) and what I learned to help me understand code. This particularly helped me because I began a project to convert a database I wrote in Java many years ago to c++.
1. **Code Reading** - Do not underestimate this important task. The ability to write code
does not always translate into the ability to read it -- and reading it can be more
frustrating than writing it.
Take your time and carefully discover what each line of the codes does. This will certainly help you avoid making assumptions unless you come across code that you are familiar with and can gloss over it.
2. Don't hesitate to follow references, locate declarations, and uncover definitions of
code elements you are reading. What you learn about how a particular variable,
method call, or class are defined all contribute to learning and ultimately to you
being able to perform your task.
This is particularly important because detective, and effective detective work, are essential parts of being bale to understand the small parts of the code so that you can, in the future, grasp the larger parts with less difficulty.
Others have already posted information about what a debugger is and you will find it is an invaluable asset at tracking down code errors and, I think, helps with code reading, knowledge gain, and understanding so you can be a successful programmer.
Here is a link to a debugger tutorial utilizing Visual Studio and may give you a strong understanding of at least the process at hand.
A coworker and I are Clojure newbies. We started a project a couple months back, but quickly found that we had a tough time dealing with our code base -- by 500 LOC we basically had no idea where to start with the debugging, when things went wrong (which was often). Instead of pairs, functions were getting lists, or numbers, or what-have-you.
Now we're starting a new but related project and migrating a lot of the old code over. But we're again hitting a wall.
We're wondering, how do we effectively manage a Clojure project, especially as we make changes to existing code?
What we've come up with:
liberal use of unit-tests
liberal use of pre-, post-conditions
informal type declarations in function comments
use defrecord/defstruct/defprotocol to implement a data model, which would really simplify testing
But post-, pre-conditions seem not to be used very often. Unit-testing + comments will only help so much. And it seems like Clojure programmers don't typically implement formal data models.
Do we just not get Clojure? How do Clojure programmers know that their code is robust and correct?
I think this is actually an evolving area - Clojure hasn't really been around long enough for all of the best practices and associated tools for managing a large code base to be developed yet.
Some suggestions from my experience:
Structure your code in a "bottom up" way - in general, the way you want to structure you code will have the "utility" code at the top of the file (or imported from another namespace) and the "business logic" code that uses these utility functions towards the end of the file. If this seems difficult to do, then it's probably a hint that your code needs some refactoring.
Tests as examples - Test code in clojure works very well both to sanity check your code but also as documentation (e.g. "what kind of parameter is this function expecting?"). If you hit a bug, refer to your tests to check your assumptions and write a couple of new tests to flush out what is going wrong.
Keep functions simple and compose them - Kind of an extension of the "single responsibility principle" to functional programming. I consider more than 5-10 lines in a Clojure function as a major code smell (if this seems extreme, just remember that you can probably achieve as much in 5-10 lines of Clojure as you could with 50-100 lines of Java/C#)
Watch out for "imperative habits" - when I first started using Clojure, I wrote a lot of pseudo-imperative code in Clojure. An example would be emulating a for loop with "dotimes" and accumulating some result within an atom. This can be painful - it's not idiomatic, it's confusing and usually there is a much smarter, simpler and less error-prone functional way of doing it. This takes practice, but it is worth it in the long run...
Debug at the REPL - usually when I hit an issue, coding at the REPL is the easiest way to flush it out. Generally this means running some specific parts of the larger algorithm to check assumptions etc.
Refactor common utility functions out - you'll probably find a bunch of common or structure repeated in many functions. Well worth pulling this out into a function or macro that you can re-use in other places or projects - that way you can test it much more rigorously and have the benefits in multiple places. Bonus points if you can get it all the way upstream into Clojure itself! If you do this well enough, then your main code base will be extremely succinct and therefore easy to manage, containing nothing but the genuinely domain-specific code.
simple composable abstractions
"It is better to have 100 functions operate on one data structure than to have 10 functions operate on 10 data structures." - Alan J. Perlis
For me its all about composing simple functions. Try to break every function down into the smallest units you can and then have another function that composes them to do the work your need. You know you are in good shape is every function can be tested independently. If you go too heavy on the macroes then it can make this step harder because macroes compose differently.
D.R.Y, Seriously, just don't repeat yourself
starting with well decomposed functions in a a bunch of namespaces; every time I need one of the composable parts somewhere else I "hoist" that function up to a library included by both namespaces. This way your commonly used abstractions sort of evolve over the course of the project into "just enough framework". It is very difficult to do this unless you really have discrete composable abstractions.
Sorry to dig up this old question, the answers by mikera and Arthur are excellent, but it's something I've also wondered about as I've been learning Clojure, and thought I'd mention how we organise files.
In a similar vein to ensuring each function has a single job, we group related functions into namespaces to make it easier to navigate the code. So we might have a namespace for functions providing access to a particular database, or providing a collection of HTTP-related utilities. This keeps each file relatively small, and makes tests easier to find. It also makes refactoring much more straightforward. This is hardly anything new, but it's worth bearing in mind.