Common problems with Clojure multi-methods and protocols? - clojure

I am asking this question as I am starting to really use multimethods and protocols alot, but in doing so I'm also wondering if I'm making my code too un-maintainable. For example in the good old (or bad old :) OO days I would know where to find everything related to a particulat type, which would mean that all interfaces and methods would be in the same source file, but now they can be spread out all over the place. Any experiences on this?

It is true that everything can be scattered in different places if you are not forced to organize code in certain ways, like Java forces you.
It's completly up to you as a developer to organize code in logical units so it could be easier to find them and keep in mind that With great power comes great responsibility.
The more you work in functional style, you'll find better ways to organize your code, the key is that you are not afraid of refactoring. Besides M-. in Emacs/Slime will bring you to the definition of symbol wherever you are. I suppose other Clojure IDE plugins have a similar feature.

Related

How do I effectively manage a Clojure code base?

A coworker and I are Clojure newbies. We started a project a couple months back, but quickly found that we had a tough time dealing with our code base -- by 500 LOC we basically had no idea where to start with the debugging, when things went wrong (which was often). Instead of pairs, functions were getting lists, or numbers, or what-have-you.
Now we're starting a new but related project and migrating a lot of the old code over. But we're again hitting a wall.
We're wondering, how do we effectively manage a Clojure project, especially as we make changes to existing code?
What we've come up with:
liberal use of unit-tests
liberal use of pre-, post-conditions
informal type declarations in function comments
use defrecord/defstruct/defprotocol to implement a data model, which would really simplify testing
But post-, pre-conditions seem not to be used very often. Unit-testing + comments will only help so much. And it seems like Clojure programmers don't typically implement formal data models.
Do we just not get Clojure? How do Clojure programmers know that their code is robust and correct?
I think this is actually an evolving area - Clojure hasn't really been around long enough for all of the best practices and associated tools for managing a large code base to be developed yet.
Some suggestions from my experience:
Structure your code in a "bottom up" way - in general, the way you want to structure you code will have the "utility" code at the top of the file (or imported from another namespace) and the "business logic" code that uses these utility functions towards the end of the file. If this seems difficult to do, then it's probably a hint that your code needs some refactoring.
Tests as examples - Test code in clojure works very well both to sanity check your code but also as documentation (e.g. "what kind of parameter is this function expecting?"). If you hit a bug, refer to your tests to check your assumptions and write a couple of new tests to flush out what is going wrong.
Keep functions simple and compose them - Kind of an extension of the "single responsibility principle" to functional programming. I consider more than 5-10 lines in a Clojure function as a major code smell (if this seems extreme, just remember that you can probably achieve as much in 5-10 lines of Clojure as you could with 50-100 lines of Java/C#)
Watch out for "imperative habits" - when I first started using Clojure, I wrote a lot of pseudo-imperative code in Clojure. An example would be emulating a for loop with "dotimes" and accumulating some result within an atom. This can be painful - it's not idiomatic, it's confusing and usually there is a much smarter, simpler and less error-prone functional way of doing it. This takes practice, but it is worth it in the long run...
Debug at the REPL - usually when I hit an issue, coding at the REPL is the easiest way to flush it out. Generally this means running some specific parts of the larger algorithm to check assumptions etc.
Refactor common utility functions out - you'll probably find a bunch of common or structure repeated in many functions. Well worth pulling this out into a function or macro that you can re-use in other places or projects - that way you can test it much more rigorously and have the benefits in multiple places. Bonus points if you can get it all the way upstream into Clojure itself! If you do this well enough, then your main code base will be extremely succinct and therefore easy to manage, containing nothing but the genuinely domain-specific code.
simple composable abstractions
"It is better to have 100 functions operate on one data structure than to have 10 functions operate on 10 data structures." - Alan J. Perlis
For me its all about composing simple functions. Try to break every function down into the smallest units you can and then have another function that composes them to do the work your need. You know you are in good shape is every function can be tested independently. If you go too heavy on the macroes then it can make this step harder because macroes compose differently.
D.R.Y, Seriously, just don't repeat yourself
starting with well decomposed functions in a a bunch of namespaces; every time I need one of the composable parts somewhere else I "hoist" that function up to a library included by both namespaces. This way your commonly used abstractions sort of evolve over the course of the project into "just enough framework". It is very difficult to do this unless you really have discrete composable abstractions.
Sorry to dig up this old question, the answers by mikera and Arthur are excellent, but it's something I've also wondered about as I've been learning Clojure, and thought I'd mention how we organise files.
In a similar vein to ensuring each function has a single job, we group related functions into namespaces to make it easier to navigate the code. So we might have a namespace for functions providing access to a particular database, or providing a collection of HTTP-related utilities. This keeps each file relatively small, and makes tests easier to find. It also makes refactoring much more straightforward. This is hardly anything new, but it's worth bearing in mind.

Learning Clojure by reading core.clj

I came across the tweet today:
Start each day by reading the implementation of a function or macro in Clojure's core.clj.
My Clojure knowledge is really basic, I can hardly read other's Clojure (or Lisp) code.
Can I do well with core.clj, especially I have the feeling it is full of complicated macros?
I think a better place to start is by doing a project; anything that interests you and seems manageable is good.
core.clj is not readable right now; perhaps the latter half is, but the first half isn't something I'd wish on anyone as an introduction to the language. The truth is, even if you read it very carefully, you'd not have a solid idea of what was going on without also reading a lot of Java code, too.
Make an asynchronous text-based game (technomancy—Phil Hagelburg—has a nice one to look through on his Github, though it's a little dated by now)
Scrape websites using the Enlive library.
Maybe just solve some math problems, and/or
Graph things using Incanter.
Build first. Once you acquaint yourself with the tools you are using, start reading them. The libraries mentioned here are well written (you can't go wrong with anything by Christophe Grand, for instance), and once you start using them, you'll understand what they do, which makes it much simpler to figure out the why and how later.
That tweet is probably a great idea once you have enough experience to be able to read Clojure code well enough to understand what is going on.
Before that point, I'd recommend gaining a strong familiarity with the language by writing a lot of small mini-programs. It's best to learn by doing, after all.
I personally found Project Euler very useful while I was learning Clojure.
You might also take a look at the Clojure Koans, though they may be a bit beginner-oriented for you, depending on how much Clojure you already know.
As a good starting point, I would recommend 4Clojure. It looks similar to Project Euler and Code Katas, but has more forgiving learning curve.
There's a lot I don't understand in it, but there's also a lot I do, and it's fascinating looking at the language getting bootstrapped from its minimal initial implementation.
Another place to look for little projects to get you going would be any of the Kata repos. My personal favorite at this point is Coding Kata.org, but there's also Coding Dojo, Ruby Koans (with a little translation effort), and of course Google.
Try to do one of those a day and you'll quickly pick up the language.
Try reading the Clojure contrib libraries or look at some Clojure projects on github to get a better feel for idiomatic Clojure code.
The advice above is good. Another fine example of learning-by-doing is well illustrated here by the creator of Ruby on Rails: How Do I Learn to Program? I hope you have an app that you care about that you could just go for in Clojure.
I find the clojure.core a good way to find ideas of how things are implemented, as more of a reference guide. As noted earlier, the early parts or neither pretty nor exemplary. However, the later parts can be a good example especially when needing to write a function that is close to core function but different.

Is it possible for programmer to analyze unknown code fast?

I got a task related to ANCIENT C++ project which hasn't any documentation, comments at all and all code/variables is written in foreign language. Do I have a chance to analyze this code in a 1 working day and make a design/UML to create new features? I have been sitting around for 3 hours already and I feel so frustrated... Maybe somebody also had same problem? Any advice?
BR,
I suspect the biggest issue may be the fact that it's in a foreign language. You can use various static code analysis tools to try and understand what's going on, but if everything is presented in an unfamiliar language then that's still no use. Your first step (I believe) is to find someone who can speak this language and get them to translate as you go...
1) Use Doxygen , You can configure doxygen to extract the code structure from undocumented source files.
2) Use source Insight, Source Insight is an advanced code editor and browser with built-in analysis for C/C++, C#, and Java programs
Short answer, no - you probably don't have a chance to understand the code in one day. Reading/maintaining code is one of the hardest things to do, especially when it's lacking documentation. The fact that the code is in a foreign language (!) makes it even harder.
Sounds like you are on a very restricted (unrealistic) time-budget, but Working With Legacy Software is a good book if you're working with legacy systems. If you are planning to keep adding new features to the legacy system it's your responsibility to make your management aware of the scope of the operation. Or at least try.
Under this time constraint (1 day) it may or may not be doable depending on the size of the project - if its a few hundred lines of code then for sure. If its a serious project with several tens of thousands code lines, then likely no.
The first thing you need to know is what is this program supposed to do at all. If you have no idea what it does and how it does it, then analyzing the code will give you the answer but it will be a long and frustrating task. So my first suggestion would be to get yourself familiar with the outer workings of the software - what does it supposed to do and generally how it is supposed to do it. If you are doing it as part as your work then you should be able to get someone to walk you through using the program - even if its UI is in a foreign language (which I hope it doesn't, even if the code is written by a foreign language speaker).
Once you know what the software is attempting to do, then it should be fairly straight forward (even if lengthy and daunting) to rewrite all the comments in your own language for you to understand. I suggest doing so in a bottoms-up approach: its easier to understand the small and trivial things a program does, then to understand the top-level logic - and a lot of trivial things in order make up the logic of the software.
Only once you understand - to a large degree, anyway - the inner workings of the program you may write its functional spec and work on features.
Non-free way on Windows:
You can use CppDepend. This application is able to parse your visual project or your source files. It gives you a lot of information like dependency trees. You can try the trial (Maybe it will be enough for what you have to do).
Free way multi-platform:
You can use doxygen with a special configuration (extract code structure from undocumented code) and analyze the result.
I was quite happy with a tool called Understand (15-day eval license available) for this kind of task. However, I agree with Guss that the time you'll need depends a lot on the size of the code, and one day is probably just enough for a small program.
cscope & ctags are a must when I do my own code, and even more when looking to other's code.
You may also try this ::
http://www.sgvsarc.com/product_crystalflow.htm

What are some good approaches to learning the Half-Life 2 SDK?

I have been a Half-Life lover for years. I have a BS in CS and have been informally programming since High-School. When I was still in college I tried to become a mod programmer for fun..using the first Half-Life engine...didn't work so good. So i figured after all my great college learrning :-) I would have more insight on how to tackle this problem and could finally do it. So here I am..finally out in the business world programming java...so I downloaded the HL2 SDk and started looking through the class structure. I feel like I did that last time I tried this...dazed and confused. Sorry about all the back ground.
So what is the best way to systematically learn the code structure? I know java and I know c++..i just dont know what any of the classes do...the comments are few and far between and the documentation seems meager. Any good approahces? I **don'**t wanna start my own mod... I just wanna maybe be a spare-time mod programmer on some cool MOD one day...to keep the fun in learning programming along with the business side.
the comments are few and far between
and the documentation seems meager.
Any good approahces?
Welcome to the wonder that is the Source SDK. No, it's not documented. Experiment, hack, place breakpoints and see what happens if you change bits of code.
There is a wiki you may find helpful in some cases, but it's filled in by the community, and not by Valve, which means that you won't find any actual documentation there, just explanations of how previous modders have hacked the engine.
Honestly, it sucks. The only way around it is to dive in. Try to achieve various changes to the game and don't be afraid to rip the existing code apart. It won't be pretty, but if it works, who's going to complain? Their code is pretty horrible, and most likely, yours will be too.
You can start at the Valve Developer Wiki.
I think the best way is to check out the source code of one of the few open source mods out there, Open Source Jail Break. It will help you at least get familiar with the code.
Beyond that, its just developer resources and forums.
Edit:Plan of Attack seems great too.
Also: This is a great list, including both general and specific topics.
I'd do what I do with any other vague system... set lots of breakpoints and get a feel for the structure by watching it function. Add your own comments/documentation as you go. Test your understanding by making small changes and see if you get expected results.
I've worked with the Source SDK for a little and made some modifications. Really you have to have a good understanding of C and C++. The Source SDK isn't modern C++ and is much more akin to C with classes than any real OOP.
The SDK is simply fashioned in that the major of code is comprised of entities, of which many you can ignore.
Also know that the SDK uses inheritance very heavily, so look to base classes for functionality that you may desire.
I'd say make a list of important files and classes that maybe relevant to what you want to do with the SDK. Then start sorting these files using virtual folders in VS (or real folders on the filesystem) and use the find in files option (or grep) to find your way around.
Some sample files:
eiface.h - Engine interfaces
gameinterface.cpp/.h - Lots of interfaces from external dlls for server
cdll_client_int.cpp/.h - Lots of interfaces from external dlls for client
*_gamerules.cpp/.h - Gamerules (determines logic of game)
world.cpp - Entity that determines the map properties and loads the gamerules and other entities
Also try to use the Source SDK Base instead of the HL2MP Base for a mod. The former is a lot cleaner and easier to build off of.

Write C++ in a graphical scratch-like way?

I am considering the possibility of designing an application that would allow people to develop C++ code graphically. I was amazed when I discovered Scratch (see site and tutorial videos).
I believe most of C++ can be represented graphically, with the exceptions of preprocessor instructions and possibly function pointers.
What C++ features do you think could be (or not be) represented by graphical items?
What would be the pros and cons of such an application ? How much simpler would it be than "plain" C++?
RECAP and MORE:
Pros:
intuitive
simple for small applications
helps avoid typos
Cons:
may become unreadable for large (medium?) - sized applications
manual coding is faster for experienced programmers
C++ is too complicated a language for such an approach
Considering that we -at my work- already have quite a bit of existing C++ code, I am not looking for a completely new way of programming. I am considering an alternate way of programming that is fully compatible with legacy code. Some kind of "viral language" that people would use for new code and, hopefully, would eventually use to replace existing code as well (where it could be useful).
How do you feel towards this viral approach?
When it comes to manual vs graphical programming, I tend to agree with your answers. This is why, ideally, I'll find a way to let the user always choose between typing and graphical programming. A line-by-line parser (+partial interpreter) might be able to convert typed code into graphical design. It is possible. Let's all cross our fingers.
Are there caveats to providing both typing and graphical programming capabilities that I should think about and analyze carefully?
I have already worked on template classes (and more generally type-level C++) and their graphical representation.
See there for an example of graphical representation of template classes. Boxes represent classes or class templates. First top node is the class itself, the next ones (if any) are typedef instructions inside the class. Bottom nodes are template arguments. Edges, of course, connect classes to template arguments for instantiations.
I already have a prototype for working on such type-level diagrams.
If you feel this way of representing template classes is plain wrong, don't hesitate to say so and why!
Much as I like Scratch, it is still much quicker for an experienced programmer to write code using a text editor than it is to drag blocks around, This has been proved time and again with any number of graphical programming environments.
Writing code is the easiest part of a developers day. I don't think we need more help with that. Reading, understanding, maintaining, comparing, annotating, documenting, and validating is where - despite a gargantuan amount of tools and frameworks - we still are lacking.
To dissect your pros:
Intuitive and simple for small applications - replace that with "misleading". It makes it look simple, but it isn't: As long as it is simple, VB.NET is simpler. When it gets complicated, visual design would get in the way.
Help avoid typos - that's what a good style, consistency and last not least intellisense are for. The things you need anyway when things aren't simple anymore.
Wrong level
You are thinking on the wrong level: C++ statements are not reusable, robust components, they are more like a big bag of gears that need to be put together correctly. C++ with it's complexity and exceptions (to rules) isn't even particulary suited.
If you want to make things easy, you need reusable components at a much higher level. Even if you have these, plugging them together is not simple. Despite years of struggle, and many attempts in many environments, this sometimes works and often fails.
Viral - You are correct IMO about that requriement: allow incremental adoption. This is closely related to switching smoothly between source code and visual representation, which in turn probably means you must be able to generate the visual representation from modified source code.
IDE Support - here's where most language-centered approaches go astray. A modern IDE is more than just a text editor and a compiler. What about debugging your graph - with breakpoints, data inspection etc? Will profilers, leak detectors etc. highlight nodes in your graph? Will source control give me a Visual Diff of yesterday's graph vs. today's?
Maybe you are on to something, despite all my "no"s: a better way to visualize code, a way to put different filters on it so that I see just what I need to see.
The early versions of C++ were originally written so that they compiled to C, then the C was compiled as normal.
What it sounds like you are describing is a graphical language that is compiled to C++, which will then be compiled as normal.
So really you are not creating a graphical C++, you are creating a new language that happens to be graphical. Nothing wrong with that, but don't let C++ restrict what you do, because eventually you may want to compile the graphical language straight to machine code, or even to something like CIL, Java ByteCode, or whatever else tickles your fancy.
Other graphical languages you may want to check out are LabVIEW, and more generally the category of visual programming languages.
Good luck in your efforts.
The complexity of a nontrivial program is usually too high to be represented with graphical symbols, which are low in their information content. Unless your approach is markedly different in some way, I am skeptical that this would be of value based on past efforts.
So, practically speaking, his will be useful only for instructional purposes and very simple programs. But that would still be a great target market for a product like this. sometimes people have trouble grasping the fundamentals, and a visual model might be just the thing to help things click.
Interesting idea. I doubt I'd use it though. I tend to prefer coding in a flat text editor, not even an IDE, and for tough problems I prefer a pad of paper. Most of the really experienced programmers I know work this way, Maybe it's because we grew up in a different environment, but I think it's also because of the way we think about programming. As you get more experience, you start seeing the code in your head more clearly than any GUI tool will show it to you.
As for your question, I'd nominate templates as one of the harder / more interesting sort of thing to try to represent well. They are ubiquitous and carry information that you won't have access to as you are designing your tool. Getting that to the user in a useful way should pose an interesting challenge.
What C++ features do you think could be [...] represented by graphical items?
Object Oriented Design. Hence classes, inheritance, polymorphism, mutability, const-ness etc. And, templates.
What would be the pros and cons of such an application?
It may be easier for beginners to start writing programs. For the experienced, it may be get rid of the boring parts of programming.
Think of any other code generator. They create a framework for you to write the more involved portion(s). They also lead to bloated-code (think of any WYSIWYG HTML editor).
The biggest challenge, as I see it, is that any such UI necessarily hinders the user's imagination.
How much simpler would it be than "plain" c++ ?
It can be a real pain, when you wade through truckloads of errors which is typical of code generators.
Further, since a lot of code is generated, you have no idea of what is going on -- debugging becomes difficult.
Also, for the experienced there may be some irritation to find that the generated code is not per their preferred coding style.
I prefer hot-keys instead graphical menus and buttons.
And I think same thing will happen with graphical development tool. Many peoples will prefer manual codding.
But, source code visualizer - should be nice thing.
I like the idea, but I suspect there comes a point where things get far too complicated to be represented graphically.
However, given recent experience at work; it would be useful to give such a graphical interface to a non-techie person to use to create basic drag-and-drop programs, leaving myself free to get on with some "proper" programming ;-) If it can do the job of allowing somebody non-skilled to build something functional it can be a very good thing (even if programming logic escapes them)
There comes a point in such a system where it becomes easier to define what you want to do using literal C++ code, rather than have a user interface getting in the way; it can get frustrating to the sessioned programmer knowing the precise code that needs to be written but then only being limited to the design GUI. I'm specifically thinking about a more common application, such as html editors/designers in which they allow newbies to build their websites without knowing any html at all.
It would be interesting to see how such a system would handle the dynamic allocation of memory, and the different states of a program as time progressed; I suspect that there are some very basic programming concepts that may be difficult to represent graphically.. Polymorphism..? Virtual Classes, LinkList, Stacks/Circular Queues. I wonder for a moment how you would explain an image compression algorithm (such as jpg) successfully too without the help of a gigantic display screen.
I also wonder if such a system would even go to such a low level, and whether you would be dealing with abstracted concepts and the compiler would be working out the best way to do something.
I've been working on a new model-driven software development paradigm named ABSE (http://www.abse.info) that supports end-user programming: It's a template-based system that can be complemented with transformation code. I also have an IDE (named AtomWeaver) implementing ABSE that is in pre-alpha stage right now.
With AtomWeaver, as an expert/architect, you build your knowledge Templates, and then the developers (or end-users if you make your meta-models simpler) can just "assemble" systems by building blocks, and then filling template parameters in form-style editors.
At the end, pressing the "Generate" button will create the final system as specified by the architect/expert.
I'm surprised you think function pointers would be a particular problem. How about anything at all to do with pointers?
A programming language can be represented by a hierarchy of nodes - that's exactly what the compiler turns it into. It is very strange that the UI for editing programs is still a sequence of characters that get parsed, because the degrees of freedom in the editor is way larger than the available set of allowed choices. But intellisense helps to reduce this problem a lot.
C++ would be a strange choice to base such a system on.
I think the major problem of this kind of IDEs are that the code generated becomes unmantainable easily.
This happened to Delphi. It's a really nice tool to develop some kind of applications, however, when we start adding complex relationships between the components, start adding Design Patterns, etc. the code grows to an unmantainable size.
I believe it's also because graphical tools don't apply the concept of MVC (or if they do, it's only in the way that the IDE understands).
It can be really helpful for prototypes and very small applications that don't tend to grow, otherwise it can become a mess for the developer(s)