Why are Clojure's multimethods better than 'if' or 'case' statements - clojure

I've spent some time, trying to understand Clojure multimethods. The main "pro" multimethod argument, as far as I understand, is their flexibility, however, I'm confused with the argumentation of why multimethods are better than a simple if or case statement. Could someone, please, explain, where is the line between polymorphism and an overglorified case statement drawn?
EDIT: I should have been clearer in the question, that I'm more interested in comparison with the 'if' statement. Thanks a lot for the answers!

Say we have types A, B, C, D and E, and methods m1, m2, m3 taking single argument of the previous types. You can put them in a table like this:
| A | B | C | D | E |
m1 | | | | | |
m2 | | | | | |
m3 | | | | | |
The "switch" statement strategy is implementing one row of this table at a time. Suppose you add a new type F. You'll have to modify all implementations to support it.
The class-based polymorphism (C++, Java, etc.) allows you to implement a whole column instead. Adding a new type is thus easy, as you don't have to change the already defined classes. But adding a new method is hard, as you'll have to add it to all other types.
Multimethods allow you to implement single cells of the table independently of each other.
This flexibility is even greater if you have to dispatch on multiple arguments. Each new argument adds another dimension to this table, and both swich-based and class-based dispatches become very complex pretty quickly (c.f. Visitor pattern).
Note, that multimethods are actually even more generic than depicted, as you can dispatch on pretty much anything, not just on the types of the arguments.

The difference between multimethods and a big if-statement is that you need to modify the function that contains the case-statement to add cases to the if-statement. You can add a new method without touching the previously existing methods.
So if you define a multimethod inside your library and you want your users to be able to extend it for their own data types, that's no problem. If you had used an if-statement instead, it would be a big problem.

ivant's answer above can be expanded by taking a look at this article . It does a good job of explaining the power of protocols. Think of multimethods as protocols with many dimensions.

Related

Haskell - Why is Alternative implemented for List

I have read some of this post Meaning of Alternative (it's long)
What lead me to that post was learning about Alternative in general. The post gives a good answer to why it is implemented the way it is for List.
My question is:
Why is Alternative implemented for List at all?
Is there perhaps an algorithm that uses Alternative and a List might be passed to it so define it to hold generality?
I thought because Alternative by default defines some and many, that may be part of it but What are some and many useful for contains the comment:
To clarify, the definitions of some and many for the most basic types such as [] and Maybe just loop. So although the definition of some and many for them is valid, it has no meaning.
In the "What are some and many useful for" link above, Will gives an answer to the OP that may contain the answer to my question, but at this point in my Haskelling, the forest is a bit thick to see the trees.
Thanks
There's something of a convention in the Haskell library ecology that if a thing can be an instance of a class, then it should be an instance of the class. I suspect the honest answer to "why is [] an Alternative?" is "because it can be".
...okay, but why does that convention exist? The short answer there is that instances are sort of the one part of Haskell that succumbs only to whole-program analysis. They are global, and if there are two parts of the program that both try to make a particular class/type pairing, that conflict prevents the program from working right. To deal with that, there's a rule of thumb that any instance you write should live in the same module either as the class it's associated with or as the type it's associated with.
Since instances are expected to live in specific modules, it's polite to define those instances whenever you can -- since it's not really reasonable for another library to try to fix up the fact that you haven't provided the instance.
Alternative is useful when viewing [] as the nondeterminism-monad. In that case, <|> represents a choice between two programs and empty represents "no valid choice". This is the same interpretation as for e.g. parsers.
some and many does indeed not make sense for lists, since they try iterating through all possible lists of elements from the given options greedily, starting from the infinite list of just the first option. The list monad isn't lazy enough to do even that, since it might always need to abort if it was given an empty list. There is however one case when both terminates: When given an empty list.
Prelude Control.Applicative> many []
[[]]
Prelude Control.Applicative> some []
[]
If some and many were defined as lazy (in the regex sense), meaning they prefer short lists, you would get out results, but not very useful, since it starts by generating all the infinite number of lists with just the first option:
Prelude Control.Applicative> some' v = liftA2 (:) v (many' v); many' v = pure [] <|> some' v
Prelude Control.Applicative> take 100 . show $ (some' [1,2])
"[[1],[1,1],[1,1,1],[1,1,1,1],[1,1,1,1,1],[1,1,1,1,1,1],[1,1,1,1,1,1,1],[1,1,1,1,1,1,1,1],[1,1,1,1,1,"
Edit: I believe the some and many functions corresponds to a star-semiring while <|> and empty corresponds to plus and zero in a semiring. So mathematically (I think), it would make sense to split those operations out into a separate typeclass, but it would also be kind of silly, since they can be implemented in terms of the other operators in Alternative.
Consider a function like this:
fallback :: Alternative f => a -> (a -> f b) -> (a -> f e) -> f (Either e b)
fallback x f g = (Right <$> f x) <|> (Left <$> g x)
Not spectacularly meaningful, but you can imagine it being used in, say, a parser: try one thing, falling back to another if that doesn't work.
Does this function have a meaning when f ~ []? Sure, why not. If you think of a list's "effects" as being a search through some space, this function seems to represent some kind of biased choice, where you prefer the first option to the second, and while you're willing to try either, you also tag which way you went.
Could a function like this be part of some algorithm which is polymorphic in the Alternative it computes in? Again I don't see why not. It doesn't seem unreasonable for [] to have an Alternative instance, since there is an implementation that satisfies the Alternative laws.
As to the answer linked to by Will Ness that you pointed out: it covers that some and many don't "just loop" for lists. They loop for non-empty lists. For empty lists, they immediately return a value. How useful is this? Probably not very, I must admit. But that functionality comes along with (<|>) and empty, which can be useful.

Dynamically switch parser while parsing

I'm parsing spice netlists, for which I already have a parser. Since I actually use spectre (cadence, integrated electronics), I want to support both simulator languages (they differ, unfortunately). I could use a switch (e.g. commandline) and use the correct parser from start. However, spectre allows simulator lang=spectre statements, which I would also want to support (and vice versa, of course). How can this be done with boost::spirit?
My grammar looks roughly like this:
line = component_parser |
command_parser |
comment_parser |
subcircuit_parser |
subcircuit_instance_parser;
main = -line % qi::eol >> qi::eoi;
This toplevel structure is fine for both languages, so i need to change the subparsers. A first idea for me would be to have the toplevel parser hold instances (or objects) to the respective parser and to switch on finding the simulator lang statement (with a semantic action). Is this a good approach? If not, how else would one do this?
You can use qi::lazy (https://www.boost.org/doc/libs/1_68_0/libs/spirit/doc/html/spirit/qi/reference/auxiliary/lazy.html).
There's an idiomatic pattern related to that, known as The Nabialek Trick.
I have several answers up on this site that show these various techniques.
https://stackoverflow.com/search?q=user%3A85371+qi%3A%3Alazy

Defining how long a list can be in Haskell

So I'm new to Haskell and i'm trying to define a list which is a Max of 4 elements long.
so far I have type IntL = [Int,Int,Int,Int]
but I was thinking there must be a better/proper way of doing this.
Is there?
This is problematic in Haskell because phantom types encoding sizes need proper compiler support (otherwise it's pretty annoying to use), and type nats in GHC appeared somewhat recently.
That being said libraries exist, just to give you an idea.
Alternatively, just use a tuple.
it might look stupid and it certainly does not scale but what about
data Max4 a
= Empty
| One a
| Two a a
| Three a a a
| Four a a a a
with type IntL = Max4 Int? It's basic, you should be able to understand it and you can learn a lot by implementing operations on it.
Basic Haskell Types are not so powerful as to encode the maximum length of a list. In order to do that, you must rely on extensions such as GADTs and Phantom Types and yet it is not straightforward as well.
If you are really a newbie, I advice you to learn other basic concepts like Monads, IO and other idioms.
This site is a very good reading for an initial approach to Haskell:
http://learnyouahaskell.com

Dynamically Describing Mathematical Rules

I want to be able to specify mathematical rules in an external location to my application and have my application process them at run time and perform actions (again described externally). What is the best way to do this?
For example, I might want to execute function MyFunction1() when the following evaluates to true:
(a < b) & MyFunction2() & (myWord == "test").
Thanks in advance for your help.
(If it is of any relevance, I wish to use C++, C or C++/CLI)
I'd consider not reinventing the wheel --- use an embedded scripting engine. This means you'd be using a standard form for describing the actions and logic. There are several great options out there that will probably be fine for your needs.
Good options include:
Javascript though google v8. (I don't love this from an embedding point of view,
but javascript is easy to work with, and many people already know it)
Lua. Its fast and portable. Syntax is maybe not as nice as Javascript, but embedding is
easy.
Python. Clean syntax, lots of libraries. Not much fun to embed though.
I'd consider using SWIG to help generate the bindings ... I know it works for python and lua, not sure about v8.
I would look at the command design pattern to handle calling external mathematical predicates, and the Factory design pattern to run externally defined code.
If your mathematical expression language is that simple then uou could define a grammar for it, e.g.:
expr = bin-op-expr | rel-expr | func-expr | var-expr | "(" expr ")"
bin-op = "&" | "|" | "!"
bin-op-expr = expr bin-op expr
rel-op = "<" | ">" | "==" | "!=" | "<=" | ">="
rel-expr = expr rel-op expr
func-args = "(" ")"
func-expr = func-name func-args
var-expr = name
and then translate that into a grammar for a parser. E.g. you could use Boost.Spirit which provides a DSL to allow you to express a grammar within your C++ code.
If that calculation happens at an inner loop, you want high performance, you cannot go with scripting languages. Based on how "deployable" and how much platform independent you would like that to be:
1) You could express the equations in C++ and let g++ compile it for you at run-time, and you could link to the resulting shared object. But this method is very much platform dependent at every step! The necessary system calls, the compiler to use, the flags, loading a shared object (or a DLL)... That would be super-fast in the end though, especially if you compile the innermost loop with the equation. The equation would be inlined and all.
2) You could use java in the same way. You can get a nice java compiler in java (from Eclipse I think, but you can embed it easily). With this solution, the result would be slightly slower (depending on how much template magic you want), I would expect, by a factor of 2 for most purposes. But this solution is extremely portable. Once you get it working, there's no reason it shouldn't work anywhere, you don't need anything external to your program. Another down side to this is having to write your equations in Java syntax, which is ugly for complex math. The first solution is much better in that respect, since operator overloading greatly helps math equations.
3) I don't know much about C#, but there could be a solution similar to (2). If there is, I know that there's operator overloading in C#, so your equations would be more pleasant to write and look at.

How to find all mutual friendships in large C++ source tree?

In a large C++ source tree, with about 600 or so classes defined, I want to find all pairs of classes where each declares the other a friend.
There are many cases of one class being a friend of another, too many to make it worth wading through a simple grep result.
You could implement a kind of triple loop here; the algorithm could be as follows:
First loop: find all classes who have friends, and remember the name of the friend and the name of the actual class;
Then run inner loop for all the classes and find a class with the name of the friend from step 1.
Then run another inner loop through all the friends of the class found at step 2. If you have found class with name from step 1 - voila - they're mutual friends.
I believe Perl and regexes are the best tools for such things.
P.S. sure this approach has its limits, because not everything in C++ could be parsed with regex (using namespace stuff is the first thing came into my mind). But, to some extent, this is working approach, and if you don't have alternatives, you could give it a try.
EDIT:
An idea came to my mind today in the morning, while I still was lying in my bed. :) The idea is quite simple and clear (like all morning ideas): use SQL! Naturally, imagine you have a table of classes with 2 columns, where first column is class name, and second column is it's friend`s name. Say, something like this:
ClassName FriendName
C1 C2
C1 C3
C1 C4
C2 C1
C2 C8
C3 C1
C3 C2
... ...
Then you can run a simple query against it. Say, something like this (sorry, I don't have any SQL DB handy, so have not checked the query, but I hope you'll got the idea and implement it as needed:
SELECT ClassName as c, FriendName as f FROM T
WHERE c in
(SELECT FriendName FROM T
WHERE FriendName = c AND ClassName = f)
The idea behind this variant is that we should employ those tolls which exactly fit the task. What can compare to SQL when you need to crunch some sets of data?
I) Some elegant ways:
1) Doxygen ( http://www.doxygen.nl/ ) might be able to give you what you need. (If it doesn't already give this information, you could hack Doxygen's C++ parser a bit to get what you need).
2) There are also existing ANTLR grammar files for C++ as well.
II) Quicker way (perhaps the right approach here):
Regex should just be fine for your purpose as others suggest. Consider the following pseudo code:
rm -f result_file;
foreach source_file
do
sed 's/[ \t\n]\+/ /g' $source_file > temp_file; ## remove newlines, etc
grep -o -P -i "friend [^;]*;" >> result_file; ## you can improve this regex for eliminating some possible unwanted matches or post-process result_file later
done
Now you have all friend relations in result_file. You can remove "friend functions" using another simple regex and/or process the result_file further as per needs.
This answer is similar to #user534498's, but I'm going to go into a bit more detail, as the suggestion "parse C++ with regex" is so insane, I don't think it bears consideration.
I also don't think you're going to find an automated tool which can already do this for you. If this were managed code land, I'd be suggesting something like Nitriq, but I don't think anything like that works for C++.
If you're not worried about nested classes, I think you can construct parings of classes to friends without too much difficulty. You can find instances of the keyword class, followed by curly braces, and within the curly braces look for friend statements. That should, without too much difficulty, give you a listing of which classes have which friends.
Once you've done that, you can check for duplicate references easily. (Depends on the language you're using... if you're in C++ then you'd put your results in a std::multimap with the keys being class name and values being the friends)
I suppose this is similar to what #Haspemulator is suggesting ... but my point is that it will probably be easier to split out the parsing, then implement the circular ref checking in terms of sets or maps, then it will be to try to intertwine these operations.
Use perl or python or c++ regex to parse all files, record all class-friends pairs. The matching should be trivial for these kind of 600 pairs