I've searched around for the concept of GADT in OCaml, why we need it and when to use it, etc.
I understand GADT is not only in OCaml but a more general term.
I've found
What are GADTs?
http://caml.inria.fr/pub/docs/manual-ocaml-400/manual021.html#toc85
http://www.reddit.com/r/ocaml/comments/1jmjwf/explain_me_gadts_like_im_5_or_like_im_an/
etc, but some of them are in Haskell, and others do not have a good comparison example between no GADT and GADT.
So what I would like is a simple yet good concrete example where I can see if without GADT, things are bad.
Can I have that please?
GADTs are useful for two reasons.
The first one (and the most common one) is about dynamic typing: you can add some dynamic typing without losing static checking of it. It is not simple though, but you can be sure through it that your type conditions will be met.
The simplest example of that is given in the ocaml manual.
This was used for instance in the standard library to rewrite printf in a type safe manner (before that, it was a pretty horrible collection of Obj.magic)
The second reason you may want to use GADTs is when you have some complex invariant you want to maintain on your type structure. This is pretty hard to express though, and you often have to put a lot of effort to do that.
Well, I don't have any example handy, but I once saw a friend write down an implementation of AVL-trees were it was proven by the typing system that balancing was right, which is pretty cool.
For more one the GADT feature, and its good use cases, You can read the pretty good blog post by Mads Hartmann.
I'm also in a search of good application of GADT, as most of the time, when I use them sooner or later I discover, that the same can be done without them, and usually in a much more cleaner way. So, this is not a complete survey, just a bit of my own experience.
Universal values, aka existentials. They allow you to create heterogenous containers and typesafe serialization. See, for examples Core's Univ and Univ_map modules.
Type-safe evaluators for syntax trees. Here GADTs are useful to remove some runtime checks.
Pure and type-safe Printf implementation, that is already a part of OCaml, was also rewritten using GADT
Here is a real life example of how GADT can be used. In the example, I use GADT to specify table relations, e.g., one_to_one, one_to_many, etc. Depending on the used relation the function type is inferred accordingly. For example, one_to_maybe_one relation, returns a function 'a -> 'b option, one_to_many creates a function with 'a -> 'b list. The same can be achieved by just creating several different functions, like link_one_to_one, link_one_to_many, etc instead of one function link ~one_to:relation. So, one can consider this approach as arguable.
Related
I'm reading some Clojure code at the moment that has a bunch of uninitialised values as nil for a numeric value in a record that gets passed around.
Now lots of the Clojure libraries treat this as idiomatic. Which means that it is an accepted convention.
But it also leads to NullPointerException, because not all the Clojure core functions can handle a nil as input. (Nor should they).
Other languages have the concept of Maybe or Option to proxy the value in the event that it is null, as a way of mitigating the NullPointerException risk. This is possible in Clojure - but not very common.
You can do some tricks with fnil but it doesn't solve every problem.
Another alternative is simply to set the uninitialised value to a symbol like :empty-value to force the user to handle this scenario explicitly in all the handling code. But this isn't really a big step-up from nil - because you don't really discover all the scenarios (in other people's code) until run-time.
My question is: Is there an idiomatic alternative to nil-punning in Clojure?
Not sure if you've read this lispcast post on nil-punning, but I do think it makes a pretty good case for why it's idiomatic and covers various important considerations that I didn't see mentioned in those other SO questions.
Basically, nil is a first-class thing in clojure. Despite its inherent conventional meaning, it is a proper value, and can be treated as such in many contexts, and in a context-dependent way. This makes it more flexible and powerful than null in the host language.
For example, something like this won't even compile in java:
if(null) {
....
}
Where as in clojure, (if nil ...) will work just fine. So there are many situations where you can use nil safely. I'm yet to see a java codebase that isn't littered with code like if(foo != null) { ... everywhere. Perhaps java 8's Optional will change this.
I think where you can run into issues quite easily is in java interop scenarios where you are dealing with actual nulls. A good clojure wrapper library can also help shield you from this in many cases, and its one good reason to prefer one over direct java interop where possible.
In light of this, you may want to re-consider fighting this current. But since you are asking about alternatives, here's one I think is great: prismatic's schema. Schema has a Maybe schema (and many other useful ones as well), and it works quite nicely in many scenarios. The library is quite popular and I have used it with success. FWIW, it is recommended in the recent clojure applied book.
Is there an idiomatic alternative to nil-punning in Clojure?
No. As leeor explains, nil-punning is idiomatic. But it's not as prevalent as in Common Lisp, where (I'm told) an empty list equates to nil.
Clojure used to work this way, but the CL functions that deal with lists correspond to Clojure functions that deal with sequences in general. And these sequences may be lazy, so there is a premium on unifying lazy sequences with others, so that any laziness can be preserved. I think this evolution happened about Clojure 1.2. Rich described it in detail here.
If you want option/maybe types, take a look at the core.typed library. In contrast to Prismatic Schema, this operates at compile time.
This might be a stupid question, but since OCaml is not pure and has side effects built-in, what is the use of monads in OCaml?
Monads have nothing to do with purity, except that a pure language without Monads would be almost entirely useless.
In layman's terms, a Monad is just a set of rules that describe how a sequence of steps can be executed. Having a Monad abstraction gives you the ability to define a DSL for executing stuff. A Monad can be built to intelligently handle things like exceptions, ATOMIC rollbacks/commits, retry logic, sleeping between each step, or whatever.
Here are some examples of Monads:
https://wiki.haskell.org/Monad#Interesting_monads
I realize that this list is for Haskell, which is a pure language, but don't let that confuse you.
You don't need to understand category theory to understand what a Monad is, contrary to popular belief. A monad basically has 2 things: (paraphrased from this wikipedia article)
A unit function, defined as (a -> M a), called "return" in Haskell, used to put a value into the context of a Monad.
A binding operation, defined as (M t -> (t -> M u) -> M u), which looks scary but if you look carefully, this is a function that gets invoked between each step of the process, this is where you inject the good stuff.
Depending on the language, there may be more things, but this is the heart of it.
Whilst OCaml supports the standard side-effects provided by most languages, this does not include all possible side-effects. There are a number of effects which OCaml does not provide native support for. Many of these effects can be encoded using Monads. For example,
Concurrency (see Lwt and Async libraries)
Non-deterministic choice
First-class continuations
Ambivalent choice and backtracking
Using more sophisticated representations of computation, such as parameterised monads, even more exotic effects can be encoded. For example,
Polymorphic state
Linear resources
While OCaml allows one to write imperative code it is still functional by its nature, it is used by functional programmers. And we prefer to use persistent data structures and algorithms whenever possible.
What concerning your question, then in particular monads are usefull, for asynchronous computations, e.g., Lwt, Async, where they're used to bind computations (instead of usual way of setting callbacks). Also, monads are used for error handling, instead of exceptions. Also, monads are very helpful in writing parsers, see mparser library. There're also other uses, I enumerated only the most popular.
In general monads just allow you to hide a complex control flow under simple sequential syntax.
This may be a lot more naive than the answer you want but a monad is just a simple abstraction that's useful for structuring computation. It's a little math thing like an equivalence relation (or for people smarter than I am, like a group). Once you learn what they are, you see them everywhere, and they help organize your thinking.
Disclaimer
Despite the title, this is a genuine question, not an attempt at Emacs/Vi flamewars.
Context
I've used Haskell for a few months, and written a small ~10K LOC interpreter. In the past year, I've switched to Clojure. For quite a while, I struggled with Clojure's lack of types. Then I switched into using defrecords in Clojure, and now, switched to Clojure's defprotocols.
I really really like defprotocols. In fact, more than types.
I'm now at the point where for my Clojure functions, for it's documentation string, I just specify:
* the protocols of the inputs
* the protocols of the outputs
Using this, it appears I now have an ad-hoc type system (not compiler checked; but human checked).
Question
I suspect there's something about types that I'm missing. What does types provide over protocols?
Questioning the question...
Your question "What [do] types provide over protocols?" seems awkward to me. Types and protocols are perpendicular; They describe different things. Types/records define structure of data, while Protocols define the structure of some behavior or functionality. And part of why this question seems weird to me is that these things are not mutually exlusive! You can have types implement a protocol, thereby giving them whatever behaviour/functionality that protocol describes. In fact, since your context makes it clear that you have been using protocols, I have to wonder how you've been using them. My guess is that you've been using them with records (or possibly reifying them), but you could just as easily use protocols and (def)types together.
So to me, it seems you've compared apples with oranges here. To help clarify, let me compare apples to apples and oranges to oranges with a couple of different questions:
What problems do protocols solve, and what are the alternatives and their respective advantages/disadvantages?
Protocols let you define functions that operate in different ways on different types. The only other ways to do this are multimethods and simple function logic:
multimethods: have value in being extremely flexible. You can dispatch behavior on type by passing type as the dispatch function, but you can also use any other arbitrary function for dispatching.
internal function logic: You can also (of course) manually check for types in conditionals in your function definitions to decide how to process differently given different types. This is more primitive than multimethod dispatch, and also less extensible. Except in simple cases, multimethods are preferred.
Protocols have the advantage of being much more performant, being based on JVM class/method dispatch, which has been highly optimized. Additionally, protocols were designed to address the expression problem (great read), which makes them really powerful tools for crafting nice, modular, extensible APIs.
What are the advantages/disadvantages of (def)records or reify over (def)types?
On the side of how we specify the structure of data, we have a number of options available:
(def)records: produce a type good for "representing application domain information" (from http://clojure.org/datatypes; worth a read)
(def)types: produce a lighter weight type for creating "artifacts of the implementation/programming domain", such as the standard collection types
reify: construct a one-off object with an anonymous type implementing one or more protocols; good for... one-off things which need to implement a protocol(s)
Practically, records behave like clojure hash-maps, but have the added benefit of being able to implement protocols and have faster attribute lookup. Conveniently, the remain extensible via assoc, though attributes added in this fashion do not share the compiled lookup performance. This is what makes these constructs convenient for implementing applciation logic. Using deftype is advantageous for aspects of implementation/programming domain because they don't implement excess bagage, making the the use cleaner for these cases.
Protocols create interfaces and interfaces are a well, the interface to a type. they describe some aspects of a type though with much less rigor than you would come to expect in a language like Haskell.
machine checking
type inference (you don't get some of your protocols generated from docs of others)
parametric polymorphism (parameterised protocols / protocols with generics don't exist)
higher order protocols (what is the protocol for a function that returns a protocol?)
automatic generation of code / boilerplate
inter-operation with automated tools
I want to ask what sort of type safety languages constructs are there on Clojure?
I've read 'Practical Clojure' from Luke VanderHart and Stuart Sierra several times now, but i still have the distinct impression that Clojure (like other lisps) don't take compilation-time validation checking very seriously. Type safety is just but one (very popular) strategy for doing compilation-time checking of correct semantics
I'm asking this question because i'm aching to be proven wrong; what sort of design patterns are there available on clojure to validate (at compilation-time, not at run-time) that a function that expects a string doesn't get called with, say, a list of integers?
Also, i've read very smart people like Paul Graham openly advocate about lisp allowing to implement everything from lower-level languages on top of it (most would say that the language themselves are being reimplemented on top of it), so if that assertion would be true, then trivially stuff like type checking should be a piece of cake. So do you feel that there exist type systems (or the ability to implement such type systems) in clojure or other lisps, that give the programmer the ability to offset validation checking from run-time to compile-time, or even better, design-time?
Compilation units in Clojure are very small - a single function. Lispers tend to change small portions of running programs while they develop. Introducing static type checking into this style of development is problematic - for a deeper discussion why I recommend the post Types are Anti-Modular by Gilad Bracha. Thus Clojure's prefers pre/post-conditions which jive better with Lisp's highly REPL-oriented development.
That said, it's certainly desirable and possible to build an a la carte type system for Clojure. This trail has been blazed by Qi/Shen, and Typed Racket. This functionality could be easily provided as a library. I'm hoping to build something like that in the future with core.logic - https://github.com/clojure/core.logic.
Since Clojure is a dynamic language the whole idea is not to check the types (or much of anything) at compile time.
Even when you add type hints to your function they do not get checked at compile-time.
Since Clojure is a Lisp you can do whatever you want at compile-time with macros and macros are powerful enough that you can write your own type systems. Some people have made type systems for lisps Typed Racket and Qi. These Type systems can be just as powerful as any Type system in a "normal" language.
Ok, we now know that it is possible but does Clojure has such a optional type system? The answer is currently no but there is a logic engine (core.logic) that could be used to implement a typesystem but the author has not worked (yet) in that direction.
There is a library that adds an optional type system to Clojure,
http://typedclojure.org/
Rationale
Static typing has well known benefits. For example, statically typed languages catch many common programming errors at the earliest time possible: compile time. Types also serve as an excellent form of (machine checkable) documentation that almost always augment existing hand-written documentation.
Languages without static type checking (dynamically typed) bring other benefits. Without the strict rigidity of mandatory static typing, they can provide more flexible and forgiving idioms that can help in rapid prototyping. Often the benefits of static type checking are desired as the program grows.
This work adds static type checking (and some of its benefits) to Clojure, a dynamically typed language, while still preserving idioms that characterise the language. It allows static and dynamically typed code to be mixed so the programmer can use whichever is more appropriate.
I have been learning about various functional languages for some time now including Haskell, Scala and Clojure. Haskell has a very strict and well-defined static type system. Scala is also statically typed. Clojure on the other hand, is dynamically typed.
So my questions are
What role does the type system play in a functional language?
Is it necessary for a language to have a type system for it to be functional?
How is the "functional" level of a language related to the kind of the type system of the language?
A language does not need to be typed to be functional - at the heart of functional programming is the lambda calculus, which comes in untyped and typed variants.
The type system plays two roles:
it provides a guarantee at compile time that a class of errors cannot occur at run-time. The class of errors usually includes things like trying to add two strings together, or trying to apply an integer as a function.
it has some efficiency benefits, in that objects at runtime do not need to carry their types around, because the types have already been established at compile-time. This is known as type erasure.
In advanced type systems like Haskell's, the type system can provide more benefits:
overloading: using one identifier to refer to operations on different types
it allows a library to automatically choose an optimised implementation based on what type it is used at (using Type Families)
it allows powerful invariants to be proven at compile time, such as the invariant in a red-black tree (using Generalised Algebraic Datatypes)
What role does the type system play in a functional language?
To Simon Marlow's excellent answer, I would add that a type system, especially one that includes algebraic data types, makes it easier to write programs:
Software designs, which in object-oriented languages are sometimes expressed using UML diagrams, are very clearly expressed using types. This clarity manifests especially when not only values have types, but also modules have types, as in Objective Caml or Standard ML.
When a person is writing code, a couple of simple heuristics make it very, very easy to write pure functions based on the types:
A value of function type can always be created with a lambda.
A value of function type can always be consumed by applying it.
A value of an algebraic data type can be created by applying any of the type's constructors.
A value of an algebraic data type can be consumed by scrutinizing it with a case expression.
Based on these observations, and on the simple rule that unless there's a good reason, a function should consume each of its arguments, it's pretty easy to cut down the space of possible code you could write to a very small number of candidates. For example, there just aren't that many sensible functions of type (using Haskell notation)
forall a . (a -> Bool) -> [a] -> Bool
The art of using types to create code is called type-directed programming. When it works well, you hear functional programmers say things like "once we got the types right, the code practically wrote itself." Since the types are usually much smaller than the
code, this is a big win.
Same as in any programming language: it helps you to avoid/find errors in your code. In case of static typing a good type system prevents programs with certain types of errors from compiling.
No. The untyped lambda calculus is what you could call the prototype of functional programming languages and it is, as the name suggests, entirely untyped.
In a functional language (as well as any other language where a function can be used as a value) the type system needs to know what the type of a function is. Other than that there is nothing special about type systems for functional languages.
In a purely functional language you need to abstract side-effects, so you'd want the type system to somehow be able to support that. For example if you want to have a world type like in Clean, you'd want the type system to support uniqueness types to ensure proper usage.
If you want to have an IO monad like in haskell, you'd need an IO type (though a monad typeclass like in haskell is not required to have an IO monad, so you don't need a type system, which supports that).
1: Same as any other, it stops you from doing operations that are either ill-defined, or whose result would be 'nonsensical' to humans. Like float addition on integers.
2: Nope, the oldest programming language in the world, the (untyped) lambda calculus, is both functional and untyped.
3: Hardly, functional just means no side effects, no mutations, referential transparency et cetera.
Just remember that the oldest functional language, the untyped lambda calculus has no type system.