What is the difference between the different ways to type variables? [duplicate] - ocaml

OCaml has several different syntaxes for a polymorphic type annotation :
let f : 'a -> 'a = … (* Isn’t this one already polymorphic? (answer: NO) *)
let f : 'a. 'a -> 'a = …
let f : type a. a -> a = …
We often see them when using fancy algebraic datatypes (typically, GADTs), where they seem to be necessary.
What is the difference between these syntaxes? When and why each one must be used?

Below are alternative explanations with a varying amount of detail, depending on how much of a hurry you’re in. ;-)
I will use the following code (drawn from that other question) as a running example. Here, the type annotation on the definition of reduce is actually required to make it typecheck.
(* The type [('a, 'c) fun_chain] represents a chainable list of functions, i.e.
* such that the output type of one function is the input type of the next one;
* ['a] and ['c] are the input and output types of the whole chain.
* This is implemented as a recursive GADT (generalized algebraic data type). *)
type (_, _) fun_chain =
| Nil : ('a, 'a) fun_chain
| Cons : ('a -> 'b) * ('b, 'c) fun_chain -> ('a, 'c) fun_chain
(* [reduce] reduces a chain to just one function by composing all
* functions of the chain. *)
let rec reduce : type a c. (a, c) fun_chain -> a -> c =
fun chain x ->
begin match chain with
| Nil -> x
| Cons (f, chain') -> reduce chain' (f x)
end
The short story
On let-definitions, an annotation like : 'a -> 'a does not force polymorphism: the type-checker may refine the unification variable 'a to something. This bit of syntax is misleading indeed, because the same annotation on a val-declaration i.e. in a module signature does enforce polymorphism.
: type a. … is a type annotation with explicit (forced) polymorphism. You can think of this as the universal quantifier (∀ a, “for all a“). For instance,
let some : type a. a -> a option =
fun x -> Some x
means that “for all” type a, you can give an a to some and then it will return an a option.
The code at the beginning of this answer makes use of advanced features of the type system, namely, polymorphic recursion and branches with different types, and that leaves type inference at a loss. In order to have a program typecheck in such a situation, we need to force polymorphism like this. Beware that in this syntax, a is a type name (no leading quote) rather than a type unification variable.
: 'a. … is another syntax that forces polymorphism, but it is practically subsumed by : type a. … so you will hardly need it at all.
The pragmatic story
: type a. … is a short-hand syntax that combines two features:
an explicitly polymorphic annotation : 'a. …
useful for ensuring a definition is as general as intended
required when recursion is done with type parameters different from those of the initial call (“polymorphic recursion” i.e. recursion on “non-regular” ADTs)
a locally abstract type (type a) …
required when different branches have different types (i.e. when pattern-matching on “generalized” ADTs)
allows you to refer to type a from inside the definition, typically when building a first-class module (I won’t say more about this)
Here we use the combined syntax because our definition of reduce falls under both situations in bold.
We have polymorphic recursion because Cons builds a (a, c) fun_chain from a (b, c) fun_chain: the first type parameter differs (we say that fun_chain is a “non-regular” ADT).
We have branches with different types because Nil builds a (a, a) fun_chain whereas Cons builds a (a, c) fun_chain (we say that fun_chain is a “generalized” ADT, or GADT for short).
Just to be clear: : 'a. … and : type a. … produce the same signature for the definition. Choosing one syntax or the other only has an influence on how its body is typechecked. For most intents and purposes, you can forget about : 'a. … and just remember the combined form : type a. …. Alas, the latter does not completely subsume the former, there are rare situations where writing : type a. … wouldn’t work and you would need : 'a. … (see #octachron’s answer) but, hopefully, you won’t stumble upon them often.
The long story
Explicit polymorphism
OCaml type annotations have a dirty little secret: writing let f : 'a -> 'a = … doesn’t force f to be polymorphic in 'a. The compiler unifies the provided annotation with the inferred type and is free to instantiate the type variable 'a while doing so, leading to a less general type than intended. For instance let f : 'a -> 'a = fun x -> x+1 is an accepted program and leads to val f : int -> int. To ensure the function is indeed polymorphic (i.e. to have the compiler reject the definition if it is not general enough), you have to make the polymorphism explicit, with the following syntax:
let f : 'a. 'a -> 'a = …
For a non-recursive definition, this is merely the human programmer adding a constraint which makes more programs be rejected.
In the case of a recursive definition however, this has another implication. When typechecking the body, the compiler will unify the provided type with the types of all occurrences of the function being defined. Type variables which are not marked as polymorphic will be made equal in all recursive calls. But polymorphic recursion is precisely when we recurse with differing type parameters; without explicit polymorphism, that would either fail or infer a less general type than intended. To make it work, we explicitly mark which type variables should be polymorphic.
Note that there is a good reason why OCaml cannot typecheck polymorphic recursion on its own: there is undecidability around the corner (see Wikipedia for references).
As an example, let’s do the job of the typechecker on this faulty definition, where polymorphism is not made explicit:
(* does not typecheck! *)
let rec reduce : ('a, 'c) fun_chain -> 'a -> 'c =
fun chain x ->
begin match chain with
| Nil -> x
| Cons (f, chain') -> reduce chain' (f x)
end
We start with reduce : ('a, 'c) fun_chain -> 'a -> 'c and chain : ('a, 'c) fun_chain for some type variables 'a and 'c.
In the first branch, chain = Nil, so we learn that in fact chain : ('c, 'c) fun_chain and 'a == 'c. We unify both type variables. (That doesn’t matter right now, though.)
In the second branch, chain = Cons (f, chain') so there exists an arbitrary type b such that f : 'a -> b and chain' : (b, 'c) fun_chain. Then we must typecheck the recursive call reduce chain', so the expected argument type ('a, 'c) fun_chain must unify with the provided argument type (b, 'c) fun_chain; but nothing tells us that b == 'a. So we reject this definition, preferably (as is tradition) with a cryptic error message:
Error: This expression has type ($Cons_'b, 'c) fun_chain
but an expression was expected of type ('c, 'c) fun_chain
The type constructor $Cons_'b would escape its scope
If now we make polymorphism explicit:
(* still does not typecheck! *)
let rec reduce : 'a 'c. ('a, 'c) fun_chain -> 'a -> 'c =
…
Then typechecking the recursive call is not a problem anymore, because we now know that reduce is polymorphic with two “type parameters” (non-standard terminology), and these type parameters are instantiated independently at each occurrence of reduce; the recursive call uses b and 'c even though the enclosing call uses 'a and 'c.
Locally abstract types
But we have a second problem: the other branch, for constructor Nil, has made 'a be unified with 'c. Hence we end up inferring a less general type than what the annotation mandated, and we report an error:
Error: This definition has type 'c. ('c, 'c) fun_chain -> 'c -> 'c
which is less general than 'a 'c. ('a, 'c) fun_chain -> 'a -> 'c
The solution is to turn the type variables into locally abstract types, which cannot be unified (but we can still have type equations about them). That way, type equations are derived locally to each branch and they do not transpire outside of the match with construct.

The practical answer when hesitating between 'a . ... and type a. ... is to always use the latter form:
type a. ... works with:
polymorphic recursion
GADTs
raise type errors early
whereas:
'a. ... works with
polymorphic recursion
polymorphic quantification over row type variables
Thus type a. ... is generally a strictly superior version of 'a . ... .
Except for the last strange point. For the sake exhaustiveness, let me give an example of quantification over a row type variable:
let f: 'a. ([> `X ] as 'a) -> unit = function
| `X -> ()
| _ -> ()
Here the universal quantification allows us to control precisely the row variable type. For instance,
let f: 'a. ([> `X ] as 'a) -> unit = function
| `X | `Y -> ()
| _ -> ()
yields the following error
Error: This pattern matches values of type [? `Y ]
but a pattern was expected which matches values of type [> `X ]
The second variant type is bound to the universal type variable 'a,
it may not allow the tag(s) `Y
This use case is not supported by the form type a. ... mostly because the interaction of locally abstract type, GADTs type refinement and type constraints has not been formalized. Thus this second exotic use case is not supported.

TL;DR; In your question, only the last two forms are polymorphic type annotations. The latter of these two forms, in addition to annotating a type as polymorphic, introduces a locally abstract type1. This is the only difference.
The longer story
Now let's speak a little bit about the terminology. The following is not a type annotation (or, more properly, doesn't contain any type annotations),
let f : 'a -> 'a = …
It is called a type constraint. A type constraint requires the type of the defined value to be compatible with the specified type schemata.
In this definition,
let f : 'a. 'a -> 'a = …
we have a type constraint that includes a type annotation. The phrase "type annotation" in OCaml parlance means: annotating a type with some information, i.e., attaching some attribute or a property to a type. In this case, we annotate type 'a as polymorphic. We're not annotating the value f as polymorphic neither are we annotating the value f with type 'a -> 'a or 'a. 'a -> 'a. We are constraining the value of f to be compatible with type 'a -> 'a and annotate 'a as a polymorphic type variable.
For a long time, syntax 'a. was the only way to annotate type as polymorphic, but later OCaml introduced locally abstract types. They have the following syntax, which you could also add to your collection.
let f (type t) : t -> t = ...
Which creates a fresh abstract type constructor that you can use in the scope of the definition. It doesn't annotate t as polymorphic though, so if you want it to be explicitly annotated as polymorphic you could write,
let f : 'a. 'a -> 'a = fun (type t) (x : t) : t -> ...
which includes both an explicit type annotation of 'a as polymorphic and the introduction of a locally abstract type. Needless to say, it is cumbersome to write such constructions, so a little bit later (OCaml 4.00) they introduced syntactic sugar for that so that the above expression could be written as simple as,
let f : type t. t -> t = ...
Therefore, this syntax is just an amalgamation of two rather orthogonal features: locally abstract types and explicitly polymorphic types.
It is not however that the result of this amalgamation is stronger than its parts. It is more like an intersection. Whilst the generated type is both locally abstract and polymorphic, it is constrained to be a ground type. In other words, it constrains the kind of the type, but this is a completely different problem of higher-kinded polymorphism.
And to conclude the story, despite the syntax similarities, the following is not a type annotation,
val f : 'a -> 'a
It is called a value specification, which is a part of a signature, and it denotes that the value f has type 'a -> 'a.
1)) Locally abstract types have two main use cases. First, you can use them inside your expression in places where type variables are not permitted, e.g., in modules and exceptions definitions. Second, the scope of the locally abstract type exceeds the scope of the function, which you can employ by unifying types that are local to your expression with the abstract type to extend their scopes. The underlying idea is that the expression can not outlive its type and since in OCaml types could be created in runtime we have to be careful with the extent of the type as well. Unifying a locally created type with a locally abstract type via a function parameter guarantees that this type will be unified with some existing type in the place of the function application. Intuitively, it is like passing a reference for a type, so that the type could be returned from the function.

Related

What does it mean to define a type with no constructors in OCaml?

I saw this OCaml code in the Coq codebase:
type ('a, 'b, 'c) tag
which seems we create the type tag that takes three type arguments 'a, 'b, 'c but has no constructors…? So how do we even construct values for this type?
Because this is in reference to a specification in an interface, this simply means that constructors are not exposed as part of the interface.
However, this does not mean there is no way to obtain a value of this type. In this case, it appears the primary (possibly only; I am unfamiliar with Coq's API) way to obtain a value of this type would be to use get_arg_tag from the containing module.

How to use Js.Option.map?

For this code:
// val map : ('a -> 'b [#bs]) -> 'a option -> 'b option
let optTest: Js.Option.t(int) = Js.Option.map(x => x, Js.Option.some(1));
I am getting following error:
This expression should not be a function, the expected type is (. 'a) => 'b
where x => x is red. I am really confused, why the map isn't working? From its type signature it looks I am using it correctly, yet compiler says the first argument is not supposed to be a function?
Short answer - use Belt.Option.map instead:
let optTest: option(int) = Belt.Option.map(Some(1), x => x);
Long answer:
The Js namespace is intended mostly for bindings to JavaScript's standard APIs. And while Js.Option for historical reasons was included in this namespace, the conventions used in the Js namespace is still that of very thin bindings.
The type you see for the callback function in the documentation, 'a -> 'b [#bs], and the type you see in the error message, (. 'a) => 'b, are exactly the same type. But the former is in OCaml's syntax while the latter is in Reason's, and also sugared up to look less offensive. Either way, the problem is that you're passing it an ordinary function when it expects this weird other kind of function.
The weird other kind of function is called an uncurried function. It's called that because "normal" functions in Reason are curried, while JavaScript functions are not. An uncurried function is therefore essentially just a native JavaScript function, which you sometimes need to deal with because you might receive one or need to pass one to a higher-order JavaScript function, like those in the Js namespace.
So how do you create an uncurried function in Reason? Just add a ., like in the type:
let optTest: option(int) = Js.Option.map((.x) => x, Some(1);
Or if you want to do it without the sugar (which you don't, but for the sake of completeness):
let optTest: option(int) = Js.Option.map([#bs] x => x, Some(1);
Addendum:
You might have noticed that I've replaced Js.Option.t and Js.Option.some in your example with option and Some. That's because those are the actual primitives. The option type is essentially defined as
type option('a) =
| Some('a)
| None
and is available everywhere.
Js.Option.t('a) (and Belt.Option.t('a)) is just an alias. And Js.Option.some is just a convenience function which doesn't have any equivalent in Belt.Option. They're mostly just there for consistency, and you should normally use the actual type and variant constructors instead.

Prevent SML type from becoming eqtype without hiding constructors

In datatype declarations, Standard ML will produce an equality type if all of the type arguments to all of the variants are themselves eqtypes.
I've seen comments in a few places lamenting the inability of users to provide their own definition of equality and construct their own eqtypes and unexpected consequences of the SML rules (e.g. bare refs and arrays are eqtypes, but datatype Foo = Foo of (real ref) is not an eqtype).
Source: http://mlton.org/PolymorphicEquality
one might expect to be able to compare two values of type real t, because pointer comparison on a ref cell would suffice. Unfortunately, the type system can only express that a user-defined datatype admits equality or not.
I'm wondering whether it is possible to block eqtyping. Say, for instance, I am implementing a set as a binary tree (with an unnecessary variant) and I want to pledge away the ability to structurally compare sets with each other.
datatype 'a set = EmptySet | SetLeaf of 'a | SetNode of 'a * 'a set * 'a set;
Say I don't want people to be able to distinguish SetLeaf(5) and SetNode(5, EmptySet, EmptySet) with = since it's an abstraction-breaking operation.
I tried a simple example with datatype on = On | Off just to see if I could demote the type to a non-eqtype using signatures.
(* attempt to hide the "eq"-ness of eqtype *)
signature S = sig
type on
val foo : on
end
(* opaque transcription to kill eqtypeness *)
structure X :> S = struct
datatype on = On | Off
let foo = On
end
It seems that transparent ascription fails to prevent X.on from becoming an eqtype, but opaque ascription does prevent it. However, these solutions are not ideal because they introduce a new module and hide the data constructors. Is there a way to prevent a custom type or type constructor from becoming an eqtype or admitting equality without hiding its data constructors or introducing new modules?
Short answer is no. When a type's definition is visible, it's eq-ness is whatever the definition implies. The only way to prevent it being eq then is to tweak the definition such that it isn't, for example, by adding a dummy constructor with a real parameter.
Btw, small correction: your type foo should be an equality type. If your SML implementation disagrees then it has a bug. A different case is real bar when datatype 'a bar = Bar of 'a ref (which is what the MLton manual discusses). The reason that the first one works but the second doesn't is that ref is magic in SML: it has a form of polymorphic eq-ness that user types cannot have.

printing polymorphic containers in ocaml toplevel

Say I have my own data structure, as a silly example,
type 'a mylist = Empty | Cons of 'a * ('a mylist).
I would like the toplevel to print this list in the form {a,b,...}. Here a, b of type 'a are printed according to a printing function installed in the toplevel with #install_printer, or if none is available, as <abstr>.
I know how I would define a printing function for a monomorphic mylist, but is there a polymorphic way to tell the toplevel to just put {, , and } and use what it already knows for any type that comes in between?
I don't think it's possible. The reason is that OCaml throws away types at run time and therefore it is not possible to have a function which behave differently depending on a type at runtime. So you can't define such a polymorphic printing function. Note that #install_printer is not part of the OCaml language but it a directive for the toplevel, which still knows about type. The only possible solution is to define a generic printing function which take the 'a printing function as parameter. Something like
'a -> string -> 'a mylist -> unit
But I think you already know that, don't you ?

SML Warning: Type Vars Not Generalized when using Empty Lists or NONE option

I can't for the life of me figure out why the following SML function is throwing a Warning in my homework problem:
fun my_func f ls =
case ls of
[] => raise MyException
| head :: rest => case f head of
SOME v => v
| NONE => my_func f rest
fun f a = if isSome a then a else NONE;
Whenever I call my_func with the following test functions:
my_func f [NONE, NONE];
my_func f [];
I always get the warning:
Warning: type vars not generalized because of
value restriction are instantiated to dummy types (X1,X2,...)
Whenever I pass in an options list containing at least one SOME value, this Warning is not thrown. I know it must be something to do with the fact that I am using polymorphism in my function currying, but I've been completely stuck as to how to get rid of these warnings.
Please help if you have any ideas - thank you in advance!
The value restriction referenced in the warning is one of the trickier things to understand in SML, however I will do my best to explain why it comes up in this case and try to point you towards a few resources to learn more.
As you know, SML will use type inference to deduce most of the types in your programs. In this program, the type of my_func will be inferred to be ('a -> 'b option) -> 'a list -> 'b. As you noted, it's a polymorphic type. When you call my_func like this
myfunc f [NONE, SOME 1, NONE];
... the type variables 'a and 'b are instantiated to int option and int.
However when you call it without a value such as SOME 1 above
myfunc f [NONE, NONE];
What do you think the type variables should be instantiated to? The types should be polymorphic -- something like 't option and 't for all types 't. However, there is a limitation which prevents values like this to take on polymorphic types.
SML defines some expressions as non-expansive values and only these values may take on polymorphic types. They are:
literals (constants)
variables
function expressions
constructors (except for ref) applied to non-expansive values
a non-expansive values with a type annotation
tuples where each field is a non-expansive value
records where each field is a non-expansive value
lists where each field is a non-expansive value
All other expressions, notably function calls (which is what the call to my_func is) cannot be polymorphic. Neither can references. You might be curious to see that the following does not raise a warning:
fn () => my_func f [NONE, NONE];
Instead, the type inferred is unit -> 'a. If you were to call this function however, you would get the warning again.
My understanding of the reason for this restriction is a little weak, but I believe that the underlying root issue is mutable references. Here's an example I've taken from the MLton site linked below:
val r: 'a option ref = ref NONE
val r1: string option ref = r
val r2: int option ref = r
val () = r1 := SOME "foo"
val v: int = valOf (!r2)
This program does not typecheck under SML, due to the value restriction. Were it not for the value restriction, this program would have a type error at runtime.
As I said, my understanding is shaky. However I hope I've shed a little light on the issue you've run into, although I believe that in your case you could safely ignore the warning. Here are some references should you decide you'd like to dig deeper:
http://users.cis.fiu.edu/~smithg/cop4555/valrestr.html
http://mlton.org/ValueRestriction
(BTW the MLton site is solid gold. There's so much hidden away here, so if you're trying to understand something weird about SML, I highly recommend searching here because you'll likely turn up a lot more than you initially wanted)
Since it seems like you're actually using SML/NJ, this is a pretty handy guide to the error messages and warnings that it will give you at compile time:
http://flint.cs.yale.edu/cs421/smlnj/doc/errors.html