Value restriction for records - ocaml

I face a situation where a record is given a weak polymorphic type and I am not sure why.
Here is a minimized example
module type S = sig
type 'a t
val default : 'a t
end
module F (M : S) = struct
type 'a record = { x : 'a M.t; n : int }
let f = { x = M.default; n = (fun x -> x) 3 }
end
Here f is given the type '_weak1 record.
There are (at least) two ways to solve that problem.
The first one consists in using an auxiliary definition for the function application.
let n = (fun x -> x) 3
let f = { x = M.default; n }
The second one consists in declaring the type parameter of t as covariant.
module type S = sig
type +'a t
val default : 'a t
end
What I find strange is that the function application is used to initialize the field of type int that has no link at all with the type variable 'a of type t. And I also fail to see why declaring 'a as covariant suddenly allows to use arbitrary expressions in this unrelated field without losing polymorphism.

For your first point, the relaxed value restriction is triggered as soon as any computation happens in any sub-expression. Thus
neither
{ x = M.default; n = (fun x -> x) 3 }
nor
let n = Fun.id 3 in { x = M.default; n }
are considered a value and the value expression applies to both of them.
For your second point, this the relaxed value restriction at work: if a type variable only appears in strictly covariant positions, it can always be generalized. For instance, the type of
let none = Fun.id None
is 'a. 'a option and not '_weak1 option because the option type constructor is covariant in its first parameter.
The brief explanation for this relaxation of the value restriction is that a covariant type parameter corresponds to a positive immutable piece of data, for instance
type !+'a option = None | Some of 'a
or
type +'a t = A
Thus if we have a type variable that only appear in strictly covariant position, we know that it is not bound to any mutable data, and it can thus be safely generalized.
An important point to notice however, if that the only values of type 'a t for a t covariant in its first parameters are precisely those that does not contains any 'a. Thus, if I have a value of type 'a. 'a option, I know that I have in fact a None. We can in fact check that point with the help of the typechecker:
type forall_option = { x:'a. 'a option }
type void = |
let for_all_option_is_none {x} = match (x: void option) with
| None -> ()
| _ -> . (* this `.` means that this branch cannot happen *)
Here by constraining the type 'a. 'a option to void option, we have made the typechecker aware than x was in fact a None.

Related

Module, what does "with type" do?

module type ORDER = sig
type t
val leq : t -> t -> bool
val equal : t -> t -> bool
end
module Int:ORDER with type t = int = struct
type t = int
let leq = (<=)
let equal = (=)
end
Can someone explain to me this line :
module Int:ORDER with type t = int = struct
this --> with type t = int
I've tried without it :
Int.equal 3 3
Line 1, characters 10-11:
Error: This expression has type int but an expression was expected of type
Int.t
I can see what "It does", but I'm unable to explain it in words what it is happening, thank you
The colon operator in a module expression isn't just a signature constraint, it's also an abstraction construct. M : S requires the module M to have the signature S, and tells the compiler to forget everything about typing of M except for what is specified in S. This is where abstraction is born.
Given the definition module Int: S = struct … end (which is syntactic sugar for module S = (struct … end : S)), all the compiler knows about the types of elements of Int is what is recorded in S. If S is ORDER, its type t is abstract, and therefore Int.t is an abstract type: the fac that Int.t is actually an alias for int is hidden. Hiding the real implementation of a type is exactly what abstract types are about.
The signature that is actually desired for Int is
sig
type t = int
val leq : t -> t -> bool
val equal : t -> t -> bool
end
This is almost exactly the signature called ORDER, but with the type t being an alias for int rather than abstract. The with type construct allows using the name ORDER to construct the module type expression above. Given the definition of ORDER, writing
module Int : ORDER with type t = int = struct … end
is equivalent to writing
module Int : sig
type t = int
val leq : t -> t -> bool
val equal : t -> t -> bool
end = struct … end
Since the type Int.t is transparently equal to int, it can be used interchangeably with int.
A ORDER signature is mostly useful to pass the type to functors like Set and Map that build data structures that rely on an ordering relation for their elements. The data structure only depends on the abstract properties of the order relation, but the code that uses the data structure can still be aware of the type of the elements (thanks to another with type constraint, this one equating the type of data structure elements with the type of the functor argument). See “functors and type abstraction” in the language introduction.

Why do the inferred types differ between let f = List.map fst vs let g x = List.map fst x

In OCaml, the inferred type of
let f = List.map fst
is
val f : ('_weak1 * '_weak2) list -> '_weak1 list = <fun>
while the inferred type of
let g x = List.map fst x
is
val g : ('a * 'b) list -> 'a list = <fun>
(types taken from utop).
As a result of this, f cannot be used polymorphically, while g can.
Why does this eta conversion between pure functions cause such a difference in type inference?
The difference is due to the value restriction, which does not allow the first definition to be polymorphic: it is defined by an application, which is not a value. The second form is defined as a function, which is a value. The notation '_weakN indicates a monomorphic type that is not yet resolved, as opposed to a polymorphic type variable like 'a.
See this chapter for more background.

OCaml why does an empty array have polymorphic type?

OCaml arrays are mutable. For most mutable types, even an "empty" value does not have polymorphic type.
For example,
# ref None;;
- : '_a option ref = {contents = None}
# Hashtbl.create 0;;
- : ('_a, '_b) Hashtbl.t = <abstr>
However, an empty array does have a polymorphic type
# [||];;
- : 'a array = [||]
This seems like it should be impossible since arrays are mutable.
It happens to work out in this case because the length of an array can't change and thus there's no opportunity to break soundness.
Are arrays special-cased in the type system to allow this?
The answer is simple -- an empty array has the polymorphic type because it is a constant. Is it special-cased? Well, sort of, mostly because an array is a built-in type, that is not represented as an ADT, so yes, in the typecore.ml in the is_nonexpansive function, there is a case for the array
| Texp_array [] -> true
However, this is not a special case, it is just a matter of inferring which syntactic expressions form constants.
Note, in general, the relaxed value restriction allows generalization of expressions that are non-expansive (not just syntactic constants as in classical value restriction). Where non-expansive expression is either a expression in the normal form (i.e., a constant) or an expression whose computation wouldn't have any observable side effects. In our case, [||] is a perfect constant.
The OCaml value restriction is even more relaxed than that, as it allows the generalization of some expansive expressions, in case if type variables have positive variance. But this is a completely different story.
Also,ref None is not an empty value. A ref value by itself, is just a record with one mutable field, type 'a ref = {mutable contents : 'a} so it can never be empty. The fact that it contains an immutable value (or references the immutable value, if you like) doesn't make it either empty or polymorphic. The same as [|None|] that is also non-empty. It is a singleton. Besides, the latter has the weak polymorphic type.
I don't believe so. Similar situations arise with user-defined data types, and the behaviour is the same.
As an example, consider:
type 'a t = Empty | One of { mutable contents : 'a }
As with an array, an 'a t is mutable. However, the Empty constructor can be used in a polymorphic way just like an empty array:
# let n = Empty in n, n;;
- : 'a t * 'b t = (Empty, Empty)
# let o = One {contents = None};;
val o : '_weak1 option t = One {contents = None}
This works even when there is a value of type 'a present, so long as it is not in a nonvariant position:
type 'a t = NonMut of 'a | Mut of { mutable contents : 'a }
# let n = NonMut None in n, n;;
- : 'a option t * 'b option t = (NonMut None, NonMut None)
Note that the argument of 'a t is still nonvariant and you will lose polymorphism when hiding the constructor inside a function or module (roughly because variance will be inferred from arguments of the type constructor).
# (fun () -> Empty) ();;
- : '_weak1 t = Empty
Compare with the empty list:
# (fun () -> []) ();;
- : 'a list = []

How to understand a type definition?

type 'a tree = Empty | Node of 'a * 'a tree * 'a tree;;
Is it a type definition, where a is a type parameter, and tree is the type name?
In Node of, is Node a built-in type of OCaml? What does of mean?
Thanks.
Yes 'a is a type parameter and tree is a type name (these are usually called variants in OCaml). This is reversed order from most other languages. Node is a constructor (called tag in OCaml) and of is just a keyword in OCaml to specify the types of the constructor arguments. Node is not a built-in type of OCaml (it is not even a type, but rather, as a I said, a constructor).
Hence Node (5, Empty, Node (6, Empty, Empty)) is something of type int tree (something like Tree<Int> in Java).
It may make more sense if you start with a simpler variant.
type shape = Square of int | Rectangle of int * int
Shape and Rectangle are tags (again constructors) that I have just made up that allow me to construct values of type shape (in this case I've chosen to have Shape take only one argument because only length is needed to specify a square, whereas Rectangles need both length and width). Nothing ever has a type of Shape or Rectangle, but things can have a type of shape.
One way to read that line in English is "I have defined a type called shape. A shape is either a Square of a single integer, or a Rectangle of two integers."
Now maybe for some reason I also want to label my shapes.
type 'label labelledshape = LabelledSquare of 'label * int | LabelledRectangle of 'label * int * int
The quote ' distinguishes that label is not a type (such as int), but is rather a variable. This allows me to write something like LabelledSquare ("a label for a square", 5) which is of type string labelledshape
Note that although this allows for polymorphism, these are not what is known as "polymorphic variants" in OCaml. I will not talk about that here, rather I'll just recommend either looking at OCaml documentation or browsing Stack Overflow for more details on that.
There are several kinds of type definitions:
Definition of a new sum type
From that type definition:
type 'a tree = Empty | Node of 'a * 'a tree * 'a tree;;
Here are roughly the "facts" that the OCaml compiler knows about based on it:
tree is a unary type constructor. That means that for any type t, t tree is a type as well (example: int tree).
Empty is a 0-ary constructor for 'a tree. That means that Empty has type t tree for any t.
Node is a 3-ary constructor. Here this means that if a has type t, b has type t tree, and c has type t tree, then Node (a, b, c)has typet tree(note: that's the samet`)
those are the only two ways to have a value of type t tree. That means that you can use pattern matching (match ... with | Empty -> ... | Node (a, b, c) -> ...).
Definition of a type alias
A definition can also be something that looks like:
type t = existing_type
In that case, t is just a new neame for the existing_type.
For example:
type 'a pp = Format.formatter -> 'a -> unit
This means that something that is a int pp has type Format.formatter -> int -> unit.
This is the type of a function that takes a formatter and an integer and returns unit. Such as :
type 'a pp = Format.formatter -> 'a -> unit
module M : sig
val pp_int : int pp
end = struct
let pp_int fmt n = Format.fprintf fmt "%d" n
end

Do OCaml 'underscore types' (e.g. '_a) introduce the possibility of runtime type errors / soundness violations?

I was reading a little bit about the value restriction in Standard ML and tried translating the example to OCaml to see what it would do. It seems like OCaml produces these types in contexts where SML would reject a program due to the value restriction. I've also seen them in other contexts like empty hash tables that haven't been "specialized" to a particular type yet.
http://mlton.org/ValueRestriction
Here's an example of a rejected program in SML:
val r: 'a option ref = ref NONE
val r1: string option ref = r
val r2: int option ref = r
val () = r1 := SOME "foo"
val v: int = valOf (!r2)
If you enter the first line verbatim into the SML of New Jersey repl you get
the following error:
- val r: 'a option ref = ref NONE;
stdIn:1.6-1.33 Error: explicit type variable cannot be generalized at its binding declaration: 'a
If you leave off the explicit type annotation you get
- val r = ref NONE
stdIn:1.6-1.18 Warning: type vars not generalized because of
value restriction are instantiated to dummy types (X1,X2,...)
val r = ref NONE : ?.X1 option ref
What exactly is this dummy type? It seems like it's completely inaccessible and fails to unify with anything
- r := SOME 5;
stdIn:1.2-1.13 Error: operator and operand don't agree [overload conflict]
operator domain: ?.X1 option ref * ?.X1 option
operand: ?.X1 option ref * [int ty] option
in expression:
r := SOME 5
In OCaml, by contrast, the dummy type variable is accessible and unifies with the first thing it can.
# let r : 'a option ref = ref None;;
val r : '_a option ref = {contents = None}
# r := Some 5;;
- : unit = ()
# r ;;
- : int option ref = {contents = Some 5}
This is sort of confusing and raises a few questions.
1) Could a conforming SML implementation choose to make the "dummy" type above accessible?
2) How does OCaml preserve soundness without the value restriction? Does it make weaker guarantees than SML does?
3) The type '_a option ref seems less polymorphic than 'a option ref. Why isn't let r : 'a option ref = ref None;; (with an explicit annotation) rejected in OCaml?
Weakly polymorphic types (the '_-style types) are a programming convenience rather than a true extension of the type system.
2) How does OCaml preserve soundness without the value restriction? Does it make weaker guarantees than SML does?
OCaml does not sacrifice value restriction, it rather implements a heuristic that saves you from systematically annotating the type of values like ref None whose type is only “weekly” polymorphic. This heuristic by looking at the current “compilation unit”: if it can determine the actual type for a weekly polymorphic type, then everything works as if the initial declaration had the appropriate type annotation, otherwise the compilation unit is rejected with the message:
Error: The type of this expression, '_a option ref,
contains type variables that cannot be generalized
3) The type '_a option ref seems less polymorphic than 'a option ref. Why isn't let r : 'a option ref = ref None;; (with an explicit annotation) rejected in OCaml?
This is because '_a is not a “real” type, for instance it is forbidden to write a signature explicitly defining values of this “type”:
# module A : sig val table : '_a option ref end = struct let option = ref None end;;
Characters 27-30:
module A : sig val table : '_a option ref end = struct let option = ref None end;;
^^^
Error: The type variable name '_a is not allowed in programs
It is possible to avoid using these weakly polymorphic types by using recursive declarations to pack together the weakly polymorphic variable declaration and the later function usage that completes the type definition, e.g.:
# let rec r = ref None and set x = r := Some(x + 1);;
val r : int option ref = {contents = None}
val set : int -> unit = <fun>
1) Could a conforming SML implementation choose to make the "dummy" type above accessible?
The revised Definition (SML97) doesn't specify that there be a "dummy" type; all it formally specifies is that the val can't introduce a polymorphic type variable, since the right-hand-side expression isn't a non-expansive expression. (There are also some comments about type variables not leaking into the top level, but as Andreas Rossberg points out in his Defects in the Revised Definition of Standard ML, those comments are really about undetermined types rather than the type variables that appear in the definition's formalism, so they can't really be taken as part of the requirements.)
In practice, I think there are four approaches that implementations take:
some implementations reject the affected declarations during type-checking, and force the programmer to specify a monomorphic type.
some implementations, such as MLton, prevent generalization, but defer unification, so that the appropriate monomorphic type can become clear later in the program.
SML/NJ, as you've seen, issues a warning and instantiates a dummy type that cannot subsequently be unified with any other type.
I think I've heard that some implementation defaults to int? I'm not sure.
All of these options are presumably permitted and apparently sound, though the "defer unification" approach does require care to ensure that the type doesn't unify with an as-yet-ungenerated type name (especially a type name from inside a functor, since then the monomorphic type may correspond to different types in different applications of the functor, which would of course have the same sorts of problems as a regular polymorphic type).
2) How does OCaml preserve soundness without the value restriction? Does it make weaker guarantees than SML does?
I'm not very familiar with OCaml, but from what you write, it sounds like it uses the same approach as MLton; so, it should not have to sacrifice soundness.
(By the way, despite what you imply, OCaml does have the value restriction. There are some differences between the value restriction in OCaml and the one in SML, but none of your code-snippets relates to those differences. Your code snippets just demonstrate some differences in how the restriction is enforced in OCaml vs. one implementation of SML.)
3) The type '_a option ref seems less polymorphic than 'a option ref. Why isn't let r : 'a option ref = ref None;; (with an explicit annotation) rejected in OCaml?
Again, I'm not very familiar with OCaml, but — yeah, that seems like a mistake to me!
To answer the second part of your last question,
3) [...] Why isn't let r : 'a option ref = ref None;; (with an explicit annotation) rejected in OCaml?
That is because OCaml has a different interpretation of type variables occurring in type annotations: it interprets them as existentially quantified, not universally quantified. That is, a type annotation only has to be right for some possible instantiation of its variables, not for all. For example, even
let n : 'a = 5
is totally valid in OCaml. Arguably, this is rather misleading and not the best design choice.
To enforce polymorphism in OCaml, you have to write something like
let n : 'a. 'a = 5
which would indeed cause an error. However, this introduces a local quantifiers, so is still somewhat different from SML, and doesn't work for examples where 'a needs to be bound elsewhere, e.g. the following:
fun pair (x : 'a) (y : 'a) = (x, y)
In OCaml, you have to rewrite this to
let pair : 'a. 'a -> 'a -> 'a * 'a = fun x y -> (x, y)