Module, what does "with type" do? - ocaml

module type ORDER = sig
type t
val leq : t -> t -> bool
val equal : t -> t -> bool
end
module Int:ORDER with type t = int = struct
type t = int
let leq = (<=)
let equal = (=)
end
Can someone explain to me this line :
module Int:ORDER with type t = int = struct
this --> with type t = int
I've tried without it :
Int.equal 3 3
Line 1, characters 10-11:
Error: This expression has type int but an expression was expected of type
Int.t
I can see what "It does", but I'm unable to explain it in words what it is happening, thank you

The colon operator in a module expression isn't just a signature constraint, it's also an abstraction construct. M : S requires the module M to have the signature S, and tells the compiler to forget everything about typing of M except for what is specified in S. This is where abstraction is born.
Given the definition module Int: S = struct … end (which is syntactic sugar for module S = (struct … end : S)), all the compiler knows about the types of elements of Int is what is recorded in S. If S is ORDER, its type t is abstract, and therefore Int.t is an abstract type: the fac that Int.t is actually an alias for int is hidden. Hiding the real implementation of a type is exactly what abstract types are about.
The signature that is actually desired for Int is
sig
type t = int
val leq : t -> t -> bool
val equal : t -> t -> bool
end
This is almost exactly the signature called ORDER, but with the type t being an alias for int rather than abstract. The with type construct allows using the name ORDER to construct the module type expression above. Given the definition of ORDER, writing
module Int : ORDER with type t = int = struct … end
is equivalent to writing
module Int : sig
type t = int
val leq : t -> t -> bool
val equal : t -> t -> bool
end = struct … end
Since the type Int.t is transparently equal to int, it can be used interchangeably with int.
A ORDER signature is mostly useful to pass the type to functors like Set and Map that build data structures that rely on an ordering relation for their elements. The data structure only depends on the abstract properties of the order relation, but the code that uses the data structure can still be aware of the type of the elements (thanks to another with type constraint, this one equating the type of data structure elements with the type of the functor argument). See “functors and type abstraction” in the language introduction.

Related

Value restriction for records

I face a situation where a record is given a weak polymorphic type and I am not sure why.
Here is a minimized example
module type S = sig
type 'a t
val default : 'a t
end
module F (M : S) = struct
type 'a record = { x : 'a M.t; n : int }
let f = { x = M.default; n = (fun x -> x) 3 }
end
Here f is given the type '_weak1 record.
There are (at least) two ways to solve that problem.
The first one consists in using an auxiliary definition for the function application.
let n = (fun x -> x) 3
let f = { x = M.default; n }
The second one consists in declaring the type parameter of t as covariant.
module type S = sig
type +'a t
val default : 'a t
end
What I find strange is that the function application is used to initialize the field of type int that has no link at all with the type variable 'a of type t. And I also fail to see why declaring 'a as covariant suddenly allows to use arbitrary expressions in this unrelated field without losing polymorphism.
For your first point, the relaxed value restriction is triggered as soon as any computation happens in any sub-expression. Thus
neither
{ x = M.default; n = (fun x -> x) 3 }
nor
let n = Fun.id 3 in { x = M.default; n }
are considered a value and the value expression applies to both of them.
For your second point, this the relaxed value restriction at work: if a type variable only appears in strictly covariant positions, it can always be generalized. For instance, the type of
let none = Fun.id None
is 'a. 'a option and not '_weak1 option because the option type constructor is covariant in its first parameter.
The brief explanation for this relaxation of the value restriction is that a covariant type parameter corresponds to a positive immutable piece of data, for instance
type !+'a option = None | Some of 'a
or
type +'a t = A
Thus if we have a type variable that only appear in strictly covariant position, we know that it is not bound to any mutable data, and it can thus be safely generalized.
An important point to notice however, if that the only values of type 'a t for a t covariant in its first parameters are precisely those that does not contains any 'a. Thus, if I have a value of type 'a. 'a option, I know that I have in fact a None. We can in fact check that point with the help of the typechecker:
type forall_option = { x:'a. 'a option }
type void = |
let for_all_option_is_none {x} = match (x: void option) with
| None -> ()
| _ -> . (* this `.` means that this branch cannot happen *)
Here by constraining the type 'a. 'a option to void option, we have made the typechecker aware than x was in fact a None.

How to understand a type definition?

type 'a tree = Empty | Node of 'a * 'a tree * 'a tree;;
Is it a type definition, where a is a type parameter, and tree is the type name?
In Node of, is Node a built-in type of OCaml? What does of mean?
Thanks.
Yes 'a is a type parameter and tree is a type name (these are usually called variants in OCaml). This is reversed order from most other languages. Node is a constructor (called tag in OCaml) and of is just a keyword in OCaml to specify the types of the constructor arguments. Node is not a built-in type of OCaml (it is not even a type, but rather, as a I said, a constructor).
Hence Node (5, Empty, Node (6, Empty, Empty)) is something of type int tree (something like Tree<Int> in Java).
It may make more sense if you start with a simpler variant.
type shape = Square of int | Rectangle of int * int
Shape and Rectangle are tags (again constructors) that I have just made up that allow me to construct values of type shape (in this case I've chosen to have Shape take only one argument because only length is needed to specify a square, whereas Rectangles need both length and width). Nothing ever has a type of Shape or Rectangle, but things can have a type of shape.
One way to read that line in English is "I have defined a type called shape. A shape is either a Square of a single integer, or a Rectangle of two integers."
Now maybe for some reason I also want to label my shapes.
type 'label labelledshape = LabelledSquare of 'label * int | LabelledRectangle of 'label * int * int
The quote ' distinguishes that label is not a type (such as int), but is rather a variable. This allows me to write something like LabelledSquare ("a label for a square", 5) which is of type string labelledshape
Note that although this allows for polymorphism, these are not what is known as "polymorphic variants" in OCaml. I will not talk about that here, rather I'll just recommend either looking at OCaml documentation or browsing Stack Overflow for more details on that.
There are several kinds of type definitions:
Definition of a new sum type
From that type definition:
type 'a tree = Empty | Node of 'a * 'a tree * 'a tree;;
Here are roughly the "facts" that the OCaml compiler knows about based on it:
tree is a unary type constructor. That means that for any type t, t tree is a type as well (example: int tree).
Empty is a 0-ary constructor for 'a tree. That means that Empty has type t tree for any t.
Node is a 3-ary constructor. Here this means that if a has type t, b has type t tree, and c has type t tree, then Node (a, b, c)has typet tree(note: that's the samet`)
those are the only two ways to have a value of type t tree. That means that you can use pattern matching (match ... with | Empty -> ... | Node (a, b, c) -> ...).
Definition of a type alias
A definition can also be something that looks like:
type t = existing_type
In that case, t is just a new neame for the existing_type.
For example:
type 'a pp = Format.formatter -> 'a -> unit
This means that something that is a int pp has type Format.formatter -> int -> unit.
This is the type of a function that takes a formatter and an integer and returns unit. Such as :
type 'a pp = Format.formatter -> 'a -> unit
module M : sig
val pp_int : int pp
end = struct
let pp_int fmt n = Format.fprintf fmt "%d" n
end

Do OCaml 'underscore types' (e.g. '_a) introduce the possibility of runtime type errors / soundness violations?

I was reading a little bit about the value restriction in Standard ML and tried translating the example to OCaml to see what it would do. It seems like OCaml produces these types in contexts where SML would reject a program due to the value restriction. I've also seen them in other contexts like empty hash tables that haven't been "specialized" to a particular type yet.
http://mlton.org/ValueRestriction
Here's an example of a rejected program in SML:
val r: 'a option ref = ref NONE
val r1: string option ref = r
val r2: int option ref = r
val () = r1 := SOME "foo"
val v: int = valOf (!r2)
If you enter the first line verbatim into the SML of New Jersey repl you get
the following error:
- val r: 'a option ref = ref NONE;
stdIn:1.6-1.33 Error: explicit type variable cannot be generalized at its binding declaration: 'a
If you leave off the explicit type annotation you get
- val r = ref NONE
stdIn:1.6-1.18 Warning: type vars not generalized because of
value restriction are instantiated to dummy types (X1,X2,...)
val r = ref NONE : ?.X1 option ref
What exactly is this dummy type? It seems like it's completely inaccessible and fails to unify with anything
- r := SOME 5;
stdIn:1.2-1.13 Error: operator and operand don't agree [overload conflict]
operator domain: ?.X1 option ref * ?.X1 option
operand: ?.X1 option ref * [int ty] option
in expression:
r := SOME 5
In OCaml, by contrast, the dummy type variable is accessible and unifies with the first thing it can.
# let r : 'a option ref = ref None;;
val r : '_a option ref = {contents = None}
# r := Some 5;;
- : unit = ()
# r ;;
- : int option ref = {contents = Some 5}
This is sort of confusing and raises a few questions.
1) Could a conforming SML implementation choose to make the "dummy" type above accessible?
2) How does OCaml preserve soundness without the value restriction? Does it make weaker guarantees than SML does?
3) The type '_a option ref seems less polymorphic than 'a option ref. Why isn't let r : 'a option ref = ref None;; (with an explicit annotation) rejected in OCaml?
Weakly polymorphic types (the '_-style types) are a programming convenience rather than a true extension of the type system.
2) How does OCaml preserve soundness without the value restriction? Does it make weaker guarantees than SML does?
OCaml does not sacrifice value restriction, it rather implements a heuristic that saves you from systematically annotating the type of values like ref None whose type is only “weekly” polymorphic. This heuristic by looking at the current “compilation unit”: if it can determine the actual type for a weekly polymorphic type, then everything works as if the initial declaration had the appropriate type annotation, otherwise the compilation unit is rejected with the message:
Error: The type of this expression, '_a option ref,
contains type variables that cannot be generalized
3) The type '_a option ref seems less polymorphic than 'a option ref. Why isn't let r : 'a option ref = ref None;; (with an explicit annotation) rejected in OCaml?
This is because '_a is not a “real” type, for instance it is forbidden to write a signature explicitly defining values of this “type”:
# module A : sig val table : '_a option ref end = struct let option = ref None end;;
Characters 27-30:
module A : sig val table : '_a option ref end = struct let option = ref None end;;
^^^
Error: The type variable name '_a is not allowed in programs
It is possible to avoid using these weakly polymorphic types by using recursive declarations to pack together the weakly polymorphic variable declaration and the later function usage that completes the type definition, e.g.:
# let rec r = ref None and set x = r := Some(x + 1);;
val r : int option ref = {contents = None}
val set : int -> unit = <fun>
1) Could a conforming SML implementation choose to make the "dummy" type above accessible?
The revised Definition (SML97) doesn't specify that there be a "dummy" type; all it formally specifies is that the val can't introduce a polymorphic type variable, since the right-hand-side expression isn't a non-expansive expression. (There are also some comments about type variables not leaking into the top level, but as Andreas Rossberg points out in his Defects in the Revised Definition of Standard ML, those comments are really about undetermined types rather than the type variables that appear in the definition's formalism, so they can't really be taken as part of the requirements.)
In practice, I think there are four approaches that implementations take:
some implementations reject the affected declarations during type-checking, and force the programmer to specify a monomorphic type.
some implementations, such as MLton, prevent generalization, but defer unification, so that the appropriate monomorphic type can become clear later in the program.
SML/NJ, as you've seen, issues a warning and instantiates a dummy type that cannot subsequently be unified with any other type.
I think I've heard that some implementation defaults to int? I'm not sure.
All of these options are presumably permitted and apparently sound, though the "defer unification" approach does require care to ensure that the type doesn't unify with an as-yet-ungenerated type name (especially a type name from inside a functor, since then the monomorphic type may correspond to different types in different applications of the functor, which would of course have the same sorts of problems as a regular polymorphic type).
2) How does OCaml preserve soundness without the value restriction? Does it make weaker guarantees than SML does?
I'm not very familiar with OCaml, but from what you write, it sounds like it uses the same approach as MLton; so, it should not have to sacrifice soundness.
(By the way, despite what you imply, OCaml does have the value restriction. There are some differences between the value restriction in OCaml and the one in SML, but none of your code-snippets relates to those differences. Your code snippets just demonstrate some differences in how the restriction is enforced in OCaml vs. one implementation of SML.)
3) The type '_a option ref seems less polymorphic than 'a option ref. Why isn't let r : 'a option ref = ref None;; (with an explicit annotation) rejected in OCaml?
Again, I'm not very familiar with OCaml, but — yeah, that seems like a mistake to me!
To answer the second part of your last question,
3) [...] Why isn't let r : 'a option ref = ref None;; (with an explicit annotation) rejected in OCaml?
That is because OCaml has a different interpretation of type variables occurring in type annotations: it interprets them as existentially quantified, not universally quantified. That is, a type annotation only has to be right for some possible instantiation of its variables, not for all. For example, even
let n : 'a = 5
is totally valid in OCaml. Arguably, this is rather misleading and not the best design choice.
To enforce polymorphism in OCaml, you have to write something like
let n : 'a. 'a = 5
which would indeed cause an error. However, this introduces a local quantifiers, so is still somewhat different from SML, and doesn't work for examples where 'a needs to be bound elsewhere, e.g. the following:
fun pair (x : 'a) (y : 'a) = (x, y)
In OCaml, you have to rewrite this to
let pair : 'a. 'a -> 'a -> 'a * 'a = fun x y -> (x, y)

Does compare work for all types?

Let's consider a type t and two variables x,y of type t.
Will the call compare x y be valid for any type t? I couldn't find any counterexample.
The polymorphic compare function works by recursively exploring the structure of values, providing an ad-hoc total ordering on OCaml values, used to define structural equality tested by the polymorphic = operator.
It is, by design, not defined on functions and closures, as observed by #antron. The recursive nature of the definition implies that structural equality is not defined on values containing a function or a closure. This recursive nature also imply that the compare function is not defined on recursive values, as mentioned by a #antron as well.
Structural equality, and therefore the compare function and the comparison operators, is not aware of structure invariants and cannot be used to compare (mildly) advanced data structures such as Sets, Maps, HashTbls and so on. If comparison of these structures is desired, a specialised function has to be written, this is why Set and Map define such a function.
When defining your own structures, a good rule of thumb is to distinguish between
concrete types, which are defined only in terms of primitive types and other concrete types. Concrete types should not be used for structures whose processing expects some invariants, because it is easy to create arbitrary values of this type breaking these invariants. For these types, the polymorphic comparison function and operators are appropriate.
abstract types, whose concrete definition is hidden. For these types, it is best to provide specialised comparison function. The mixture library defines a compare mixin that can be used to derive comparison operators from the implementation of a specialised compare function. Its use is illustrated in the README.
It doesn't work for function types:
# compare (fun x -> x) (fun x -> x);;
Exception: Invalid_argument "equal: functional value".
Likewise, it won't (generally) work for other types whose values can contain functions:
# type t = A | B of (int -> int);;
type t = A | B of (int -> int)
# compare A A;;
- : int = 0
# compare (B (fun x -> x)) A;;
- : int = 1
# compare (B (fun x -> x)) (B (fun x -> x));;
Exception: Invalid_argument "equal: functional value".
It also doesn't (generally) work for recursive values:
# type t = {self : t};;
type t = { self : t; }
# let rec v = {self = v};;
val v : t = {self = <cycle>}
# let rec v' = {self = v'};;
val v' : t = {self = <cycle>}
# compare v v;;
- : int = 0
# compare v v';;
(* Does not terminate. *)
These cases are also listed in the documentation for compare in Pervasives.

Why is the binding of the functor result with a new module name required to call nested functor of initial functor?

I have:
module Functor(M : sig end) = struct
module NestedFunctor(M : sig end) = struct
end
end
This code is valid:
module V = Functor(struct end)
module W = V.NestedFunctor(struct end)
But this is invalid:
module M = Functor(struct end).NestedFunctor(struct end)
(* ^ Error: Syntax error *)
As I understand, a functor is a relation between a set of input modules and a set of possible output modules. But this example confuses my understanding. Why is the binding of the functor result with a new module name required to call nested functor of initial functor?
My compiler version = 4.01.0
I'm new to OCaml. When I found functors I imagined something as
Engine.MakeRunnerFor(ObservationStation
.Observe(Matrix)
.With(Printer))
I thought it is a good tool for the human-friendly architecture notation.
Then I was disappointed. Of course, this is a syntax error and I understand that. But I think this restriction inflates grammar and makes it less intuitive. And my "Why?" in the main question is in the context of the concept of language.
While I don't believe that this restriction is strictly necessary, it is probably motivated by certain limitations in OCaml's module type system. Without going into too much technical detail, OCaml requires all intermediate module types to be expressible as syntactic signatures. But with functors, that sometimes isn't possible. For example, consider:
module Functor(X : sig end) = struct
type t = T of int
module Nested(Y : sig end) = struct let x = T 5 end
end
Given this definition, the type of the functor Functor(struct end).Nested can't be expressed in OCaml syntax. It would need to be something like
functor(Y : sig end) -> sig val x : Functor(struct end).t end (* not legal OCaml! *)
but Functor(struct end).t isn't a valid type expression in OCaml, for reasons that are rather technical (in short, allowing a type like that would make deciding what types are equal -- as necessary during type checking -- much more involved).
Naming intermediate modules often avoids this dilemma. Given
module A = Functor(struct end)
the functor A.Nested has type
functor(Y : sig end) -> sig val x : A.t end
by referring to the named intermediate result A.
Using the terminology in the manual, types and the like (module types, class types, etc.) can be qualified by an extended-module-path where a qualifier can be a functor call, whereas non-types (core expressions, module expressions, classes, etc.) can only be qualified by a module-path where qualifiers have to be plain module names.
For example, you can write a type Functor(struct end).NestedFunctor(struct end).t but not an expression Functor(struct end).NestedFunctor(struct end).x or a module expression Functor(struct end).NestedFunctor(struct end).
Syntax-wise, allowing extended-module-path in expressions would be ambiguous: the expression F(M).x is parsed as the constructor F applied to the expression (M).x, where M is a constructor and the . operator is the record field access operator. This won't ever typecheck since M is obviously a variant to which the . operator can't be applied, but eliminating this at the parser would be complicated. There may be other ambiguities I'm not thinking of right now (with first-class modules?).
As far as the type checker is concerned, functor calls in types designation isn't a problem — they're allowed. However the argument has to itself be a path: you can write Set.Make(String).t but not Set.Make(struct type t = string let compare = … end).t. Allowing structures and first-class modules in type expressions would make the type checker more complex, because of the way OCaml manages abstract types. Every time you write Set.Make(String).t, it designates the same abstract type; but if you write
module M1 = Set.Make(struct type t let compare = String.compare end)
module M2 = Set.Make(struct type t let compare = String.compare end)
then M1 and M2 are distinct abstract types. The technical way to formulate this is that in OCaml, functor application is applicative: applying the same functor to the same argument always returns the same abstract type. But structures are generative: writing struct … end twice produces distinct abstract types — so Set.Make(struct type t let compare = String.compare end).t ≠ Set.Make(struct type t let compare = String.compare end).t — generative types lead to a non-reflexive equality between type expressions if you aren't careful what you allow in type expressions.
Code generation wouldn't be impacted much, because it could desugar Functor(struct … end).field as let module TMP = struct … end in Functor(TMP).field.
As far as I can see, there's no deep answer. The reported error is a syntax error. I.e., the grammar of OCaml just doesn't support this notation.
One way to summarize it is that in the grammar for a module expression, the dot always appears as part of a "long module identifier", i.e., between two capitalized identifiers. I checked this just now, and that's what I saw.