What can be done with the "constraint" keyword in OCaml - ocaml

The OCaml manual describes the "constraint" keyword, which can be used in a type definition. However, I cannot figure out any usage that can be done with this keyword. When is this keyword is useful? Can it be used to remove polymorphic type variables? (so that a type 'a t in a module becomes just t and the module can be used in a functor argument which requires t with no variables.)

So, the constraint keywords, used in type or class definitions, let one "reduce the scope” of applicable types to a type parameter, so to speak. The documentation clearly announce that type expressions from both sides of the constraint equation will be unified to "refine" the types the constraint relates to. Because they are type expressions, you may use all the usual type level operators.
Examples:
# type 'a t = int * 'a constraint 'a * int = float * int;;
type 'a t = int * 'a constraint 'a = float
# type ('a,'b) t = 'c r constraint 'c = 'a * 'b
and 'a r = {v1 : 'a; v2 : int };;
type ('a,'b) t = ('a * 'b) r
and 'a r = { v1 : 'a; v2 : int; }
Observe how type unification simplifies the equations, in the first example by getting rid of the extraneous type product (* int), and in the second case eliminating it altogether. Note also that I used a type variable 'c which only appears in the right hand side of the type definition.
Two interesting uses are with polymorphic variants and class types, both based on row-polymorphism. Constraints allow to express certain subtyping relations. By subtyping, for variants, we mean a relation such that any constructor of a type is present in its subtypes. Some of these relations may already be expressed monomorphically:
# type sum_op = [ `add | `subtract ];;
type sum_op = [ `add | `subtract ]
# type prod_op = [ `mul | `div ];;
type prod_op = [ `mul | `div ]
# type op = [ sum_op | prod_op ];;
type op = [ `add | `div | `mul | `sub ]
There, op is a subtype of both sum_op and prod_op.
But in some cases, you have to introduce polymorphism, and this is where constraints come handy:
# type 'a t = 'a constraint [> op ] = 'a;;
type 'a t = 'a constraint 'a = [> op ]
The above let you denote the family of types which are subtypes of op : the type instance is 'a itself for a given instance of 'a t.
If we try to define the same type without a parameter, the type unification algorithm will complain:
# type t' = [> op];;
Error: A type variable is unbound in this type declaration.
In type [> op ] as 'a the variable 'a is unbound
The same sort of constraints may be expressed with class types, and the same problem may arise if the type definition is implicitly polymorphic by subtyping.
# class type ct = object method v : int end;;
class type ct = object method v : int end
# type i = #ct;;
Error: A type variable is unbound in this type declaration.
In type #ct as 'a the variable 'a is unbound
# type 'a i = 'a constraint 'a = #ct;;
type 'a i = 'a constraint 'a = #ct

Related

Polymorphic type declaration in OCaml

when I write a type that e.g. only accepts strings: let type t1 = string
I can do let name : t1 = "A", but not let age : t1 = 1
But when I want to have a generic type that accepts any data type I have to do this: let type 'a t2 = 'a So I can do both let name : t2 = "A", let age : t2 = 1.
But why do I have to write let type 'a t2 = 'a instead of let type t2 = 'a?
The form
let type t1 = string
is not syntactically valid OCaml.
I imagine that you meant:
type t1 = string
Similarly, with the type constructor t2 defined as
type 'a t2 = 'a
then
let x : t2 = "hi"
is a type error because t2 is not a type but a type constructor of arity one.
The closest valid definition would be:
let x: string t2 = "hi"
which is equivalent
let x: 'a t2 = "hi"
because the type variable 'a is equated to 'a = string when inferring the type of x. But 'a t2 is an abbreviation for 'a, thus the above is still the same as
let x : string = "hi"
At a higher level, there is no useful generic type that accepts any data¹. Indeed, if it existed such type would break the type system.
¹ There are an advanced feature (GADTs or record with polymorphic fields) that allows to either create black-hole types that can carry a data of any kind but forbids any use of the data or types without any values of this type. However, it is probably better to first familiarize yourself with the core part of the type system before exploring those area.
To add to #octachron's answer, you can think of a declaration like type 'a t = 'a as the declaration of a function over types: given a type 'a, t will produce a new type 'a t (which in this case is just 'a). So you need the 'a just like you would need to declare the parameters in a function.
This shouldn't be confused with the occurrences of 'a in an explicit typing form like expr : 'a t. In this case, OCaml thinks of 'a as a type variable, which may or may not denote polymorphism. If you actually wanted to have polymorphism as is the polymorphic type, you'd have to introduce a polymorphic type variable either with 'a. ... or with a type a:
let id : 'a. 'a -> 'a = fun x -> x
let id (type a) (x : a) : a = x
However, note that these forms are not quite equivalent, there are subtle differences, and they are in any case different (in general, not in this specific example) from let id : 'a -> 'a = fun x -> x or let id (x : 'a) : 'a = x.

Combining parametric polymorphism and polymorphic variants (backtick types)

Suppose I have a type consisting of multiple polymorphic variants (covariantly) such as the following:
[> `Ok of int | `Error of string]
Let's further suppose that I want to factor this definition into some kind of type constructor and a concrete type int. My first attempt was the following:
type 'a error = [> `Ok of 'a | `Error of string]
However, using a definition like this produces a really strange type error mentioning a type variable 'b that doesn't appear anywhere in the definition.
$ ocaml
OCaml version 4.07.0
# type 'a error = [> `Ok of 'a | `Error of string ];;
Error: A type variable is unbound in this type declaration.
In type [> `Error of string | `Ok of 'a ] as 'b the variable 'b is unbound
This 'b is an autogenerated name, adding an explicit 'b shifts the variable to 'c.
$ ocaml
OCaml version 4.07.0
# type ('a, 'b) error = [> `Ok of 'a | `Error of 'b ];;
Error: A type variable is unbound in this type declaration.
In type [> `Error of 'b | `Ok of 'a ] as 'c the variable 'c is unbound
Using the invariant construction [ `Thing1 of type1 | `Thing2 of type 2 ] appears to work fine in this context.
$ ocaml
OCaml version 4.07.0
# type 'a error = [ `Ok of 'a | `Error of string ] ;;
type 'a error = [ `Error of string | `Ok of 'a ]
#
However, explicitly marking the type parameter as covariant does not salvage the original example.
$ ocaml
OCaml version 4.07.0
# type +'a error = [> `Ok of 'a | `Error of string];;
Error: A type variable is unbound in this type declaration.
In type [> `Error of string | `Ok of 'a ] as 'b the variable 'b is unbound
And, just for good measure, adding a contravariance annotation also does not work.
$ ocaml
OCaml version 4.07.0
# type -'a error = [> `Ok of 'a | `Error of string];;
Error: A type variable is unbound in this type declaration.
In type [> `Error of string | `Ok of 'a ] as 'b the variable 'b is unbound
Attempting to guess the name that the compiler will use for the unbound type variable and adding it as a parameter on the left also does not work and produces a very bizarre error message.
$ ocaml
OCaml version 4.07.0
# type ('a, 'b) string = [> `Ok of 'a | `Error of string] ;;
Error: The type constructor string expects 2 argument(s),
but is here applied to 0 argument(s)
Is there a way of making a type constructor that can effectively "substitute different types" for int in [> `Ok of int | `Error of string]?
This isn't an issue of variance, or parametric polymorphism, but of row polymorphism. When you add > or < it also adds an implicit type variable, the row variable, that will hold the "full" type. You can see this type variable made explicit in the error:
[> `Error of string | `Ok of 'a ] as 'b
Note the as 'b part at the end.
In order to alias the type you have to make the type variable explicit, so you can reference it as a type parameter on the alias:
type ('a, 'r) error = [> `Ok of 'a | `Error of string ] as 'r
Note also, in case you have or when you will, run into objects, that this applies there as well. An object type with .. has an implicit type variable that you need to make explicit in order to alias it:
type 'r obj = < foo: int; .. > as 'r

Adding parametrically a `Null constructor to polymorphic variants

The following type declarations do not work:
type 'a or_null = [ 'a | `Null ]
and
type 'a or_null = [ 'a | `Null ] constraint 'a = [> `A | `B ]
With the message:
Error: The type 'a does not expand to a polymorphic variant type
Hint: Did you mean `a
I would like to achieve this without using another layer in the memory representation (and in the syntax). In particular, I want to avoid using an option type such as
type 'a or_null = | A of 'a | Null
Is there a way to have such a type using only polymorphic variants? The final goal would be to write e.g. monads on 'a or_null types. (And this is actually the tricky part.)
Polymorphic variants cannot track the absence of a specific constructor. This implies that we cannot really write the usual bind. If we try
let bind x f =
match x with
| `Null -> `Null
| x -> f x
we get
val bind: ([> `Null] as 'a) -> ('a -> ([>`Null] as 'b)) -> 'b
If for readability's sake, we add the following type abbreviation
type 'a m = [> `Null] as 'a
(which is an alternative definition of or_null) the previous type read as
val bind: 'a m -> ('a m -> 'b m) -> 'b m
In other words, the function argument f of bind must already handle the `Null case in its argument by itself because the type system cannot express the constraint x <> `Null in the second branch of the match.

Function signature as type in OCaml

Is there a way to declare something like
type do = ('a -> 'b)
in OCaml? Specifically, to declare a function signature as a type
For free types 'a and 'b,'a -> 'b is not the type of any well behaved OCaml function, because it requires the function to produce a value of an arbitrary type.
So, you can't give a name to a type with unbound parameters:
# type uabfun = 'a -> 'b
Error: Unbound type parameter 'a
If you use specific types, there's no problem giving it a name:
# type iifun = int -> int;;
type iifun = int -> int
If type types 'a and 'b are parameters (rather than being free), there is also no problem:
# type ('a, 'b) abfun = 'a -> 'b;;
type ('a, 'b) abfun = 'a -> 'b

OCaml functor taking polymorphic variant type

Trying to compile
module F (M : sig
type t = [> `Foo ]
end) = struct
type t = [ M.t | `Bar ]
end
gets me
Error: A type variable is unbound in this type declaration.
In type [> `Foo ] as 'a the variable 'a is unbound
What am I doing wrong?
type t = [> `Foo] is invalid since [> `Foo] is an open type and contains a type variable implicitly. The definition is rejected just as the following type definition is rejected since the RHS has a type variable which is not quantified in LHS:
type t = 'a list
You have to make it closed:
type t = [ `Foo ]
or quantify the type variable:
type 'a t = [> `Foo] as 'a
which is equivalent to
type 'a t = 'a constraint 'a = [> `Foo]
This seems to work:
module F ( M : sig
type base_t = [ `Foo ]
type 'a t = [> base_t] as 'a
end) = struct
type t = [ M.base_t | `Bar ] M.t
end
M.base_t is closed, while M.t('a) is polymorphic. F constructs M.t using M.base_t extended with 'Bar.
Here is reasonml try link, which includes the snippet above both in OCaml and ReasonML syntax, and proves that it compiles.