I want to exploit GADT to implement the type ('a, 'b) liInstr_t in order to hold various types of instructions which are recursively decoded into basic operations (if need be) that then executed. (Eventually, I should construct more abstract types over them but in a mechanical, compositional and scripted fashion.) Unfortunately, I have difficulties associating the locally abstract types from pattern-matching function argument with the alternative concrete return types desired for the GADT.
I believe I'm missing something fundamental or making wrong assumptions though I have looked at the ocaml 4.10.0 manual on locally abstract types and gadts, the Real world Ocaml book, and the responses to similar questions such as here and here. This is because I seem to follow their explanations but cannot somehow apply them to my task.
From the above, I understand that polymorphic type variables annotate functions so that they can take on (be unified with) arbitrary types compatible with their constraints, and that locally abstract types let us have different types along alternative paths through a pattern-matching expression, say. Also, the local abstract types cannot be unified but can be refined to concrete types compatible with GADT result types. As such, GADTs can recurse over polymorphic types, and aggregate multiple result types into a single sum type.
I deliberately let the type ('a, 'b) liInstr_t have two type variables (so I can add more later), and its variants capture various constraint formats and scenarios I may have to use together.
type
liLabel_t = string (* Instruction name (label) *)
and
context_t = string (* TODO: execution context *)
and
'a context_list_t = 'a list
and
'a liChooser_t = 'a -> int (* get index of i-th list entry *)
and
('a, 'b) liInstr_t =
LiExec: 'a -> ('a, 'b) liInstr_t (* executable operation *)
| LiExecTRY: ('a, _) liInstr_t (* Ignore: Experiment on GADT *)
| LiLab: liLabel_t -> ('a, 'b) liInstr_t (* instruction label *)
| LiLabTRY: (liLabel_t, _) liInstr_t (* Ignore: Experiment on GADT *)
| LiSeq: 'a liChooser_t * 'b list -> ('a, 'b) liInstr_t (* sequence *)
| LiAlt: 'a liChooser_t * 'b list -> ('a, 'b) liInstr_t (* choice *)
| LiLoop: 'a liChooser_t * 'b list -> ('a, 'b) liInstr_t (* loop *)
| LiName: 'a liChooser_t * liLabel_t * 'b context_list_t ->
('a, 'b) liInstr_t (* change context *)
| Err_LiInstr: ('a, 'b) liInstr_t (* error handling *)
| Nil_LiInstr: ('a, 'b) liInstr_t (* no action *)
After experimenting, the sample function used is:
let ft1: type b c. (b, c) liInstr_t -> b = function
(* *) | LiExec n -> n
(* *) | LiExecTRY -> "4"
(* *) | LiLab s -> "LiLab"
(* *) | LiLabTRY -> "LiLabTRY"
(* *) | LiSeq (f, il) -> "LiSeq"
(* *) | LiAlt (f, il) -> "LiAlt"
(* *) | LiLoop (f, il) -> "LiLoop"
(* *) | LiName (f, il, ic) -> "LiName"
(* *) | Err_LiInstr -> "Err_LiInstr"
(* *) | Nil_LiInstr -> "Nil_LiInstr"
;;
and it gave the error:
Line 3, characters 22-25:
3 | (* *) | LiExecTRY -> "4"
^^^
Error: This expression has type string but an expression was expected of type
b
I still got errors when I changed the function annotation (and typing), or commented out some alternatives in the function pattern matching and the GADT type variants. Some of the errors (elided for brevity) were obtained as follows:
Using an extra locally-typed variable:
let ft1 : type b c d. (b, c) liInstr_t -> d = function ...
2 | (* *) | LiExec n -> n
^
Error: This expression has type b but an expression was expected of type d
Using only polymorphic type variables:
let ft1: 'b 'c. ('b, 'c) liInstr_t -> 'b = function ...
Error: This definition has type 'c. (liLabel_t, 'c) liInstr_t -> liLabel_t
which is less general than 'b 'c. ('b, 'c) liInstr_t -> 'b
My questions then are the following:
How can we capture and use the abstract types identified with alternative paths? A locally abstract type should bind (or be refined) to compatible concrete type(s) for values found in the resulting expression, or can be ignored, right? Ignoring recursion, this example:
let rec eval : type a. a term -> a = function
| Int n -> n (* a = int *)
| Add -> (fun x y -> x+y) (* a = int -> int -> int *)
| App(f,x) -> (eval f) (eval x)
(* eval called at types (b->a) and b for fresh b *)
on expression evaluation in the ocaml manual seems to suggest that is the case, at least for a 1-parameter GADT type. So, why aren't my types b and c not suitably bound (or refined) in the return type? And if they are binding (or being refined), which should bind to abstract type b and which to c, if at all? How can I find values for my return type so they can correctly associate with the abstract, value-less types reaching them. For, there is seems no way to obtain a result that has the type b in my first error above!
Why am I forced to have the same result type for the alternative paths to succeed (string type in my examples) whereas all possible result types of the GADT should be admissible. In this regard, the first variant (LiExec n -> n) seemed forced to have type string! Also, the abstract types and polymorphic variables along execution paths seem irrelevant to the result type!
I could not reproduce it but at one point, making the first variant LiExec n -> 4 seemed to require integer return values from all alternative pattern matches. If indeed this is the case, why should abstract types on alternative paths require values from the same non-GADT return type? (This behaviour is of non-polymorphic types, right?)
To work around incomprehensible issues on polymorphic types and locally abstract types, is there a simple way to mix them in a constraint? Various permutations to mix them always seem to result in a syntax error. e.g.:
let ft1: (type d) 'b 'c. ('b, 'c) liInstr_t -> d = function
^^^^
Error: Syntax error
Suppose we have the following GADT:
type _ simple_gadt =
| Con : 'a -> 'a simple_gadt
The type signature of Con can be understood as ('a : Type) -> 'a -> 'a simple_gadt (not real OCaml syntax); in other words, it takes a type as its first argument, and the rest of the type is dependent on this input type. The client provides the type; for example:
let value : int simple_gadt = Con 0
Implicitly, you can understand this definition as really meaning let value = Con(type int, 0), where the type is given as an argument (again, not real OCaml syntax).
When you write a function that takes a 'a simple_gadt as an argument, you don't know what 'a is. 'a is said to be an "existential type" provided by the caller of the function. Consider the following function:
let f (type a) (param : a simple_gadt) : a = match param with
| Con x -> x
The type of f is 'a . 'a simple_gadt -> 'a. A client can evaluate f (Con 0) and get back 0, of type int. A client can also evaluate f (Con true) and get back true, of type bool. The definition of the function has no control over what the actual type 'a is; only the caller does.
Suppose we attempt to define:
let g (type a) (param : a simple_gadt) : a = match param with
| Con _ -> ""
One would be able to evaluate g (Con 0) and get back "", a string, but based on the type of Con 0, the output of the function should be an int. This is clearly a type error, so g has an ill-typed definition, and the compiler rightfully rejects it. Likewise, your definition
let ft1: type b c. (b, c) liInstr_t -> b = function
(* ... *)
(* *) | LiExecTRY -> "4"
(* ... *)
is ill-typed because it assumes that b is string, while b could be any type that the caller provides. It looks like you have other similar type errors because you are attempting to pick more specific types for the existential types.
If the caller can choose any type, how can one use GADTs to "refine" the type variable to a more concrete type? The only way to do this is through the information that the caller provides.
Consider the following type definition:
type _ term =
| Abs : ('a -> 'b) -> ('a -> 'b) term
| App : ('a -> 'b) term * 'a term -> 'b term
| Bool : bool -> bool term
In a GADT, each constructor can make the type parameters more specific. Therefore, by pattern matching against each constructor, a function can refine the existential type parameter.
Consider this function on the GADT defined above:
let rec eval : 'a . 'a term -> 'a =
fun (type a) (term : a term) : a ->
match term with
| Abs f -> f
| App(f, x) -> (eval f) (eval x)
| Bool b -> b
In the Abs f case, Abs f is known to have type ('a -> 'b) term for some 'a and 'b by the definition of Abs. Similar reasoning applies for the App(f, x) and Bool b cases.
What's a universally quantified type from the caller's perspective (i.e. the caller can pick any type) must be an existentially quantified type from the callee's perspective (i.e. the callee must work with some fixed arbitrary type that the caller provides).
In brief, the type li_Instr_t as defined is not an interesting GADT and it can be rewritten to the strictly equivalent ADT
type ('a, 'b) liInstr_t =
| LiExec of 'a
| LiExecTRY
| LiLab of liLabel_t
| LiLabTRY
| LiSeq of 'a liChooser_t * 'b list
| LiAlt of 'a liChooser_t * 'b list
| LiLoop of 'a liChooser_t * 'b list
| LiName of 'a liChooser_t * liLabel_t * 'b context_list_t
| Err_LiInstr
| Nil_LiInstr
because the type declaration never introduces equations (or existential quantifications) between the result type and the constructor of the GADT.
If we look at a simple example for GADT:
type ('elt,'array) compact_array =
| String: (char, String.t) compact_array
| Float_array: (float, Float.Array.t) compact_array
| Standard: ('a, 'a array) compact_array
let init: type elt array.
(elt,array) compact_array -> int -> (int -> elt) -> array =
fun kind n f -> match kind with
| String -> String.init n f
| Float_array -> Float.Array.init n f
| Standard -> Array.init n f
The difference is that the constructor String constrains the type of compact_array to be (char,string) compact_array. Thus, when I observe String in the pattern matching above, I can introduce the equation
elt=char and array=string in the branch String and use those equation locally . Similarly, after observing the constructor Float_array in the pattern matching, I can work with the equation elt=float and array=Float.Array.t inside the corresponding branch.
Contrarily, with the definition of liInstr_t as it stands, observing a constructor of a value of type ('a,'b) liInstr_t brings no information on the type ('a,'b) liInstr_t. Consequently, the function ft1 of type type a b. (a,b) liInstr_t -> b is is promising to return a float array when called with ft1 (LiExecTRY:('a,float array) li_Instr_t). More generally, a function of type a b. (a,b) liInstr_t -> awhere no constructor impose a constraint onbis necessarily returning some value of typebthat was contained inside(a,b) liInstr_t` (or is not returning).
Using that knowledge, we can update your type liInstr_t to make the function ft1 works by adding the equations corresponding to the expected return type for ft1 to the definition of the type:
type liLabel_t = string
and context_t = string
and 'a context_list_t = 'a list
and 'a liChooser_t = 'a -> int
and ('a, 'b, 'ft1) liInstr_t =
| LiExec: 'a -> ('a, 'b,'a) liInstr_t (* ft1 returns the argument of LiExec *)
(* ft1 returns a string in all other cases *)
| LiExecTRY: ('a, 'b, string) liInstr_t
| LiLab: liLabel_t -> ('a, 'b, string) liInstr_t
| LiLabTRY: (liLabel_t, 'b, string) liInstr_t
| LiSeq: 'a liChooser_t * 'b list -> ('a,'b, string) liInstr_t
| LiAlt: 'a liChooser_t * 'b list -> ('a,'b, string) liInstr_t
| LiLoop: 'a liChooser_t * 'b list -> ('a,'b, string) liInstr_t
| LiName: 'a liChooser_t * liLabel_t * 'b context_list_t ->
('a,'b, string) liInstr_t
| Err_LiInstr: ('a, 'b, string) liInstr_t
| Nil_LiInstr: ('a, 'b, string) liInstr_t
and now that we have the right equation in place, we can define ft1 as:
let ft1: type a b c. (a, b, c) liInstr_t -> c = function
| LiExec n -> n
| LiExecTRY -> "4"
| LiLab s -> "LiLab"
| LiLabTRY -> "LiLabTRY"
| LiSeq (f, il) -> "LiSeq"
| LiAlt (f, il) -> "LiAlt"
| LiLoop (f, il) -> "LiLoop"
| LiName (f, il, ic) -> "LiName"
| Err_LiInstr -> "Err_LiInstr"
| Nil_LiInstr -> "Nil_LiInstr"
which typechecks without any error.
Here is a simple OCaml module type for a monad:
module type Monad = sig
type 'a t
val return : 'a -> 'a t
val bind : 'a t -> ('a -> 'b t) -> 'b t
end
I can instantiate this with any particular monad, such as the reader monad for some type r:
module Reader_monad : Monad = struct
type 'a t = r -> 'a
let return a = fun _ -> a
let bind o f = fun x -> f (o x) x
end
And I can parametrize it over the type r by using a functor:
module type Readable = sig type r end
module Reader (R : Readable) : Monad = struct
type 'a t = R.r -> 'a
let return a = fun _ -> a
let bind o f = fun x -> f (o x) x
end
However, the latter approach requires that I instantiate different instances of the functor for different types r. Is there any way to define a "parametrically polymorphic" module of type Monad that would give parametrically polymorphic functions like return : 'a -> ('r -> 'a)?
I think can get more or less what I want with a separate module type for "families of monads":
module type Monad_family = sig
type ('c, 'a) t
val return : 'a -> ('c, 'a) t
val bind : ('c, 'a) t -> ('a -> ('c, 'b) t) -> ('c, 'b) t
end
module Reader_family : Monad_family = struct
type ('c, 'a) t = 'c -> 'a
let return a = fun _ -> a
let bind o f = fun x -> f (o x) x
end
But if I have a substantial library of general facts about monads, this would require modifying it everywhere manually to use families. And then some monads are parametrized by a pair of types (although I suppose that could be encoded by a product type), etc. So I would rather avoid having to do it this way.
If this isn't directly possible, is there at least a way to instantiate the module Reader locally inside a parametrically polymorphic function? I thought I might be able to do this with first-class modules, but my naive attempt
let module M = Reader(val (module struct type r = int end) : Readable) in M.return "hello";;
produces the error message
Error: This expression has type string M.t
but an expression was expected of type 'a
The type constructor M.t would escape its scope
which I don't understand. Isn't the type M.t equal to int -> string?
I think this is the same issue as The type constructor "..." would escape its scope when using first class modules, where the module M doesn't live long enough. If you instead wrote
# module M = Reader(struct type r = int end);;
# M.return "hello";;
- : string M.t = <fun>
then this would work fine.
Separately, the Reader functor loses some type equalities that you might want. You can restore them by defining it as such:
module Reader (R : Readable) : Monad with type 'a t = R.r -> 'a = struct
type 'a t = R.r -> 'a
let return a = fun _ -> a
let bind o f = fun x -> f (o x) x
end
I understand that you can't do this, but want to understand precisely why.
module M : sig
type 'a t
val call : 'a t -> 'a option
end = struct
type 'a t
let state : ('a t -> 'a option) ref = ref (fun _ -> None)
let call : ('a t -> 'a option) = fun x -> !state x
end
Results in:
Error: Signature mismatch:
Modules do not match:
sig
type 'a t
val state : ('_a t -> '_a option) ref
val call : '_a t -> '_a option
end
is not included in
sig
type 'a t
val call : 'a t -> 'a option
end
Values do not match:
val call : '_a t -> '_a option
is not included in
val call : 'a t -> 'a option
Why are the abstract types not compatible here?
My gut tells me it has everything to do with early vs late binding, but I'm looking for an exact description of what the type system is doing here.
One way to look at it is that your field state can't have the polymorphic value you ascribe to it, because mutable values can't be polymorphic. References are at most monomorphic (as indicated by the '_a notation for the type variable).
If you just try to declare a similar reference in the toplevel, you'll see the same effect:
# let lfr: ('a list -> 'a option) ref = ref (fun x -> None);;
val lfr : ('_a list -> '_a option) ref = {contents = <fun>}
The type variable '_a indicates some single type that hasn't yet been determined.
The reason that references can't be polymorphic is that it's unsound. If you allow references to be generalized (polymorphic) it's easy to produce programs that go horribly wrong. (In practice this usually means a crash and core dump.)
The issue of soundness is discussed near the beginning of this paper: Jacques Garrigue, Relaxing the Value Restriction (which I refer to periodically when I forget how things work).
Update
What I think you want is "rank 2 polymorphism". I.e., you want a field whose type is polymorphic. You can actually get this in OCaml as long as you declare the type. The usual method is to use a record type:
# type lfrec = { mutable f: 'a. 'a list -> 'a option };;
type lfrec = { mutable f : 'a. 'a list -> 'a option; }
# let x = { f = fun x -> None };;
val x : lfrec = {f = <fun>}
# x.f ;;
- : 'a list -> 'a option = <fun>
The following code compiles for me using lfrec instead of a reference:
module M : sig
type 'a t
val call : 'a t -> 'a option
end = struct
type 'a t
type lfrec = { mutable f: 'a. 'a t -> 'a option }
let state: lfrec = { f = fun _ -> None }
let call : ('a t -> 'a option) = fun x -> state.f x
end
In attempting to implement the exercises from Purely Functional Data Structures in OCaml I'm not sure how I can create instances of my solutions.
Say I have the following code:
module type Stack =
sig
type 'a t
val empty : 'a t
val isEmpty : 'a t -> bool
val cons : 'a -> 'a t -> 'a t
val head : 'a t -> 'a
val tail : 'a t -> 'a t
end
(* Implementation using OCaml lists *)
module MyStack : Stack = struct
type 'a t = 'a list
exception Empty
let empty = []
let isEmpty l =
match l with
| [] -> true
| _ -> false
let cons x l = x :: l
let head l =
match l with
| h :: _ -> h
| [] -> raise Empty
let tail l =
match l with
| _ :: r -> r
| [] -> raise Empty
end
I want to provide a Make function similar to Set.Make(String) for creating a specialised instance.
But I'm not sure how to do that.
Seems to me it's natural to parameterize a set by a notion of order (or you could get away with just equality). But a stack doesn't need to be parameterized in that way; i.e., it doesn't depend on a notion of order or equality. It just depends on algebraic properties of its structure.
You already have a parametrically polymorphic module that can be used to make a stack of any type of object.
I'm looking at the code for the Set module. If you want to make a functor like Set.Make, you need a module type for the elements. Since you can use any type at all (unlike Set, which needs an ordered type), you could use something like this:
module type AnyType = struct type t end
Then your functor might look like this (again, I'm just copying code from the Set module):
module Make(Any: AnyType) =
struct
type elt = Any.t
type t = elt list
...
end
Update
If you just want to try out your stack code as is, you can just start using it:
$ ocaml
OCaml version 4.01.0
# #use "mystack.ml";;
module type Stack =
sig
type 'a t
val empty : 'a t
val isEmpty : 'a t -> bool
val cons : 'a -> 'a t -> 'a t
val head : 'a t -> 'a
val tail : 'a t -> 'a t
end
module MyStack : Stack
# let x = MyStack.cons 3 MyStack.empty;;
val x : int MyStack.t = <abstr>
# MyStack.head x;;
- : int = 3
#
module MapHelpers (Ord : Map.OrderedType) = struct
include Map.Make (Ord)
let add_all a b = fold add a b
end
works but the seemingly equivalent
module MapHelpers (Ord : Map.OrderedType) = struct
include Map.Make (Ord)
let add_all = fold add
end
fails to compile with
File "Foo.ml", line 2, characters 18-104:
Error: The type of this module,
functor (Ord : Map.OrderedType) ->
sig
...
val add_all : '_a t -> '_a t -> '_a t
end,
contains type variables that cannot be generalized
Command exited with code 2.
and adding an explicit type annotation
: 'a . 'a t -> 'a t -> 'a t
causes compilation to fail earlier with
Error: This definition has type 'a t -> 'a t -> 'a t
which is less general than 'a0. 'a0 t -> 'a0 t -> 'a0 t
Why does adding the explicit formals a b change the way these two modules are typed?
This is a consequence of the value restriction, as described in the following FAQ item:
A function obtained through partial application is not polymorphic enough
The more common case to get a ``not polymorphic enough'' definition is when defining a function via partial application of a general polymorphic function. In Caml polymorphism is introduced only through the “let” construct, and results from application are weakly polymorph; hence the function resulting from the application is not polymorph. In this case, you recover a fully polymorphic definition by clearly exhibiting the functionality to the type-checker : define the function with an explicit functional abstraction, that is, add a function construct or an extra parameter (this rewriting is known as eta-expansion):
# let map_id = List.map (function x -> x) (* Result is weakly polymorphic *)
val map_id : '_a list -> '_a list = <fun>
# map_id [1;2]
- : int list = [1;2]
# map_id (* No longer polymorphic *)
- : int list -> int list = <fun>
# let map_id' l = List.map (function x -> x) l
val map_id' : 'a list -> 'a list = <fun>
# map_id' [1;2]
- : int list = [1;2]
# map_id' (* Still fully polymorphic *)
- : 'a list -> 'a list = <fun>
The two definitions are semantically equivalent, and the new one can be assigned a polymorphic type scheme, since it is no more a function application.
See also this discussion about what the _ in '_a indicates -- weak, non-polymorphic type variables.