ocaml GADT : why "type a." needed? - ocaml

In the basic example of GADT from§7.20 of ocaml manual, what is the meaning of 'type a.' ?
Why declaring "eval : a term -> a" is not enough ?
type _ term =
| Int : int -> int term
| Add : (int -> int -> int) term
| App : ('b -> 'a) term * 'b term -> 'a term
let rec eval : type a. a term -> a = function
| Int n -> n (* a = int *)
| Add -> (fun x y -> x+y) (* a = int -> int -> int *)
| App(f,x) -> (eval f) (eval x)

Jacque's slide on ML'2011 workshop has a nice introduction. The idea to use syntax of locally abstract type to introduce universal expression-scoped variable.

Related

Why Peano numbers in OCaml not working due to scope error?

I have the following peano number written with GADTs:
type z = Z of z
type 'a s = Z | S of 'a
type _ t = Z : z t | S : 'n t -> 'n s t
module T = struct
type nonrec 'a t = 'a t
end
type 'a nat = 'a t
type e = T : 'n nat -> e
The following function to decode a 'a nat (or 'a t) into the number it encoded, works:
let to_int : type n. n t -> int =
let rec go : type n. int -> n t -> int =
fun acc n -> match n with Z -> acc | S n -> go (acc + 1) n
in
fun x -> go 0 x
but if I try to rewrite it almost exactly the same this way:
let to_int2 (type a) (a: a nat) : int =
let rec go (type a) (acc : int) (x : a nat) : int =
match x with
| Z -> acc
| S v -> go (acc + 1) v
in
go 0 a
I get a scope error. What's the difference between the two functions?
138 | | S v -> go (acc + 1) v
^
Error: This expression has type $0 t but an expression was expected of type
'a
The type constructor $0 would escape its scope
The root issue is polymorphic recursion, GADTs are a red herring here.
Without an explicit annotation, recursive functions are not polymorphic in their own definition.
For instance, the following function has type int -> int
let rec id x =
let _discard = lazy (id 0) in
x;;
because id is not polymorphic in
let _discard = lazy (id 0) in
and thus id 0 implies that the type of id is int -> 'a which leads to id having type int -> int.
In order to define polymorphic recursive function, one need to add an explicit universally quantified annotation
let rec id : 'a. 'a -> 'a = fun x ->
let _discard = lazy (id 0) in
x
With this change, id recovers its expected 'a -> 'a type.
This requirement does not change with GADTs. Simplifying your code
let rec to_int (type a) (x : a nat) : int =
match x with
| Z -> 0
| S v -> 1 + to_int v
the annotation x: a nat implies that the function to_int only works with a nat, but you are applying to an incompatible type (and ones that lives in a too narrow scope but that is secondary).
Like in the non-GADT case, the solution is to add an explicit polymorphic annotation:
let rec to_int: 'a. 'a nat -> int = fun (type a) (x : a nat) ->
match x with
| Z -> 0
| S v -> 1 + to_int v
Since the form 'a. 'a nat -> int = fun (type a) (x : a nat) -> is both a mouthful and quite often needed with recursive function on GADTs, there is a shortcut notation available:
let rec to_int: type a. a nat -> int = fun x ->
match x with
| Z -> 0
| S v -> 1 + to_int v
For people not very familiar with GADTs, this form is the one to prefer whenever one write a GADT function. Indeed, not only this avoids the issue with polymorphic recursion, writing down the explicit type of a function before trying to implement it is generally a good idea with GADTs.
See also https://ocaml.org/manual/polymorphism.html#s:polymorphic-recursion , https://ocaml.org/manual/gadts-tutorial.html#s%3Agadts-recfun , and https://v2.ocaml.org/manual/locallyabstract.html#p:polymorpic-locally-abstract .

How to define "apply" in OCaml

I am trying to define a function that is similar to Lisp's apply. Here is my attempt:
type t =
| Str of string
| Int of int
let rec apply f args =
match args with
| (Str s)::xs -> apply (f s) xs
| (Int i)::xs -> apply (f i) xs
| [] -> f
(* Example 1 *)
let total = apply (fun x y z -> x + y + z)
[Int 1; Int 2; Int 3]
(* Example 2 *)
let () = apply (fun name age ->
Printf.printf "Name: %s\n" name;
Printf.printf "Age: %i\n" age)
[Str "Bob"; Int 99]
However, this fails to compile. The compiler gives this error message:
File "./myprog.ml", line 7, characters 25-30:
7 | | (Str s)::xs -> apply (f s) xs
^^^^^
Error: This expression has type 'a but an expression was expected of type
string -> 'a
The type variable 'a occurs inside string -> 'a
What is the meaning of this error message? How can I fix the problem and implement apply?
You cannot mix an untyped DSL for data:
type t =
| Int of int
| Float of float
and a shallow embedding (using OCaml functions as functions inside the DSL) for functions in apply
let rec apply f args =
match args with
| (Str s)::xs -> apply (f s) xs (* f is int -> 'a *)
| (Int i)::xs -> apply (f i) xs (* f is string -> 'a *)
| [] -> f (* f is 'a *)
The typechecker is complaining that if f has type 'a, f s cannot also have for type 'a since it would mean that f has simultaneously type string -> 'a and 'a (without using the recursive types flag).
And more generally, your function apply doesn't use f with a coherent type: sometimes it has type 'a, sometimes it has type int -> 'a, other times it would rather have type string -> 'a. In other words, it is not possible to write a type for apply
val apply: ??? (* (int|string) -> ... *) -> t list -> ???
You have to choose your poison.
Either go with a fully untyped DSL which contains functions, that can be applied:
type t =
| Int of int
| Float of float
| Fun of (t -> t)
exception Type_error
let rec apply f l = match f, l with
| x, [] -> f
| Fun f, a :: q -> apply (f a) q
| (Int _|Float _), _ :: _ -> raise Type_error
or use OCaml type system and define a well-typed list of arguments with a GADT:
type ('a,'b) t =
| Nil: ('a,'a) t
| Cons: 'a * ('b,'r) t -> ('a -> 'b,'r) t
let rec apply: type f r. f -> (f,r) t -> r = fun f l ->
match l with
| Nil -> f
| Cons (x,l) -> apply (f x) l
EDIT:
Using the GADT solution is quite direct since we are using usual OCaml type without much wrapping:
let three = apply (+) (Cons(1, Cons(2,Nil)))
(and we could use a heterogeneous list syntactic sugar to make this form even lighter syntactically)
The untyped DSL requires to build first a function in the DSL:
let plus = Fun(function
| Float _ | Fun _ -> raise Type_error
| Int x -> Fun(function
| Float _ | Fun _ -> raise Type_error
| Int y -> Int (x+y)
)
)
but once we have built the function, it is relatively straightforward:
let three = apply_dsl plus [Int 2; Int 1]
type t =
| Str of string
| Int of int
| Unit
let rec apply f args =
match args with
| x::xs -> apply (f x) xs
| [] -> f Unit
Let's go step by step:
line 1: apply : 'a -> 'b -> 'c (we don't know the types of f, args and apply's return type
line 2 and beginning of line 3: args : t list so apply : 'a -> t list -> 'c
rest of line 3: Since f s (s : string), f : string -> 'a but f t : f because apply (f s). This means that f contains f in its type, this is a buggy behaviour
It's actually buggy to call f on s and i because this means that f can take a string or an int, the compiler will not allow it.
And lastly, if args is empty, you return f so the return type of f is the type of f itself, another buggy part of this code.
Looking at your examples, a simple solution would be:
type t = Str of string | Int of int
let rec apply f acc args =
match args with x :: xs -> apply f (f acc x) xs | [] -> acc
(* Example 1 *)
let total =
apply
(fun acc x ->
match x with Int d -> d + acc | Str _ -> failwith "Type error")
0 [ Int 1; Int 2; Int 3 ]
(* Example 2 *)
let () =
apply
(fun () -> function
| Str name -> Printf.printf "Name: %s\n" name
| Int age -> Printf.printf "Age: %i\n" age)
() [ Str "Bob"; Int 99 ]
Since you know the type you want to work on, you don't need GADT shenanigans, just let f handle the pattern matching and work with an accumulator

Please explain the ppx_variants_conv make_matcher method signature

I found myself wanting a way to do some codegen around some large variant types in my code, and I found ppx_variants_conv (https://github.com/janestreet/ppx_variants_conv)
The make_matcher method sounds potentially useful to me, but there are no docs and tbh I am struggling to read the example signature:
val make_matcher :
a:(('a -> 'a t) Variant.t -> 'b -> ('c -> 'd) * 'e)
-> b:((char -> 'f t) Variant.t -> 'e -> (char -> 'd) * 'g)
-> c:('h t Variant.t -> 'g -> (unit -> 'd) * 'i)
-> d:((int -> int -> 'j t) Variant.t -> 'i -> (int -> int -> 'd) * 'k)
-> 'b
-> ('c t -> 'd) * 'k
a b c and d labelled arguments correspond to the cases of the variant and the first part of each signature corresponds to the constructor for each case... I get a bit lost after that 🤔
and it seems a odd to me that 'b 'c 'd and 'k appear in the latter part of the signature but not 'e 'f 'g 'h 'i 'j
In the make_matcher function each labeled argument takes a function of two arguments that has a general form, fun v x -> f, y, where v is the first-class variant that represents the corresponding constructor, x is the value that is folded over all matchers-generating functions. The function returns a pair, in which the first constituent, the function f, is actually the matcher that will be called if that variant matches and some value y that will be passed to the next matcher-generating function.
Let's do some examples to illustrate this. First, let's define some simple matcher, e.g.,
type 'a t =
| A of 'a
| B of char
| C
| D of int * int
[##deriving variants]
let matcher init = Variants.make_matcher
~a:(fun v (x1 : char) ->
(fun x -> x+1),Char.code x1)
~b:(fun v (x2 : int) ->
(fun c -> Char.code c),float x2)
~c:(fun v (x3 : float) ->
(fun () -> 0), string_of_float x3)
~d:(fun v (x4 : string) ->
(fun x y -> x + y),[x4])
init
and here's how we can use, it,
# let f,s = matcher '*';;
val f : int t -> int = <fun>
val s : Base.string list = ["42."]
# f (A 10);;
- : int = 11
# f (B '*');;
- : int = 42
# f C;;
- : int = 0
# f (D (1,2));;
- : int = 3
To be honest, I don't know the purpose of the extra parameter that is passed to each matcher-generating function1. Probably, the idea is that depending on the initial parameter we could generate different matchers. But if you don't need this, then just pass () to it and/or define your own simplified matcher that ignores this additional information, e.g.,
let make_simple_matcher ~a ~b ~c ~d =
fst ##Variants.make_matcher
~a:(fun _ _ -> a,())
~b:(fun _ _ -> b,())
~c:(fun _ _ -> c,())
~d:(fun _ _ -> d,())
()
The make_simple_matcher function has an expected type,
a:('a -> 'b) ->
b:(char -> 'b) ->
c:(unit -> 'b) ->
d:(int -> int -> 'b) ->
'a t -> 'b
1) looking into the guts of the code that generates this function doesn't help a lot, as they use the generic name acc for this parameter, which is not very helpful.

Writing an interpreter with OCaml GADTs

I am writing a small interpreter in OCaml and am using GADTs to type my expressions:
type _ value =
| Bool : bool -> bool value
| Int : int -> int value
| Symbol : string -> string value
| Nil : unit value
| Pair : 'a value * 'b value -> ('a * 'b) value
and _ exp =
| Literal : 'a value -> 'a exp
| Var : name -> 'a exp
| If : bool exp * 'a exp * 'a exp -> 'a exp
and name = string
exception NotFound of string
type 'a env = (name * 'a) list
let bind (n, v, e) = (n, v)::e
let rec lookup = function
| (n, []) -> raise (NotFound n)
| (n, (n', v)::e') -> if n=n' then v else lookup (n, e')
let rec eval : type a. a exp -> a value env -> a value = fun e rho ->
match e with
| Literal v -> v
| Var n -> lookup (n, rho)
| If (b, l, r) ->
let Bool b' = eval b rho in
if b' then eval l rho else eval r rho
But I cannot get my code to compile. I get the following error:
File "gadt2.ml", line 33, characters 33-36:
Error: This expression has type a value env = (name * a value) list
but an expression was expected of type
bool value env = (name * bool value) list
Type a is not compatible with type bool
My understanding is that for some reason rho is being coerced into a bool value env, but I don't know why. I also tried the following:
let rec eval : 'a. 'a exp -> 'a value env -> 'a value = fun e rho ->
match e with
| Literal v -> v
| Var n -> lookup (n, rho)
| If (b, l, r) ->
let Bool b = eval b rho in
if b then eval l rho else eval r rho
But I am not sure how exactly that is different, and it also gives me an error -- albeit a different one:
File "gadt2.ml", line 38, characters 56-247:
Error: This definition has type bool exp -> bool value env -> bool value
which is less general than 'a. 'a exp -> 'a value env -> 'a value
Guidance on GADTs, differences between the two evals, and this particular problem are all appreciated. Cheers.
The type 'a env is intended to represent a list of name/value bindings, but the values in a list must all be the same type. Two different value types (such as bool value and int value) are not the same type. If eval b rho returns Bool b, rho must be a list of string * bool value. So eval l rho and eval r rho will return bool value. But your annotation says the function returns a value.
There are a few possible approaches to typed binding with GADTs. Here's a design that associates type info with both variables and environment entries.
Environment lookup involves attempting to construct a correspondence between the types of the variable and the environment entry (which is a bit slow, but does recover the type in a safe way). This is what allows the lookup to return an unwrapped value of arbitrary type.
type var = string
type _ ty =
| TyInt : int ty
| TyArrow : 'a ty * 'b ty -> ('a -> 'b) ty
type _ term =
| Int : int -> int term
| Var : 'a ty * var -> 'a term
| Lam : 'a ty * var * 'b term -> ('a -> 'b) term
| App : ('a -> 'b) term * 'a term -> 'b term
type ('a, 'b) eq = Refl : ('a, 'a) eq
let rec types_equal : type a b . a ty -> b ty -> (a, b) eq option =
fun a b ->
match a, b with
| TyInt, TyInt -> Some Refl
| TyArrow (x1, y1), TyArrow (x2, y2) ->
begin match types_equal x1 x2, types_equal y1 y2 with
| Some Refl, Some Refl -> Some Refl
| _, _ -> None
end
| _, _ -> None
type env = Nil | Cons : var * 'a ty * 'a * env -> env
let rec lookup : type a . a ty -> var -> env -> a =
fun ty var -> function
| Nil -> raise Not_found
| Cons (xname, xty, x, rest) ->
if var = xname then
match types_equal ty xty with
| Some Refl -> x
| None -> assert false
else
lookup ty var rest
let rec eval : type a . env -> a term -> a =
fun env -> function
| Int n -> n
| Var (ty, var) -> lookup ty var env
| App (f, x) -> (eval env f) (eval env x)
| Lam (arg_ty, arg_name, body) ->
fun arg_value ->
eval (Cons (arg_name, arg_ty, arg_value, env)) body
It is possible to have a typed interpreter that avoids the type reconstruction (and the string comparison!) by enforcing the correspondence between variable indices and environments at the type level, but that gets complicated.

GADTs for Representing Function Application with Multiple Parameters (AST)

I saw in the OCaml manual this example to use GADT for an AST with function application:
type _ term =
| Int : int -> int term
| Add : (int -> int -> int) term
| App : ('b -> 'a) term * 'b term -> 'a term
let rec eval : type a. a term -> a = function
| Int n -> n
| Add -> (fun x y -> x+y)
| App(f,x) -> (eval f) (eval x)
Is this the right way of representing function application for a language not supporting partial application?
Is there a way to make a GADT supporting function application with an arbitrary number of arguments?
Finally, is GADT a good way to represent a typed AST? Is there any alternative?
Well, partial eval already works here:
# eval (App(App(Add, Int 3),Int 4)) ;;
- : int = 7
# eval (App(Add, Int 3)) ;;
- : int -> int = <fun>
# eval (App(Add, Int 3)) 4 ;;
- : int = 7
What you don't have in this small gadt is abstraction (lambdas), but it's definitely possible to add it.
If you are interested in the topic, there is an abundant (academic) literature. This paper presents various encoding that supports partial evaluation.
There are also non-Gadt solutions, as shown in this paper.
In general, GADT are a very interesting way to represent evaluators. They tend to fall a bit short when you try to transform the AST for compilations (but there are ways).
Also, you have to keep in mind that you are encoding the type system of the language you are defining in your host language, which means that you need an encoding of the type feature you want. Sometimes, it's tricky.
Edit: A way to have a GADT not supporting partial eval is to have a special value type not containing functions and a "functional value" type with functions. Taking the simplest representation of the first paper, we can modify it that way:
type _ v =
| Int : int -> int v
| String : string -> string v
and _ vf =
| Base : 'a v -> ('a v) vf
| Fun : ('a vf -> 'b vf) -> ('a -> 'b) vf
and _ t =
| Val : 'a vf -> 'a t
| Lam : ('a vf -> 'b t) -> ('a -> 'b) t
| App : ('a -> 'b) t * 'a t -> 'b t
let get_val : type a . a v -> a = function
| Int i -> i
| String s -> s
let rec reduce : type a . a t -> a vf = function
| Val x -> x
| Lam f -> Fun (fun x -> reduce (f x))
| App (f, x) -> let Fun f = reduce f in f (reduce x)
let eval t =
let Base v = reduce t in get_val v
(* Perfectly defined expressions. *)
let f = Lam (fun x -> Lam (fun y -> Val x))
let t = App (f, Val (Base (Int 3)))
(* We can reduce t to a functional value. *)
let x = reduce t
(* But we can't eval it, it's a type error. *)
let y = eval t
(* HOF are authorized. *)
let app = Lam (fun f -> Lam (fun y -> App(Val f, Val y)))
You can make that arbitrarly more complicated, following your needs, the important property is that the 'a v type can't produce functions.