How do I interpret this GADT error in OCaml? - ocaml

Sorry about the "what am I missing here" style of question here, but I'm just missing something here.
I was trying to understand how GADTs work in OCaml, I define the following (in utop):
type value =
| Bool : bool -> value
| Int : int -> value
;;
type _ value =
| Bool : bool -> bool value
| Int : int -> int value
;;
type _ expr =
| Value : 'a value -> 'a expr
| If : bool expr * 'a expr * 'a expr -> 'a expr
| Lt : 'a expr * 'a expr -> bool expr
| Eq : 'a expr * 'a expr -> bool expr
| Gt : 'a expr * 'a expr -> bool expr
;;
I defined an eval function:
let rec eval : type a. a expr -> a = function
| Value (Int i) -> i
| Value (Bool b) -> b
| Lt (a, b) -> (eval a) < (eval b)
| Gt (a, b) -> (eval a) > (eval b)
| Eq (a, b) -> (eval a) = (eval b)
| If (c, a, b) -> if eval c then (eval a) else (eval b)
;;
but got an error:
Line 4, characters 15-23:
Error: This expression has type $Lt_'a but an expression was expected of type
int
What exactly does this mean?
Just to test further, I modified the expression GADT to be:
type _ expr =
| Value : 'a value -> 'a expr
| If : bool expr * 'a expr * 'a expr -> 'a expr
| Lt : int expr * int expr -> bool expr
| Eq : 'a expr * 'a expr -> bool expr
| Gt : int expr * int expr -> bool expr
;;
and then I see
Line 6, characters 15-23:
Error: This expression has type $Eq_'a but an expression was expected of type
int
When I finally modify it to be
type _ expr =
| Value : 'a value -> 'a expr
| If : bool expr * 'a expr * 'a expr -> 'a expr
| Lt : int expr * int expr -> bool expr
| Eq : int expr * int expr -> bool expr
| Gt : int expr * int expr -> bool expr
;;
it works fine.
Update (more context):
Ocaml version: 4.08.1
Libraries opened during this session: Base
Update (solution):
it turned out to be (as mentioned in the first line of the selected answer) because I had previously, within utop run open Base ;;
In a fresh session I'm able to enter the types initially mentioned and eval is happy with that.

The direct cause of the error is that you are using a library (maybe Base or Core?) that shadows the polymorphic comparison operators (<,<=,=,>=,>) and replace them with integer comparison operators.
Concerning the error message, when you pattern match a GADT constructor with existential types,
| Lt (a, b) -> (eval a) < (eval b)
the typechecker introduces new types to represent the existential types.
Here, in the (original) definition of Lt,
| Lt : 'a expr * 'a expr -> bool expr
there is one existentially quantified type variable: 'a.
When pattern matching on Lt, we need to replace this type variable with
a new type. Moreover, it is quite useful in error message to try to pick
a meaningful name for this type. To do so, the typechecker constructs a
new type name piece by piece as $ + Lt + 'a:
$: to mark an existential type
Lt: to indicate that it was introduced by the constructor Lt
a: to remember that the existential type variable was named 'a in the definition of the constructor
In other words, in the pattern match above, we have something akin to
| Lt ( (a: $Lt_'a eval), (b: $Lt_'a eval)) -> (eval a) < (eval b)
And when typing:
(eval a) < (eval b)
the typechecker compare the type of <: int -> int with the type of eval a: $Lt_'a and outputs your original error message:
Line 4, characters 15-23:
Error: This expression has type $Lt_'a but an expression was expected of type
int

Related

Why does OCaml think that this function takes an int parameter when nothing suggests that it should be the case?

I was working on chapter 1 of Modern Compiler Implementation in ML by Andrew Appel and I decided to implement it in OCaml instead of SML. I'm new to OCaml and I came across a very frustrating problem. OCaml seems to think that the below function has the signature int * (int * 'a) -> 'a option.
let rec lookupTable = function
| name, (i, v) :: _ when name = i -> Some v
| name, (_, _) :: rest -> lookupTable (name, rest)
| _, [] -> None
But as far as I can tell, there should be nothing that suggests that the first element in the tuple is an int. This is a problem because when the lookupTable function down the line, the compiler complains that I am not passing it an integer. Perhaps I am missing something incredibly obvious, but it has been pretty mind-boggling. Here is the rest of the program
open Base
type id = string
type binop = Plus | Minus | Times | Div
type stm =
| CompoundStm of stm * stm
| AssignStm of id * exp
| PrintStm of exp list
and exp =
| IdExp of id
| NumExp of int
| OpExp of exp * binop * exp
| EseqExp of stm * exp
(* Returns the maximum number of arguments of any print
statement within any subexpression of a given statement *)
let rec maxargs s =
match s with
| CompoundStm (stm1, stm2) -> Int.max (maxargs stm1) (maxargs stm2)
| AssignStm (_, exp) -> maxargs_exp exp
(* Might be more nested expressions *)
| PrintStm exps -> Int.max (List.length exps) (maxargs_explist exps)
and maxargs_exp e = match e with EseqExp (stm, _) -> maxargs stm | _ -> 0
and maxargs_explist exps =
match exps with
| exp :: rest -> Int.max (maxargs_exp exp) (maxargs_explist rest)
| [] -> 0
type table = (id * int) list
let updateTable name value t : table = (name, value) :: t
let rec lookupTable = function
| name, (i, v) :: _ when name = i -> Some v
| name, (_, _) :: rest -> lookupTable (name, rest)
| _, [] -> None
exception UndefinedVariable of string
let rec interp s =
let t = [] in
interpStm s t
and interpStm s t =
match s with
| CompoundStm (stm1, stm2) -> interpStm stm2 (interpStm stm1 t)
| AssignStm (id, exp) ->
let v, t' = interpExp exp t in
updateTable id v t'
(* Might be more nested expressions *)
| PrintStm exps ->
let interpretAndPrint t e =
let v, t' = interpExp e t in
Stdio.print_endline (Int.to_string v);
t'
in
List.fold_left exps ~init:t ~f:interpretAndPrint
and interpExp e t =
match e with
| IdExp i -> (
match lookupTable (i, t) with
| Some v -> (v, t)
| None -> raise (UndefinedVariable i))
| NumExp i -> (i, t)
| OpExp (exp1, binop, exp2) ->
let exp1_val, t' = interpExp exp1 t in
let exp2_val, _ = interpExp exp2 t' in
let res =
match binop with
| Plus -> exp1_val + exp2_val
| Minus -> exp1_val - exp2_val
| Times -> exp1_val * exp2_val
| Div -> exp1_val / exp2_val
in
(res, t')
| EseqExp (s, e) -> interpExp e (interpStm s t)
Base defines = as int -> int -> bool, so when you have the expression name = i the compiler will infer them as ints.
You can access the polymorphic functions and operators through the Poly module, or use a type-specific operator by locally opening the relevant module, e.g. String.(name = i).
The reason Base does not expose polymorphic operators by default is briefly explained in the documentation's introduction:
The comparison operators exposed by the OCaml standard library are polymorphic:
What they implement is structural comparison of the runtime representation of values. Since these are often error-prone, i.e., they don't correspond to what the user expects, they are not exposed directly by Base.
There's also a performance-argument to be made, because the polymorphic/structural operators need to also inspect what kind of value it is at runtime in order to compare them correctly.

When is OCaml's warning 27 "Innocuous unused variable" useful?

This is the description of warning 27 from the OCaml manual:
27 Innocuous unused variable: unused variable that is not bound with let nor as, and doesn't start with an underscore (_) character.
This warning is turned on by jbuilder --dev, and I'm curious to know in which cases people find it useful. For me, it's an annoyance to get warnings when I write code like this:
$ utop -w +27
utop # fun (x, y) -> x;;
Characters 8-9:
Warning 27: unused variable y.
- : 'a * 'b -> 'a = <fun>
or like that:
utop # let error loc msg = failwith (loc ^ ": " ^ msg);;
val error : string -> string -> 'a = <fun>
utop # let rec eval = function
| `Plus (loc, a, b) -> eval a + eval b
| `Minus (loc, a, b) -> eval a - eval b
| `Star (loc, a, b) -> eval a * eval b
| `Slash (loc, a, b) ->
let denom = eval b in
if denom = 0 then
error loc "division by zero"
else
eval a / denom
| `Int (loc, x) -> x
;;
Characters 33-36:
Warning 27: unused variable loc.
Characters 73-76:
Warning 27: unused variable loc.
Characters 112-115:
Warning 27: unused variable loc.
Characters 287-290:
Warning 27: unused variable loc.
val eval :
([< `Int of 'b * int
| `Minus of 'c * 'a * 'a
| `Plus of 'd * 'a * 'a
| `Slash of 'e * 'a * 'a
| `Star of 'f * 'a * 'a ]
as 'a) ->
int = <fun>
I know that prepending an underscore to the identifiers as in _loc suppresses the warnings, but it's not compatible with my notions that:
variables starting with an underscore are ugly and are meant for use in generated code, hidden from the programmer;
a name given to something should not have to change based on how it's used (including unused).
Using underscores, the code becomes:
(* Here we have _loc or loc depending on whether it's used. *)
let rec eval = function
| `Plus (_loc, a, b) -> eval a + eval b
| `Minus (_loc, a, b) -> eval a - eval b
| `Star (_loc, a, b) -> eval a * eval b
| `Slash (loc, a, b) ->
let denom = eval b in
if denom = 0 then
error loc "division by zero"
else
eval a / denom
| `Int (_loc, x) -> x
or
(* Here it can be hard to know what _ stands for. *)
let rec eval = function
| `Plus (_, a, b) -> eval a + eval b
| `Minus (_, a, b) -> eval a - eval b
| `Star (_, a, b) -> eval a * eval b
| `Slash (loc, a, b) ->
let denom = eval b in
if denom = 0 then
error loc "division by zero"
else
eval a / denom
| `Int (_, x) -> x
It is very useful in the monadic code, where instead of the common syntactic let bindings you're forced to use monadic >>= bind operator. Basically, where
let x = something in
code
translates to
something >>= fun x ->
code
If x is not used in code then only with the 27 warning enabled the latter will be highlighted, while the former will produce a warning by default. Enabling this warning, revealed lots of bugs for us. For example, it showed us that this code is buggy :)
Another source of use cases are higher-order functions, i.e., map, fold, etc. It captures one of the most common bugs:
let bug init =
List.fold ~init ~f:(fun acc xs ->
List.fold ~init ~f:(fun acc x -> x :: acc))
Concerning the ugliness, I totally agree that underscores are ugly, but in most cases, this is the main purpose of them - to highlight the suspicious code. Concerning the example, that you're showing, in the modern OCaml it could be easily addressed with the inline records, e.g.,
type exp =
| Plus of {loc : loc; lhs : exp; rhs: exp}
| ...
so that instead of using the underscores, you can just omit the unused field,
let rec eval = function
| Plus {lhs; rhs} -> eval lhs + eval rhs
You can use the same approach without using inline records by sparing some extra space in your program and defining all those records separately. The real-world example.
For me this warning is useful in order to remind me to explicit more my intention. If we take your example :
fun (x, y) -> x;;
Your intention is to use only the first element. If we rewrite it this way :
fun (x, _ ) -> x;;
You use a pattern matching in the parameter to make your code more concise, but you explain your intention of using only the first element. The added value in this example is small, related to the very simple implementation. But in real life functions, this warning promote a good habit in coding.

Writing an interpreter with OCaml GADTs

I am writing a small interpreter in OCaml and am using GADTs to type my expressions:
type _ value =
| Bool : bool -> bool value
| Int : int -> int value
| Symbol : string -> string value
| Nil : unit value
| Pair : 'a value * 'b value -> ('a * 'b) value
and _ exp =
| Literal : 'a value -> 'a exp
| Var : name -> 'a exp
| If : bool exp * 'a exp * 'a exp -> 'a exp
and name = string
exception NotFound of string
type 'a env = (name * 'a) list
let bind (n, v, e) = (n, v)::e
let rec lookup = function
| (n, []) -> raise (NotFound n)
| (n, (n', v)::e') -> if n=n' then v else lookup (n, e')
let rec eval : type a. a exp -> a value env -> a value = fun e rho ->
match e with
| Literal v -> v
| Var n -> lookup (n, rho)
| If (b, l, r) ->
let Bool b' = eval b rho in
if b' then eval l rho else eval r rho
But I cannot get my code to compile. I get the following error:
File "gadt2.ml", line 33, characters 33-36:
Error: This expression has type a value env = (name * a value) list
but an expression was expected of type
bool value env = (name * bool value) list
Type a is not compatible with type bool
My understanding is that for some reason rho is being coerced into a bool value env, but I don't know why. I also tried the following:
let rec eval : 'a. 'a exp -> 'a value env -> 'a value = fun e rho ->
match e with
| Literal v -> v
| Var n -> lookup (n, rho)
| If (b, l, r) ->
let Bool b = eval b rho in
if b then eval l rho else eval r rho
But I am not sure how exactly that is different, and it also gives me an error -- albeit a different one:
File "gadt2.ml", line 38, characters 56-247:
Error: This definition has type bool exp -> bool value env -> bool value
which is less general than 'a. 'a exp -> 'a value env -> 'a value
Guidance on GADTs, differences between the two evals, and this particular problem are all appreciated. Cheers.
The type 'a env is intended to represent a list of name/value bindings, but the values in a list must all be the same type. Two different value types (such as bool value and int value) are not the same type. If eval b rho returns Bool b, rho must be a list of string * bool value. So eval l rho and eval r rho will return bool value. But your annotation says the function returns a value.
There are a few possible approaches to typed binding with GADTs. Here's a design that associates type info with both variables and environment entries.
Environment lookup involves attempting to construct a correspondence between the types of the variable and the environment entry (which is a bit slow, but does recover the type in a safe way). This is what allows the lookup to return an unwrapped value of arbitrary type.
type var = string
type _ ty =
| TyInt : int ty
| TyArrow : 'a ty * 'b ty -> ('a -> 'b) ty
type _ term =
| Int : int -> int term
| Var : 'a ty * var -> 'a term
| Lam : 'a ty * var * 'b term -> ('a -> 'b) term
| App : ('a -> 'b) term * 'a term -> 'b term
type ('a, 'b) eq = Refl : ('a, 'a) eq
let rec types_equal : type a b . a ty -> b ty -> (a, b) eq option =
fun a b ->
match a, b with
| TyInt, TyInt -> Some Refl
| TyArrow (x1, y1), TyArrow (x2, y2) ->
begin match types_equal x1 x2, types_equal y1 y2 with
| Some Refl, Some Refl -> Some Refl
| _, _ -> None
end
| _, _ -> None
type env = Nil | Cons : var * 'a ty * 'a * env -> env
let rec lookup : type a . a ty -> var -> env -> a =
fun ty var -> function
| Nil -> raise Not_found
| Cons (xname, xty, x, rest) ->
if var = xname then
match types_equal ty xty with
| Some Refl -> x
| None -> assert false
else
lookup ty var rest
let rec eval : type a . env -> a term -> a =
fun env -> function
| Int n -> n
| Var (ty, var) -> lookup ty var env
| App (f, x) -> (eval env f) (eval env x)
| Lam (arg_ty, arg_name, body) ->
fun arg_value ->
eval (Cons (arg_name, arg_ty, arg_value, env)) body
It is possible to have a typed interpreter that avoids the type reconstruction (and the string comparison!) by enforcing the correspondence between variable indices and environments at the type level, but that gets complicated.

ocaml GADT : why "type a." needed?

In the basic example of GADT fromĀ§7.20 of ocaml manual, what is the meaning of 'type a.' ?
Why declaring "eval : a term -> a" is not enough ?
type _ term =
| Int : int -> int term
| Add : (int -> int -> int) term
| App : ('b -> 'a) term * 'b term -> 'a term
let rec eval : type a. a term -> a = function
| Int n -> n (* a = int *)
| Add -> (fun x y -> x+y) (* a = int -> int -> int *)
| App(f,x) -> (eval f) (eval x)
Jacque's slide on ML'2011 workshop has a nice introduction. The idea to use syntax of locally abstract type to introduce universal expression-scoped variable.

Creating GADT expression in OCaml

There is my toy GADT expression:
type _ expr =
| Num : int -> int expr
| Add : int expr * int expr -> int expr
| Sub : int expr * int expr -> int expr
| Mul : int expr * int expr -> int expr
| Div : int expr * int expr -> int expr
| Lt : int expr * int expr -> bool expr
| Gt : int expr * int expr -> bool expr
| And : bool expr * bool expr -> bool expr
| Or : bool expr * bool expr -> bool expr
Evaluation function:
let rec eval : type a. a expr -> a = function
| Num n -> n
| Add (a, b) -> (eval a) + (eval b)
| Sub (a, b) -> (eval a) - (eval b)
| Mul (a, b) -> (eval a) * (eval b)
| Div (a, b) -> (eval a) / (eval b)
| Lt (a, b) -> (eval a) < (eval b)
| Gt (a, b) -> (eval a) > (eval b)
| And (a, b) -> (eval a) && (eval b)
| Or (a, b) -> (eval a) || (eval b)
Creating expression is trivial when we limited to int expr:
let create_expr op a b =
match op with
| '+' -> Add (a, b)
| '-' -> Sub (a, b)
| '*' -> Mul (a, b)
| '/' -> Div (a, b)
| _ -> assert false
The question is how to support both int expr and bool expr in create_expr function.
My try:
type expr' = Int_expr of int expr | Bool_expr of bool expr
let concrete : type a. a expr -> expr' = function
| Num _ as expr -> Int_expr expr
| Add _ as expr -> Int_expr expr
| Sub _ as expr -> Int_expr expr
| Mul _ as expr -> Int_expr expr
| Div _ as expr -> Int_expr expr
| Lt _ as expr -> Bool_expr expr
| Gt _ as expr -> Bool_expr expr
| And _ as expr -> Bool_expr expr
| Or _ as expr -> Bool_expr expr
let create_expr (type a) op (a:a expr) (b:a expr) : a expr =
match op, concrete a, concrete b with
| '+', Int_expr a, Int_expr b -> Add (a, b)
| '-', Int_expr a, Int_expr b -> Sub (a, b)
| '*', Int_expr a, Int_expr b -> Mul (a, b)
| '/', Int_expr a, Int_expr b -> Div (a, b)
| '<', Int_expr a, Int_expr b -> Lt (a, b)
| '>', Int_expr a, Int_expr b -> Gt (a, b)
| '&', Bool_expr a, Bool_expr b -> And (a, b)
| '|', Bool_expr a, Bool_expr b -> Or (a, b)
| _ -> assert false
It still can't return value of generalized type.
Error: This expression has type int expr
but an expression was expected of type a expr
Type int is not compatible with type a
UPDATE
Thanks to #gsg, I was able to implement type safe evaluator. Two tricks are crucial there:
existential wrapper Any
type tagging (TyInt and TyBool) that lets us to pattern match Any type
see
type _ ty =
| TyInt : int ty
| TyBool : bool ty
type any_expr = Any : 'a ty * 'a expr -> any_expr
let create_expr op a b =
match op, a, b with
| '+', Any (TyInt, a), Any (TyInt, b) -> Any (TyInt, Add (a, b))
| '-', Any (TyInt, a), Any (TyInt, b) -> Any (TyInt, Sub (a, b))
| '*', Any (TyInt, a), Any (TyInt, b) -> Any (TyInt, Mul (a, b))
| '/', Any (TyInt, a), Any (TyInt, b) -> Any (TyInt, Div (a, b))
| '<', Any (TyInt, a), Any (TyInt, b) -> Any (TyBool, Lt (a, b))
| '>', Any (TyInt, a), Any (TyInt, b) -> Any (TyBool, Gt (a, b))
| '&', Any (TyBool, a), Any (TyBool, b) -> Any (TyBool, And (a, b))
| '|', Any (TyBool, a), Any (TyBool, b) -> Any (TyBool, Or (a, b))
| _, _, _ -> assert false
let eval_any : any_expr -> [> `Int of int | `Bool of bool] = function
| Any (TyInt, expr) -> `Int (eval expr)
| Any (TyBool, expr) -> `Bool (eval expr)
As you've found, this approach doesn't type check. It also has a more fundamental problem: GADTs can be recursive, in which case it is flatly impossible to enumerate their cases.
Instead you can reify types as a GADT and pass them around. Here's a cut-down example:
type _ expr =
| Num : int -> int expr
| Add : int expr * int expr -> int expr
| Lt : int expr * int expr -> bool expr
| And : bool expr * bool expr -> bool expr
type _ ty =
| TyInt : int ty
| TyBool : bool ty
let bin_op (type a) (type b) op (l : a expr) (r : a expr) (arg_ty : a ty) (ret_ty : b ty) : b expr =
match op, arg_ty, ret_ty with
| '+', TyInt, TyInt -> Add (l, r)
| '<', TyInt, TyBool -> Lt (l, r)
| '&', TyBool, TyBool -> And (l, r)
| _, _, _ -> assert false
At some point you are going to want to have a value that can be 'any expression'. Introducing an existential wrapper allows this. Cheesy example: generating random expression trees:
type any_expr = Any : _ expr -> any_expr
let rec random_int_expr () =
if Random.bool () then Num (Random.int max_int)
else Add (random_int_expr (), random_int_expr ())
let rec random_bool_expr () =
if Random.bool () then Lt (Num (Random.int max_int), Num (Random.int max_int))
else And (random_bool_expr (), random_bool_expr ())
let random_expr () =
if Random.bool () then Any (random_int_expr ())
else Any (random_bool_expr ())
Your stated type for create_expr is char -> 'a expr -> 'a expr -> 'a expr. But the type for the '>' case would be char -> int expr -> int expr -> bool expr. So it seems there are problems with the basic plan.
In essence you want the type of the result to depend on the value of the character. I'm not absolutely positive, but I suspect this isn't possible in OCaml. Seems like it would require a more powerful type system.