I am trying to define a function that is similar to Lisp's apply. Here is my attempt:
type t =
| Str of string
| Int of int
let rec apply f args =
match args with
| (Str s)::xs -> apply (f s) xs
| (Int i)::xs -> apply (f i) xs
| [] -> f
(* Example 1 *)
let total = apply (fun x y z -> x + y + z)
[Int 1; Int 2; Int 3]
(* Example 2 *)
let () = apply (fun name age ->
Printf.printf "Name: %s\n" name;
Printf.printf "Age: %i\n" age)
[Str "Bob"; Int 99]
However, this fails to compile. The compiler gives this error message:
File "./myprog.ml", line 7, characters 25-30:
7 | | (Str s)::xs -> apply (f s) xs
^^^^^
Error: This expression has type 'a but an expression was expected of type
string -> 'a
The type variable 'a occurs inside string -> 'a
What is the meaning of this error message? How can I fix the problem and implement apply?
You cannot mix an untyped DSL for data:
type t =
| Int of int
| Float of float
and a shallow embedding (using OCaml functions as functions inside the DSL) for functions in apply
let rec apply f args =
match args with
| (Str s)::xs -> apply (f s) xs (* f is int -> 'a *)
| (Int i)::xs -> apply (f i) xs (* f is string -> 'a *)
| [] -> f (* f is 'a *)
The typechecker is complaining that if f has type 'a, f s cannot also have for type 'a since it would mean that f has simultaneously type string -> 'a and 'a (without using the recursive types flag).
And more generally, your function apply doesn't use f with a coherent type: sometimes it has type 'a, sometimes it has type int -> 'a, other times it would rather have type string -> 'a. In other words, it is not possible to write a type for apply
val apply: ??? (* (int|string) -> ... *) -> t list -> ???
You have to choose your poison.
Either go with a fully untyped DSL which contains functions, that can be applied:
type t =
| Int of int
| Float of float
| Fun of (t -> t)
exception Type_error
let rec apply f l = match f, l with
| x, [] -> f
| Fun f, a :: q -> apply (f a) q
| (Int _|Float _), _ :: _ -> raise Type_error
or use OCaml type system and define a well-typed list of arguments with a GADT:
type ('a,'b) t =
| Nil: ('a,'a) t
| Cons: 'a * ('b,'r) t -> ('a -> 'b,'r) t
let rec apply: type f r. f -> (f,r) t -> r = fun f l ->
match l with
| Nil -> f
| Cons (x,l) -> apply (f x) l
EDIT:
Using the GADT solution is quite direct since we are using usual OCaml type without much wrapping:
let three = apply (+) (Cons(1, Cons(2,Nil)))
(and we could use a heterogeneous list syntactic sugar to make this form even lighter syntactically)
The untyped DSL requires to build first a function in the DSL:
let plus = Fun(function
| Float _ | Fun _ -> raise Type_error
| Int x -> Fun(function
| Float _ | Fun _ -> raise Type_error
| Int y -> Int (x+y)
)
)
but once we have built the function, it is relatively straightforward:
let three = apply_dsl plus [Int 2; Int 1]
type t =
| Str of string
| Int of int
| Unit
let rec apply f args =
match args with
| x::xs -> apply (f x) xs
| [] -> f Unit
Let's go step by step:
line 1: apply : 'a -> 'b -> 'c (we don't know the types of f, args and apply's return type
line 2 and beginning of line 3: args : t list so apply : 'a -> t list -> 'c
rest of line 3: Since f s (s : string), f : string -> 'a but f t : f because apply (f s). This means that f contains f in its type, this is a buggy behaviour
It's actually buggy to call f on s and i because this means that f can take a string or an int, the compiler will not allow it.
And lastly, if args is empty, you return f so the return type of f is the type of f itself, another buggy part of this code.
Looking at your examples, a simple solution would be:
type t = Str of string | Int of int
let rec apply f acc args =
match args with x :: xs -> apply f (f acc x) xs | [] -> acc
(* Example 1 *)
let total =
apply
(fun acc x ->
match x with Int d -> d + acc | Str _ -> failwith "Type error")
0 [ Int 1; Int 2; Int 3 ]
(* Example 2 *)
let () =
apply
(fun () -> function
| Str name -> Printf.printf "Name: %s\n" name
| Int age -> Printf.printf "Age: %i\n" age)
() [ Str "Bob"; Int 99 ]
Since you know the type you want to work on, you don't need GADT shenanigans, just let f handle the pattern matching and work with an accumulator
It seems that only a * b can fit in to _ and only (a,b) can fit in to (a,_).
I can imagine that a*b is a proper type for the internal product with components a and b whereas (a,b) is a external product of type a and type b (just a guess)
But are there examples distinguishing the two ?
type zero = Z : zero
type 'a succ = S : 'a succ
type _ ptree1 =
| Leaf : 'a -> ('a * zero) ptree1
| Node : (('a * 'n) ptree1 * ('a * 'n) ptree1) -> ('a * 'n succ) ptree1
type (_, _) ptree =
| Leaf : 'a -> ('a, zero) ptree
| Node : (('a, 'n) ptree * ('a, 'n) ptree) -> ('a, 'n succ) ptree
(* bad
type ('a * _) ptree =
| Leaf : 'a -> ('a, zero) ptree
| Node : (('a, 'n) ptree * ('a, 'n) ptree) -> ('a, 'n succ) ptree
*)
let rec last1 : type n a. (a * n) ptree1 -> a = function
| Leaf x -> x
| Node (_, t) -> last1 t
let rec last : type n a. (a, n) ptree -> a = function
| Leaf x -> x
| Node (_, t) -> last t
Type constructors have an arity in OCaml.
For instance, in
type ('a,'b) either =
| Left of 'a
| Right of 'b
the type constructor either has an arity of two. And ('a,'b) either denotes the type constructor either applied to two argument 'a and 'b. The form ('a,'b) does not exist by itself in the language.
However, it is possible to encode type constructor with an arity of n as a type constructor of arity 1 constrained on only having a n-tuple type as an argument.
Typically, this means rewriting either to
type 'p either2 =
| Left2 of 'a
| Right2 of 'b
constraint 'p = 'a * 'b
let translation: type a b. (a*b) either2 -> (a,b) either = function
| Left2 x -> Left x
| Right2 x -> Right x
Here, either2 is a type constructor of arity one, but which arguments must be a 2-tuple type.
This is the equivalent of translating a function of type 'a -> 'b -> 'c to a function of type 'a * 'b -> 'c at the type-level.
And another point of view is that type-level applications were written like function applications ('a,'b) either would be written either 'a 'b and ('a * 'b) either2 would become either2 ('a * 'b).
Without GADTs, this kind of encoding requires the use of an explicit constraint and they are thus not that frequent.
With GADTs, since the definition of the GADTs is free to construct its own type indices, this choice is simply more apparent. For instance, one can define an eccentric version of either as
type (_,_,_) either3 =
| Left3: 'a -> ('a list -> _, 'a * unit, _) either3
| Right3: 'a -> ( _ -> 'a array, _, unit * 'a) either3
let translate: type a b. (a list -> b array, a * unit, unit * b) either3 -> (a,b) either =
function
| Left3 x -> Left x
| Right3 x -> Right x
Here, either3 is a type constructor of arity 3, which stores the left and right types all over the place among its 3 argument.
I saw in the OCaml manual this example to use GADT for an AST with function application:
type _ term =
| Int : int -> int term
| Add : (int -> int -> int) term
| App : ('b -> 'a) term * 'b term -> 'a term
let rec eval : type a. a term -> a = function
| Int n -> n
| Add -> (fun x y -> x+y)
| App(f,x) -> (eval f) (eval x)
Is this the right way of representing function application for a language not supporting partial application?
Is there a way to make a GADT supporting function application with an arbitrary number of arguments?
Finally, is GADT a good way to represent a typed AST? Is there any alternative?
Well, partial eval already works here:
# eval (App(App(Add, Int 3),Int 4)) ;;
- : int = 7
# eval (App(Add, Int 3)) ;;
- : int -> int = <fun>
# eval (App(Add, Int 3)) 4 ;;
- : int = 7
What you don't have in this small gadt is abstraction (lambdas), but it's definitely possible to add it.
If you are interested in the topic, there is an abundant (academic) literature. This paper presents various encoding that supports partial evaluation.
There are also non-Gadt solutions, as shown in this paper.
In general, GADT are a very interesting way to represent evaluators. They tend to fall a bit short when you try to transform the AST for compilations (but there are ways).
Also, you have to keep in mind that you are encoding the type system of the language you are defining in your host language, which means that you need an encoding of the type feature you want. Sometimes, it's tricky.
Edit: A way to have a GADT not supporting partial eval is to have a special value type not containing functions and a "functional value" type with functions. Taking the simplest representation of the first paper, we can modify it that way:
type _ v =
| Int : int -> int v
| String : string -> string v
and _ vf =
| Base : 'a v -> ('a v) vf
| Fun : ('a vf -> 'b vf) -> ('a -> 'b) vf
and _ t =
| Val : 'a vf -> 'a t
| Lam : ('a vf -> 'b t) -> ('a -> 'b) t
| App : ('a -> 'b) t * 'a t -> 'b t
let get_val : type a . a v -> a = function
| Int i -> i
| String s -> s
let rec reduce : type a . a t -> a vf = function
| Val x -> x
| Lam f -> Fun (fun x -> reduce (f x))
| App (f, x) -> let Fun f = reduce f in f (reduce x)
let eval t =
let Base v = reduce t in get_val v
(* Perfectly defined expressions. *)
let f = Lam (fun x -> Lam (fun y -> Val x))
let t = App (f, Val (Base (Int 3)))
(* We can reduce t to a functional value. *)
let x = reduce t
(* But we can't eval it, it's a type error. *)
let y = eval t
(* HOF are authorized. *)
let app = Lam (fun f -> Lam (fun y -> App(Val f, Val y)))
You can make that arbitrarly more complicated, following your needs, the important property is that the 'a v type can't produce functions.
How do you define a simple lambda calculus-like DSL in OCaml using GADTs? Specifically, I can't figure out how to properly define the type checker to translate from an untyped AST to a typed AST nor can I figure out the correct type for the context and environment.
Here's some code for a simple lambda calculus-like language using the traditional approach in OCaml
(* Here's a traditional implementation of a lambda calculus like language *)
type typ =
| Boolean
| Integer
| Arrow of typ*typ
type exp =
| Add of exp*exp
| And of exp*exp
| App of exp*exp
| Lam of string*typ*exp
| Var of string
| Int of int
| Bol of bool
let e1=Add(Int 1,Add(Int 2,Int 3))
let e2=Add(Int 1,Add(Int 2,Bol false)) (* Type error *)
let e3=App(Lam("x",Integer,Add(Var "x",Var "x")),Int 4)
let rec typecheck con e =
match e with
| Add(e1,e2) ->
let t1=typecheck con e1 in
let t2=typecheck con e2 in
begin match (t1,t2) with
| (Integer,Integer) -> Integer
| _ -> failwith "Tried to add with something other than Integers"
end
| And(e1,e2) ->
let t1=typecheck con e1 in
let t2=typecheck con e2 in
begin match (t1,t2) with
| (Boolean,Boolean) -> Boolean
| _ -> failwith "Tried to and with something other than Booleans"
end
| App(e1,e2) ->
let t1=typecheck con e1 in
let t2=typecheck con e2 in
begin match t1 with
| Arrow(t11,t12) ->
if t11 <> t2 then
failwith "Mismatch of types on a function application"
else
t12
| _ -> failwith "Tried to apply a non-arrow type"
end
| Lam(x,t,e) ->
Arrow (t,typecheck ((x,t)::con) e)
| Var x ->
let (y,t) = List.find (fun (y,t)->y=x) con in
t
| Int _ -> Integer
| Bol _ -> Boolean
let t1 = typecheck [] e1
(* let t2 = typecheck [] e2 *)
let t3 = typecheck [] e3
type value =
| VBoolean of bool
| VInteger of int
| VArrow of ((string*value) list -> value -> value)
let rec eval env e =
match e with
| Add(e1,e2) ->
let v1=eval env e1 in
let v2=eval env e2 in
begin match (v1,v2) with
| (VInteger i1,VInteger i2) -> VInteger (i1+i2)
| _ -> failwith "Tried to add with something other than Integers"
end
| And(e1,e2) ->
let v1=eval env e1 in
let v2=eval env e2 in
begin match (v1,v2) with
| (VBoolean b1,VBoolean b2) -> VBoolean (b1 && b2)
| _ -> failwith "Tried to and with something other than Booleans"
end
| App(e1,e2) ->
let v1=eval env e1 in
let v2=eval env e2 in
begin match v1 with
| VArrow a1 -> a1 env v2
| _ -> failwith "Tried to apply a non-arrow type"
end
| Lam(x,t,e) ->
VArrow (fun env' v' -> eval ((x,v')::env') e)
| Var x ->
let (y,v) = List.find (fun (y,t)->y=x) env in
v
| Int i -> VInteger i
| Bol b -> VBoolean b
let v1 = eval [] e1
let v3 = eval [] e3
Now, I'm trying to translate this into something that uses GADTs. Here's my start
(* Now, we try to GADT the process *)
type exp =
| Add of exp*exp
| And of exp*exp
| App of exp*exp
| Lam of string*typ*exp
| Var of string
| Int of int
| Bol of bool
let e1=Add(Int 1,Add(Int 2,Int 3))
let e2=Add(Int 1,Add(Int 2,Bol false))
let e3=App(Lam("x",Integer,Add(Var "x",Var "x")),Int 4)
type _ texp =
| TAdd : int texp * int texp -> int texp
| TAnd : bool texp * bool texp -> bool texp
| TApp : ('a -> 'b) texp * 'a texp -> 'b texp
| TLam : string*'b texp -> ('a -> 'b) texp
| TVar : string -> 'a texp
| TInt : int -> int texp
| TBol : bool -> bool texp
let te1 = TAdd(TInt 1,TAdd(TInt 2,TInt 3))
let rec typecheck : type a. exp -> a texp = fun e ->
match e with
| Add(e1,e2) ->
let te1 = typecheck e1 in
let te2 = typecheck e2 in
TAdd (te1,te2)
| _ -> failwith "todo"
Here's the problem. First, I'm not sure how to define the correct type for TLam and TVar in the type texp. Generally, I would provide the type with the variable name, but I'm not sure how to do that in this context. Second, I don't know the correct type for the context in the function typecheck. Before, I used some kind of list, but now I'm sure sure of the type of the list. Third, after leaving out the context, the typecheck function doesn't type check itself. It fails with the message
File "test03.ml", line 32, characters 8-22:
Error: This expression has type int texp
but an expression was expected of type a texp
Type int is not compatible with type a
which makes complete sense. This is more of an issue of that I'm not sure what the correct type for typecheck should be.
In any case, how do you go about fixing these functions?
Edit 1
Here's a possible type for the context or environment
type _ ctx =
| Empty : unit ctx
| Item : string * 'a * 'b ctx -> ('a*'b) ctx
Edit 2
The trick with the environment is to make sure that the type of the environment is embedded into the type of the expression. Otherwise, there's not enough information in order to make things type safe. Here's a completed interpreter. At the moment, I do not have a valid type checker to move from untyped expressions to typed expressions.
type (_,_) texp =
| TAdd : ('e,int) texp * ('e,int) texp -> ('e,int) texp
| TAnd : ('e,bool) texp * ('e,bool) texp -> ('e,bool) texp
| TApp : ('e,('a -> 'b)) texp * ('e,'a) texp -> ('e,'b) texp
| TLam : (('a*'e),'b) texp -> ('e,('a -> 'b)) texp
| TVar0 : (('a*'e),'a) texp
| TVarS : ('e,'a) texp -> (('b*'e),'a) texp
| TInt : int -> ('e,int) texp
| TBol : bool -> ('e,bool) texp
let te1 = TAdd(TInt 1,TAdd(TInt 2,TInt 3))
(*let te2 = TAdd(TInt 1,TAdd(TInt 2,TBol false))*)
let te3 = TApp(TLam(TAdd(TVar0,TVar0)),TInt 4)
let te4 = TApp(TApp(TLam(TLam(TAdd(TVar0,TVarS(TVar0)))),TInt 4),TInt 5)
let te5 = TLam(TLam(TVarS(TVar0)))
let rec eval : type e t. e -> (e,t) texp -> t = fun env e ->
match e with
| TAdd (e1,e2) ->
let v1 = eval env e1 in
let v2 = eval env e2 in
v1 + v2
| TAnd (e1,e2) ->
let v1 = eval env e1 in
let v2 = eval env e2 in
v1 && v2
| TApp (e1,e2) ->
let v1 = eval env e1 in
let v2 = eval env e2 in
v1 v2
| TLam e ->
fun x -> eval (x,env) e
| TVar0 ->
let (v,vs)=env in
v
| TVarS e ->
let (v,vs)=env in
eval vs e
| TInt i -> i
| TBol b -> b
Then, we have
# eval () te1;;
- : int = 6
# eval () te3;;
- : int = 8
# eval () te5;;
- : '_a -> '_b -> '_a = <fun>
# eval () te4;;
- : int = 9
If you want the term representation to enforce well-typedness, you need to change the way type environments (and variables) are represented: you cannot finely type a mapping from strings to value (type to represent mapping are homogeneous). The classic solution is to move to a representation of variables using De Bruijn indices (strongly-typed numbers) instead of variable names. It may help you to perform that conversion in the untyped world first, and then only care about typing in the untyped -> GADT pass.
Here is, rouhgly sketched, a GADT declaration for strongly typed variables:
type (_, _) var =
| Z : ('a, 'a * 'g) var
| S : ('a, 'g) var -> ('a, 'b * 'g) var
A value at type ('a, 'g) var should be understood as a description of a way to extract a value of type 'a out of an environment of type 'g. The environment is represented by a cascade of right-nested tuples. The Z case corresponds to picking the first variable in the environment, while the S case ignores the topmost variables and looks deeper in the environment.
Shayan Najd has a (Haskell) implementation of this idea on github. Feel free to have a look at the GADT representation or the type-checking/translating code.
Alright, so I finally worked things out. Since I may not be the only one who finds this interesting, here's a complete set of code that does both type checking and evaluation:
type (_,_) texp =
| TAdd : ('gamma,int) texp * ('gamma,int) texp -> ('gamma,int) texp
| TAnd : ('gamma,bool) texp * ('gamma,bool) texp -> ('gamma,bool) texp
| TApp : ('gamma,('t1 -> 't2)) texp * ('gamma,'t1) texp -> ('gamma,'t2) texp
| TLam : (('gamma*'t1),'t2) texp -> ('gamma,('t1 -> 't2)) texp
| TVar0 : (('gamma*'t),'t) texp
| TVarS : ('gamma,'t1) texp -> (('gamma*'t2),'t1) texp
| TInt : int -> ('gamma,int) texp
| TBol : bool -> ('gamma,bool) texp
type _ typ =
| Integer : int typ
| Boolean : bool typ
| Arrow : 'a typ * 'b typ -> ('a -> 'b) typ
type (_,_) iseq = IsEqual : ('a,'a) iseq
let rec is_equal : type a b. a typ -> b typ -> (a,b) iseq option = fun a b ->
match a, b with
| Integer, Integer -> Some IsEqual
| Boolean, Boolean -> Some IsEqual
| Arrow(t1,t2), Arrow(u1,u2) ->
begin match is_equal t1 u1, is_equal t2 u2 with
| Some IsEqual, Some IsEqual -> Some IsEqual
| _ -> None
end
| _ -> None
type _ isint = IsInt : int isint
let is_integer : type a. a typ -> a isint option = fun a ->
match a with
| Integer -> Some IsInt
| _ -> None
type _ isbool = IsBool : bool isbool
let is_boolean : type a. a typ -> a isbool option = fun a ->
match a with
| Boolean -> Some IsBool
| _ -> None
type _ context =
| CEmpty : unit context
| CVar : 'a context * 't typ -> ('a*'t) context
type exp =
| Add of exp*exp
| And of exp*exp
| App of exp*exp
| Lam : 'a typ * exp -> exp
| Var0
| VarS of exp
| Int of int
| Bol of bool
type _ exists_texp =
| Exists : ('gamma,'t) texp * 't typ -> 'gamma exists_texp
let rec typecheck
: type gamma t. gamma context -> exp -> gamma exists_texp =
fun ctx e ->
match e with
| Int i -> Exists ((TInt i) , Integer)
| Bol b -> Exists ((TBol b) , Boolean)
| Var0 ->
begin match ctx with
| CEmpty -> failwith "Tried to grab a nonexistent variable"
| CVar(ctx,t) -> Exists (TVar0 , t)
end
| VarS e ->
begin match ctx with
| CEmpty -> failwith "Tried to grab a nonexistent variable"
| CVar(ctx,_) ->
let tet = typecheck ctx e in
begin match tet with
| Exists (te,t) -> Exists ((TVarS te) , t)
end
end
| Lam(t1,e) ->
let tet2 = typecheck (CVar (ctx,t1)) e in
begin match tet2 with
| Exists (te,t2) -> Exists ((TLam te) , (Arrow(t1,t2)))
end
| App(e1,e2) ->
let te1t1 = typecheck ctx e1 in
let te2t2 = typecheck ctx e2 in
begin match te1t1,te2t2 with
| Exists (te1,t1),Exists (te2,t2) ->
begin match t1 with
| Arrow(t11,t12) ->
let p = is_equal t11 t2 in
begin match p with
| Some IsEqual ->
Exists ((TApp (te1,te2)) , t12)
| None ->
failwith "Mismatch of types on a function application"
end
| _ -> failwith "Tried to apply a non-arrow type"
end
end
| Add(e1,e2) ->
let te1t1 = typecheck ctx e1 in
let te2t2 = typecheck ctx e2 in
begin match te1t1,te2t2 with
| Exists (te1,t1),Exists (te2,t2) ->
let p = is_equal t1 t2 in
let q = is_integer t1 in
begin match p,q with
| Some IsEqual, Some IsInt ->
Exists ((TAdd (te1,te2)) , t1)
| _ ->
failwith "Tried to add with something other than Integers"
end
end
| And(e1,e2) ->
let te1t1 = typecheck ctx e1 in
let te2t2 = typecheck ctx e2 in
begin match te1t1,te2t2 with
| Exists (te1,t1),Exists (te2,t2) ->
let p = is_equal t1 t2 in
let q = is_boolean t1 in
begin match p,q with
| Some IsEqual, Some IsBool ->
Exists ((TAnd (te1,te2)) , t1)
| _ ->
failwith "Tried to and with something other than Booleans"
end
end
let e1 = Add(Int 1,Add(Int 2,Int 3))
let e2 = Add(Int 1,Add(Int 2,Bol false))
let e3 = App(Lam(Integer,Add(Var0,Var0)),Int 4)
let e4 = App(App(Lam(Integer,Lam(Integer,Add(Var0,VarS(Var0)))),Int 4),Int 5)
let e5 = Lam(Integer,Lam(Integer,VarS(Var0)))
let e6 = App(Lam(Integer,Var0),Int 1)
let e7 = App(Lam(Integer,Lam(Integer,Var0)),Int 1)
let e8 = Lam(Integer,Var0)
let e9 = Lam(Integer,Lam(Integer,Var0))
let tet1 = typecheck CEmpty e1
(*let tet2 = typecheck CEmpty e2*)
let tet3 = typecheck CEmpty e3
let tet4 = typecheck CEmpty e4
let tet5 = typecheck CEmpty e5
let tet6 = typecheck CEmpty e6
let tet7 = typecheck CEmpty e7
let tet8 = typecheck CEmpty e8
let tet9 = typecheck CEmpty e9
let rec eval : type gamma t. gamma -> (gamma,t) texp -> t = fun env e ->
match e with
| TAdd (e1,e2) ->
let v1 = eval env e1 in
let v2 = eval env e2 in
v1 + v2
| TAnd (e1,e2) ->
let v1 = eval env e1 in
let v2 = eval env e2 in
v1 && v2
| TApp (e1,e2) ->
let v1 = eval env e1 in
let v2 = eval env e2 in
v1 v2
| TLam e ->
fun x -> eval (env,x) e
| TVar0 ->
let (env,x)=env in
x
| TVarS e ->
let (env,x)=env in
eval env e
| TInt i -> i
| TBol b -> b
type exists_v =
| ExistsV : 't -> exists_v
let typecheck_eval e =
let tet = typecheck CEmpty e in
match tet with
| Exists (te,t) -> ExistsV (eval () te)
let v1 = typecheck_eval e1
let v3 = typecheck_eval e3
let v4 = typecheck_eval e4
let v5 = typecheck_eval e5
let v6 = typecheck_eval e6
let v7 = typecheck_eval e7
let v8 = typecheck_eval e8
let v9 = typecheck_eval e9
Here are the pieces I had trouble with and how I managed to resolve them
In order to correctly type the typed expressions texp, the type of the environment needed to be built into the type of texp. This implies, as gasche correctly noted, that we needed some sort of De Bruijin notation. The easiest was just Var0 and VarS. In order to use variable names, we'd just have to preprocess the AST.
The type of the expression, typ, needed to include both variant types to match on as well as the type we use in the typed expression. In other words, that also needed to be a GADT.
We require three proofs in order to ferret out the correct types in the type checker. These are is_equal, is_integer, and is_bool. The code for is_equal is actually in the OCaml manual under Advanced examples. Specifically, look at the definition of eq_type.
The type exp, for the untyped AST, actually needs to be a GADT also. The lambda abstraction needs access to typ, which is a GADT.
The type checker returns an existential type of both a typed expression as well as the type. We need both to get the program to check type. Also, we need the existential because the untyped expression may or may not have a type.
The existential type, exists_texp, exposes the type of the environment/context, but not the type. We need this type exposed in order to type check properly.
Once everything is setup, the evaluator follows the type rules exactly.
The result of combining the type checker with the evaluator must be another existential type. A priori, we don't know the resulting type, so we have to hide it in an existential package.
I have a parameterized type that recursively uses itself but with a type parameter specialized and when I implement a generic operator, the type of that operator is bound too tightly because of the case that handles the specialized sub-tree. The first code sample shows the problem, and the second shows a workaround that I'd rather not use because the real code has quite a few more cases so duplicating code this way is a maintenance hazard.
Here's a minimal test case that shows the problem:
module Op1 = struct
type 'a t = A | B (* 'a is unused but it and the _ below satisfy a sig *)
let map _ x = match x with
| A -> A
| B -> B
end
module type SIG = sig
type ('a, 'b) t =
| Leaf of 'a * 'b
(* Here a generic ('a, 'b) t contains a specialized ('a, 'a Op1.t) t. *)
| Inner of 'a * ('a, 'a Op1.t) t * ('a, 'b) t
val map : ('a -> 'b) -> ('a_t -> 'b_t) -> ('a, 'a_t) t -> ('b, 'b_t) t
end
module Impl : SIG = struct
type ('a, 'b) t =
| Leaf of 'a * 'b
| Inner of 'a * ('a, 'a Op1.t) t * ('a, 'b) t
(* Fails signature check:
Values do not match:
val map :
('a -> 'b) ->
('a Op1.t -> 'b Op1.t) -> ('a, 'a Op1.t) t -> ('b, 'b Op1.t) t
is not included in
val map :
('a -> 'b) -> ('a_t -> 'b_t) -> ('a, 'a_t) t -> ('b, 'b_t) t
*)
let rec map f g n = match n with
| Leaf (a, b) -> Leaf (f a, g b)
(* possibly because rec call is applied to specialized sub-tree *)
| Inner (a, x, y) -> Inner (f a, map f (Op1.map f) x, map f g y)
end
This modified version of Impl.map fixed the problem but introduces a maintenance hazard.
let rec map f g n = match n with
| Leaf (a, b) -> Leaf (f a, g b)
| Inner (a, x, y) -> Inner (f a, map_spec f x, map f g y)
and map_spec f n = match n with
| Leaf (a, b) -> Leaf (f a, Op1.map f b)
| Inner (a, x, y) -> Inner (f a, map_spec f x, map_spec f y)
Is there any way to get this to work without duplicating the body of let rec map?
Applying gasche's solution yields the following working code:
let rec map
: 'a 'b 'c 'd . ('a -> 'b) -> ('c -> 'd) -> ('a, 'c) t -> ('b, 'd) t
= fun f g n -> match n with
| Leaf (a, b) -> Leaf (f a, g b)
| Inner (a, x, y) -> Inner (f a, map f (Op1.map f) x, map f g y)
This style of recursion in datatype definitions is called "non-regular": the recursive type 'a t is reused at an instance foo t where foo is different from the single variable 'a used in the definition. Another well-known example is the type of full binary trees (with exactly 2^n leaves):
type 'a full_tree =
| Leaf of 'a
| Node of ('a * 'a) full_tree
Recursive functions that operate these datatypes typically suffer from the monomorphic recursion restriction of languages with type inference. When you do type inference you have to make a guess at what the type of a recursive function may be, before type-checking its body (as it may be use inside). ML languages refine this guess by unification/inference, but only monomorphic types may be inferred. If your function makes polymorphic uses of itself (it calls itself recursively on a different type that what it took as input), this cannot be inferred (it is undecidable in the general case).
let rec depth = function
| Leaf _ -> 1
| Node t -> 1 + depth t
^
Error: This expression has type ('a * 'a) full_tree
but an expression was expected of type 'a full_tree
Since 3.12, OCaml allows to use an explicit polymorphic annotation of
the form 'a 'b . foo, meaning forall 'a 'b. foo:
let rec depth : 'a . 'a full_tree -> int = function
| Leaf _ -> 1
| Node t -> 1 + depth t
You could do the same in your example. However, I wasn't able to
compile the type after using the annotation you have in your module
signature, as it appear to be wrong (the 'a_t are just weird). Here
is what I used to make it work:
let rec map : 'a 'b . ('a -> 'b) -> ('a Op1.t -> 'b Op1.t) ->
('a, 'a Op1.t) t -> ('b, 'b Op1.t) t
= fun f g n -> match n with
| Leaf (a, b) -> Leaf (f a, g b)
| Inner (a, x, y) -> Inner (f a, map f (Op1.map f) x, map f g y)