Recursive match function with cyclic function dependences - ocaml

I'm not sure that the title explains my problem, but I can improve it after for the moment I want to explain my problem because are several days that I'm breaking my mind over this problem.
I'm developing a static analysis for my class with Ocaml to check if my c (subset of a C language) program meaning somethings real, and I'm new with this stuff (with the language and paradigm and with the compiler stuff).
The static analysis is traversing the Abstract syntax tree (Ast) and make some checks of it (The check is a problem with TODO comment), for the moment I'm developing the data structure, in particular a Symbol Table, and implementing the code to traverse the Ast.
My complete Ast.
type binop = Add | Sub | Mult | Div | Mod | Equal | Neq | Less | Leq |
Greater | Geq | And | Or | Comma
[##deriving show]
type uop = Neg | Not [##deriving show]
type identifier = string [##deriving show]
type position = Lexing.position * Lexing.position
let dummy_pos = (Lexing.dummy_pos, Lexing.dummy_pos)
type 'a annotated_node = {loc : position[#opaque]; node : 'a}[##deriving show]
type typ =
| TypInt (* Type int *)
| TypBool (* Type bool *)
| TypChar (* Type char *)
| TypArray of typ * int option (* Array type *)
| TypPoint of typ (* Pointer type *)
| TypVoid (* Type void *)
[##deriving show]
and expr = expr_node annotated_node
and expr_node =
| Access of access (* x or *p or a[e] *)
| Assign of access * expr (* x=e or *p=e or a[e]=e *)
| Addr of access (* &x or &*p or &a[e] *)
| ILiteral of int (* Integer literal *)
| CLiteral of char (* Char literal *)
| BLiteral of bool (* Bool literal *)
| UnaryOp of uop * expr (* Unary primitive operator *)
| BinaryOp of binop * expr * expr (* Binary primitive operator *)
| Call of identifier * expr list (* Function call f(...) *)
[##deriving show]
and access = access_node annotated_node
and access_node =
| AccVar of identifier (* Variable access x *)
| AccDeref of expr (* Pointer dereferencing *p *)
| AccIndex of access * expr (* Array indexing a[e] *)
[##deriving show]
and stmt = stmt_node annotated_node
and stmt_node =
| If of expr * stmt * stmt (* Conditional *)
| While of expr * stmt (* While loop *)
| For of expr option * expr option * expr option * stmt (* For loop *)
| Expr of expr (* Expression statement e; *)
| Return of expr option (* Return statement *)
| Block of stmtordec list (* Block: grouping and scope *)
[##deriving show]
and stmtordec = stmtordec_node annotated_node
and stmtordec_node =
| Dec of typ * identifier (* Local variable declaration *)
| Stmt of stmt (* A statement *)
[##deriving show]
type fun_decl = {
typ : typ;
fname : string;
formals : (typ*identifier) list;
body : stmt;
}[##deriving show]
type topdecl = topdecl_node annotated_node
and topdecl_node =
| Fundecl of fun_decl
| Vardec of typ * identifier
[##deriving show]
type program = Prog of topdecl list [##deriving show]
My problem is how to traverse the stmt because inside contains the Block of stmtordec list and the stmtordec and have the stmt and in this case, I'm in a cyclic dependence that I'm not able to resolve with the match function.
My idea to traverse it is to have an OCaml function check_stm -> check_blk -> check_stm, but how I can resolve this idea with the code?
At the moment my code is that but don't compiler because I'm not able to put the function in the OCaml scope at the same time.
My code is that
open Ast
open Symbol_table
open Easy_logging
let logger = Logging.make_logger "Semant" Debug [Cli Debug]
(* Global Scope: This scope contains all the Global declaration
Global declaration types:
- Int, Bool, Char, Array.
- Struct
- function declaration
*)
let global_scope = empty_table
let check_blk blkstm =
match blkstm.node with
| Ast.Dec(tipe, id) ->
begin
logger#debug "Variable declaration check";
(* TODO: I'm missing the variable duplication *)
Symbol_table.add_entry id tipe global_scope
end
| Ast.Stmt(stm) ->
begin
logger#debug "Stm check (recursive call)";
check_stm stm
end
let check_stm node =
match node with
| Ast.If(ex, ifs, els) -> logger#debug "TODO: If stm check"
| Ast.While(ex, stm) -> logger#debug "TODO: While stm check"
| Ast.For(ex1, ex2, ex3, stm) -> logger#debug "TODO: For stm check"
| Ast.Expr(ex) -> logger#debug "TODO: Expression check"
| Ast.Return(optex) -> logger#debug "TODO: Return stm check"
| Ast.Block(blkstm) -> List.iter check_blk blkstm
let check_fundec node =
match node with
| fun_decl as f ->
begin
logger#debug "Checking function declaration";
(* TODO: how I can managed the parameter of the function?*)
global_scope = Symbol_table.begin_block global_scope;
check_stm f.body.node
end
let rec match_type ast_elem =
match ast_elem.node with
| Vardec(tipe, id) ->
begin
logger#debug "Global variable found";
add_entry id tipe global_scope;
()
end
| Fundecl(fundec) ->
begin
logger#debug "Function analysis found";
Symbol_table.add_entry fundec.fname fundec.typ global_scope;
check_fundec fundec;
end
let check (Ast.Prog(topdecls)) = List.iter match_type topdecls
Maybe this question is foolish, and maybe I'm making something wrong inside my idea, but I want to talk about the problem to fix it and learn how to use the OCaml language
p.s: For the moment the Symbol_table implementation is an empty implementation

If I understand your problem correctly you just have to specify the mutually recursive functions explicitly. You do so using the and keyword, just like with type definitions, but also have to use the rec keyword because function definitions are not recursive by default, unlike type definitions:
let rec check_blk blkstm = ...
and check_stm node = ...

Related

Why does OCaml think that this function takes an int parameter when nothing suggests that it should be the case?

I was working on chapter 1 of Modern Compiler Implementation in ML by Andrew Appel and I decided to implement it in OCaml instead of SML. I'm new to OCaml and I came across a very frustrating problem. OCaml seems to think that the below function has the signature int * (int * 'a) -> 'a option.
let rec lookupTable = function
| name, (i, v) :: _ when name = i -> Some v
| name, (_, _) :: rest -> lookupTable (name, rest)
| _, [] -> None
But as far as I can tell, there should be nothing that suggests that the first element in the tuple is an int. This is a problem because when the lookupTable function down the line, the compiler complains that I am not passing it an integer. Perhaps I am missing something incredibly obvious, but it has been pretty mind-boggling. Here is the rest of the program
open Base
type id = string
type binop = Plus | Minus | Times | Div
type stm =
| CompoundStm of stm * stm
| AssignStm of id * exp
| PrintStm of exp list
and exp =
| IdExp of id
| NumExp of int
| OpExp of exp * binop * exp
| EseqExp of stm * exp
(* Returns the maximum number of arguments of any print
statement within any subexpression of a given statement *)
let rec maxargs s =
match s with
| CompoundStm (stm1, stm2) -> Int.max (maxargs stm1) (maxargs stm2)
| AssignStm (_, exp) -> maxargs_exp exp
(* Might be more nested expressions *)
| PrintStm exps -> Int.max (List.length exps) (maxargs_explist exps)
and maxargs_exp e = match e with EseqExp (stm, _) -> maxargs stm | _ -> 0
and maxargs_explist exps =
match exps with
| exp :: rest -> Int.max (maxargs_exp exp) (maxargs_explist rest)
| [] -> 0
type table = (id * int) list
let updateTable name value t : table = (name, value) :: t
let rec lookupTable = function
| name, (i, v) :: _ when name = i -> Some v
| name, (_, _) :: rest -> lookupTable (name, rest)
| _, [] -> None
exception UndefinedVariable of string
let rec interp s =
let t = [] in
interpStm s t
and interpStm s t =
match s with
| CompoundStm (stm1, stm2) -> interpStm stm2 (interpStm stm1 t)
| AssignStm (id, exp) ->
let v, t' = interpExp exp t in
updateTable id v t'
(* Might be more nested expressions *)
| PrintStm exps ->
let interpretAndPrint t e =
let v, t' = interpExp e t in
Stdio.print_endline (Int.to_string v);
t'
in
List.fold_left exps ~init:t ~f:interpretAndPrint
and interpExp e t =
match e with
| IdExp i -> (
match lookupTable (i, t) with
| Some v -> (v, t)
| None -> raise (UndefinedVariable i))
| NumExp i -> (i, t)
| OpExp (exp1, binop, exp2) ->
let exp1_val, t' = interpExp exp1 t in
let exp2_val, _ = interpExp exp2 t' in
let res =
match binop with
| Plus -> exp1_val + exp2_val
| Minus -> exp1_val - exp2_val
| Times -> exp1_val * exp2_val
| Div -> exp1_val / exp2_val
in
(res, t')
| EseqExp (s, e) -> interpExp e (interpStm s t)
Base defines = as int -> int -> bool, so when you have the expression name = i the compiler will infer them as ints.
You can access the polymorphic functions and operators through the Poly module, or use a type-specific operator by locally opening the relevant module, e.g. String.(name = i).
The reason Base does not expose polymorphic operators by default is briefly explained in the documentation's introduction:
The comparison operators exposed by the OCaml standard library are polymorphic:
What they implement is structural comparison of the runtime representation of values. Since these are often error-prone, i.e., they don't correspond to what the user expects, they are not exposed directly by Base.
There's also a performance-argument to be made, because the polymorphic/structural operators need to also inspect what kind of value it is at runtime in order to compare them correctly.

expanding type equation generator in OCaml

type exp =
| CONST of int
| VAR of var
| ADD of exp * exp
| SUB of exp * exp
| ISZERO of exp
| IF of exp * exp * exp
| LET of var * exp * exp
| PROC of var * exp
| CALL of exp * exp
and var = string
type typ = TyInt | TyBool | TyFun of typ * typ | TyVar of tyvar
and tyvar = string
type typ_eqn = (typ * typ) list
module TEnv = struct
type t = var -> typ
let empty = fun _ -> raise (Failure "Type Env is empty")
let extend (x,t) tenv = fun y -> if x = y then t else (tenv y)
let find tenv x = tenv x
end
let rec gen_equations : TEnv.t -> exp -> typ -> typ_eqn
=fun tenv e ty -> match e with
| CONST n -> [(ty, TyInt)]
| VAR x -> [(ty, TEnv.find tenv x)]
| ADD (e1,e2) -> [(ty, TyInt)]#
[gen_equations (tenv, e1, TyInt)]#
[gen_equations (tenv, e2, TyInt)]
Implementing type equation generator in OCaml
I'm expanding type checker based on above code
What I want to do is add "EQUAL" expression, which take two input and return TyBool as output
Problem is how to make equation of input, as input is not a fixed type.
EQUAL can take both TyInt and TyBool as input
ex)
EQUAL (FALSE, FALSE) have output TRUE (because false == false)
EQUAL (5, 3) have output FALSE (because 5 != 3)
How can I make equation?
Your EQUAL expression has a polymorphic type, i.e., it is typable for integers and variables. It is your choice, as a language designer, which flavor of polymorphism you will implement. You can start with the classical parametric polymorphism. In that case, you will type EQUAL as 'a -> 'a -> bool, or, in the parlance of your representation,
TyFun (TyVar a, TyFun (TyVar a, TyBool))
Then, during the unification process, the type variable a will be unified either with TyBool or with TyVar (well, it could also unify with functional types, which you can turn into a type error if you wish).
Alternatively, you can implement EQUAL using ad-hoc polymorphism. This will require you to change your type system though, i.e., to add new constructors to your type typ. You can take the type classes approach and have TyCls of string * typ and give EQUAL the following type,
TyFun (TyCls ("comparable",t), TyFun (TyCls ("comparable",t), TyBool))
But inference with type classes is hard and is not always decidable.

How do I interpret this GADT error in OCaml?

Sorry about the "what am I missing here" style of question here, but I'm just missing something here.
I was trying to understand how GADTs work in OCaml, I define the following (in utop):
type value =
| Bool : bool -> value
| Int : int -> value
;;
type _ value =
| Bool : bool -> bool value
| Int : int -> int value
;;
type _ expr =
| Value : 'a value -> 'a expr
| If : bool expr * 'a expr * 'a expr -> 'a expr
| Lt : 'a expr * 'a expr -> bool expr
| Eq : 'a expr * 'a expr -> bool expr
| Gt : 'a expr * 'a expr -> bool expr
;;
I defined an eval function:
let rec eval : type a. a expr -> a = function
| Value (Int i) -> i
| Value (Bool b) -> b
| Lt (a, b) -> (eval a) < (eval b)
| Gt (a, b) -> (eval a) > (eval b)
| Eq (a, b) -> (eval a) = (eval b)
| If (c, a, b) -> if eval c then (eval a) else (eval b)
;;
but got an error:
Line 4, characters 15-23:
Error: This expression has type $Lt_'a but an expression was expected of type
int
What exactly does this mean?
Just to test further, I modified the expression GADT to be:
type _ expr =
| Value : 'a value -> 'a expr
| If : bool expr * 'a expr * 'a expr -> 'a expr
| Lt : int expr * int expr -> bool expr
| Eq : 'a expr * 'a expr -> bool expr
| Gt : int expr * int expr -> bool expr
;;
and then I see
Line 6, characters 15-23:
Error: This expression has type $Eq_'a but an expression was expected of type
int
When I finally modify it to be
type _ expr =
| Value : 'a value -> 'a expr
| If : bool expr * 'a expr * 'a expr -> 'a expr
| Lt : int expr * int expr -> bool expr
| Eq : int expr * int expr -> bool expr
| Gt : int expr * int expr -> bool expr
;;
it works fine.
Update (more context):
Ocaml version: 4.08.1
Libraries opened during this session: Base
Update (solution):
it turned out to be (as mentioned in the first line of the selected answer) because I had previously, within utop run open Base ;;
In a fresh session I'm able to enter the types initially mentioned and eval is happy with that.
The direct cause of the error is that you are using a library (maybe Base or Core?) that shadows the polymorphic comparison operators (<,<=,=,>=,>) and replace them with integer comparison operators.
Concerning the error message, when you pattern match a GADT constructor with existential types,
| Lt (a, b) -> (eval a) < (eval b)
the typechecker introduces new types to represent the existential types.
Here, in the (original) definition of Lt,
| Lt : 'a expr * 'a expr -> bool expr
there is one existentially quantified type variable: 'a.
When pattern matching on Lt, we need to replace this type variable with
a new type. Moreover, it is quite useful in error message to try to pick
a meaningful name for this type. To do so, the typechecker constructs a
new type name piece by piece as $ + Lt + 'a:
$: to mark an existential type
Lt: to indicate that it was introduced by the constructor Lt
a: to remember that the existential type variable was named 'a in the definition of the constructor
In other words, in the pattern match above, we have something akin to
| Lt ( (a: $Lt_'a eval), (b: $Lt_'a eval)) -> (eval a) < (eval b)
And when typing:
(eval a) < (eval b)
the typechecker compare the type of <: int -> int with the type of eval a: $Lt_'a and outputs your original error message:
Line 4, characters 15-23:
Error: This expression has type $Lt_'a but an expression was expected of type
int

OCaml: How to handle sum type properly?

suppose I have the following code:
type s = A of a | B of b
let foo (a:?) =
let bar (input:s) = match input with
| A a -> foo input
| B b -> ...
My question is what should I fill in the question mark in the signature of foo so I won't need a (redundant) match statement in it? Or is there a better pattern to do this?
If you want to avoid rematch, I see 3 solutions:
have the function foo simply take the "payload" of the value constructor A, and reconstruct a value of type s as its output (or any other type matching the output type of bar).
# type a;;
type a
# type b;;
type b
# module Ex1 = struct
type s = A of a | B of b
let foo (a:a) = A a
let bar (input:s) = match input with
| A a -> foo a
| B b -> (* ... *) input
end;;
module Ex1 :
sig
type s = A of a | B of b
val foo : a -> s
val bar : s -> s
end
use polymorphic variants:
# module Ex2 = struct
type s = [ `A of a | `B of b ]
let foo (`A a) = `A a
let bar (input:s) = match input with
| `A _ as a -> foo a
| `B b -> (* ... *) input
end;;
module Ex2 :
sig
type s = [ `A of a | `B of b ]
val foo : [< `A of 'a ] -> [> `A of 'a ]
val bar : s -> s
end
use GADTs:
# module Ex3 = struct
type _ s =
| A : a -> a s
| B : b -> b s
let foo (a: a s) : a s =
match a with
| A a -> A a
let bar: type x. x s -> x s = fun input ->
match input with
| A _ as a -> foo a
| B b -> (* ... *) input
end;;
module Ex3 :
sig
type _ s = A : a -> a s | B : b -> b s
val foo : a s -> a s
val bar : 'x s -> 'x s
end
Starting with your example, the solution would be simple:
type s = A of a | B of b
let foo (a:a) =
let bar (input:s) = match input with
| A a -> foo a
| B b -> ...
But constraint here is not needed. Looks like that you're misunderstanding the idea of type constraints. In general, in OCaml type constraints cannot affect the program. Programs with and without type constraints have the same behavior. So, here you don't need to put any constraints at all. You must think of type annotations only as a helper tool for programmer.
Update
I'm still not sure, that I understand what actually you want, but if you want to split your variants into subsets, and keep this split refutable, then, indeed, you can use polymorphic variants, as Pascal suggested.
Let me first rephrase the questions. Suppose I have type:
type t = A | B | C | D | E
and I have a pattern match
let f = function
| A | B | C as x -> handle_abc x
| D | E as x -> handle_de x
How can I prove to a compiler, that handle_abc takes only a subset of all possible constructors, namely A | B | C ?
The answer is, with regular variants it is impossible. But it is possible with polymorphic variants:
type t = [`A | `B | `C | `D | `E]
let f = function
| `A | `B | `C as x -> handle_abc x
| `D | `E as -> handle_de x
So, handle_abc now needs only to pattern match on three variants, and don't need to have any redundant matches. Moreover, you can give names to a groups of constructors, and pattern match on this names:
type abc = [`A | `B | `C ]
type de = [`D | `E ]
type t = [ abc | de ]
let f = function
| #abc as x -> handle_abc x
| #de as -> handle_de x
As a real world example, you can take a look at BAP project where polymorphic variants are used to represent instruction code. Here we split all codes into different subgroups, like all move instructions, all branch instructions and so on. And later we can pattern match on the groups directly.
One solution, that incurs a runtime cost, would be to have the variants wrap tuples instead of individual values. Then it's easier to capture the whole tuple and send it to a specialized function:
type s =
(* Note the extra parentheses! *)
| Foo of (int * string)
| Bar of (char * int * string)
let foo (i, s) = "foo"
let bar (c, i, s) = "bar"
let main input =
match input with
| Foo f -> foo f (* `f` is bound to a tuple of type `int * string` *)
| Bar b -> bar b (* `b` is bound to a tuple of type `char * int * string` *)
You would have to fill in the type in the question mark for the signature of Foo, and then use a match statement in it. The place where the question mark is denotes a type. In a way it is assisting the compiler by telling it what exact type you want, and it will strictly ensure that operations you carry out on a or input is of a matching type.
The match statement is not that redundant and does not hurt performance much as it is very efficient in OCaml. However we have another approach as below.
Alternatively if you only have one parameter, you could save some typing by doing function in place of match. For example:
let foo (c:s) = match c with ....
we can do
let foo = function
| A -> ...
| B -> ...
Note that the function word will only work if you have one parameter passed in (you could definitely wrap up all your parameters into a list and pass it in if you like)
Here's an additional example to get my point across:
type number = |Int of int
|Float of float
let to_int: (number -> int option) = function
| Int n -> Some n
| _ -> None
(*this is the same as *) (*also note that int option denotes return type*)
let to_int (input:number) : int option =
match input with
| Int n -> Some n
| _ -> None
(*notice how the first one does not have a input as a parameter name*)
let n1:number = Int 1;;
let n2:number = Int 2;;
Example use: `to_int(n1);`
Just to be clear, there isn't a need to fill it in, and type assists helps the programmer as well, and for me in some ambiguous cases helped make sure the compiler knew what I wanted. According to my professor a few semesters ago, it is a good practice to explicitly mention it to keep types in check.

Simple lambda calculus DSL using GADTs in OCaml

How do you define a simple lambda calculus-like DSL in OCaml using GADTs? Specifically, I can't figure out how to properly define the type checker to translate from an untyped AST to a typed AST nor can I figure out the correct type for the context and environment.
Here's some code for a simple lambda calculus-like language using the traditional approach in OCaml
(* Here's a traditional implementation of a lambda calculus like language *)
type typ =
| Boolean
| Integer
| Arrow of typ*typ
type exp =
| Add of exp*exp
| And of exp*exp
| App of exp*exp
| Lam of string*typ*exp
| Var of string
| Int of int
| Bol of bool
let e1=Add(Int 1,Add(Int 2,Int 3))
let e2=Add(Int 1,Add(Int 2,Bol false)) (* Type error *)
let e3=App(Lam("x",Integer,Add(Var "x",Var "x")),Int 4)
let rec typecheck con e =
match e with
| Add(e1,e2) ->
let t1=typecheck con e1 in
let t2=typecheck con e2 in
begin match (t1,t2) with
| (Integer,Integer) -> Integer
| _ -> failwith "Tried to add with something other than Integers"
end
| And(e1,e2) ->
let t1=typecheck con e1 in
let t2=typecheck con e2 in
begin match (t1,t2) with
| (Boolean,Boolean) -> Boolean
| _ -> failwith "Tried to and with something other than Booleans"
end
| App(e1,e2) ->
let t1=typecheck con e1 in
let t2=typecheck con e2 in
begin match t1 with
| Arrow(t11,t12) ->
if t11 <> t2 then
failwith "Mismatch of types on a function application"
else
t12
| _ -> failwith "Tried to apply a non-arrow type"
end
| Lam(x,t,e) ->
Arrow (t,typecheck ((x,t)::con) e)
| Var x ->
let (y,t) = List.find (fun (y,t)->y=x) con in
t
| Int _ -> Integer
| Bol _ -> Boolean
let t1 = typecheck [] e1
(* let t2 = typecheck [] e2 *)
let t3 = typecheck [] e3
type value =
| VBoolean of bool
| VInteger of int
| VArrow of ((string*value) list -> value -> value)
let rec eval env e =
match e with
| Add(e1,e2) ->
let v1=eval env e1 in
let v2=eval env e2 in
begin match (v1,v2) with
| (VInteger i1,VInteger i2) -> VInteger (i1+i2)
| _ -> failwith "Tried to add with something other than Integers"
end
| And(e1,e2) ->
let v1=eval env e1 in
let v2=eval env e2 in
begin match (v1,v2) with
| (VBoolean b1,VBoolean b2) -> VBoolean (b1 && b2)
| _ -> failwith "Tried to and with something other than Booleans"
end
| App(e1,e2) ->
let v1=eval env e1 in
let v2=eval env e2 in
begin match v1 with
| VArrow a1 -> a1 env v2
| _ -> failwith "Tried to apply a non-arrow type"
end
| Lam(x,t,e) ->
VArrow (fun env' v' -> eval ((x,v')::env') e)
| Var x ->
let (y,v) = List.find (fun (y,t)->y=x) env in
v
| Int i -> VInteger i
| Bol b -> VBoolean b
let v1 = eval [] e1
let v3 = eval [] e3
Now, I'm trying to translate this into something that uses GADTs. Here's my start
(* Now, we try to GADT the process *)
type exp =
| Add of exp*exp
| And of exp*exp
| App of exp*exp
| Lam of string*typ*exp
| Var of string
| Int of int
| Bol of bool
let e1=Add(Int 1,Add(Int 2,Int 3))
let e2=Add(Int 1,Add(Int 2,Bol false))
let e3=App(Lam("x",Integer,Add(Var "x",Var "x")),Int 4)
type _ texp =
| TAdd : int texp * int texp -> int texp
| TAnd : bool texp * bool texp -> bool texp
| TApp : ('a -> 'b) texp * 'a texp -> 'b texp
| TLam : string*'b texp -> ('a -> 'b) texp
| TVar : string -> 'a texp
| TInt : int -> int texp
| TBol : bool -> bool texp
let te1 = TAdd(TInt 1,TAdd(TInt 2,TInt 3))
let rec typecheck : type a. exp -> a texp = fun e ->
match e with
| Add(e1,e2) ->
let te1 = typecheck e1 in
let te2 = typecheck e2 in
TAdd (te1,te2)
| _ -> failwith "todo"
Here's the problem. First, I'm not sure how to define the correct type for TLam and TVar in the type texp. Generally, I would provide the type with the variable name, but I'm not sure how to do that in this context. Second, I don't know the correct type for the context in the function typecheck. Before, I used some kind of list, but now I'm sure sure of the type of the list. Third, after leaving out the context, the typecheck function doesn't type check itself. It fails with the message
File "test03.ml", line 32, characters 8-22:
Error: This expression has type int texp
but an expression was expected of type a texp
Type int is not compatible with type a
which makes complete sense. This is more of an issue of that I'm not sure what the correct type for typecheck should be.
In any case, how do you go about fixing these functions?
Edit 1
Here's a possible type for the context or environment
type _ ctx =
| Empty : unit ctx
| Item : string * 'a * 'b ctx -> ('a*'b) ctx
Edit 2
The trick with the environment is to make sure that the type of the environment is embedded into the type of the expression. Otherwise, there's not enough information in order to make things type safe. Here's a completed interpreter. At the moment, I do not have a valid type checker to move from untyped expressions to typed expressions.
type (_,_) texp =
| TAdd : ('e,int) texp * ('e,int) texp -> ('e,int) texp
| TAnd : ('e,bool) texp * ('e,bool) texp -> ('e,bool) texp
| TApp : ('e,('a -> 'b)) texp * ('e,'a) texp -> ('e,'b) texp
| TLam : (('a*'e),'b) texp -> ('e,('a -> 'b)) texp
| TVar0 : (('a*'e),'a) texp
| TVarS : ('e,'a) texp -> (('b*'e),'a) texp
| TInt : int -> ('e,int) texp
| TBol : bool -> ('e,bool) texp
let te1 = TAdd(TInt 1,TAdd(TInt 2,TInt 3))
(*let te2 = TAdd(TInt 1,TAdd(TInt 2,TBol false))*)
let te3 = TApp(TLam(TAdd(TVar0,TVar0)),TInt 4)
let te4 = TApp(TApp(TLam(TLam(TAdd(TVar0,TVarS(TVar0)))),TInt 4),TInt 5)
let te5 = TLam(TLam(TVarS(TVar0)))
let rec eval : type e t. e -> (e,t) texp -> t = fun env e ->
match e with
| TAdd (e1,e2) ->
let v1 = eval env e1 in
let v2 = eval env e2 in
v1 + v2
| TAnd (e1,e2) ->
let v1 = eval env e1 in
let v2 = eval env e2 in
v1 && v2
| TApp (e1,e2) ->
let v1 = eval env e1 in
let v2 = eval env e2 in
v1 v2
| TLam e ->
fun x -> eval (x,env) e
| TVar0 ->
let (v,vs)=env in
v
| TVarS e ->
let (v,vs)=env in
eval vs e
| TInt i -> i
| TBol b -> b
Then, we have
# eval () te1;;
- : int = 6
# eval () te3;;
- : int = 8
# eval () te5;;
- : '_a -> '_b -> '_a = <fun>
# eval () te4;;
- : int = 9
If you want the term representation to enforce well-typedness, you need to change the way type environments (and variables) are represented: you cannot finely type a mapping from strings to value (type to represent mapping are homogeneous). The classic solution is to move to a representation of variables using De Bruijn indices (strongly-typed numbers) instead of variable names. It may help you to perform that conversion in the untyped world first, and then only care about typing in the untyped -> GADT pass.
Here is, rouhgly sketched, a GADT declaration for strongly typed variables:
type (_, _) var =
| Z : ('a, 'a * 'g) var
| S : ('a, 'g) var -> ('a, 'b * 'g) var
A value at type ('a, 'g) var should be understood as a description of a way to extract a value of type 'a out of an environment of type 'g. The environment is represented by a cascade of right-nested tuples. The Z case corresponds to picking the first variable in the environment, while the S case ignores the topmost variables and looks deeper in the environment.
Shayan Najd has a (Haskell) implementation of this idea on github. Feel free to have a look at the GADT representation or the type-checking/translating code.
Alright, so I finally worked things out. Since I may not be the only one who finds this interesting, here's a complete set of code that does both type checking and evaluation:
type (_,_) texp =
| TAdd : ('gamma,int) texp * ('gamma,int) texp -> ('gamma,int) texp
| TAnd : ('gamma,bool) texp * ('gamma,bool) texp -> ('gamma,bool) texp
| TApp : ('gamma,('t1 -> 't2)) texp * ('gamma,'t1) texp -> ('gamma,'t2) texp
| TLam : (('gamma*'t1),'t2) texp -> ('gamma,('t1 -> 't2)) texp
| TVar0 : (('gamma*'t),'t) texp
| TVarS : ('gamma,'t1) texp -> (('gamma*'t2),'t1) texp
| TInt : int -> ('gamma,int) texp
| TBol : bool -> ('gamma,bool) texp
type _ typ =
| Integer : int typ
| Boolean : bool typ
| Arrow : 'a typ * 'b typ -> ('a -> 'b) typ
type (_,_) iseq = IsEqual : ('a,'a) iseq
let rec is_equal : type a b. a typ -> b typ -> (a,b) iseq option = fun a b ->
match a, b with
| Integer, Integer -> Some IsEqual
| Boolean, Boolean -> Some IsEqual
| Arrow(t1,t2), Arrow(u1,u2) ->
begin match is_equal t1 u1, is_equal t2 u2 with
| Some IsEqual, Some IsEqual -> Some IsEqual
| _ -> None
end
| _ -> None
type _ isint = IsInt : int isint
let is_integer : type a. a typ -> a isint option = fun a ->
match a with
| Integer -> Some IsInt
| _ -> None
type _ isbool = IsBool : bool isbool
let is_boolean : type a. a typ -> a isbool option = fun a ->
match a with
| Boolean -> Some IsBool
| _ -> None
type _ context =
| CEmpty : unit context
| CVar : 'a context * 't typ -> ('a*'t) context
type exp =
| Add of exp*exp
| And of exp*exp
| App of exp*exp
| Lam : 'a typ * exp -> exp
| Var0
| VarS of exp
| Int of int
| Bol of bool
type _ exists_texp =
| Exists : ('gamma,'t) texp * 't typ -> 'gamma exists_texp
let rec typecheck
: type gamma t. gamma context -> exp -> gamma exists_texp =
fun ctx e ->
match e with
| Int i -> Exists ((TInt i) , Integer)
| Bol b -> Exists ((TBol b) , Boolean)
| Var0 ->
begin match ctx with
| CEmpty -> failwith "Tried to grab a nonexistent variable"
| CVar(ctx,t) -> Exists (TVar0 , t)
end
| VarS e ->
begin match ctx with
| CEmpty -> failwith "Tried to grab a nonexistent variable"
| CVar(ctx,_) ->
let tet = typecheck ctx e in
begin match tet with
| Exists (te,t) -> Exists ((TVarS te) , t)
end
end
| Lam(t1,e) ->
let tet2 = typecheck (CVar (ctx,t1)) e in
begin match tet2 with
| Exists (te,t2) -> Exists ((TLam te) , (Arrow(t1,t2)))
end
| App(e1,e2) ->
let te1t1 = typecheck ctx e1 in
let te2t2 = typecheck ctx e2 in
begin match te1t1,te2t2 with
| Exists (te1,t1),Exists (te2,t2) ->
begin match t1 with
| Arrow(t11,t12) ->
let p = is_equal t11 t2 in
begin match p with
| Some IsEqual ->
Exists ((TApp (te1,te2)) , t12)
| None ->
failwith "Mismatch of types on a function application"
end
| _ -> failwith "Tried to apply a non-arrow type"
end
end
| Add(e1,e2) ->
let te1t1 = typecheck ctx e1 in
let te2t2 = typecheck ctx e2 in
begin match te1t1,te2t2 with
| Exists (te1,t1),Exists (te2,t2) ->
let p = is_equal t1 t2 in
let q = is_integer t1 in
begin match p,q with
| Some IsEqual, Some IsInt ->
Exists ((TAdd (te1,te2)) , t1)
| _ ->
failwith "Tried to add with something other than Integers"
end
end
| And(e1,e2) ->
let te1t1 = typecheck ctx e1 in
let te2t2 = typecheck ctx e2 in
begin match te1t1,te2t2 with
| Exists (te1,t1),Exists (te2,t2) ->
let p = is_equal t1 t2 in
let q = is_boolean t1 in
begin match p,q with
| Some IsEqual, Some IsBool ->
Exists ((TAnd (te1,te2)) , t1)
| _ ->
failwith "Tried to and with something other than Booleans"
end
end
let e1 = Add(Int 1,Add(Int 2,Int 3))
let e2 = Add(Int 1,Add(Int 2,Bol false))
let e3 = App(Lam(Integer,Add(Var0,Var0)),Int 4)
let e4 = App(App(Lam(Integer,Lam(Integer,Add(Var0,VarS(Var0)))),Int 4),Int 5)
let e5 = Lam(Integer,Lam(Integer,VarS(Var0)))
let e6 = App(Lam(Integer,Var0),Int 1)
let e7 = App(Lam(Integer,Lam(Integer,Var0)),Int 1)
let e8 = Lam(Integer,Var0)
let e9 = Lam(Integer,Lam(Integer,Var0))
let tet1 = typecheck CEmpty e1
(*let tet2 = typecheck CEmpty e2*)
let tet3 = typecheck CEmpty e3
let tet4 = typecheck CEmpty e4
let tet5 = typecheck CEmpty e5
let tet6 = typecheck CEmpty e6
let tet7 = typecheck CEmpty e7
let tet8 = typecheck CEmpty e8
let tet9 = typecheck CEmpty e9
let rec eval : type gamma t. gamma -> (gamma,t) texp -> t = fun env e ->
match e with
| TAdd (e1,e2) ->
let v1 = eval env e1 in
let v2 = eval env e2 in
v1 + v2
| TAnd (e1,e2) ->
let v1 = eval env e1 in
let v2 = eval env e2 in
v1 && v2
| TApp (e1,e2) ->
let v1 = eval env e1 in
let v2 = eval env e2 in
v1 v2
| TLam e ->
fun x -> eval (env,x) e
| TVar0 ->
let (env,x)=env in
x
| TVarS e ->
let (env,x)=env in
eval env e
| TInt i -> i
| TBol b -> b
type exists_v =
| ExistsV : 't -> exists_v
let typecheck_eval e =
let tet = typecheck CEmpty e in
match tet with
| Exists (te,t) -> ExistsV (eval () te)
let v1 = typecheck_eval e1
let v3 = typecheck_eval e3
let v4 = typecheck_eval e4
let v5 = typecheck_eval e5
let v6 = typecheck_eval e6
let v7 = typecheck_eval e7
let v8 = typecheck_eval e8
let v9 = typecheck_eval e9
Here are the pieces I had trouble with and how I managed to resolve them
In order to correctly type the typed expressions texp, the type of the environment needed to be built into the type of texp. This implies, as gasche correctly noted, that we needed some sort of De Bruijin notation. The easiest was just Var0 and VarS. In order to use variable names, we'd just have to preprocess the AST.
The type of the expression, typ, needed to include both variant types to match on as well as the type we use in the typed expression. In other words, that also needed to be a GADT.
We require three proofs in order to ferret out the correct types in the type checker. These are is_equal, is_integer, and is_bool. The code for is_equal is actually in the OCaml manual under Advanced examples. Specifically, look at the definition of eq_type.
The type exp, for the untyped AST, actually needs to be a GADT also. The lambda abstraction needs access to typ, which is a GADT.
The type checker returns an existential type of both a typed expression as well as the type. We need both to get the program to check type. Also, we need the existential because the untyped expression may or may not have a type.
The existential type, exists_texp, exposes the type of the environment/context, but not the type. We need this type exposed in order to type check properly.
Once everything is setup, the evaluator follows the type rules exactly.
The result of combining the type checker with the evaluator must be another existential type. A priori, we don't know the resulting type, so we have to hide it in an existential package.