I saw in the OCaml manual this example to use GADT for an AST with function application:
type _ term =
| Int : int -> int term
| Add : (int -> int -> int) term
| App : ('b -> 'a) term * 'b term -> 'a term
let rec eval : type a. a term -> a = function
| Int n -> n
| Add -> (fun x y -> x+y)
| App(f,x) -> (eval f) (eval x)
Is this the right way of representing function application for a language not supporting partial application?
Is there a way to make a GADT supporting function application with an arbitrary number of arguments?
Finally, is GADT a good way to represent a typed AST? Is there any alternative?
Well, partial eval already works here:
# eval (App(App(Add, Int 3),Int 4)) ;;
- : int = 7
# eval (App(Add, Int 3)) ;;
- : int -> int = <fun>
# eval (App(Add, Int 3)) 4 ;;
- : int = 7
What you don't have in this small gadt is abstraction (lambdas), but it's definitely possible to add it.
If you are interested in the topic, there is an abundant (academic) literature. This paper presents various encoding that supports partial evaluation.
There are also non-Gadt solutions, as shown in this paper.
In general, GADT are a very interesting way to represent evaluators. They tend to fall a bit short when you try to transform the AST for compilations (but there are ways).
Also, you have to keep in mind that you are encoding the type system of the language you are defining in your host language, which means that you need an encoding of the type feature you want. Sometimes, it's tricky.
Edit: A way to have a GADT not supporting partial eval is to have a special value type not containing functions and a "functional value" type with functions. Taking the simplest representation of the first paper, we can modify it that way:
type _ v =
| Int : int -> int v
| String : string -> string v
and _ vf =
| Base : 'a v -> ('a v) vf
| Fun : ('a vf -> 'b vf) -> ('a -> 'b) vf
and _ t =
| Val : 'a vf -> 'a t
| Lam : ('a vf -> 'b t) -> ('a -> 'b) t
| App : ('a -> 'b) t * 'a t -> 'b t
let get_val : type a . a v -> a = function
| Int i -> i
| String s -> s
let rec reduce : type a . a t -> a vf = function
| Val x -> x
| Lam f -> Fun (fun x -> reduce (f x))
| App (f, x) -> let Fun f = reduce f in f (reduce x)
let eval t =
let Base v = reduce t in get_val v
(* Perfectly defined expressions. *)
let f = Lam (fun x -> Lam (fun y -> Val x))
let t = App (f, Val (Base (Int 3)))
(* We can reduce t to a functional value. *)
let x = reduce t
(* But we can't eval it, it's a type error. *)
let y = eval t
(* HOF are authorized. *)
let app = Lam (fun f -> Lam (fun y -> App(Val f, Val y)))
You can make that arbitrarly more complicated, following your needs, the important property is that the 'a v type can't produce functions.
Related
I have the following peano number written with GADTs:
type z = Z of z
type 'a s = Z | S of 'a
type _ t = Z : z t | S : 'n t -> 'n s t
module T = struct
type nonrec 'a t = 'a t
end
type 'a nat = 'a t
type e = T : 'n nat -> e
The following function to decode a 'a nat (or 'a t) into the number it encoded, works:
let to_int : type n. n t -> int =
let rec go : type n. int -> n t -> int =
fun acc n -> match n with Z -> acc | S n -> go (acc + 1) n
in
fun x -> go 0 x
but if I try to rewrite it almost exactly the same this way:
let to_int2 (type a) (a: a nat) : int =
let rec go (type a) (acc : int) (x : a nat) : int =
match x with
| Z -> acc
| S v -> go (acc + 1) v
in
go 0 a
I get a scope error. What's the difference between the two functions?
138 | | S v -> go (acc + 1) v
^
Error: This expression has type $0 t but an expression was expected of type
'a
The type constructor $0 would escape its scope
The root issue is polymorphic recursion, GADTs are a red herring here.
Without an explicit annotation, recursive functions are not polymorphic in their own definition.
For instance, the following function has type int -> int
let rec id x =
let _discard = lazy (id 0) in
x;;
because id is not polymorphic in
let _discard = lazy (id 0) in
and thus id 0 implies that the type of id is int -> 'a which leads to id having type int -> int.
In order to define polymorphic recursive function, one need to add an explicit universally quantified annotation
let rec id : 'a. 'a -> 'a = fun x ->
let _discard = lazy (id 0) in
x
With this change, id recovers its expected 'a -> 'a type.
This requirement does not change with GADTs. Simplifying your code
let rec to_int (type a) (x : a nat) : int =
match x with
| Z -> 0
| S v -> 1 + to_int v
the annotation x: a nat implies that the function to_int only works with a nat, but you are applying to an incompatible type (and ones that lives in a too narrow scope but that is secondary).
Like in the non-GADT case, the solution is to add an explicit polymorphic annotation:
let rec to_int: 'a. 'a nat -> int = fun (type a) (x : a nat) ->
match x with
| Z -> 0
| S v -> 1 + to_int v
Since the form 'a. 'a nat -> int = fun (type a) (x : a nat) -> is both a mouthful and quite often needed with recursive function on GADTs, there is a shortcut notation available:
let rec to_int: type a. a nat -> int = fun x ->
match x with
| Z -> 0
| S v -> 1 + to_int v
For people not very familiar with GADTs, this form is the one to prefer whenever one write a GADT function. Indeed, not only this avoids the issue with polymorphic recursion, writing down the explicit type of a function before trying to implement it is generally a good idea with GADTs.
See also https://ocaml.org/manual/polymorphism.html#s:polymorphic-recursion , https://ocaml.org/manual/gadts-tutorial.html#s%3Agadts-recfun , and https://v2.ocaml.org/manual/locallyabstract.html#p:polymorpic-locally-abstract .
I am trying to define a function that is similar to Lisp's apply. Here is my attempt:
type t =
| Str of string
| Int of int
let rec apply f args =
match args with
| (Str s)::xs -> apply (f s) xs
| (Int i)::xs -> apply (f i) xs
| [] -> f
(* Example 1 *)
let total = apply (fun x y z -> x + y + z)
[Int 1; Int 2; Int 3]
(* Example 2 *)
let () = apply (fun name age ->
Printf.printf "Name: %s\n" name;
Printf.printf "Age: %i\n" age)
[Str "Bob"; Int 99]
However, this fails to compile. The compiler gives this error message:
File "./myprog.ml", line 7, characters 25-30:
7 | | (Str s)::xs -> apply (f s) xs
^^^^^
Error: This expression has type 'a but an expression was expected of type
string -> 'a
The type variable 'a occurs inside string -> 'a
What is the meaning of this error message? How can I fix the problem and implement apply?
You cannot mix an untyped DSL for data:
type t =
| Int of int
| Float of float
and a shallow embedding (using OCaml functions as functions inside the DSL) for functions in apply
let rec apply f args =
match args with
| (Str s)::xs -> apply (f s) xs (* f is int -> 'a *)
| (Int i)::xs -> apply (f i) xs (* f is string -> 'a *)
| [] -> f (* f is 'a *)
The typechecker is complaining that if f has type 'a, f s cannot also have for type 'a since it would mean that f has simultaneously type string -> 'a and 'a (without using the recursive types flag).
And more generally, your function apply doesn't use f with a coherent type: sometimes it has type 'a, sometimes it has type int -> 'a, other times it would rather have type string -> 'a. In other words, it is not possible to write a type for apply
val apply: ??? (* (int|string) -> ... *) -> t list -> ???
You have to choose your poison.
Either go with a fully untyped DSL which contains functions, that can be applied:
type t =
| Int of int
| Float of float
| Fun of (t -> t)
exception Type_error
let rec apply f l = match f, l with
| x, [] -> f
| Fun f, a :: q -> apply (f a) q
| (Int _|Float _), _ :: _ -> raise Type_error
or use OCaml type system and define a well-typed list of arguments with a GADT:
type ('a,'b) t =
| Nil: ('a,'a) t
| Cons: 'a * ('b,'r) t -> ('a -> 'b,'r) t
let rec apply: type f r. f -> (f,r) t -> r = fun f l ->
match l with
| Nil -> f
| Cons (x,l) -> apply (f x) l
EDIT:
Using the GADT solution is quite direct since we are using usual OCaml type without much wrapping:
let three = apply (+) (Cons(1, Cons(2,Nil)))
(and we could use a heterogeneous list syntactic sugar to make this form even lighter syntactically)
The untyped DSL requires to build first a function in the DSL:
let plus = Fun(function
| Float _ | Fun _ -> raise Type_error
| Int x -> Fun(function
| Float _ | Fun _ -> raise Type_error
| Int y -> Int (x+y)
)
)
but once we have built the function, it is relatively straightforward:
let three = apply_dsl plus [Int 2; Int 1]
type t =
| Str of string
| Int of int
| Unit
let rec apply f args =
match args with
| x::xs -> apply (f x) xs
| [] -> f Unit
Let's go step by step:
line 1: apply : 'a -> 'b -> 'c (we don't know the types of f, args and apply's return type
line 2 and beginning of line 3: args : t list so apply : 'a -> t list -> 'c
rest of line 3: Since f s (s : string), f : string -> 'a but f t : f because apply (f s). This means that f contains f in its type, this is a buggy behaviour
It's actually buggy to call f on s and i because this means that f can take a string or an int, the compiler will not allow it.
And lastly, if args is empty, you return f so the return type of f is the type of f itself, another buggy part of this code.
Looking at your examples, a simple solution would be:
type t = Str of string | Int of int
let rec apply f acc args =
match args with x :: xs -> apply f (f acc x) xs | [] -> acc
(* Example 1 *)
let total =
apply
(fun acc x ->
match x with Int d -> d + acc | Str _ -> failwith "Type error")
0 [ Int 1; Int 2; Int 3 ]
(* Example 2 *)
let () =
apply
(fun () -> function
| Str name -> Printf.printf "Name: %s\n" name
| Int age -> Printf.printf "Age: %i\n" age)
() [ Str "Bob"; Int 99 ]
Since you know the type you want to work on, you don't need GADT shenanigans, just let f handle the pattern matching and work with an accumulator
I found myself wanting a way to do some codegen around some large variant types in my code, and I found ppx_variants_conv (https://github.com/janestreet/ppx_variants_conv)
The make_matcher method sounds potentially useful to me, but there are no docs and tbh I am struggling to read the example signature:
val make_matcher :
a:(('a -> 'a t) Variant.t -> 'b -> ('c -> 'd) * 'e)
-> b:((char -> 'f t) Variant.t -> 'e -> (char -> 'd) * 'g)
-> c:('h t Variant.t -> 'g -> (unit -> 'd) * 'i)
-> d:((int -> int -> 'j t) Variant.t -> 'i -> (int -> int -> 'd) * 'k)
-> 'b
-> ('c t -> 'd) * 'k
a b c and d labelled arguments correspond to the cases of the variant and the first part of each signature corresponds to the constructor for each case... I get a bit lost after that 🤔
and it seems a odd to me that 'b 'c 'd and 'k appear in the latter part of the signature but not 'e 'f 'g 'h 'i 'j
In the make_matcher function each labeled argument takes a function of two arguments that has a general form, fun v x -> f, y, where v is the first-class variant that represents the corresponding constructor, x is the value that is folded over all matchers-generating functions. The function returns a pair, in which the first constituent, the function f, is actually the matcher that will be called if that variant matches and some value y that will be passed to the next matcher-generating function.
Let's do some examples to illustrate this. First, let's define some simple matcher, e.g.,
type 'a t =
| A of 'a
| B of char
| C
| D of int * int
[##deriving variants]
let matcher init = Variants.make_matcher
~a:(fun v (x1 : char) ->
(fun x -> x+1),Char.code x1)
~b:(fun v (x2 : int) ->
(fun c -> Char.code c),float x2)
~c:(fun v (x3 : float) ->
(fun () -> 0), string_of_float x3)
~d:(fun v (x4 : string) ->
(fun x y -> x + y),[x4])
init
and here's how we can use, it,
# let f,s = matcher '*';;
val f : int t -> int = <fun>
val s : Base.string list = ["42."]
# f (A 10);;
- : int = 11
# f (B '*');;
- : int = 42
# f C;;
- : int = 0
# f (D (1,2));;
- : int = 3
To be honest, I don't know the purpose of the extra parameter that is passed to each matcher-generating function1. Probably, the idea is that depending on the initial parameter we could generate different matchers. But if you don't need this, then just pass () to it and/or define your own simplified matcher that ignores this additional information, e.g.,
let make_simple_matcher ~a ~b ~c ~d =
fst ##Variants.make_matcher
~a:(fun _ _ -> a,())
~b:(fun _ _ -> b,())
~c:(fun _ _ -> c,())
~d:(fun _ _ -> d,())
()
The make_simple_matcher function has an expected type,
a:('a -> 'b) ->
b:(char -> 'b) ->
c:(unit -> 'b) ->
d:(int -> int -> 'b) ->
'a t -> 'b
1) looking into the guts of the code that generates this function doesn't help a lot, as they use the generic name acc for this parameter, which is not very helpful.
let app = fun f -> fun x -> f (x);;
(*val app : ('a -> 'b) -> 'a -> 'b = <fun>*)
let app = fun f -> fun x -> x (f+1);;
(*val app : int -> (int -> 'a) -> 'a = <fun>*)
let app = fun f -> fun x -> x (f);;
(*val app : 'a -> ('a -> 'b) -> 'b = <fun>*)
let app2 = fun f -> fun g -> fun x -> g ( f x );;
(*val app2 : ('a -> 'b) -> ('b -> 'c) -> 'a -> 'c = <fun>*)
let app2 = fun f -> fun g -> fun x -> f (g x );;
(*val app2 : ('a -> 'b) -> ('c -> 'a) -> 'c -> 'b = <fun>*)
let app2 = fun f -> fun g -> fun x -> g (f x+1);;
(*val app2 : ('a -> int) -> (int -> 'b) -> 'a -> 'b = <fun>*)
let app2 = fun f -> fun g -> fun x -> f (g x+1 );;
(*val app2 : (int -> 'a) -> ('b -> int) -> 'b -> 'a = <fun>*)
let app3 = fun f -> fun g -> fun x -> g f x+1 ;;
(*val app3 : 'a -> ('a -> 'b -> int) -> 'b -> int = <fun>*)
How do I know what happens in the val line without tapping enter on the let app line.
To be more specific I dont understand the relation between let and val.
For example, for the first one:
let app = fun f -> fun x -> f (x);;
(*val app : ('a -> 'b) -> 'a -> 'b = <fun>*)
What I see is that the first x takes the name 'a
and after it goes to the f which takes the name 'b
and after it goes to the fun x which is 'a again
and after it goes to the fun y which is 'b again.
But it is clearly not the case with other functions.
How do I know how they are related?
Someone has put the types of these functions in comments below the actual code. Comments in OCaml are surrounded by (* and *).
These type signatures are identical to what you'd see in an OCaml toplevel. You enter an expression, followed by ;;, this code is evaluated, and you're shown the resulting value.
Consider:
# 4 + 5;;
- : int = 9
# let a = 4 + 5;;
val a : int = 9
#
The val only shows up if I've introduced a name for the value. In either case we're shown the type of the resulting value and the value itself.
When you see things like 'a and 'b you're seeing type variables that OCaml infers when it can't infer a specific type. This very recent question is worth reading.
let is used to create name-bindings(alternatively named definitions if you would like to call it), where a binding will have a name and will be bound to an expression, which when evaluated will produce a value.
fun is used to create/define anonymous function, Anonymous functions are function expression which means an expression of type function, which when evaluated given its arguments will produce a value, and this function expression can itself be passed around like any other regular value.. and this function expression is bound using a let.
For example
let app = fun f -> fun x -> x (f+1)
In this case, we are defining a name app which will be bound to the expression produced by fun on the right. The fun takes a parameter f whose type will be inferred by the compiler once it is done processing the body of that function expression. The same applies to the next function expression which takes a parameter named x.
By looking at the right-most expression, which is responsible for generating the reduced value, we see f getting applied to the operator + and so f will receive it's type from that expression. And then x (...) entails a function application, so the x will receive it's type from that expression.
And we can go on applying the same to every expression cited in the question.
Essentially, let bindings, lambda expressions, type inference, function application, function expression etc... are some of the topics one should get acquainted with to understand what's going on.
In the basic example of GADT from§7.20 of ocaml manual, what is the meaning of 'type a.' ?
Why declaring "eval : a term -> a" is not enough ?
type _ term =
| Int : int -> int term
| Add : (int -> int -> int) term
| App : ('b -> 'a) term * 'b term -> 'a term
let rec eval : type a. a term -> a = function
| Int n -> n (* a = int *)
| Add -> (fun x y -> x+y) (* a = int -> int -> int *)
| App(f,x) -> (eval f) (eval x)
Jacque's slide on ML'2011 workshop has a nice introduction. The idea to use syntax of locally abstract type to introduce universal expression-scoped variable.