how to implement lambda-calculus in OCaml? - ocaml

In OCaml, it seems that "fun" is the binding operator to me. Does OCaml have built-in substitution? If does, how it is implemented? is it implemented using de Bruijn index?
Just want to know how the untyped lambda-calculus can be implemented in OCaml but did not find such implementation.

As Bromind, I also don't exactly understand what you mean by saying "Does OCaml have built-in substitution?"
About lambda-calculus once again I'm not really understand but, if you talking about writing some sort of lambda-calculus interpreter then you need first define your "syntax":
(* Bruijn index *)
type index = int
type term =
| Var of index
| Lam of term
| App of term * term
So (λx.x) y will be (λ 0) 1 and in our syntax App(Lam (Var 0), Var 1).
And now you need to implement your reduction, substitution and so on. For example you may have something like this:
(* identity substitution: 0 1 2 3 ... *)
let id i = Var i
(* particular case of lift substitution: 1 2 3 4 ... *)
let lift_one i = Var (i + 1)
(* cons substitution: t σ(0) σ(1) σ(2) ... *)
let cons (sigma: index -> term) t = function
| 0 -> t
| x -> sigma (x - 1)
(* by definition of substitution:
1) x[σ] = σ(x)
2) (λ t)[σ] = λ(t[cons(0, (σ; lift_one))])
where (σ1; σ2)(x) = (σ1(x))[σ2]
3) (t1 t2)[σ] = t1[σ] t2[σ]
let rec apply_subs (sigma: index -> term) = function
| Var i -> sigma i
| Lam t -> Lam (apply_subs (function
| 0 -> Var 0
| i -> apply_subs lift_one (sigma (i - 1))
) t)
| App (t1, t2) -> App (apply_subs sigma t1, apply_subs sigma t2)
As you can see OCaml code is just direct rewriting of definition.
And now small-step reduction:
let is_value = function
| Lam _ | Var _ -> true
| _ -> false
let rec small_step = function
| App (Lam t, v) when is_value v ->
apply_subs (cons id v) t
| App (t, u) when is_value t ->
App (t, small_step u)
| App (t, u) ->
App (small_step t, u)
| t when is_value t ->
| _ -> failwith "You will never see me"
let rec eval = function
| t when is_value t -> t
| t -> let t' = small_step t in
if t' = t then t
else eval t'
For example you can evaluate (λx.x) y:
eval (App(Lam (Var 0), Var 1))
- : term = Var 1

OCaml does not perform normal-order reduction and uses call-by-value semantics. Some terms of lambda calculus have a normal form than cannot be reached with this evaluation strategy.
See The Substitution Model of Evaluation, as well as How would you implement a beta-reduction function in F#?.

I don't exactly understand what you mean by saying "Does OCaml have built-in substitution? ...", but concerning how the lambda-calculus can be implemented in OCaml, you can indeed use fun : just replace all the lambdas by fun, e.g.:
for the church numerals: you know that zero = \f -> (\x -> x), one = \f -> (\x -> f x), so in Ocaml, you'd have
let zero = fun f -> (fun x -> x)
let succ = fun n -> (fun f -> (fun x -> f (n f x)))
and succ zero gives you one as you expect it, i.e. fun f -> (fun x -> f x) (to highlight it, you can for instance try (succ zero) (fun s -> "s" ^ s) ("0") or (succ zero) (fun s -> s + 1) (0)).
As far as I remember, you can play with let and fun to change the evaluation strategy, but to be confirmed...
N.B.: I put all parenthesis just to make it clear, maybe some can be removed.


Why does OCaml think that this function takes an int parameter when nothing suggests that it should be the case?

I was working on chapter 1 of Modern Compiler Implementation in ML by Andrew Appel and I decided to implement it in OCaml instead of SML. I'm new to OCaml and I came across a very frustrating problem. OCaml seems to think that the below function has the signature int * (int * 'a) -> 'a option.
let rec lookupTable = function
| name, (i, v) :: _ when name = i -> Some v
| name, (_, _) :: rest -> lookupTable (name, rest)
| _, [] -> None
But as far as I can tell, there should be nothing that suggests that the first element in the tuple is an int. This is a problem because when the lookupTable function down the line, the compiler complains that I am not passing it an integer. Perhaps I am missing something incredibly obvious, but it has been pretty mind-boggling. Here is the rest of the program
open Base
type id = string
type binop = Plus | Minus | Times | Div
type stm =
| CompoundStm of stm * stm
| AssignStm of id * exp
| PrintStm of exp list
and exp =
| IdExp of id
| NumExp of int
| OpExp of exp * binop * exp
| EseqExp of stm * exp
(* Returns the maximum number of arguments of any print
statement within any subexpression of a given statement *)
let rec maxargs s =
match s with
| CompoundStm (stm1, stm2) -> Int.max (maxargs stm1) (maxargs stm2)
| AssignStm (_, exp) -> maxargs_exp exp
(* Might be more nested expressions *)
| PrintStm exps -> Int.max (List.length exps) (maxargs_explist exps)
and maxargs_exp e = match e with EseqExp (stm, _) -> maxargs stm | _ -> 0
and maxargs_explist exps =
match exps with
| exp :: rest -> Int.max (maxargs_exp exp) (maxargs_explist rest)
| [] -> 0
type table = (id * int) list
let updateTable name value t : table = (name, value) :: t
let rec lookupTable = function
| name, (i, v) :: _ when name = i -> Some v
| name, (_, _) :: rest -> lookupTable (name, rest)
| _, [] -> None
exception UndefinedVariable of string
let rec interp s =
let t = [] in
interpStm s t
and interpStm s t =
match s with
| CompoundStm (stm1, stm2) -> interpStm stm2 (interpStm stm1 t)
| AssignStm (id, exp) ->
let v, t' = interpExp exp t in
updateTable id v t'
(* Might be more nested expressions *)
| PrintStm exps ->
let interpretAndPrint t e =
let v, t' = interpExp e t in
Stdio.print_endline (Int.to_string v);
List.fold_left exps ~init:t ~f:interpretAndPrint
and interpExp e t =
match e with
| IdExp i -> (
match lookupTable (i, t) with
| Some v -> (v, t)
| None -> raise (UndefinedVariable i))
| NumExp i -> (i, t)
| OpExp (exp1, binop, exp2) ->
let exp1_val, t' = interpExp exp1 t in
let exp2_val, _ = interpExp exp2 t' in
let res =
match binop with
| Plus -> exp1_val + exp2_val
| Minus -> exp1_val - exp2_val
| Times -> exp1_val * exp2_val
| Div -> exp1_val / exp2_val
(res, t')
| EseqExp (s, e) -> interpExp e (interpStm s t)
Base defines = as int -> int -> bool, so when you have the expression name = i the compiler will infer them as ints.
You can access the polymorphic functions and operators through the Poly module, or use a type-specific operator by locally opening the relevant module, e.g. String.(name = i).
The reason Base does not expose polymorphic operators by default is briefly explained in the documentation's introduction:
The comparison operators exposed by the OCaml standard library are polymorphic:
What they implement is structural comparison of the runtime representation of values. Since these are often error-prone, i.e., they don't correspond to what the user expects, they are not exposed directly by Base.
There's also a performance-argument to be made, because the polymorphic/structural operators need to also inspect what kind of value it is at runtime in order to compare them correctly.

How to define "apply" in OCaml

I am trying to define a function that is similar to Lisp's apply. Here is my attempt:
type t =
| Str of string
| Int of int
let rec apply f args =
match args with
| (Str s)::xs -> apply (f s) xs
| (Int i)::xs -> apply (f i) xs
| [] -> f
(* Example 1 *)
let total = apply (fun x y z -> x + y + z)
[Int 1; Int 2; Int 3]
(* Example 2 *)
let () = apply (fun name age ->
Printf.printf "Name: %s\n" name;
Printf.printf "Age: %i\n" age)
[Str "Bob"; Int 99]
However, this fails to compile. The compiler gives this error message:
File "./", line 7, characters 25-30:
7 | | (Str s)::xs -> apply (f s) xs
Error: This expression has type 'a but an expression was expected of type
string -> 'a
The type variable 'a occurs inside string -> 'a
What is the meaning of this error message? How can I fix the problem and implement apply?
You cannot mix an untyped DSL for data:
type t =
| Int of int
| Float of float
and a shallow embedding (using OCaml functions as functions inside the DSL) for functions in apply
let rec apply f args =
match args with
| (Str s)::xs -> apply (f s) xs (* f is int -> 'a *)
| (Int i)::xs -> apply (f i) xs (* f is string -> 'a *)
| [] -> f (* f is 'a *)
The typechecker is complaining that if f has type 'a, f s cannot also have for type 'a since it would mean that f has simultaneously type string -> 'a and 'a (without using the recursive types flag).
And more generally, your function apply doesn't use f with a coherent type: sometimes it has type 'a, sometimes it has type int -> 'a, other times it would rather have type string -> 'a. In other words, it is not possible to write a type for apply
val apply: ??? (* (int|string) -> ... *) -> t list -> ???
You have to choose your poison.
Either go with a fully untyped DSL which contains functions, that can be applied:
type t =
| Int of int
| Float of float
| Fun of (t -> t)
exception Type_error
let rec apply f l = match f, l with
| x, [] -> f
| Fun f, a :: q -> apply (f a) q
| (Int _|Float _), _ :: _ -> raise Type_error
or use OCaml type system and define a well-typed list of arguments with a GADT:
type ('a,'b) t =
| Nil: ('a,'a) t
| Cons: 'a * ('b,'r) t -> ('a -> 'b,'r) t
let rec apply: type f r. f -> (f,r) t -> r = fun f l ->
match l with
| Nil -> f
| Cons (x,l) -> apply (f x) l
Using the GADT solution is quite direct since we are using usual OCaml type without much wrapping:
let three = apply (+) (Cons(1, Cons(2,Nil)))
(and we could use a heterogeneous list syntactic sugar to make this form even lighter syntactically)
The untyped DSL requires to build first a function in the DSL:
let plus = Fun(function
| Float _ | Fun _ -> raise Type_error
| Int x -> Fun(function
| Float _ | Fun _ -> raise Type_error
| Int y -> Int (x+y)
but once we have built the function, it is relatively straightforward:
let three = apply_dsl plus [Int 2; Int 1]
type t =
| Str of string
| Int of int
| Unit
let rec apply f args =
match args with
| x::xs -> apply (f x) xs
| [] -> f Unit
Let's go step by step:
line 1: apply : 'a -> 'b -> 'c (we don't know the types of f, args and apply's return type
line 2 and beginning of line 3: args : t list so apply : 'a -> t list -> 'c
rest of line 3: Since f s (s : string), f : string -> 'a but f t : f because apply (f s). This means that f contains f in its type, this is a buggy behaviour
It's actually buggy to call f on s and i because this means that f can take a string or an int, the compiler will not allow it.
And lastly, if args is empty, you return f so the return type of f is the type of f itself, another buggy part of this code.
Looking at your examples, a simple solution would be:
type t = Str of string | Int of int
let rec apply f acc args =
match args with x :: xs -> apply f (f acc x) xs | [] -> acc
(* Example 1 *)
let total =
(fun acc x ->
match x with Int d -> d + acc | Str _ -> failwith "Type error")
0 [ Int 1; Int 2; Int 3 ]
(* Example 2 *)
let () =
(fun () -> function
| Str name -> Printf.printf "Name: %s\n" name
| Int age -> Printf.printf "Age: %i\n" age)
() [ Str "Bob"; Int 99 ]
Since you know the type you want to work on, you don't need GADT shenanigans, just let f handle the pattern matching and work with an accumulator

GADTs for Representing Function Application with Multiple Parameters (AST)

I saw in the OCaml manual this example to use GADT for an AST with function application:
type _ term =
| Int : int -> int term
| Add : (int -> int -> int) term
| App : ('b -> 'a) term * 'b term -> 'a term
let rec eval : type a. a term -> a = function
| Int n -> n
| Add -> (fun x y -> x+y)
| App(f,x) -> (eval f) (eval x)
Is this the right way of representing function application for a language not supporting partial application?
Is there a way to make a GADT supporting function application with an arbitrary number of arguments?
Finally, is GADT a good way to represent a typed AST? Is there any alternative?
Well, partial eval already works here:
# eval (App(App(Add, Int 3),Int 4)) ;;
- : int = 7
# eval (App(Add, Int 3)) ;;
- : int -> int = <fun>
# eval (App(Add, Int 3)) 4 ;;
- : int = 7
What you don't have in this small gadt is abstraction (lambdas), but it's definitely possible to add it.
If you are interested in the topic, there is an abundant (academic) literature. This paper presents various encoding that supports partial evaluation.
There are also non-Gadt solutions, as shown in this paper.
In general, GADT are a very interesting way to represent evaluators. They tend to fall a bit short when you try to transform the AST for compilations (but there are ways).
Also, you have to keep in mind that you are encoding the type system of the language you are defining in your host language, which means that you need an encoding of the type feature you want. Sometimes, it's tricky.
Edit: A way to have a GADT not supporting partial eval is to have a special value type not containing functions and a "functional value" type with functions. Taking the simplest representation of the first paper, we can modify it that way:
type _ v =
| Int : int -> int v
| String : string -> string v
and _ vf =
| Base : 'a v -> ('a v) vf
| Fun : ('a vf -> 'b vf) -> ('a -> 'b) vf
and _ t =
| Val : 'a vf -> 'a t
| Lam : ('a vf -> 'b t) -> ('a -> 'b) t
| App : ('a -> 'b) t * 'a t -> 'b t
let get_val : type a . a v -> a = function
| Int i -> i
| String s -> s
let rec reduce : type a . a t -> a vf = function
| Val x -> x
| Lam f -> Fun (fun x -> reduce (f x))
| App (f, x) -> let Fun f = reduce f in f (reduce x)
let eval t =
let Base v = reduce t in get_val v
(* Perfectly defined expressions. *)
let f = Lam (fun x -> Lam (fun y -> Val x))
let t = App (f, Val (Base (Int 3)))
(* We can reduce t to a functional value. *)
let x = reduce t
(* But we can't eval it, it's a type error. *)
let y = eval t
(* HOF are authorized. *)
let app = Lam (fun f -> Lam (fun y -> App(Val f, Val y)))
You can make that arbitrarly more complicated, following your needs, the important property is that the 'a v type can't produce functions.

Function with unknown value works. OCaml

I cant understand some things in my code. It is program in OCaml which generates all distinct pairs from elements in list. Here's my code:
let rec tmp f list x =
match list with
| [] -> x
| h :: t -> f h (tmp f t x);;
(* ^ ^ (^ ) (1) *)
let rec distinctpairs lst =
match lst with
| [] -> []
| h :: t -> tmp ( fun x lt -> (h,x)::lt) t (distinctpairs t);;
(* ^ ^ (2) *)
Do function tmp returns three values ?
How i can give an argument to func, when i dont know what is x?
When i assume that tmp return three values, that why when I giving as arg to tmp the ( fun x lt -> (h,x)::lt) argument, and it works?
1. Do function tmp returns three values ?
let rec tmp f list x =
match list with
| [] -> x
| h :: t -> f h (tmp f t x);;
The simple answer to this question is no.
f h (tmp f t x) is not three value, instead, it is a function execution/application on f.
2. How i can give an argument to func, when i dont know what is x?
let rec distinctpairs lst =
match lst with
| [] -> []
| h :: t -> tmp ( fun x lt -> (h,x)::lt) t (distinctpairs t);;
The truth here is you know x. x is defined as a parameter of the anonymous function fun x lt -> (h, x)::lt.
When i assume that tmp return three values, that why when I giving as arg to tmp the ( fun x lt -> (h,x)::lt) argument, and it works?
First of all, when ocaml sees tmp f list x, ocaml does not know anything but tmp accepts 3 parameters.
When ocaml reaches | [] -> x, it knows whatever type x is, the tmp will return the same type as x.
When ocaml reaches | h::t -> f h (tmp f t x), it knows f must be a function and f will have 2 parameters: one with type of h and one with type of x
Then in your distinctpairs function, ( fun x lt -> (h,x)::lt) is an anonymous function which really matches the prediction above.
A better way to write the two functions:
let rec tmp f x = function
| [] -> []
| h :: t -> f h (tmp f x t)
let rec distinctpairs = function
| [] -> []
| h :: t -> tmp (fun x lt -> (h,x)::lt) (distinctpairs t) t
I also suggest you to read Real World Ocaml book. It is the newest and most comprehensive book on OCaml and it is good.
When you try to enter functional programming world, there is no shortcut. It is not like you learn Spanish as an English speaker. It is more like learning Chinese/Japanese as an English speaker.
The whole idea is quite different from Java or C# or C, and of course, much better than Java (my personal feeling). So I suggest you to learn from ground.

How can I skip a term with List.Map in OCAML?

Suppose I have some code like this: (fun e -> if (e <> 1) then e + 1 else (*add nothing to the list*))
Is there a way to do this? If so, how?
I want to both manipulate the item if it matches some criteria and ignore it if it does not. Thus List.filter wouldn't seem to be the solution.
SML has a function mapPartial which does exactly this. Sadly this function does not exist in OCaml. However you can easily define it yourself like this:
let map_partial f xs =
let prepend_option x xs = match x with
| None -> xs
| Some x -> x :: xs in
List.rev (List.fold_left (fun acc x -> prepend_option (f x) acc) [] xs)
map_partial (fun x -> if x <> 1 then Some (x+1) else None) [0;1;2;3]
will return [1;3;4].
Or you can use filter_map from extlib as ygrek pointed out.
Both Batteries and Extlib provide an equivalent of mapPartial: their extended List module sprovide a filter_map function of the type ('a -> 'b option) -> 'a list -> 'b list, allowing the map function to select items as well.
Another solution would be to use directly a foldl :
let f e l = if (e <> 1)
then (e + 1)::l
else l
in List.fold_left f [] list
But my preference is filter_map as Michael Ekstrand provided
Alternatively you can filter your list then apply the map on the resulted list as follows :
let map_bis predicate map_function lst = map_function (List.filter predicate lst);;
# val map_bis : ('a -> bool) -> ('a -> 'b) -> 'a list -> 'b list = <fun>
Usage :
# map_bis (fun e -> e<>1) (fun e -> e+1) [0;1;2;3];;
- : int list = [1; 3; 4]
You can also map values to singleton lists if you want to keep them or empty lists if you don't, and then concat the results.
List.concat ( (fun e -> if (e <> 1) then [e + 1] else []) my_list)
let rec process = function
| 1 :: t -> process t
| h :: t -> (h + 1) :: (process t)
| [] -> []
or tail recursive
let process =
let rec f acc = function
| 1 :: t -> f acc t
| h :: t -> f ((h + 1) :: acc) t
| [] -> List.rev acc in
f []
or with a composition of standard functions
let process l =
l |> List.filter ((<>)1)
|> ((+)1)
The OCaml standard library has had List.filter_map since 4.08. This can therefore now be written as:
List.filter_map (fun e -> if e <> 1 then Some (e + 1) else None)