Sneaking lenses and CPS past the value restriction - ocaml

I'm encoding a form of van Laarhoven lenses in OCaml, but am having difficulty due to the value restriction.
The relevant code is as follows:
module Optic : sig
type (-'s, +'t, +'a, -'b) t
val lens : ('s -> 'a) -> ('s -> 'b -> 't) -> ('s, 't, 'a, 'b) t
val _1 : ('a * 'x, 'b * 'x, 'a, 'b) t
end = struct
type (-'s, +'t, +'a, -'b) t =
{ op : 'r . ('a -> ('b -> 'r) -> 'r) -> ('s -> ('t -> 'r) -> 'r) }
let lens get set =
let op cont this read = cont (get this) (fun b -> read (set this b))
in { op }
let _1 = let build (_, b) a = (a, b) in lens fst build
end
Here I am representing a lens as a higher order type, a transformer of CPS-transformed functions ('a -> 'b) -> ('s -> 't) (as was suggested here and discussed here). The functions lens, fst, and build all have fully generalized types but their composition lens fst build does not.
Error: Signature mismatch:
...
Values do not match:
val _1 : ('_a * '_b, '_c * '_b, '_a, '_c) t
is not included in
val _1 : ('a * 'x, 'b * 'x, 'a, 'b) t
As shown in the gist, it's perfectly possible to write _1
let _1 = { op = fun cont (a, x) read -> cont a (fun b -> read (b, x)) }
but having to manually construct these lenses each time is tedious and it would be nice to build them using higher order functions like lens.
Is there any way around the value restriction here?

The value restriction is a limitation of the OCaml type system that prevents some polymorphic values from being generalized, i.e. having a type that is universally quantified over all type variables. This is done to preserve soundness of the type system in the presence of mutable references and side effects.
In your case, the value restriction applies to the _1 value, which is defined as the result of applying the lens function to two other functions, fst and build. The lens function is polymorphic, but its result is not, because it depends on the type of the arguments it receives. Therefore, the type of _1 is not fully generalized, and it cannot be given the type signature you expect.
There are a few possible ways to work around the value restriction in this case:
Use explicit type annotations to specify the type variables you want to generalize. For example, you can write:
let _1 : type a b x. (a * x, b * x, a, b) Optic.t = lens fst (fun (_, b) a -> (a, b))
This tells the compiler that you want to generalize over the type variables a, b, and x, and that the type of _1 should be a lens that works on pairs with any types for the first and second components.
Use functors to abstract over the type variables and delay the instantiation of the lens function. For example, you can write:
module MakeLens (A : sig type t end) (B : sig type t end) (X : sig type t end) = struct
let _1 = lens fst (fun (_, b) a -> (a, b))
end
This defines a functor that takes three modules as arguments, each defining a type t, and returns a module that contains a value _1 of type (A.t * X.t, B.t * X.t, A.t, B.t) Optic.t. You can then apply this functor to different modules to get different instances of _1. For example, you can write:
module IntLens = MakeLens (struct type t = int end) (struct type t = int end) (struct type t = string end)
let _1_int = IntLens._1
This gives you a value _1_int of type (int * string, int * string, int, int) Optic.t.
Use records instead of tuples to represent the data types you want to manipulate with lenses. Records have named fields, which can be accessed and updated using the dot notation, and they are more amenable to polymorphism than tuples. For example, you can write:
type ('a, 'x) pair = { first : 'a; second : 'x }
let lens_first = lens (fun p -> p.first) (fun p b -> { p with first = b })
let lens_second = lens (fun p -> p.second) (fun p b -> { p with second = b })
This defines two lenses, lens_first and lens_second, that work on any record type that has a first and a second field, respectively. You can then use them to manipulate different kinds of records, without having to worry about the value restriction. For example, you can write:
type point = { x : int; y : int }
type person = { name : string; age : int }
let p = { x = 1; y = 2 }
let q = lens_first.op (fun x f -> x + 1) p (fun p -> p)
(* q is { x = 2; y = 2 } *)
let r = { name = "Alice"; age = 25 }
let s = lens_second.op (fun x f -> x + 1) r (fun r -> r)
(* s is { name = "Alice"; age = 26 } *)

Related

Why Peano numbers in OCaml not working due to scope error?

I have the following peano number written with GADTs:
type z = Z of z
type 'a s = Z | S of 'a
type _ t = Z : z t | S : 'n t -> 'n s t
module T = struct
type nonrec 'a t = 'a t
end
type 'a nat = 'a t
type e = T : 'n nat -> e
The following function to decode a 'a nat (or 'a t) into the number it encoded, works:
let to_int : type n. n t -> int =
let rec go : type n. int -> n t -> int =
fun acc n -> match n with Z -> acc | S n -> go (acc + 1) n
in
fun x -> go 0 x
but if I try to rewrite it almost exactly the same this way:
let to_int2 (type a) (a: a nat) : int =
let rec go (type a) (acc : int) (x : a nat) : int =
match x with
| Z -> acc
| S v -> go (acc + 1) v
in
go 0 a
I get a scope error. What's the difference between the two functions?
138 | | S v -> go (acc + 1) v
^
Error: This expression has type $0 t but an expression was expected of type
'a
The type constructor $0 would escape its scope
The root issue is polymorphic recursion, GADTs are a red herring here.
Without an explicit annotation, recursive functions are not polymorphic in their own definition.
For instance, the following function has type int -> int
let rec id x =
let _discard = lazy (id 0) in
x;;
because id is not polymorphic in
let _discard = lazy (id 0) in
and thus id 0 implies that the type of id is int -> 'a which leads to id having type int -> int.
In order to define polymorphic recursive function, one need to add an explicit universally quantified annotation
let rec id : 'a. 'a -> 'a = fun x ->
let _discard = lazy (id 0) in
x
With this change, id recovers its expected 'a -> 'a type.
This requirement does not change with GADTs. Simplifying your code
let rec to_int (type a) (x : a nat) : int =
match x with
| Z -> 0
| S v -> 1 + to_int v
the annotation x: a nat implies that the function to_int only works with a nat, but you are applying to an incompatible type (and ones that lives in a too narrow scope but that is secondary).
Like in the non-GADT case, the solution is to add an explicit polymorphic annotation:
let rec to_int: 'a. 'a nat -> int = fun (type a) (x : a nat) ->
match x with
| Z -> 0
| S v -> 1 + to_int v
Since the form 'a. 'a nat -> int = fun (type a) (x : a nat) -> is both a mouthful and quite often needed with recursive function on GADTs, there is a shortcut notation available:
let rec to_int: type a. a nat -> int = fun x ->
match x with
| Z -> 0
| S v -> 1 + to_int v
For people not very familiar with GADTs, this form is the one to prefer whenever one write a GADT function. Indeed, not only this avoids the issue with polymorphic recursion, writing down the explicit type of a function before trying to implement it is generally a good idea with GADTs.
See also https://ocaml.org/manual/polymorphism.html#s:polymorphic-recursion , https://ocaml.org/manual/gadts-tutorial.html#s%3Agadts-recfun , and https://v2.ocaml.org/manual/locallyabstract.html#p:polymorpic-locally-abstract .

How can you make a function that returns a function in ocaml

for an example, if a function receives a function as a factor and iterates it twice
func x = f(f(x))
I have totally no idea of how the code should be written
You just pass the function as a value. E.g.:
let apply_twice f x = f (f x)
should do what you expect. We can try it out by testing on the command line:
utop # apply_twice ((+) 1) 100
- : int = 102
The (+) 1 term is the function that adds one to a number (you could also write it as (fun x -> 1 + x)). Also remember that a function in OCaml does not need to be evaluated with all its parameters. If you evaluate apply_twice only with the function you receive a new function that can be evaluated on a number:
utop # let add_two = apply_twice ((+) 1) ;;
val add_two : int -> int = <fun>
utop # add_two 1000;;
- : int = 1002
To provide a better understanding: In OCaml, functions are first-class
values. Just like int is a value, 'a -> 'a -> 'a is a value (I
suppose you are familiar with function signatures). So, how do you
implement a function that returns a function? Well, let's rephrase it:
As functions = values in OCaml, we could phrase your question in three
different forms:
[1] a function that returns a function
[2] a function that returns a value
[3] a value that returns a value
Note that those are all equivalent; I just changed terms.
[2] is probably the most intuitive one for you.
First, let's look at how OCaml evaluates functions (concrete example):
let sum x y = x + y
(val sum: int -> int -> int = <fun>)
f takes in two int's and returns an int (Intuitively speaking, a
functional value is a value, that can evaluate further if you provide
values). This is the reason you can do stuff like this:
let partial_sum = sum 2
(int -> int = <fun>)
let total_sum = partial_sum 3 (equivalent to: let total_sum y = 3 + y)
(int = 5)
partial_sum is a function, that takes in only one int and returns
another int. So we already provided one argument of the function,
now one is still missing, so it's still a functional value. If that is
still not clear, look into it more. (Hint: f x = x is equivalent to
f = fun x -> x) Let's come back to your question. The simplest
function, that returns a function is the function itself:
let f x = x
(val f:'a -> 'a = <fun>)
f
('a -> 'a = <fun>)
let f x = x Calling f without arguments returns f itself. Say you
wanted to concatenate two functions, so f o g, or f(g(x)):
let g x = (* do something *)
(val g: 'a -> 'b)
let f x = (* do something *)
(val f: 'a -> 'b)
let f_g f g x = f (g x)
(val f_g: ('a -> 'b) -> ('c -> 'a) -> 'c -> 'b = <fun>)
('a -> 'b): that's f, ('c -> 'a): that's g, c: that's x.
Exercise: Think about why the particular signatures have to be like that. Because let f_g f g x = f (g x) is equivalent to let f_g = fun f -> fun g -> fun x -> f (g x), and we do not provide
the argument x, we have created a function concatenation. Play around
with providing partial arguments, look at the signature, and there
will be nothing magical about functions returning functions; or:
functions returning values.

GADTs for Representing Function Application with Multiple Parameters (AST)

I saw in the OCaml manual this example to use GADT for an AST with function application:
type _ term =
| Int : int -> int term
| Add : (int -> int -> int) term
| App : ('b -> 'a) term * 'b term -> 'a term
let rec eval : type a. a term -> a = function
| Int n -> n
| Add -> (fun x y -> x+y)
| App(f,x) -> (eval f) (eval x)
Is this the right way of representing function application for a language not supporting partial application?
Is there a way to make a GADT supporting function application with an arbitrary number of arguments?
Finally, is GADT a good way to represent a typed AST? Is there any alternative?
Well, partial eval already works here:
# eval (App(App(Add, Int 3),Int 4)) ;;
- : int = 7
# eval (App(Add, Int 3)) ;;
- : int -> int = <fun>
# eval (App(Add, Int 3)) 4 ;;
- : int = 7
What you don't have in this small gadt is abstraction (lambdas), but it's definitely possible to add it.
If you are interested in the topic, there is an abundant (academic) literature. This paper presents various encoding that supports partial evaluation.
There are also non-Gadt solutions, as shown in this paper.
In general, GADT are a very interesting way to represent evaluators. They tend to fall a bit short when you try to transform the AST for compilations (but there are ways).
Also, you have to keep in mind that you are encoding the type system of the language you are defining in your host language, which means that you need an encoding of the type feature you want. Sometimes, it's tricky.
Edit: A way to have a GADT not supporting partial eval is to have a special value type not containing functions and a "functional value" type with functions. Taking the simplest representation of the first paper, we can modify it that way:
type _ v =
| Int : int -> int v
| String : string -> string v
and _ vf =
| Base : 'a v -> ('a v) vf
| Fun : ('a vf -> 'b vf) -> ('a -> 'b) vf
and _ t =
| Val : 'a vf -> 'a t
| Lam : ('a vf -> 'b t) -> ('a -> 'b) t
| App : ('a -> 'b) t * 'a t -> 'b t
let get_val : type a . a v -> a = function
| Int i -> i
| String s -> s
let rec reduce : type a . a t -> a vf = function
| Val x -> x
| Lam f -> Fun (fun x -> reduce (f x))
| App (f, x) -> let Fun f = reduce f in f (reduce x)
let eval t =
let Base v = reduce t in get_val v
(* Perfectly defined expressions. *)
let f = Lam (fun x -> Lam (fun y -> Val x))
let t = App (f, Val (Base (Int 3)))
(* We can reduce t to a functional value. *)
let x = reduce t
(* But we can't eval it, it's a type error. *)
let y = eval t
(* HOF are authorized. *)
let app = Lam (fun f -> Lam (fun y -> App(Val f, Val y)))
You can make that arbitrarly more complicated, following your needs, the important property is that the 'a v type can't produce functions.

OCaml's let polymorphism implementation

I'm confused about let polymorphism in OCaml.
Consider the following code:
A:
let f = (fun v -> v) in
((f 3), (f true))
B:
let f = (fun v -> v) in
((fun g ->
let f = g in
f) f)
C:
let f = (fun v -> v) in
((fun g ->
let f = g in
((f 3), (f true))) f)
For A and B, there is no problem. But for C, OCaml reports error:
Error: This expression has type bool but an expression was expected of type
int
So for A, when evaluating ((f 3), (f true)), f's type is 'a -> 'a,
for B, when evaluating let f = g in f, f's type is 'a -> 'a.
But for C, when evaluating ((f 3), (f true)), f's type is int -> int.
Why C's f doesn't have type 'a -> 'a?
I have difficulty in understanding the implementation of OCaml's
let polymorphism, I'll appreciate it a lot if anyone can give a concise
description of it with respect to the question.
Your code is unnecessarily confusing because you're using the same name f for two different things in B and also two different things in C.
Inside C you have this function:
fun g -> let f = g in (f 3, f true)
Again this is unnecessarily complicated; it's the same as:
fun g -> (g 3, g true)
The reason this isn't allowed is that it only works if g is a polymorphic function. This requires rank 2 polymorphism, i.e., it requires the ability to define function parameters that are polymorphic.
I'm not exactly sure what you're trying to do, but you can have a record type whose field is a polymorphic function. You can then use this record type to define something like your function:
# type r = { f : 'a . 'a -> 'a };;
type r = { f : 'a. 'a -> 'a; }
# (fun { f = g } -> (g 3, g true)) { f = fun x -> x };;
- : int * bool = (3, true)
# let myfun { f = g } = (g 3, g true);;
val myfun : r -> int * bool = <fun>
# myfun { f = fun x -> x };;
- : int * bool = (3, true)
The downside is that you need to pack and unpack your polymorphic function.
As a side comment, your example doesn't seem very compelling, because the number of functions of type 'a -> 'a is quite limited.

Is this a case where I need a functor?

I'm using the Bitstring module in the following code:
let build_data_32 v wid =
let num = wid / 32 in
let v' = Int32.of_int(v) in
let rec aux lst vv w = match w with
0 -> lst
| _ -> (BITSTRING { vv : 32 } ) :: ( aux lst (Int32.succ vv) (w-1)) in
Bitstring.concat ( aux [] v' num ) ;;
Note that when you have BITSTRING { vv : 32 }
that vv is expected to be an Int32 value. I'd like to generalize this function to work with different widths of bitstrings; ie, I'd like to create a build_data_n function where the bitstring would be constructied with BITSTRING { vv : n } .
However, the problem here is that if n is less than 32 then the succ function used above would just be the succ for type int. If it's greater than 32 it would be Int64.succ Same issue above in the line let v' = Int32.of_int(v) in - for values less than 32 it would simply be: let v' = v in , whereas for values greater than 32 it would be: let v' = Int64.of_int(v) in
Is this a case where a functor would come in handy to generalize this function and if so, how would I set that up? (and if there's some other way to do this that doesn't require functors, that would be nice to know as well)
There are a few approaches available. One is to use a functor, similar to the following:
(* The signature a module needs to match for use below *)
module type S = sig
type t
val succ : t -> t
val of_int : int -> t
end
(* The functor *)
module Make(M : S) = struct
(* You could "open M" here if you wanted to save some typing *)
let build_data v =
M.succ (M.of_int v)
end
(* Making modules with the functor *)
module Implementation32 = Make(Int32)
module Implementation64 = Make(Int64)
let foo32 = Implementation32.build_data 12
let foo64 = Implementation64.build_data 12
Another is to wrap your data type in a record:
(* A record to hold the relevant functions *)
type 'a wrapper_t = { x : 'a; succ : 'a -> 'a }
(* Use values of type 'a wrapper_t in *)
let build_data v =
v.succ v.x
(* Helper function to create 'a wrapper_t values *)
let make_int32_wrapper x = { x = Int32.of_int x; succ = Int32.succ }
let make_int64_wrapper x = { x = Int64.of_int x; succ = Int64.succ }
(* Do something with a wrapped int32 *)
let foo32 = build_data (make_int32_wrapper 12)
let foo64 = build_data (make_int64_wrapper 12)
And finally, if you are using OCaml 3.12.0 or later, you can use first class modules:
(* You can use the module type S from the first example here *)
let build_data (type s) m x =
let module M = (val m : S with type t = s) in
M.succ x
let int32_s = (module Int32 : S with type t = Int32.t)
let int64_s = (module Int64 : S with type t = Int64.t)
let foo32 = build_data int32_s 12l
let foo64 = build_data int64_s 12L
Each of these approaches can be mixed and matched. You may also be able to wrap your values in variant types or objects to get a similar result.
With BITSTRING { vv : n }, i.e. using runtime-specified field length, the type of vv cannot depend on n as it is not the compile-time constant anymore, so vv is forced to int64.