OCaml - GADT mapping - ocaml

I'm struggling with understanding some aspects of using GADTs in OCaml. Let me try and talk you through my example and my understanding ...
I'm trying to implement Simon Peyton-Jones' classic paper on Combinatorial Contracts (https://www.microsoft.com/en-us/research/wp-content/uploads/2000/09/pj-eber.pdf)
I want to be able to manipulate an Observable, defined as a function from a date to a value of some type (realistically a float or a bool). So, using GADTs I define a function type and an Observation type
type _ value =
| Float : float -> float value
| Bool : bool -> bool value
type _ obs = Observation : (date -> 'v value) -> (date -> 'v) obs
What I think I am defining is that
a value is either a float or a bool (built using the Float or Bool constructors, and
an obs is a function from a date to one of the value types, so either a date -> float or a date -> bool
Now I define an expression, which enables me to combine Observables
type _ expr =
| Konst : 'v value -> 'v expr
| Obs : 'v obs -> 'v expr
| Lift : ('v1 -> 'v2) * 'v1 expr -> 'v2 expr
So far so good. An expression is either a constant, an observation (which is either a date->float or date->bool), or a function applied to an expression.
Now I want to be able to evaluate an Observable. In reality, an Observable is built on a random process, so I have a Distribution module (Note that in the original paper the idea is to separate out the concept of an Observable, from it's implementation - so we could implement through lattices, monte carlo, or whatever approach we want).
module type Distribution_intf =
sig
type element = Element of float * float
type t = element list
val lift : t -> f:(float -> float) -> t
val expected : t -> float
end
so, given a compose function
let compose f g = fun x -> f (g x)
I should be able to think about the evaluation of an Observable. Here is a mapping function (I've taken out the Boolean case for clarity)
type rv = date -> Distribution.t
let rec observable_to_rv : type o. o Observable.expr -> rv =
let open Distribution in
function
| Konst (Float f) -> fun (_:date) -> [Element(1.0, f)]
| Lift (f,obs) -> compose (lift ~f) (observable_to_rv o) (*ERROR HERE *)
Now things get problematic. When I try and compile, I get the following error message (on the Lift pattern match):
Error: This expression has type v1#0 -> o
but an expression was expected of type float -> float
Type v1#0 is not compatible with type float
I don't understand why: A Lift expression has type
Lift: ('v1 -> 'v2) * 'v1 expr -> 'v2 expr
So given a Lift(f,o), then shouldn't the compiler constrain that since observable_to_rv has type date -> float, then 'v2 must be a float, and since lift has type float -> float then 'v1 must be a float, so Lift should be defined on any tuple of type (float -> float, float expr).
What am I missing ?
Steve

You are missing the fact that the type variable 'v1 in Lift is existentially quantified:
Lift: ('v1 -> 'v2) * 'v1 expr -> 'v2 expr
means
Lift: ∃'v1. ('v1 -> 'v2) * 'v1 expr -> 'v2 expr
In other words, given a value Lift(f,x): 'b expr the only information that the type checker has is that there is a type 'a such that f:'a -> 'b and x:'a expr, nothing more.
In particular going back to your error message (with a somewhat recent version of compiler(≥ 4.03)):
Type $Lift_'v1 is not compatible with type float
it is not possible to unify this existential type introduced by the Lift constructor to the float type, or any concrete type, because there no information left in the type system on the actual concrete type hidden behind 'v1 at this point.
EDIT:
The essence of your problem is that you are trying to build a float random variable from a chain of expression 'a expr which can contain intermediary expressions of any type and just need to result with 'a expression. Since you can neither express the condition only float sub-expression in the type system nor construct non-float random variable with your current design, this spells trouble.
The solution is either to remove the ability to build non-float expression or to add the ability to handle non-float random variables .Taking the second option would look like
module Distribution:
sig
type 'a element = {probability:float; value: 'a}
type 'a t = 'a element list
val lift : 'a t -> f:('a -> 'b) -> 'b t
val expected : float t -> float
end = struct … end
Then your observable_to_rv random variable typed as 'a expr -> 'a rv will type-check.

Related

What does f: 'a -> b' -> c' -> d' mean in ocaml?

If I have a function f defined as
f: 'a -> 'b -> c' -> d'
Does that mean it takes one argument? Or 3? And then it outputs one argument? How would I use or call such a function?
As Glennsl notes in the comments, it means both.
Very briefly, and by no means comprehensively, from an academic perspective, no function in OCaml takes more than one argument or returns more or less than one value. For instance, a function that takes a single argument and adds 1 to it.
fun x -> x + 1
We can give that function a name in one of two ways:
let inc = fun x -> x + 1
Or:
let inc x = x + 1
Either way, inc has the type int -> int which indicates that it takes an int and returns an int value.
Now, if we want to add two ints, well, functions only take one argument... But functions are first class things, which means a function can create and return another function.
let add =
fun x ->
fun y -> x + y
Now add is a function that takes an argument x and returns a function that takes an argument y and returns the sum of x and y.
We could use a similar method to define a function that adds three ints.
let add3 =
fun a ->
fun b ->
fun c -> a + b + c
The type of add would be int -> int -> int and add3 would have type int -> int -> int -> int.
Of course, OCaml is not purely an academic language, so there is convenience syntax for this.
let add3 a b c = a + b + c
Inferred types
In your question, you ask about a type 'a -> 'b -> 'c -> 'd``. The examples provided work with the concrete type int. OCaml uses type inferencing. The compiler/interpreter looks at the entire program to figure out at compile time what the types should be, without the programmer having to explicitly state them. In the examples I provided, the +operator only works on values of typeint, so the compiler _knows_ incwill have typeint -> int`.
However, if we defined an identity function:
let id x = x
There is nothing her to say what type x should have. In fact, it can be anything. But what can be determined, if that the function will have the same type for argument and return value. Since we can't put a concrete type on that, OCaml uses a placeholder type 'a.
If we created a function to build a tuple from two values:
let make_tuple x y = (x, y)
We get type 'a -> 'b -> 'a * 'b.
In conclusion
So when you ask about:
f: 'a -> 'b -> 'c -> 'd
This is a function f that takes three arguments of types 'a, 'b, and 'c and returns a value of type 'd.

What is the purpose of an arrow "->" in OCaml

I've been learning OCaml recently and as of now it would seem an arrow is used by the compiler to signify what the next type would be. For instance, int -> int -> <fun> an integer which returns an integer, which returns a function.
However, I was wondering if I can use it natively in OCaml code. In addition, if anyone would happen to know the appropriate name for it. Thank you.
The operator is usually called type arrow where T1 -> T2 represents functions from type T1 to type T2. For instance, the type of + is int -> (int -> int) because it takes two integers and returns another one.
The way -> is defined, a function always takes one argument and returns only one element. A function with multiple parameters can be translated into a sequence of unary functions. We can interpret 1 + 2 as creating a +1 increment function (you can create it by evaluating (+) 1 in the OCaml command line) to the number 2. This technique is called Currying or Partial Evaluation.
Let's have a look at OCaml's output when evaluating a term :
# 1 + 2;;
- : int = 3
# (+) 1 ;;
- : int -> int = <fun>
The term1+2 is of type integer and has a value of 3 and the term (+) 1 is a function from integers to integers. But since the latter is a function, OCaml cannot print a single value. As a placeholder, it just prints <fun>, but the type is left of the =.
You can define your own functions with the fun keyword:
# (fun x -> x ^ "abc");;
- : bytes -> bytes = <fun>
This is the function which appends "abc" to a given string x. Let's take the syntax apart: fun x -> term means that we define a function with argument x and this x can now appear within term. Sometimes we would like to give function names, then we use the let construction:
# let append_abc = (fun x -> x ^ "abc") ;;
val append_abc : bytes -> bytes = <fun>
Because the let f = fun x -> ... is a bit cumbersome, you can also write:
let append_abc x = x ^ "abc" ;;
val append_abc : bytes -> bytes = <fun>
In any case, you can use your new function as follows:
# append_abc "now comes:" ;;
- : bytes = "now comes:abc"
The variable x is replaced by "now comes:" and we obtain the expression:
"now comes:" ^ "abc"
which evaluates to "now comes:abc".
#Charlie Parker asked about the arrow in type declarations. The statement
type 'a pp = Format.formatter -> 'a -> unit
introduces a synonym pp for the type Format.formatter -> 'a -> unit. The rule for the arrow there is the same as above: a function of type 'a pp takes a formatter, a value of arbitrary type 'a and returns () (the only value of unit type)
For example, the following function is of type Format.formatter -> int -> unit (the %d enforces the type variable 'a to become int):
utop # let pp_int fmt x = Format.fprintf fmt "abc %d" x;;
val pp_int : formatter -> int -> unit = <fun>
Unfortunately the toplevel does not infer int pp as a type so we don't immediately notice(*). We can still introduce a new variable with a type annotation that we can see what's going on:
utop # let x : int pp = pp_int ;;
val x : int pp = <fun>
utop # let y : string pp = pp_int ;;
Error: This expression has type formatter -> int -> unit
but an expression was expected of type
string pp = formatter -> string -> unit
Type int is not compatible with type string
The first declaration is fine because the type annotation agrees with the type inferred by OCaml. The second one is in conflict with the inferred type ('a' can not be both int and string at the same time).
Just a remark: type can also be used with records and algebraic data types to create new types (instead of synonyms). But the type arrow keeps its meaning as a function.
(*) Imagine having multiple synonymes, which one should the toplevel show? Therefore synonyms are usually expanded.
The given answer doesn't work for ADTs, GADTs, for that see: In OCaml, a variant declaring its tags with a colon
e.g.
type 'l generic_argument =
| GenArg : ('a, 'l) abstract_argument_type * 'a -> 'l generic_argument

OCaml variance (+'a, -'a) and invariance

After writing this piece of code
module type TS = sig
type +'a t
end
module T : TS = struct
type 'a t = {info : 'a list}
end
I realised I needed info to be mutable.
I wrote, then :
module type TS = sig
type +'a t
end
module T : TS = struct
type 'a t = {mutable info : 'a list}
end
But, surprise,
Type declarations do not match:
type 'a t = { mutable info : 'a list; }
is not included in
type +'a t
Their variances do not agree.
Oh, I remember hearing about variance. It was something about covariance and contravariance. I'm a brave person, I'll find about my problem alone!
I found these two interesting articles (here and here) and I understood!
I can write
module type TS = sig
type (-'a, +'b) t
end
module T : TS = struct
type ('a, 'b) t = 'a -> 'b
end
But then I wondered. How come that mutable datatypes are invariant and not just covariant?
I mean, I understand that an 'A list can be considered as a subtype of an ('A | 'B) list because my list can't change. Same thing for a function, if I have a function of type 'A | 'B -> 'C it can be considered as a subtype of a function of type 'A -> 'C | 'D because if my function can handle 'A and 'B's it can handle only 'A's and if I only return 'C's I can for sure expect 'C or 'D's (but I'll only get 'C's).
But for an array? If I have an 'A array I can't consider it as a an ('A | 'B) array because if I modify an element in the array putting a 'B then my array type is wrong because it truly is an ('A | 'B) array and not an 'A array anymore. But what about a ('A | 'B) array as an 'A array. Yes, it would be strange because my array can contain 'B but strangely I thought it was the same as a function. Maybe, in the end, I didn't understand everything but I wanted to put my thoughts on it here because it took me long to understand it.
TL;DR :
persistent : +'a
functions : -'a
mutable : invariant ('a) ? Why can't I force it to be -'a ?
I think that the easiest explanation is that a mutable value has two intrinsic operations: getter and setter, that are expressed using field access and field set syntaxes:
type 'a t = {mutable data : 'a}
let x = {data = 42}
(* getter *)
x.data
(* setter *)
x.data <- 56
Getter has a type 'a t -> 'a, where 'a type variable occurs on the right-hand side (so it imposes a covariance constraint), and the setter has type 'a t -> 'a -> unit where the type variable occurs to the left of the arrow, that imposes a contravariant constraint. So, we have a type that is both covariant and contravariant, that means that type variable 'a is invariant.
Your type t basically allows two operations: getting and setting. Informally, getting has type 'a t -> 'a list and setting has type 'a t -> 'a list -> unit. Combined, 'a occurs both in a positive and in a negative position.
[EDIT: The following is a (hopefully) clearer version of what I wrote in the first place. I consider it superior, so I deleted the previous version.]
I will try to make it more explicit. Suppose sub is a proper subtype of super and witness is some value of type super which is not a value of type sub. Now let f : sub -> unit be some function which fails on the value witness. Type safety is there to ensure that witness is never passed to f. I will show by example that type safety fails if one is allowed to either treat sub t as a subtype of super t, or the other way around.
let v_super = ({ info = [witness]; } : super t) in
let v_sub = ( v_super : sub t ) in (* Suppose this was allowed. *)
List.map f v_sub.info (* Equivalent to f witness. Woops. *)
So treating super t as a subtype of sub t cannot be allowed. This shows covariance, which you already knew. Now for contravariance.
let v_sub = ({ info = []; } : sub t) in
let v_super = ( v_sub : super t ) in (* Suppose this was allowed. *)
v_super.info <- [witness];
(* As v_sub and v_super are the same thing,
we have v_sub.info=[witness] once more. *)
List.map f v_sub.info (* Woops again. *)
So, treating sub t as a subtype of super t cannot be allowed either, showing contravariance. Together, 'a t is invariant.

Walking OCaml tuples of arbitrary depth

I'm trying to understand better the OCaml type inference. I created this example:
let rec f t = match t with
| (l,r) -> (f l)+(f r)
| _ -> 1
and I want to apply it on any binary tuple (pair) with nested pairs, to obtain the total number of leafs. Example: f ((1,2),3)
The function f refuses to compile, because a contradiction in types at (f l): "This expression has type 'a but an expression was expected of type 'a * 'b".
Question: 'a being any type, could not also be a pair, or else be handled by the _ case? Is any method to walk tuples of arbitrary depth without converting them to other data structures, such as variants?
PS: In C++ I would solve this kind of problem by creating two template functions "f", one to handle tuples and one other types.
There is a way to do this, although I wouldn't recommend it to a new user due to the resulting complexities. You should get used to writing regular OCaml first.
That said, you can walk arbitrary types in a generic way by capturing the necessary structure as a GADT. For this simple problem it is quite easy:
type 'a ty =
| Pair : 'a ty * 'b ty -> ('a * 'b) ty
| Other : 'a ty
let rec count_leaves : type a . a -> a ty -> int =
fun a ty ->
match ty with
| Pair (ta, tb) -> count_leaves (fst a) ta + count_leaves (snd a) tb
| Other -> 1
Notice how the pattern matching on the a ty here corresponds to the pattern matching on values in your (poorly typed) example function.
More useful functions could be written with a more complete type representation, although the machinery becomes heavy and complicated once arbitrary tuples, records, sum types, etc have to be supported.
Any combination of tuples will have a value shape completely described by it's type (because there is no "choice" in the type structure) - hence the "number of leaves" question can be answered completely statically at compile-time. Once you have a function operating on such type - this function is fixed to operate on that specific type (and shape) only.
If you want to build a tree that can have different shapes (but same type - hence can be handled by same function) - you need to add variants to the mix, i.e. classic type 'a tree = Leaf of 'a | Node of 'a tree * 'a tree, or any other type that describes value with some dynamic "choice" of shape.

Write a spec for anonymous function

I have anonymous function:
fun x -> x;;
- : 'a -> 'a = <fun>
As you may see, this function accepts argument of any type. I want to specify concrete type, say int.
I know that I can annotate functions with type specs, but do not know syntax for it.
It would be helpful to get some reference to this syntax and extend this example with such annotation.
Thanks.
# fun (x: int) -> x;;
- : int -> int = <fun>
#
The reason this works is that
Function parameters are specified as patterns.
One alternative for a patttern is of the form:
( pattern : typexpr )
Syntax for patterns is given in Section 6.6 of the OCaml manual.
The most general form is:
(fun x -> x : int -> int)
Since fun x -> x is a value by itself, it can be annotated with a type, as any other expression. Indeed, in this type annotation you can omit one of the int's, since the other can be inferred by a compiler:
(fun x -> x : int -> 'a)
or
(fun x -> x : 'a -> int)
all will result in:
- : int -> int = <fun>
This also demonstrates that 'a in type annotations has different meaning from 'a in signatures. In type annotation it stands for "I don't care, you decide". Thats why the proper name for type annotations is type constraining, thus you're not annotating your expression with type, but you're giving extra constraint for type inference system. In this example, you're saying to it: I have this expression, and please infer its type, giving it is a function that returns int.
Also, you can use _ instead of type variables, the same way as you can do this for a normal variables:
(fun x -> x : _ -> int)
The result will be the same.