Variant vs GADT approach - ocaml

In OCaml, I want to define a function f that accepts an input to update a record x. Among the following two approaches, I'm interested whether one has an advantage over the other (readability aside).
Variant approach
type input =
| A of int
| B of string
let f x = function
| A a -> { x with a }
| B b -> { x with b }
GADT approach
type _ input =
| A : int input
| B : string input
let f (type t) x (i: t input) (v: t) =
match i with
| A -> { x with a = v }
| B -> { x with b = v }

ADT pros:
Straightforward, no need for type annotations or anything fancy
Writing a function of type string -> input is straightforward.
GADT pros:
Avoid one layer of boxing.
However, this is completely negated if you need a parsing functions, which would force you to pack things under an existential.
To be more precise, the GADT version can be seen as a decomposition of the ADT version. You can transform one into the other in a systematic way, and the memory layout will be similar (with the help of a small annotation):
type a and b and c
type sum =
| A of a
| B of b
| C of c
type _ tag =
| A : a tag
| B : b tag
| C : c tag
type deppair = Pair : ('a tag * 'a) -> deppair [##ocaml.unboxed]
let pack (type x) (tag : x tag) (x : x) = Pair (tag, x)
let to_sum (Pair (tag, v)) : sum = match tag with
| A -> A v
| B -> B v
| C -> C v
let of_sum : sum -> deppair = function
| A x -> pack A x
| B x -> pack B x
| C x -> pack C x

As you noticed (non)readability of GADTs is a big drawback. Try to avoid GADTs when possibly. Easier to type and easier to read. Less complex error messages too.
Simplified at runtime they are the same. They are represented as simple ints or blocks with tag and fields and the code uses the tag to match and branch. So neither gives you an advantage there.
At compile time GADTs are more powerful as the compiler can check types in ways that ADTs don't allow. One example are existential types like the example in the other answer. So use GADTs when you can't use ADTs.

Related

Is there a shorthand to match e.g. "anything of int"

Say I have a variant like:
type myvar = A of int | B of int
I can write a function like:
let myvar_to_int = function
| A i -> i
| B i -> i
Let's say I have lots more elements in the variant, all <something> of int ...
Is there any shorthand for writing the to_int function? e.g. a way to express <anything> of int in a match case?
(In other places in the code I want to be able to distinguish my As from Bs and match on them explicitly still)
If every variant has an int then you really have a pair of distinct values:
type ab = A | B
type myvar = ab * int
let myvar_to_int = snd
Otherwise, no there's no way to do what you want. You can write slightly more compactly:
let myvar_to_int = function
| A i | B i -> i

Call 2 or more functions inside a match expression

I am a beginner in OCaml. I am curious to know how, syntactically speaking, call two functions, or more, within a match expression. Or is that possible at all?
For example :
let rec foo l:list =
match l with
| [x,y] -> (foo1 x) (foo2 y)
| _ -> doSome
I have tried using the ; operator but that seems to be used for something else. I have tried different combinations of bracketing but in all cases I get
This is not a function it cannot be applied under foo1 x.
You just need a semicolon (no begin/end). You don't need the parentheses (they don't hurt but they're not especially idiomatic OCaml).
let rec foo l : 'a list = match l with
| [x,y] -> foo1 x; foo2 y
| _ -> doSome

Subtyping for Yojson element in a yojson list

I meet an error about subtyping.
For this code, List.map (fun ((String goal_feat):> Basic.t) -> goal_feat) (goal_feats_json:> Basic.t list).
I meet the following error in vscode:
This expression cannot be coerced to type
Yojson.Basic.t =
[ Assoc of (string * Yojson.Basic.t) list
| Bool of bool
| Float of float
| Int of int
| List of Yojson.Basic.t list
| Null
| String of string ];
it has type [< String of 'a ] -> 'b but is here used with type
[< Yojson.Basic.t ].
While compiling, I meet the following error.
Error: Syntax error: ')' expected.
If I change the code to List.map (fun ((String goal_feat): Basic.t) -> goal_feat) (goal_feats_json:> Basic.t list), which useq explicit type cast instead of subtyping, then the error disappeared. I can not understand what is the problem with my code when i use subtyping. Much appreciation to anyone who could give me some help.
First of all, most likely the answer that you're looking for is
let to_strings xs =
List.map (function `String x -> x | _ -> assert false) (xs :> t list)
The compiler is telling you that your function is handling only one case and you're passing it a list that may contain many other things, so there is a possibility for runtime error. So it is better to indicate to the compiler that you know that only the variants tagged with String are expected. This is what we did in the example above. Now our function has type [> Yojson.Basic.t].
Now back to your direct question. The syntax for coercion is (expr : typeexpr), however in the fun ((String goal_feat):> Basic.t) -> goal_feat snippet, String goal_feat is a pattern, and you cannot coerce a pattern, so we shall use parenthesized pattern here it to give it the right, more general, type1, e.g.,
let exp xs =
List.map (fun (`String x : t) -> x ) (xs :> t list)
This will tell the compiler that the parameter of your function shall belong to a wider type and immediately turn the error into warning 8,
Warning 8: this pattern-matching is not exhaustive.
Here is an example of a case that is not matched:
(`Bool _|`Null|`Assoc _|`List _|`Float _|`Int _)
which says what I was saying in the first part of the post. It is usually a bad idea to leave warning 8 unattended, so I would suggest you to use the first solution, or, otherwise, find a way to prove to the compiler that your list doesn't have any other variants, e.g., you can use List.filter_map for that:
let collect_strings : t list -> [`String of string] list = fun xs ->
List.filter_map (function
| `String s -> Some (`String s)
| _ -> None) xs
And a more natural solution would be to return untagged strings (unless you really need the to be tagged, e.g., when you need to pass this list to a function that is polymorphic over [> t] (Besides, I am using t for Yojson.Basic.t to make the post shorter, but you should use the right name in your code). So here is the solution that will extract strings and make everyone happy (it will throw away values with other tags),
let collect_strings : t list -> string list = fun xs ->
List.filter_map (function
| `String s -> Some s
| _ -> None) xs
Note, that there is no need for type annotations here, and we can easily remove them to get the most general polymoprhic type:
let collect_strings xs =
List.filter_map (function
| `String s -> Some s
| _ -> None) xs
It will get the type
[> `String a] list -> 'a list
which means, a list of polymorphic variants with any tags, returning a list of objects that were tagged with the String tag.
1)It is not a limitation that coercion doesn't work on patterns, moreover it wouldn't make any sense to coerce a pattern. The coercion takes an expression with an existing type and upcasts (weakens) it to a supertype. A function parameter is not an expression, so there is nothing here to coerce. You can just annotate it with the type, e.g., fun (x : #t) -> x will say that our function expects values of type [< t] which is less general than the unannotated type 'a. To summarize, coercion is needed when you have a function that accepts an value that have a object or polymorphic variant type, and in you would like at some expressions to use it with a weakened (upcasted type) for example
type a = [`A]
type b = [`B]
type t = [a | b]
let f : t -> unit = fun _ -> ()
let example : a -> unit = fun x -> f (x :> t)
Here we have type t with two subtypes a and b. Our function f is accepting the base type t, but example is specific to a. In order to be able to use f on an object of type a we need an explicit type coercion to weaken (we lose the type information here) its type to t. Notice that, we do not change the type of x per se, so the following example still type checks:
let rec example : a -> unit = fun x -> f (x :> t); example x
I.e., we weakened the type of the argument to f but the variable x is still having the stronger type a, so we can still use it as a value of type a.

Assertion in type definition

I have some type
type my_type_name =
| A of (float * float)
| B of (float * float)
| E
;;
And I know that I have issue in my code, if p >= q in B (p, q) or A (p, q). So I want to make assertion in type, such that if I'll try to make A (5, 1) it will inform me.
Is it possible to make assertion in type constructor? How it should look like?
I'm trying to avoid something like that:
assert (p < q) ; A(p, q)
Cause I have a lot of A or B objects in code.
#Jeffrey Scofield already gave a fully accurate answer to this, but here is the example requested:
module type MY_TYPE =
sig
type my_type_name = private
| A of (float * float)
| B of (float * float)
| E
val a : float -> float -> my_type_name
val b : float -> float -> my_type_name
val e : my_type_name
end
module My_type : MY_TYPE =
struct
type my_type_name =
| A of (float * float)
| B of (float * float)
| E
let a p q = assert true; A (p, q)
let b p q = assert true; B (p, q)
let e = E
end
let () =
(* My_type.A (1., 2.); *) (* Type error - private type. *)
let x = My_type.a 1. 2. in
(* Can still pattern match. *)
match x with
| My_type.A (_, _) -> ()
| My_type.B (_, _) -> ()
| My_type.E -> ()
The body of the signature could be an .mli file instead and the body of My_type could be an .ml file. The key is that my_type_name is followed by the keyword private in MY_TYPE. This causes the direct use of the constructor at the bottom to be flagged as an error, forcing all construction to go through the functions a, b, e, where you can include assertions or arbitrary other expressions.
EDIT To make the functions look even more like the constructors, you can of course turn their arguments into tuple types. To make the module nearly transparent you can open it at the bottom. But these are matters of taste and up to you.
There's no straightforward way to put such an assertion into the type definition itself.
One good approach would be to define your type as an abstract type (in a module) and export functions for constructing values of the type. These will be the only way to produce values of the type, and you can include any assertions you like in the functions.
Update
As #antron points out, making the type private rather than abstract might be a better fit for your problem.

Cartesian (outer) product of lists in OCaml

I would like to iterate over all combinations of elements from a list of lists which have the same length but not necessarily the same type. This is like the cartesian product of two lists (which is easy to do in OCaml), but for an arbitrary number of lists.
First I tried to write a general cartesian (outer) product function which takes a list of lists and returns a list of tuples, but that can't work because the input list of lists would not have elements of the same type.
Now I'm down to a function of the type
'a list * 'b list * 'c list -> ('a * 'b * 'c) list
which unfortunately fixes the number of inputs to three (for example). It's
let outer3 (l1, l2, l3) =
let open List in
l1 |> map (fun e1 ->
l2 |> map (fun e2 ->
l3 |> map (fun e3 ->
(e1,e2,e3))))
|> concat |> concat
This works but it's cumbersome since it has to be redone for each number of inputs. Is there a better way to do this?
Background: I want to feed the resulting flat list to Parmap.pariter.
To solve your task for arbitrary ntuple we need to use existential types. We can use GADT, but they are close by default. Of course we can use open variants, but I prefer a little more syntactically heavy but more portable solution with first class modules (and it works because GADT can be expressed via first class modules). But enough theory, first of all we need a function that will produce the n_cartesian_product for us, with type 'a list list -> 'a list list
let rec n_cartesian_product = function
| [] -> [[]]
| x :: xs ->
let rest = n_cartesian_product xs in
List.concat (List.map (fun i -> List.map (fun rs -> i :: rs) rest) x)
Now we need to fit different types into one type 'a, and here comes existential types, let's define a signature:
module type T = sig
type t
val x : t
end
Now let's try to write a lifter to this existential:
let int x = (module struct type t = int let x = x end : T)
it has type:
int -> (module T)
Let's extend the example with few more cases:
let string x = (module struct type t = string let x = x end : T)
let char x = (module struct type t = char let x = x end : T)
let xxs = [
List.map int [1;2;3;4];
List.map string ["1"; "2"; "3"; "4"];
List.map char ['1'; '2'; '3'; '4']
]
# n_cartesian_product xxs;;
- : (module T) list list =
[[<module>; <module>; <module>]; [<module>; <module>; <module>];
[<module>; <module>; <module>]; [<module>; <module>; <module>];
...
Instead of first class modules you can use other abstractions, like objects or functions, if your type requirements allow this (e.g., if you do not need to expose the type t). Of course, our existential is very terse, and maybe you will need to extend the signature.
I used #ivg 's answer but in a version with a GADT. I reproduce it here for reference. In a simple case where only the types float and int can appear in the input lists, first set
type wrapped = Int : int -> wrapped | Float : float -> wrapped
this is a GADT without type parameter. Then
let wrap_f f = Float f
let wrap_i i = Int f
wrap types into the sum type. On wrapped value lists we can call n_cartesian_product from #ivg 's answer. The result is a list combinations: wrapped list list which is flat (for the present purposes).
Now to use Parmap, i have e.g. a worker function work : float * int * float * float -> float. To get the arguments out of the wrappers, I pattern match:
combinations |> List.map (function
| [Float f1; Int i; Float f2; Float f3] -> (f1, i, f2, f3)
| _ -> raise Invalid_argument "wrong parameter number or types")
to construct the flat list of tuples. This can be finally fed to Parmap.pariter with the worker function work.
This setup is almost the same as using a regular sum type type wrapsum = F of float | I of int instead of wrapped. The pattern matching would be the same; the only difference seems to be that getting a wrong input, e.g. (F 1, I 1, F 2.0, F, 3.0) would be detected only at runtime, not compile time as here.