GADT pattern matching - ocaml

I’ve been playing around with GADTs lately and was wondering if anybody could point me in the right direction for learning how to type this so it would compile, if it’s possible, or if I’m overly complicating things.
I have seen a few other answers to GADT pattern matching here but this seems to be a little different.
I’ve seen this type of thing done to represent a type with no possible values:
module Nothing = {
type t =
| Nothing(t);
};
So I wanted to use it to lock down this Exit.t type so I could have a type of Exit.t('a, Nothing.t) to represent the Success case, capturing the fact that there is no recoverable Failure value.
module Exit = {
type t('a, 'e) =
| Success('a): t('a, Nothing.t)
| Failure('e): t(Nothing.t, 'e);
This seemed to be okay, until I tried to write a flatMap function for it.
let flatMap: ('a => t('a1, 'e), t('a, 'e)) => t('a1, 'e) = (f, exit) =>
switch (exit) {
| Success(a) => f(a)
| Failure(_) as e => e
};
};
As is, it is inferring type Exit.t to always be Exit.t(Nothing.t, Nothing.t) which, I kind of understand since the type in the Failure case would set the first type to Nothing and the Success case would set the second type to Nothing.
I've tried the one thing I know, making some of those types local using type a. I've tried type a a1 e and type a e leaving 'a1 but I just don't seem to be able to capture the idea.

(I am using the OCaml syntax below, since the question is also tagged "ocaml", but the same should hold for Reason.)
First, your type Nothing.t is not empty. The cyclic value Nothing (Nothing (Nothing (...))) is a valid inhabitant. If you really want the type to be empty, do not put any constructor.
Second, as you guessed, in flat_map, your Failure branch forces 'a1 to be instantiated with Nothing.t. There is no way around that; it is not a deficiency of the compiler, just the only way to interpret this code.
Third, you are making things a bit too complicated, as Exit.t does not have to be a GADT in the first place, to achieve your goals.
Below is a simpler example that shows that, if Nothing.t is actually empty, then the compiler correctly allows irrelevant branches.
type nothing = |
type ('a, 'b) result =
| Success of 'a
| Failure of 'b
let only_success (x : ('a, nothing) result) : 'a =
match x with
| Success v -> v
| Failure _ -> . (* this branch can be removed, as it is correctly inferred *)

Related

OCaml - What is an unsound type?

Recently I was given the code
List.fold_left (fun acc x -> raise x ; acc) 3
I'm completely fine with this partial application having a functional
value of type exn list -> int, and the fact it yields a warning
isn't surprising. I am, however, not certain what half of the warning
means:
Warning 21: this statement never returns (or has an unsound type.)
I can't actually find any reference to this warning where it isn't the
result of a non-returning statement. Even the man page for ocamlc only
mentions non-returning statements for this warning, and warnings.ml
refers to it merely as Nonreturning_statement.
I am familiar with the concept of soundness as it relates to type
systems, but the idea of a type itself being inherently unsound seems
odd to me.
So my questions are:
What exactly is an unsound type?
What's a situation in which an unsound type would arise when OCaml
would only issue a warning rather than failing hard outright?
Someone has posted this question, and while I was writing an answer, it was deleted. I believe the question is very interesting and worth for reposting. Please consider you may have someone who is willing to help you :-(
How Warning 21 is reported
First, let's think of functions which returns unrelated 'a: I do not mean function like let id x = x here since it has type 'a -> 'a and the return type 'a relates with the input. I mean functions like raise : exn -> 'a and exit : int -> 'a.
These functions return unrelated 'a are considered never returning. Since the type 'a (more precisely forall 'a. 'a) has no citizen. Only thing the functions can do are terminating the program (exit or raising an exception) or falling into an infinite loop: let rec loop () = loop ().
Warning 21 is mentioned when the type of a statement is 'a. (Actually there is another condition but I just skip for simplicity.) For example,
# loop (); print_string "end of the infinite loop";;
Warning 21: this statement never returns (or has an unsound type.)
This is the main purpose of warning 21. Then what is the latter half?
"Unsound type"
Warning 21 can be reported even if the statement returns something actually. In this case, as the warning message suggests the statement has a unsound type.
Why unsound? Since the expression does return a value of type forall 'a. 'a, which has no citizen. It breaks the basis of the type theory OCaml depends on.
In OCaml, there are several ways to write an expression with such an unsound type:
Use of Obj.magic. It screws type system therefore you can write an expression of type 'a which returns:
(Obj.magic 1); print_string "2"
Use of external. Same as Obj.magic you can give arbitrary type to any external values and functions:
external crazy : unit -> 'a = "%identity"
let f () = crazy () (* val f : unit -> 'a *)
let _ = f (); print_string "3"
For OCaml type system, it is impossible to distinguish non-returning expressions and expressions with unsound types. This is why it cannot rule out unsound things as errors. Tracking the definitions to tell a statement has an unsound type or not is generally impossible either and costs a lot even when possible.

Why does a partial application have value restriction?

I can understand that allowing mutable is the reason for value restriction and weakly polymorphism. Basically a mutable ref inside a function may change the type involved and affect the future use of the function. So real polymorphism may not be introduced in case of type mismatch.
For example,
# let remember =
let cache = ref None in
(fun x ->
match !cache with
| Some y -> y
| None -> cache := Some x; x)
;;
val remember : '_a -> '_a = <fun>
In remember, cache originally was 'a option, but once it gets called first time let () = remember 1, cache turns to be int option, thus the type becomes limited. Value restriction solves this potential problem.
What I still don't understand is the value restriction on partial application.
For example,
let identity x = x
val identity: 'a -> 'a = <fun>
let map_rep = List.map identity
val map_rep: '_a list -> '_a list = <fun>
in the functions above, I don't see any ref or mutable place, why still value restriction is applied?
Here is a good paper that describes OCaml's current handling of the value restriction:
Garrigue, Relaxing the Value Restriction
It has a good capsule summary of the problem and its history.
Here are some observations, for what they're worth. I'm not an expert, just an amateur observer:
The meaning of "value" in the term "value restriction" is highly technical, and isn't directly related to the values manipulated by a particular language. It's a syntactic term; i.e., you can recognize values by just looking at the symbols of the program, without knowing anything about types.
It's not hard at all to produce examples where the value restriction is too restrictive. I.e., where it would be safe to generalize a type when the value restriction forbids it. But attempts to do a better job (to allow more generalization) resulted in rules that were too difficult to remember and follow for mere mortals (such as myself).
The impediment to generalizing exactly when it would be safe to do so is not separate compilation (IMHO) but the halting problem. I.e., it's not possible in theory even if you see all the program text.
The value restriction is pretty simple: only let-bound expressions that are syntactically values are generalized. Applications, including partial applications, are not values and thus are not generalized.
Note that in general it is impossible to tell whether an application is partial, and thus whether the application could have an effect on the value of a reference cell. Of course in this particular case it is obvious that no such thing occurs, but the inference rules are designed to be sound in the event that it does.
A 'let' expression is not a (syntactic) value. While there is a precise definition of 'value', roughly the only values are identifiers, functions, constants, and constructors applied to values.
This paper and those it references explains the problem in detail.
Partial application doesn't preclude mutation. For example, here is a refactored version of your code that would also be incorrect without value restriction:
let aux cache x =
match !cache with
| Some y -> y
| None -> cache := Some x; x
let remember = aux (ref None)

OCaml: unspecified type variables

Is there anyway to define a variable that match any type in OCaml?
I came up with this: type term = Save of '_
after reading this tutorial
But it doesn't work.
Error: Syntax error
Could someone tell me why I have the error in the code above?
Every type variable matches "any type" in OCaml. However, all type variables in a type definition have to be bound, usually as a parameter:
type 'a term = Save of 'a
The data constructor defined here will have type Save : 'a -> 'a term. Consequently, in a value of type int term, the constructor is known to carry an integer.
But I'm not sure what you are trying to achieve. Maybe you also want an existential type, which "forgets" the type the variable is instantiated with? Then you need to use GADT syntax:
type term = Save : 'a -> term
Here, the data constructor will have type Save : 'a -> term. However, this type is not particularly useful, since you cannot do anything with the constructor's argument later, as it will be fully abstract when you match it (because it can be anything, and there is no way to tell what it is at that point -- unlike with the type above). So without understanding your use case, it's difficult to give a better answer.
A single underscore _ is more of an operator (a pattern, usually) than an identifier in OCaml. So that's your syntax problem. Furthermore, a type variable name can't start with an underscore. If you change to a legal name like 'a, you see the following:
# type term = Save of 'a;;
Error: Unbound type parameter 'a
#
It's not at all clear what you're trying to do, but the most straightforward definition that's similar to what you give might be this:
# type 'a term = Save of 'a;;
type 'a term = Save of 'a
Then you can have values with any type as the contents:
# Save 33;;
- : int term = Save 33
# Save "yes";;
- : string term = Save "yes"
#
I suspect you're trying to do something fancier than this, but if so you'll need to explain it more carefully (at least for me :-).

When does the relaxed value restriction kick in in OCaml?

Can someone give a concise description of when the relaxed value restriction kicks in? I've had trouble finding a concise and clear description of the rules. There's Garrigue's paper:
http://caml.inria.fr/pub/papers/garrigue-value_restriction-fiwflp04.pdf
but it's a little dense. Anyone know of a pithier source?
An Addendum
Some good explanations were added below, but I was unable to find an explanation there for the following behavior:
# let _x = 3 in (fun () -> ref None);;
- : unit -> 'a option ref = <fun>
# let _x = ref 3 in (fun () -> ref None);;
- : unit -> '_a option ref = <fun>
Can anyone clarify the above? Why does the stray definition of a ref within the RHS of the enclosing let affect the heuristic.
I am not a type theorist, but here is my interpretation of Garrigue's explanation. You have a value V. Start with the type that would be assigned to V (in OCaml) under the usual value restriction. There will be some number (maybe 0) monomorphic type variables in the type. For each such variable that appears only in covariant position in the type (on the right sides of function arrows), you can replace it with a fully polymorphic type variable.
The argument goes as follows. Since your monomorphic variable is a variable, you can imagine replacing it with any single type. So you choose an uninhabited type U. Now since it is in covariant position only, U can in turn be replaced by any supertype. But every type is a supertype of an uninhabited type, hence it's safe to replace with a fully polymorphic variable.
So, the relaxed value restriction kicks in when you have (what would be) monomorphic variables that appear only in covariant positions.
(I hope I have this right. Certainly #gasche would do better, as octref suggests.)
Jeffrey provided the intuitive explanation of why the relaxation is correct. As to when it is useful, I think we can first reproduce the answer octref helpfully linked to:
You may safely ignore those subtleties until, someday, you hit a problem with an abstract type of yours that is not as polymorphic as you would like, and then you should remember than a covariance annotation in the signature may help.
We discussed this on reddit/ocaml a few months ago:
Consider the following code example:
module type S = sig
type 'a collection
val empty : unit -> 'a collection
end
module C : S = struct
type 'a collection =
| Nil
| Cons of 'a * 'a collection
let empty () = Nil
end
let test = C.empty ()
The type you get for test is '_a C.collection, instead of the 'a C.collection that you would expect. It is not a polymorphic type ('_a is a monomorphic inference variable that is not yet fully determined), and you won't be happy with it in most cases.
This is because C.empty () is not a value, so its type is not generalized (~ made polymorphic). To benefit from the relaxed value restriction, you have to mark the abstract type 'a collection covariant:
module type S = sig
type +'a collection
val empty : unit -> 'a collection
end
Of course this only happens because the module C is sealed with the signature S : module C : S = .... If the module C was not given an explicit signature, the type-system would infer the most general variance (here covariance) and one wouldn't notice that.
Programming against an abstract interface is often useful (when defining a functor, or enforcing a phantom type discipline, or writing modular programs) so this sort of situation definitely happens and it is then useful to know about the relaxed value restriction.
That's an example of when you need to be aware of it to get more polymorphism, because you set up an abstraction boundary (a module signature with an abstract type) and it doesn't work automatically, you have explicitly to say that the abstract type is covariant.
In most cases it happens without your notice, when you manipulate polymorphic data structures. [] # [] only has the polymorphic type 'a list thanks to the relaxation.
A concrete but more advanced example is Oleg's Ber-MetaOCaml, which uses a type ('cl, 'ty) code to represent quoted expressions which are built piecewise. 'ty represents the type of the result of the quoted code, and 'cl is a kind of phantom region variable that guarantees that, when it remains polymorphic, the scoping of variable in quoted code is correct. As this relies on polymorphism in situations where quoted expressions are built by composing other quoted expressions (so are generally not values), it basically would not work at all without the relaxed value restriction (it's a side remark in his excellent yet technical document on type inference).
The question why the two examples given in the addendum are typed differently has puzzled me for a couple of days. Here is what I found by digging into the OCaml compiler's code (disclaimer: I'm neither an expert on OCaml nor on the ML type system).
Recap
# let _x = 3 in (fun () -> ref None);; (* (1) *)
- : unit -> 'a option ref = <fun>
is given a polymorphic type (think ∀ α. unit → α option ref) while
# let _x = ref 3 in (fun () -> ref None);; (* (2) *)
- : unit -> '_a option ref = <fun>
is given a monomorphic type (think unit → α option ref, that is, the type variable α is not universally quantified).
Intuition
For the purposes of type checking, the OCaml compiler sees no difference between example (2) and
# let r = ref None in (fun () -> r);; (* (3) *)
- : unit -> '_a option ref = <fun>
since it doesn't look into the body of the let to see if the bound variable is actually used (as one might expect). But (3) clearly must be given a monomorphic type, otherwise a polymorphically typed reference cell could escape, potentially leading to unsound behaviour like memory corruption.
Expansiveness
To understand why (1) and (2) are typed the way they are, let's have a look at how the OCaml compiler actually checks whether a let expression is a value (i.e. "nonexpansive") or not (see is_nonexpansive):
let rec is_nonexpansive exp =
match exp.exp_desc with
(* ... *)
| Texp_let(rec_flag, pat_exp_list, body) ->
List.for_all (fun vb -> is_nonexpansive vb.vb_expr) pat_exp_list &&
is_nonexpansive body
| (* ... *)
So a let-expression is a value if both its body and all the bound variables are values.
In both examples given in the addendum, the body is fun () -> ref None, which is a function and hence a value. The difference between the two pieces of code is that 3 is a value while ref 3 is not. Therefore OCaml considers the first let a value but not the second.
Typing
Again looking at the code of the OCaml compiler, we can see that whether an expression is considered expansive determines how the type of the let-expressions is generalised (see type_expression):
(* Typing of toplevel expressions *)
let type_expression env sexp =
(* ... *)
let exp = type_exp env sexp in
(* ... *)
if is_nonexpansive exp then generalize exp.exp_type
else generalize_expansive env exp.exp_type;
(* ... *)
Since let _x = 3 in (fun () -> ref None) is nonexpansive, it is typed using generalize which gives it a polymorphic type. let _x = ref 3 in (fun () -> ref None), on the other hand, is typed via generalize_expansive, giving it a monomorphic type.
That's as far as I got. If you want to dig even deeper, reading Oleg Kiselyov's Efficient and Insightful Generalization alongside generalize and generalize_expansive may be a good start.
Many thanks to Leo White from OCaml Labs Cambridge for encouraging me to start digging!
Although I'm not very familiar with this theory, I have asked a question about it.
gasche provided me with a concise explanation. The example is just a part of OCaml's map module. Check it out!
Maybe he will be able to provide you with a better answer. #gasche

Casting to and from a type parameter in F#

I'm just starting out in F#, and some of the issues around casting are confusing me mightily. Unfortunately, my background reading to try to figure out why is confusing me even more, so I'm looking for some specific answers I can fit into the general explanations...
I've got a ReadOnlyCollection<'T> of enums, produced by this function:
let GetValues<'T when 'T :> Enum> () =
(new ReadOnlyCollection<'T>(Enum.GetValues (typeof<'T>) :?> 'T[])) :> IList<'T>
What I want to do with it is find all the bits of the enum that are used by its values (i.e., bitwise-or all the values in the list together), and return that as the generic enum type, 'T. The obvious way to do that seemed to me to be this:
let UsedBits<'T when 'T :> Enum> () =
GetValues<'T>()
|> Seq.fold (fun acc a -> acc ||| a) 0
...except that that fails to compile, with the error "The declared type parameter 'T' cannot be used here since the type parameter cannot be resolved at compile time."
I can get the actual job done by converting to Int32 first (which I don't really want to do, because I want this function to work on all enums regardless of underlying type), viz.:
let UsedBits<'T when 'T :> Enum> () =
GetValues<'T>()
|> Seq.map (fun a -> Convert.ToInt32(a))
|> Seq.fold (fun acc a -> acc ||| a) 0
...but then the result is produced as Int32. If I try to cast it back to 'T, I again get compilation errors.
I don't want to get too specific in my question because I'm not sure which specifics I should be asking about, so -- where's the flaw(s) in this approach? How should I be going about it?
(Edited to add:, post #Daniel's answer
Alas, this appears to be one of those situations where I don't understand the context well enough to understand the answer, so...
I think I understand what inline and the different constraint are doing in your answer, but being an F# newbie, would you mind awfully expanding on those things a little so I can check that my understanding isn't way off base? Thanks.
)
You could do this:
let GetValues<'T, 'U when 'T : enum<'U>>() =
Enum.GetValues(typeof<'T>) :?> 'T[]
let inline GetUsedBits() =
GetValues() |> Seq.reduce (|||)
inline allows a more flexible constraint, namely 'T (requires member ( ||| )). Without it, the compiler must choose a constraint that can be expressed in IL, or, if unable to do so, choose a concrete type. In this case it chooses int since it supports (|||).
Here's a simpler repro:
let Or a b = a ||| b //add 'inline' to compare
See Statically Resolved Type Parameters on MSDN for more info.