OCaml gives function `A -> 1 | _ -> 0 the type [> `A] -> int, but why isn't that [> ] -> int?
This is my reasoning:
function `B -> 0 has type [<`B] -> int. Adding a `A -> 0 branch to make it function `A -> 1 | `B -> 0 loosens that to [<`A|`B] -> int. The function becomes more permissive in the type of argument it can accept. This makes sense.
function _ -> 0 has type 'a -> int. This type is unifiable with [> ] -> int, and [> ] is an already open type (very permissive). Adding the `A -> 0 branch to make it function `A -> 1 | _ -> 0 restricts the type to [>`A] -> int. That doesn't make sense to me. Indeed, adding still another branch `C -> 1 will make it [>`A|`C] -> int, further restricting the type. Why?
Note: I am not looking for workarounds, I'd just like to to know the logic behind this behavior.
On a related note, function `A -> `A | x -> x has type ([>`A] as 'a) -> 'a, and while that is also a restrictive open type for the parameter, I can understand the reason. The type should unify with 'a -> 'a, [>` ] -> 'b, 'c -> [>`A]; the only way to do it seems to be ([>`A] as 'a) -> 'a.
Does it exist a similar reason for my first example?
A possible answer is that the type [> ] -> int would allow an argument (`A 3) but this isn't allowed for function `A -> 1 | _ -> 0. In other words, the type needs to record the fact that `A takes no parameters.
The reason is a very practical one:
In older versions of OCaml the inferred type was
[`A | .. ] -> int
which meant that A takes no argument but may be absent.
However this type is unifiable with
[`B |`C ] -> int
which results in `A being discarded without any kind of check.
It makes easy introducing errors with misspellings.
For this reason variant constructors must either appear in an upper or a lower bound.
The typing of (function `A -> 1 | _ -> 0) is reasonable, as explained by Jeffrey. The reason why
(function `A -> 1 | _ -> 0) ((fun x -> (match x with `B -> ()); x) `B)
fails to type-check should be explained, in my opinion, by the latter part of the expression. Indeed the function (fun x -> (match x with `B -> ()); x) has input type [< `B] while its parameter `B has type [> `B ]. The unification of both types gives the closed type [ `B ] which is not polymorphic. It cannot be unified with the input type [> `A ] that you get from (function `A -> 1 | _ -> 0).
Fortunately, polymorphic variants do not only rely on (row) polymorphism. You can also use subtyping in situations, such as this, one where you want to enlarge a closed type: [ `B ] is a subtype of, for example, [`A | `B] which is an instance of [> `A ]. Subtyping casts are explicit in OCaml, using the syntax (expr :> ty) (casting to some type), or (expr : ty :> ty) in case the domain type cannot be inferred correct.
You can therefore write:
let b = (fun x -> (match x with `B -> ()); x) `B in
(function `A -> 1 | _ -> 0) (b :> [ `A | `B ])
which is well-typed.
Related
What this error means and how can i solve it? I am trying to generate types based on the CSS spec at styled-ppx and got stuck at this error that i dont know how to it fix neither what that means exactly.
I tried to get OCaml inference from the target file with dune using (ocamlc_flags -i :standard) because my suspect is that infered types and generated types are crashing since both are polymorphic variants, but had problems generating targets.
You can check the pull request with the problem being reproducible here
In your ppx code,
let mk_typ = (name, types) => {
let core_type = ptyp_variant(types, Closed, None);
it is unlikely that the meaning of types matches your expectation.
In this context, the list of type types represents an intersection (aka a conjunction) of types.
For instance, in
type 'a t = [< `A of int & float ] as 'a
the types list would contain the ast nodes for the type int and float.
(You can have a look at the latest version of OCaml manual https://ocaml.org/api/compilerlibref/Parsetree.html#TYPErow_field_desc to have more detailed description of those AST nodes.)
You probably meant to box the list of types in a product type
type t = [ `A of (int * float) ]
Indeed, polymorphic variant constructors always have arity one. Otherwise,
`A _ * _ would not be unifiable with `A of _.
Concerning your exact error message, the conjunctive type error that you are half-describing is likely related to the fact that conjunctive (like int & float) are only allowed as argument of polymorphic variant constructors in the upper bound of a polymorphic variant type.
In other words,
type 'a t = [< `A of int & float ] as 'a
is fine because we are on the right-hand side of <.
But both
type 'a t = [> `A of int & float ] as 'a
and
type t = [ `A of int & float ]
yields an error
Error: The present constructor A has a conjunctive type
because the conjunction of types appears in the lower bound of the type which lists the constructors that were explicitly present.
EDIT: Why are conjunctive types sometimes allowed?
Having conjunctive types makes it possible to use functions that have incoherent interpretation of some constructors.
For instance, I could have a function that works on either
`A of int or `B of float
let f = function
| `A n -> float_of_int n
| `B f -> f
and another function that expects both `A and `B to have a float argument
let g (`A f | `B f ) = f
If I try to apply f and g on the same argument
let h x = f x +. g x
I end up in a situation where h can be applied to `B 0. without trouble
let z = h (`B 0.)
but trying to use h on `A _ for any _ cannot work
let error = h (`A 0)
Error: This expression has type [> `A of int ]
but an expression was expected of type
[< `A of float & int | `B of float ]
Types for tag `A are incompatible
This is the kind of situation where conjunctive types arise: h has type [< `B of float | `A of int & float ] -> float. Moreover, we can wait to get a concrete argument of the form `A of x to check if the constructor argument x fit in the conjunction of types int & float. In this specific case, we know that there are no types that are both int and float, but more complex cases can happen, and delaying the check is a simple way to handle all cases.
Contrarily, when we have a concrete value of type [> `A of 'x ], there is no reason to delay this check. Thus it is not possible to construct a positive value that has a conjunctive type in OCaml. Consequently, it forbids the possibility to write types with conjunctive types in a positive position.
Suppose I have a type consisting of multiple polymorphic variants (covariantly) such as the following:
[> `Ok of int | `Error of string]
Let's further suppose that I want to factor this definition into some kind of type constructor and a concrete type int. My first attempt was the following:
type 'a error = [> `Ok of 'a | `Error of string]
However, using a definition like this produces a really strange type error mentioning a type variable 'b that doesn't appear anywhere in the definition.
$ ocaml
OCaml version 4.07.0
# type 'a error = [> `Ok of 'a | `Error of string ];;
Error: A type variable is unbound in this type declaration.
In type [> `Error of string | `Ok of 'a ] as 'b the variable 'b is unbound
This 'b is an autogenerated name, adding an explicit 'b shifts the variable to 'c.
$ ocaml
OCaml version 4.07.0
# type ('a, 'b) error = [> `Ok of 'a | `Error of 'b ];;
Error: A type variable is unbound in this type declaration.
In type [> `Error of 'b | `Ok of 'a ] as 'c the variable 'c is unbound
Using the invariant construction [ `Thing1 of type1 | `Thing2 of type 2 ] appears to work fine in this context.
$ ocaml
OCaml version 4.07.0
# type 'a error = [ `Ok of 'a | `Error of string ] ;;
type 'a error = [ `Error of string | `Ok of 'a ]
#
However, explicitly marking the type parameter as covariant does not salvage the original example.
$ ocaml
OCaml version 4.07.0
# type +'a error = [> `Ok of 'a | `Error of string];;
Error: A type variable is unbound in this type declaration.
In type [> `Error of string | `Ok of 'a ] as 'b the variable 'b is unbound
And, just for good measure, adding a contravariance annotation also does not work.
$ ocaml
OCaml version 4.07.0
# type -'a error = [> `Ok of 'a | `Error of string];;
Error: A type variable is unbound in this type declaration.
In type [> `Error of string | `Ok of 'a ] as 'b the variable 'b is unbound
Attempting to guess the name that the compiler will use for the unbound type variable and adding it as a parameter on the left also does not work and produces a very bizarre error message.
$ ocaml
OCaml version 4.07.0
# type ('a, 'b) string = [> `Ok of 'a | `Error of string] ;;
Error: The type constructor string expects 2 argument(s),
but is here applied to 0 argument(s)
Is there a way of making a type constructor that can effectively "substitute different types" for int in [> `Ok of int | `Error of string]?
This isn't an issue of variance, or parametric polymorphism, but of row polymorphism. When you add > or < it also adds an implicit type variable, the row variable, that will hold the "full" type. You can see this type variable made explicit in the error:
[> `Error of string | `Ok of 'a ] as 'b
Note the as 'b part at the end.
In order to alias the type you have to make the type variable explicit, so you can reference it as a type parameter on the alias:
type ('a, 'r) error = [> `Ok of 'a | `Error of string ] as 'r
Note also, in case you have or when you will, run into objects, that this applies there as well. An object type with .. has an implicit type variable that you need to make explicit in order to alias it:
type 'r obj = < foo: int; .. > as 'r
The following type declarations do not work:
type 'a or_null = [ 'a | `Null ]
and
type 'a or_null = [ 'a | `Null ] constraint 'a = [> `A | `B ]
With the message:
Error: The type 'a does not expand to a polymorphic variant type
Hint: Did you mean `a
I would like to achieve this without using another layer in the memory representation (and in the syntax). In particular, I want to avoid using an option type such as
type 'a or_null = | A of 'a | Null
Is there a way to have such a type using only polymorphic variants? The final goal would be to write e.g. monads on 'a or_null types. (And this is actually the tricky part.)
Polymorphic variants cannot track the absence of a specific constructor. This implies that we cannot really write the usual bind. If we try
let bind x f =
match x with
| `Null -> `Null
| x -> f x
we get
val bind: ([> `Null] as 'a) -> ('a -> ([>`Null] as 'b)) -> 'b
If for readability's sake, we add the following type abbreviation
type 'a m = [> `Null] as 'a
(which is an alternative definition of or_null) the previous type read as
val bind: 'a m -> ('a m -> 'b m) -> 'b m
In other words, the function argument f of bind must already handle the `Null case in its argument by itself because the type system cannot express the constraint x <> `Null in the second branch of the match.
If I define
fun id x = x
Then naturally id has type 'a -> 'a
Of course, id 0 evaluates to 0, which makes perfect sense.
Since this makes perfect sense, I should be able to encapsulate it by a function:
fun applyToZero (f: 'a -> 'a) = f 0
With the hope that applyToZero will have type ('a -> 'a) -> int and applyToZero id will evaluate to 0
But when I try to define applyToZero as above, SML/NJ gives an odd error message which begins:
unexpected exception (bug?) in SML/NJ: Match [nonexhaustive match failure]
raised at: ../compiler/Elaborator/types/unify.sml:84.37
This almost looks like a bug in the compiler itself. Weird, but possible.
But PolyML doesn't like it either (though its error message is less odd):
> fun applyToZero (f: 'a -> 'a) = f 0;
poly: : error: Type error in function application.
Function: f : 'a -> 'a
Argument: 0 : int
Reason: Can't unify int to 'a (Cannot unify with explicit type variable)
Found near f 0
The following does work:
fun ignoreF (f: 'a -> 'a) = 1
with the inferred type ('a -> 'a) -> int. This shows that it isn't impossible to create a higher order function of this type.
Why doesn't SML accept my definition of applyToZero? Is there any workaround that will allow me to define it so that its type is ('a -> 'a) -> int?
Motivation: in my attempt to solve the puzzle in this question, I was able to define a function tofun of type int -> 'a -> 'a and another function fromfun with the desired property that fromfun (tofun n) = n for all integers n. However, the inferred type of my working fromfun is ('int -> 'int) -> 'int). All of my attempts to add type annotations so that SML will accept it as ('a -> 'a) -> int have failed. I don't want to show my definition of fromfun since the person that asked that question might still be working on that puzzle, but the definition of applyToZero triggers exactly the same error messages.
It can't be done in plain Hindley-Milner, like used by SML, because it does not support so-called higher-ranked or first-class polymorphism. The type annotation 'a -> 'a and the type ('a -> 'a) -> int do not mean what you think they do.
That becomes clearer if we make the binder for the type variable explicit.
fun ignoreF (f: 'a -> 'a) = 1
actually means
fun 'a ignoreF (f: 'a -> 'a) = 1
that is, 'a is a parameter to the whole function ignoreF, not to its argument f. Consequently, the type of the function is
ignoreF : ∀ 'a. (('a -> 'a) -> int)
Here, I make the binder for 'a explicit in the type as a universal quantifier. That's how you write such types in type theory, while SML keeps all quantifiers implicit in its syntax. Now the type you thought this had would be written
ignoreF : (∀ 'a. ('a -> 'a)) -> int
Note the difference: in the first version, the caller of ignoreF gets to choose how 'a is instantiated, hence it could be anything, and the function cannot assume its int (which is why applyToZero does not type-check). In the second type, the caller of the argument gets to choose, i.e., ignoreF.
But such a type is not supported by Hindley-Milner. It only supports so-called prenex polymorphism (or rank 0 polymorphism) where all the ∀ are on the outermost level -- which is why it can keep them implicit, since there is no ambiguity under this restriction. The problem with higher-ranked polymorphism is that type inference for it is undecidable.
So your applyToZero cannot have the type you want in SML. The only way to achieve something like it is by using the module system and its functors:
functor ApplyToZero (val f : 'a -> 'a) = struct val it = f 0 end
Btw, the error message you quote from SML/NJ cannot possibly be caused by the code you showed. You must have done something else.
If we use Hindley-Milner type inference algorithm on fun applyToZero f = f 0 we are going to get f : int -> 'a because of the term f 0.
Obviously, f is a function f : 'b -> 'a. We apply this function to 0, thus 'b = int. Hence, the explicit type annotation f : 'a -> 'a produces the error you observe.
By the way, SML/NJ v110.80 works fine on my machine and prints the following error message:
stdIn:2.39-2.42 Error: operator and operand don't agree [overload - user bound tyvar]
operator domain: 'a
operand: [int ty]
in expression:
f 0
Trying to compile
module F (M : sig
type t = [> `Foo ]
end) = struct
type t = [ M.t | `Bar ]
end
gets me
Error: A type variable is unbound in this type declaration.
In type [> `Foo ] as 'a the variable 'a is unbound
What am I doing wrong?
type t = [> `Foo] is invalid since [> `Foo] is an open type and contains a type variable implicitly. The definition is rejected just as the following type definition is rejected since the RHS has a type variable which is not quantified in LHS:
type t = 'a list
You have to make it closed:
type t = [ `Foo ]
or quantify the type variable:
type 'a t = [> `Foo] as 'a
which is equivalent to
type 'a t = 'a constraint 'a = [> `Foo]
This seems to work:
module F ( M : sig
type base_t = [ `Foo ]
type 'a t = [> base_t] as 'a
end) = struct
type t = [ M.base_t | `Bar ] M.t
end
M.base_t is closed, while M.t('a) is polymorphic. F constructs M.t using M.base_t extended with 'Bar.
Here is reasonml try link, which includes the snippet above both in OCaml and ReasonML syntax, and proves that it compiles.