I'm working on an assignment where I have to write a function to get the length of a list. This is a trivial task, but I've come across something that I don't understand.
My simple code
val len = foldr (fn(_, y) => y + 1) 0
produces this warning
Warning: type vars not generalized because of
value restriction are instantiated to dummy types (X1,X2,...)
and when I try to run it in the REPL, I get this:
len [1, 2, 3, 4];
stdIn:18.1-18.17 Error: operator and operand don't agree [overload conflict]
operator domain: ?.X1 list operand:
[int ty] list in expression:
len (1 :: 2 :: 3 :: <exp> :: <exp>)
I don't understand why this doesn't work. I do know some functional programming principles, and this should work, since its very simple partial application.
Of course I can make it work without partial application, like this
fun len xs = foldr (fn(_, y) => y + 1) 0 xs
but I would like to understand why the first version doesn't work.
This is an instance of the value restriction rule application:
In short, the value restriction says that generalization can only occur if the right-hand side of an expression is syntactically a value.
Syntactically,
foldr (fn(_, y) => y + 1) 0
is not a value, it's a function application, that's why it hasn't been assigned a polymorphic type. It has been instantiated with a dummy type, which has a very limited use, e.g. this works:
len []
but in most cases len defined as val is useless.
This restriction exists to guarantee type safety in the presence of variable assignment (via references). More details can be found in the linked page.
Related
The first and the second flatMap work well. Why doesn't the third one work?
fun flatMap f xs = List.concat(List.map f xs)
fun flatMap f = List.concat o List.map f
val flatMap = (fn mmp => List.concat o mmp) o List.map;
This is due to a rule called "value polymorphism" or the "value restriction". According to this rule, a value declaration can't create a polymorphic binding if the expression might be "expansive"; that is, a value declaration can only create a polymorphic binding if it conforms to a highly restricted grammar that ensures it can't create ref cells or exception names.
In your example, since (fn mmp => List.concat o mmp) o List.map calls the function o, it's not non-expansive; you know that o doesn't create ref cells or exception names, but the grammar can't distinguish that.
So the declaration val flatMap = (fn mmp => List.concat o mmp) o List.map is still allowed, but it can't create a polymorphic binding: it has to give flatMap a monomorphic type, such as (int -> real list) -> int list -> real list. (Note: not all implementations of Standard ML can infer the desired type in all contexts, so you may need to add an explicit type hint.)
This restriction exists to ensure that we don't implicitly cast from one type to another by writing to a ref cell using one type and reading from it using a different type, or by wrapping one type in a polymorphic exception constructor and unwrapping it using a different type. For example, the below programs are forbidden by the value restriction, but if they were allowed, each would create a variable named garbage of type string that is initialized from the integer 17:
val refCell : 'a option ref =
ref NONE
val () = refCell := SOME 17
val garbage : string =
valOf (! refCell)
val (wrap : 'a -> exn, unwrap : exn -> 'a) =
let
exception EXN of 'a
in
(fn x => EXN x, fn EXN x => x)
end
val garbage : string =
unwrap (wrap 17)
For more information:
"ValueRestriction" in the MLton documentation
"Value restriction" on the English Wikipedia
"Types and Type Checking" in SML/NJ's guide to converting programs from Standard ML '90 to Standard ML '97. (Standard ML '90 had a different version of this rule, that was more permissive — it would have allowed your program — but considered "somewhat subtle" and in some cases "unpleasant", hence its replacement in Standard ML '97.)
the following sections of The Definition of Standard ML (Revised) (PDF):
§4.7 "Non-expansive Expressions", page 21, which defines which expressions are considered "non-expansive" (and can therefore be used in polymorphic value declarations).
§4.8 "Closure", pages 21–22, which defines the operation that makes a binding polymorphic; this operation enforces the value restriction by preventing the binding from becoming polymorphic if the expression might be expansive.
inference rule (15), page 26, which uses the aforementioned operation; see also the comment on page 27.
the comment on inference rule (20), page 27, which explains why the aforementioned operation is not applied to exception declarations. (Technically this is somewhat separate from the value restriction; but the value restriction would be useless without this.)
§G.4 "Value Polymorphism", pages 105–106, which discusses this change from Standard ML '90.
I'm reading Expert F# 4.0 and at some point (p.93) the following syntax is introduced for list:
type 'T list =
| ([])
| (::) of 'T * 'T list
Although I understand conceptually what's going on here, I do not understand the syntax. Apparently you can put [] or :: between parentheses and they mean something special.
Other symbols aren't allowed, for example (++) or (||). So what's going on here?
And another thing is the 'operator' nature of (::). Suppose I have the following (weird) type:
type 'T X =
| None
| Some of 'T * 'T X
| (::) of 'T * 'T X
Now I can say:
let x: X<string> = Some ("", None)
but these aren't allowed:
let x: X<string> = :: ("", None)
let x: X<string> = (::) ("", None)
So (::) is actually something completely different than Some, although both are cases in a discriminated union.
Theoretically, F# spec (see section 8.5) says that union case identifiers must be alphanumeric sequences starting with an upper-case letter.
However, this way of defining list cons is an ML idiomatic thing. There would be riots in the streets if we were forced to write Cons (x, Cons(y, Cons (z, Empty))) instead of x :: y :: z :: [].
So an exception was made for just these two identifiers - ([]) and (::). You can use these, but only these two. Besides these two, only capitalized alphanumeric names are allowed.
However, you can define free-standing functions with these funny names:
let (++) a b = a * b
These functions are usually called "operators" and can be called via infix notation:
let x = 5 ++ 6 // x = 30
As opposed to regular functions that only support prefix notation - i.e. f 5 6.
There is a separate quite intricate set of rules about which characters are allowed in operators, which can be only unary, which can be only binary, which can be both, and how they define the resulting operator precedence. See section 4.1 of the spec or here for full reference.
I was reading a little bit about the value restriction in Standard ML and tried translating the example to OCaml to see what it would do. It seems like OCaml produces these types in contexts where SML would reject a program due to the value restriction. I've also seen them in other contexts like empty hash tables that haven't been "specialized" to a particular type yet.
http://mlton.org/ValueRestriction
Here's an example of a rejected program in SML:
val r: 'a option ref = ref NONE
val r1: string option ref = r
val r2: int option ref = r
val () = r1 := SOME "foo"
val v: int = valOf (!r2)
If you enter the first line verbatim into the SML of New Jersey repl you get
the following error:
- val r: 'a option ref = ref NONE;
stdIn:1.6-1.33 Error: explicit type variable cannot be generalized at its binding declaration: 'a
If you leave off the explicit type annotation you get
- val r = ref NONE
stdIn:1.6-1.18 Warning: type vars not generalized because of
value restriction are instantiated to dummy types (X1,X2,...)
val r = ref NONE : ?.X1 option ref
What exactly is this dummy type? It seems like it's completely inaccessible and fails to unify with anything
- r := SOME 5;
stdIn:1.2-1.13 Error: operator and operand don't agree [overload conflict]
operator domain: ?.X1 option ref * ?.X1 option
operand: ?.X1 option ref * [int ty] option
in expression:
r := SOME 5
In OCaml, by contrast, the dummy type variable is accessible and unifies with the first thing it can.
# let r : 'a option ref = ref None;;
val r : '_a option ref = {contents = None}
# r := Some 5;;
- : unit = ()
# r ;;
- : int option ref = {contents = Some 5}
This is sort of confusing and raises a few questions.
1) Could a conforming SML implementation choose to make the "dummy" type above accessible?
2) How does OCaml preserve soundness without the value restriction? Does it make weaker guarantees than SML does?
3) The type '_a option ref seems less polymorphic than 'a option ref. Why isn't let r : 'a option ref = ref None;; (with an explicit annotation) rejected in OCaml?
Weakly polymorphic types (the '_-style types) are a programming convenience rather than a true extension of the type system.
2) How does OCaml preserve soundness without the value restriction? Does it make weaker guarantees than SML does?
OCaml does not sacrifice value restriction, it rather implements a heuristic that saves you from systematically annotating the type of values like ref None whose type is only “weekly” polymorphic. This heuristic by looking at the current “compilation unit”: if it can determine the actual type for a weekly polymorphic type, then everything works as if the initial declaration had the appropriate type annotation, otherwise the compilation unit is rejected with the message:
Error: The type of this expression, '_a option ref,
contains type variables that cannot be generalized
3) The type '_a option ref seems less polymorphic than 'a option ref. Why isn't let r : 'a option ref = ref None;; (with an explicit annotation) rejected in OCaml?
This is because '_a is not a “real” type, for instance it is forbidden to write a signature explicitly defining values of this “type”:
# module A : sig val table : '_a option ref end = struct let option = ref None end;;
Characters 27-30:
module A : sig val table : '_a option ref end = struct let option = ref None end;;
^^^
Error: The type variable name '_a is not allowed in programs
It is possible to avoid using these weakly polymorphic types by using recursive declarations to pack together the weakly polymorphic variable declaration and the later function usage that completes the type definition, e.g.:
# let rec r = ref None and set x = r := Some(x + 1);;
val r : int option ref = {contents = None}
val set : int -> unit = <fun>
1) Could a conforming SML implementation choose to make the "dummy" type above accessible?
The revised Definition (SML97) doesn't specify that there be a "dummy" type; all it formally specifies is that the val can't introduce a polymorphic type variable, since the right-hand-side expression isn't a non-expansive expression. (There are also some comments about type variables not leaking into the top level, but as Andreas Rossberg points out in his Defects in the Revised Definition of Standard ML, those comments are really about undetermined types rather than the type variables that appear in the definition's formalism, so they can't really be taken as part of the requirements.)
In practice, I think there are four approaches that implementations take:
some implementations reject the affected declarations during type-checking, and force the programmer to specify a monomorphic type.
some implementations, such as MLton, prevent generalization, but defer unification, so that the appropriate monomorphic type can become clear later in the program.
SML/NJ, as you've seen, issues a warning and instantiates a dummy type that cannot subsequently be unified with any other type.
I think I've heard that some implementation defaults to int? I'm not sure.
All of these options are presumably permitted and apparently sound, though the "defer unification" approach does require care to ensure that the type doesn't unify with an as-yet-ungenerated type name (especially a type name from inside a functor, since then the monomorphic type may correspond to different types in different applications of the functor, which would of course have the same sorts of problems as a regular polymorphic type).
2) How does OCaml preserve soundness without the value restriction? Does it make weaker guarantees than SML does?
I'm not very familiar with OCaml, but from what you write, it sounds like it uses the same approach as MLton; so, it should not have to sacrifice soundness.
(By the way, despite what you imply, OCaml does have the value restriction. There are some differences between the value restriction in OCaml and the one in SML, but none of your code-snippets relates to those differences. Your code snippets just demonstrate some differences in how the restriction is enforced in OCaml vs. one implementation of SML.)
3) The type '_a option ref seems less polymorphic than 'a option ref. Why isn't let r : 'a option ref = ref None;; (with an explicit annotation) rejected in OCaml?
Again, I'm not very familiar with OCaml, but — yeah, that seems like a mistake to me!
To answer the second part of your last question,
3) [...] Why isn't let r : 'a option ref = ref None;; (with an explicit annotation) rejected in OCaml?
That is because OCaml has a different interpretation of type variables occurring in type annotations: it interprets them as existentially quantified, not universally quantified. That is, a type annotation only has to be right for some possible instantiation of its variables, not for all. For example, even
let n : 'a = 5
is totally valid in OCaml. Arguably, this is rather misleading and not the best design choice.
To enforce polymorphism in OCaml, you have to write something like
let n : 'a. 'a = 5
which would indeed cause an error. However, this introduces a local quantifiers, so is still somewhat different from SML, and doesn't work for examples where 'a needs to be bound elsewhere, e.g. the following:
fun pair (x : 'a) (y : 'a) = (x, y)
In OCaml, you have to rewrite this to
let pair : 'a. 'a -> 'a -> 'a * 'a = fun x y -> (x, y)
I'm trying to write a SML function that has two argument, the first is an
int and the second is a list of lists. The objective is to insert the first argument onto the front of every list in the second arguemnt. For example, append_to_front(1,[[3,4],[6,8],[]]) should return [[1,3,4],[1,6,8],[1]].
I have the code:
fun append_to_front(a:int, L:int list list) =
if L = []
then []
else a::hd(L)::append_to_front(a, tl(L));
and I get the error message: Error: operator and operand don't agree [tycon mismatch]. Why?
The cons operator :: has type 'a * 'a list -> 'a list, that is, it requires an element on the left and a list on the right. Moreover, it is right-associative, i.e., a::b::c = a::(b::c).
In your case, a has type int, b and c both have type int list. Consequently, the second use of :: isn't well-typed, because it has a list on both sides. Use list concatenation # in that place instead.
Why is the type of a plus ( + ) considered to be int -> int -> int as opposed to (int * int) -> int? To me, the second makes sense because it "accepts" a 2-tuple (the addends) and returns a single int (their sum).
Thank you!
You can make a language where (+) has the type (int * int) -> int. In fact, SML works exactly this way. It just affects the meaning of infix operators. However OCaml conventions strongly favor the use of curried functions (of the type a -> b -> c) rather than uncurried ones. One nice result is that you can partially apply them. For example ((+) 7) is a meaningful expression of type int -> int. I find this notation useful quite often.
This might seem a little unhelpful, but it's because the function takes two arguments.
When a function takes a tuple, it is in effect taking a single argument.
Because (+) is an inline function, taking a single argument would not be useful, as it would look like + (1,2) as opposed to 1 + 2.