Prevent SML type from becoming eqtype without hiding constructors - sml

In datatype declarations, Standard ML will produce an equality type if all of the type arguments to all of the variants are themselves eqtypes.
I've seen comments in a few places lamenting the inability of users to provide their own definition of equality and construct their own eqtypes and unexpected consequences of the SML rules (e.g. bare refs and arrays are eqtypes, but datatype Foo = Foo of (real ref) is not an eqtype).
Source: http://mlton.org/PolymorphicEquality
one might expect to be able to compare two values of type real t, because pointer comparison on a ref cell would suffice. Unfortunately, the type system can only express that a user-defined datatype admits equality or not.
I'm wondering whether it is possible to block eqtyping. Say, for instance, I am implementing a set as a binary tree (with an unnecessary variant) and I want to pledge away the ability to structurally compare sets with each other.
datatype 'a set = EmptySet | SetLeaf of 'a | SetNode of 'a * 'a set * 'a set;
Say I don't want people to be able to distinguish SetLeaf(5) and SetNode(5, EmptySet, EmptySet) with = since it's an abstraction-breaking operation.
I tried a simple example with datatype on = On | Off just to see if I could demote the type to a non-eqtype using signatures.
(* attempt to hide the "eq"-ness of eqtype *)
signature S = sig
type on
val foo : on
end
(* opaque transcription to kill eqtypeness *)
structure X :> S = struct
datatype on = On | Off
let foo = On
end
It seems that transparent ascription fails to prevent X.on from becoming an eqtype, but opaque ascription does prevent it. However, these solutions are not ideal because they introduce a new module and hide the data constructors. Is there a way to prevent a custom type or type constructor from becoming an eqtype or admitting equality without hiding its data constructors or introducing new modules?

Short answer is no. When a type's definition is visible, it's eq-ness is whatever the definition implies. The only way to prevent it being eq then is to tweak the definition such that it isn't, for example, by adding a dummy constructor with a real parameter.
Btw, small correction: your type foo should be an equality type. If your SML implementation disagrees then it has a bug. A different case is real bar when datatype 'a bar = Bar of 'a ref (which is what the MLton manual discusses). The reason that the first one works but the second doesn't is that ref is magic in SML: it has a form of polymorphic eq-ness that user types cannot have.

Related

Is it possible to support higher-kinded types in Standard ML?

I have read in this post that ML dialects do not allow type variables of non-ground kind. E.g. the last statement is not representable:
-- Haskell code
type Ground = Int
type FirstOrder a = Maybe a
type SecondOrder c = c Int -- ML do not allow :c
OCaml has support of higher-kinded only at the level of modules. There are some explanations (here and author's comment here) about which features of OCaml clash with higher-kinded types opportunity.
If I understood it correctly, the main problem is in the following facts:
OCaml does not follow a "freshness" restriction for type definitions: construct type can define both an alias (an the type will remain the same) and a new fresh type
type alias definition can be hidden
AFAIK, Standard ML has different constructs for type definition and aliases: type for aliases and datatype for new fresh types introduction.
Unfortunatelly, I do not know SML well enough -- is it possible to export type aliases with its definition hidden? And can someone please show me if there are any other SML features that still do not go well with an opportunity of higher-kinded types?
Probably there will be some problems with functors -- Could one be so kind to show a code example of it? I've heard several times about such cases but still have not found a complete example of it.
Yes, SML can express the equivalent of higher-kinded types through functors, and can also make them abstract. Useless example:
functor F (type 'a t) :> sig type 'a u end =
struct
type 'a u = ('a t) t
end
However, unlike OCaml, SML does not (officially) have higher-order functors, so per the standard, you can only express second-order type constructors this way.
FWIW, OCaml may use the same keyword for type aliases and generative types (type vs datatype in SML), but they are still distinguished syntactically, by their right-hand side. So that's no real difference to SML.In both languages, an abstract occurring in a signature can be implemented as either a type alias or a generative type. So the problem for type inference that Leo is alluding to exists equally in both. Haskell can get away without that problem because it does not have the same expressiveness regarding type abstraction (i.e., no "sealing" operator for modules).

SML Warning: Type Vars Not Generalized when using Empty Lists or NONE option

I can't for the life of me figure out why the following SML function is throwing a Warning in my homework problem:
fun my_func f ls =
case ls of
[] => raise MyException
| head :: rest => case f head of
SOME v => v
| NONE => my_func f rest
fun f a = if isSome a then a else NONE;
Whenever I call my_func with the following test functions:
my_func f [NONE, NONE];
my_func f [];
I always get the warning:
Warning: type vars not generalized because of
value restriction are instantiated to dummy types (X1,X2,...)
Whenever I pass in an options list containing at least one SOME value, this Warning is not thrown. I know it must be something to do with the fact that I am using polymorphism in my function currying, but I've been completely stuck as to how to get rid of these warnings.
Please help if you have any ideas - thank you in advance!
The value restriction referenced in the warning is one of the trickier things to understand in SML, however I will do my best to explain why it comes up in this case and try to point you towards a few resources to learn more.
As you know, SML will use type inference to deduce most of the types in your programs. In this program, the type of my_func will be inferred to be ('a -> 'b option) -> 'a list -> 'b. As you noted, it's a polymorphic type. When you call my_func like this
myfunc f [NONE, SOME 1, NONE];
... the type variables 'a and 'b are instantiated to int option and int.
However when you call it without a value such as SOME 1 above
myfunc f [NONE, NONE];
What do you think the type variables should be instantiated to? The types should be polymorphic -- something like 't option and 't for all types 't. However, there is a limitation which prevents values like this to take on polymorphic types.
SML defines some expressions as non-expansive values and only these values may take on polymorphic types. They are:
literals (constants)
variables
function expressions
constructors (except for ref) applied to non-expansive values
a non-expansive values with a type annotation
tuples where each field is a non-expansive value
records where each field is a non-expansive value
lists where each field is a non-expansive value
All other expressions, notably function calls (which is what the call to my_func is) cannot be polymorphic. Neither can references. You might be curious to see that the following does not raise a warning:
fn () => my_func f [NONE, NONE];
Instead, the type inferred is unit -> 'a. If you were to call this function however, you would get the warning again.
My understanding of the reason for this restriction is a little weak, but I believe that the underlying root issue is mutable references. Here's an example I've taken from the MLton site linked below:
val r: 'a option ref = ref NONE
val r1: string option ref = r
val r2: int option ref = r
val () = r1 := SOME "foo"
val v: int = valOf (!r2)
This program does not typecheck under SML, due to the value restriction. Were it not for the value restriction, this program would have a type error at runtime.
As I said, my understanding is shaky. However I hope I've shed a little light on the issue you've run into, although I believe that in your case you could safely ignore the warning. Here are some references should you decide you'd like to dig deeper:
http://users.cis.fiu.edu/~smithg/cop4555/valrestr.html
http://mlton.org/ValueRestriction
(BTW the MLton site is solid gold. There's so much hidden away here, so if you're trying to understand something weird about SML, I highly recommend searching here because you'll likely turn up a lot more than you initially wanted)
Since it seems like you're actually using SML/NJ, this is a pretty handy guide to the error messages and warnings that it will give you at compile time:
http://flint.cs.yale.edu/cs421/smlnj/doc/errors.html

What does `B means?

In toplevel, i get the following output:
#`B
- : [> `B ] = `B
then what does `B mean ? Why do we need it ?
Sincerely!
An identifier prefixed with a backquote like `B is a constructor of a polymorphic variant type. It's similar to the constructor of an algebraic type:
type abc = A | B | C
However, you can use polymorphic variant values without declaring them, and in general they're much more flexible than the usual algebraic types. The tradeoff is that they're also quite a bit trickier to use.
One thing people use them for is as simple named values, like enum values in C. Or, more precisely, like atoms in Lisp. You can use ordinary algebraic types for this, but you need to carefully maintain your definitions of them and guard against duplication. With polymorphic variants, you don't need to do either of these. You can use them without declaring them, and the constructors aren't required to be unique (two different types can have the same constructor).
Polymorphic variant constructors can also take parameters, as algebraic constructors can. So you can also write (`B 77), a constructor with a single int parameter.
This is a pretty big topic--see the above linked section of the OCaml manual for more details.
It's a polymorphic variant. From the documentation:
Variants as presented in section 1.4 are a powerful tool to build data structures and algorithms. However they sometimes lack flexibility when used in modular programming. This is due to the fact every constructor reserves a name to be used with a unique type. One cannot use the same name in another type, or consider a value of some type to belong to some other type with more constructors.
With polymorphic variants, this original assumption is removed. That is, a variant tag does not belong to any type in particular, the type system will just check that it is an admissible value according to its use. You need not define a type before using a variant tag. A variant type will be inferred independently for each of its uses.

Which style is better to declare a type in Ocaml?

I often need to declare a type which contains a map or a list, for instance:
type my_type_1 = my_type_0 IntMap.t
type my_type_2 = my_type_0 List
Also I have seen another style of declaration which encapsulates map or list in a record, for instance:
type my_type_1 =
| Bot_1
| Nb_1 of my_type_0 IntMap.t
type my_type_2 =
| Bot_2
| Nb_2 of my_type_0 List
My question is, whether there are some cases where the second style is necessary and better than the first style?
Thank you very much!
The two types you give are not equivalent, because of the Bot constructor added in the second case. This means that the two my_type_1 do not have the same semantics. Incidentally, the construction Bot | Foo of 'a is already provided by the standard type 'a option, with constructors Some and None, so the type my_type_1 of your second sample is equivalent to a my_type_1 option in the first one.
Whether to use an option type or your own constructors names is up to you. In general, I would advise to you an option type if the semantics of your type coincides with the option idea of failure, being absent, or being undefined. Given your name Bot, I assume this is probably what you're doing, but defining your own constructor names is also ok and can be clearer in some circumstances. The matter has been discussed in depth in this blog post from ezyang.
Now, assuming your two types definition were equivalent (that is, in absence of the Bot) constructor, what's the purpose of adding an algebraic datatype layer, a new constructor, instead of using a simple type alias ? Well, it has the effect of making your type distinct from the representation type. For example, if you define type 'a stack = Stack of 'a list, 'a stack and 'a list cannot be confused for each other, and the compiler will raise an error if you do. So that can be used to enforce a (light) type separation, with the constructor acting as a type annotation:
let empty = Stack []
let length (Stack li) = List.length li
I'd say it's mostly a matter of taste, but I would recommend using an algebraic datatype instead of an alias when you want to be sure that there can be no mistake with the original type. The downside is that you have to wrap the operations of the original datatype, as I did in my length function above.
Those are not different styles, but different types: the first type declarations are an abbreviation for a specialized instance (for mytype_0) of the polymorphic List, or IntMap.
The second set of definitions present a "constructed" type, for which Bot_1 (and Bot_2) provide values. Those "alternatives" can be used, for example, to create functions of type T -> my_type_1 which return Bot_1 in a special case where the computation doesn't allow to return a list, in a similar way of what an option type permits. This is impossible with the first set of definitions (who must always provide the required list payload).
The second one isn't a "record" (which is a different thing). It creates an algebraic data type. I'm not sure how to explain it but if you've used Haskell or Standard ML you'll know. It's basically a tagged union. A my_type_1 is either a Bot_1 (which carries no data) or a Nb_1 (which carries a my_type_0 IntMap.t as data).
The first one is simply a type synonym (like a typedef in C).

Explicit var type in OCaml

If I have a type t
type t = C of string;;
And want to explicitly define a type of the variable to be type t:
let b : t = C 'MyString';;
Can I do it in OCaml?
You don't need to be explicit
let b = C mystring
let b = C "a string litteral"
You can be explicit, but it dy oesn't add anything
let b : t = C foo
The preferred way in general is to use type inference without annotating types, and only be explicit about the type of the identifiers exported to other modules, through the associated .mli interface file.
In this case, a type annotation doesn't add anything as the C constructor already is a kind of tag/annotation : C something is necessarily of type t, there is no possible confusion.
You can, using either that syntax, or the alternate one:
let b = (C foo : t)
Adding type constraints in this fashion does not usually serve any purpose in a well-formed program, because the type inference algorithm can handle all of it correctly on its own. There are a few exceptions (mostly involving the object-oriented side), but they're quite rare.
Such annotations are mostly useful when a type error happens and you need to understand why a certain expression has a certain type when you expected it to have another, because you can type-annotate intermediate values to have the type error move up through your source code.
Note that there's another way of annotating types, which is to define a signature for your modules. In your example above, your module body would contain:
let b = C foo
And your module signature would contain :
val b : t
This is especially useful when you need assumptions inside the module to be invisible to other modules. For instance, when using polymorphic variants:
let user_type = `Admin
Here, you only want to handle the administrator account, but you need the rest of your code to be aware that other account types exist, so you would write in the signature that:
val user_type : [`Admin|`Member|`Guest]
This type is technically correct, but could not have been guessed by the type inference algorithm.