I'm currently learning SML and I have a question about something I have no name for. Lets call it "type alias" for the moment. Suppose I have the following datatype definition:
datatype 'a stack = Stack of 'a list;
I now want to add an explicit "empty stack" type. I can to this by adding it to the datatype:
datatype 'a stack = emptystack | Stack of 'a list;
Now I can pattern match a function like "push":
fun push (emptystack) (e:'a) = Stack([e])
| push (Stack(list):'a stack) (e:'a) = Stack(e::list);
The problem here is that Stack([]) and emptystack are different but I want them to be the same. So every time SML encounters an Stack([]) it should "know" that this is emptystack (in case of push it should then use the emptystack match).
Is there a way to achieve this?
The short answer is: No, it is not possible.
You can create type aliases with the code
type number = int
val foo : number -> int -> number =
fn a => fn b => a+b
val x : int = foo 1 3;
val y : number = foo 1 3;
However, as the name says, it only works for types. Your question goes for value constructors, which there is no syntax for.
Such an aliasing is not possible in SML.
Instead, you should design your datatypes to be unambiguous in their representation, if that is what you desire.
You'd probably be better suited with something that resembles the definition of 'a list more:
datatype 'a stack = EmptyStack | Stack of 'a * 'a stack;
This has the downside of not letting you use the list functions on it, but you do get an explicit empty stack constructor.
Since what you want is for one value emptystack to be synonymous with another value Stack [], you could call what you are looking for "value aliases". Values that are compared with the built-in operator = or pattern matching will not allow for aliases.
You can achieve this by creating your own equality operator, but you will lose the ability to use the built-in = (since Standard ML does not support custom operator overloading) as well as the ability to pattern match on the value constructors of your type.
Alternatively, you can construct a normal form for your type and always compare the normal form. Whenever practically feasible, follow Sebastian's suggestion of no ambiguity. There might be situations in which an unambiguous algebraic type will be much more complex than a simpler one that allows the same value to be represented in different ways.
Related
I have this signature for a mutable set:
open Base
open Hashable
module type MutableSet = sig
type 'a t
val contains : 'a t -> 'a -> bool
end
I want to implement the signature with HashSet using the Hashable module from the Base library.
module HashSet(H : Hashable) : MutableSet = struct
let num_buckets = 16
type 'a t = { mutable buckets : ('a list) array }
let contains s e =
let bucket_index = (H.hash e) % num_buckets in
let bucket = s.buckets.(bucket_index) in
List.exists ~f:(fun e' -> H.equal e e') bucket
end
I am getting the error
Error: Signature mismatch:
Modules do not match:
sig
type 'a t = { mutable buckets : 'a list array; }
val contains : 'a H.t t -> 'a H.t -> bool
end
is not included in
MutableSet
Values do not match:
val contains : 'a H.t t -> 'a H.t -> bool
is not included in
val contains : 'a t -> 'a -> Base.bool
I think the issue is that the type of the Hashable key is not constrained to be the same as 'a, the type of the elements that are in the set. How do I constrain the types to be the same?
The crux of the problem is the H.equal function, which has type 'a t -> 'a t -> bool, cf., it with H.hash which has type 'a -> int.
I think that everything went wrong, because of your wrong assumptions on what the hashable means in Base. The type Hashable.t is a record of three functions, and is defined as follows1:
type 'a t = {
hash : 'a -> int;
compare : 'a -> 'a -> int;
sexp_of_t : 'a -> Sexp.t;
}
Therefore, any type that wants to be hashable must provide an implementation of these three functions. And although there is a module type of module Hashable it is not designed to be used as a parameter to a functor. There is only one module Hashable, that defines the interface (type class if you want) of a hashable value.
Therefore, if you need a monomorphic MutableSet for a key that is hashable, you shall write a functor, that takes a module of type Hashable.Key.
module HashSet(H : Hashable.Key) = struct
let num_buckets = 16
type elt = H.t
type t = { mutable buckets : H.t list array }
let contains s e =
let bucket_index = H.hash e % num_buckets in
let bucket = s.buckets.(bucket_index) in
List.exists ~f:(fun e' -> H.compare e e' = 0) bucket
end;;
If you want to implement a polymorphic MutableSet, then you do not need to write a functor (if it is polymoprhic then it is already defined for all possible types.). You can even use the polymorphic functions from the Hashable module itself, e.g.,
module PolyHashSet = struct
let num_buckets = 16
let {hash; compare} = Hashable.poly
type 'a t = { mutable buckets : 'a list array }
let contains s e =
let bucket_index = hash e % num_buckets in
let bucket = s.buckets.(bucket_index) in
List.exists ~f:(fun e' -> compare e e' = 0) bucket
end
Answers to the follow-up questions
When would you want to use Hashable.equal to compare two type classes?
1) When you need to ensure that two hashtables are using the same comparison function. For example, if you would like to merge two tables or intersect two tables, they should use the same comparison/hash functions, otherwise, the results are undefined.
2) When you need to compare two hashtables for equality.
Is there a way to define the polymorphic version without using the built in polymorphic hash functions and equals methods?
If by "built-in" you mean primitives provided by OCaml, then the answer is, no, such hashtable have to use the polymorphic comparison primitive from the OCaml standard library.
You don't have to use the Hashable module from the base library, to access to them. They are also available via Caml or Polymorphic_compare modules in Base. Or, if you're not using the Base or Core libraries, then the compare function from Stdlib by default is polymorphic and has type 'a -> 'a -> int.
With all that said, I think some clarification is needed on what we say by polymorphic version. The Base's Hash_set, as well as Hashtbl, are also polymorphic data structures, as they have types 'a t and ('k,'a) t correspondingly, which are both polymorphic in their keys. They, however, do not rely on the polymorphic comparison function but require a user to provide a comparison function during the construction. In fact, they require an implementation of the hashable interface. Therefore, to create an empty hash table you need to pass it a module which implements it, e.g.,
let empty = Hashtbl.create (module Int)
Where the passed module must implement the Hashable.Key interface, which beyond others provide the implementation of hashable via the Hashable.of_key function. And the hashtable implementation is just storing the comparison functions in itself, e.g., roughly,
type ('a,'k) hashtbl = {
mutable buckets : Avltree.t array;
mutable length : int;
hashable : 'k hashable;
}
I think, that given this implementation it is now more obvious when we need to compare to hashable records.
Is one version (monomorphic with Functor vs polymorphic) preferable over the other?
First of all, we actually have three versions. Functor, polymorphic, and one that uses the polymorphic comparison function (let's name it universal). The latter is of the least preference and should be avoided if possible.
Concerning the former two, both are good, but a polymorphic version is more versatile without involving too many compromises. Theoretically, a functor version opens more opportunities for compiler optimizations (as the comparison function could be inlined), but it comes with a price that for every key you will have a different module/type.
You can also benefit from both approaches and provide both polymorphic and monomorphic implementations (with the latter being a specialization of the former), e.g., this is how maps and sets are implemented in JS Base/Core. There is a polymorphic type for a set,
type ('a,'comparator_witness) set
which is a binary tree coupled with a compare function, which is reflected in the set type by the 'comparator_witness type, so that for each comparison function a fresh new type is generated, thus preventing Set.union et al operations between two sets that have different compare function stored in them.
There is also, at the same time a Set.Make(K : Key) functor, which creates a module that provides type t = (K.t,K.comparator_witness) set type, which can, theoretically, benefit from inlining. Moreover, each module that implements Core.Comparable.S and below, will also provide a .Map, .Set, etc modules, e.g., Int.Set. Those modules are usually created via the corresponding Make functors (i.e., Map.Make, Set.Make), but they open opportunities for manual specializations.
1) Therefore Hashable.equal function actually compares functions not values. It basically compares two type classes. And, I believe, the Hashable.hash function is typed 'a -> int accidentally, and the type that was intended was also 'a t -> int.
module HashSet(H : Hashable) : (MutableSet with type t = H.t)
This, I guess. Cannot check at the moment, though.
The issue is that the equality H.equal in
List.exists ~f:(fun e' -> H.equal e e') bucket
is the equality over hash function dictionaries ('a H.t). Thus as written, the contains function only works on sets of hash function dictionaries. If you want a polymorphic mutable set, you will have to use the polymorphic equality.
An ordered list is a list in which an element that appears later in the list is never smaller than an element that appears earlier in the list, where the notion of "smaller than" is given by a specific chosen relation. For example, the list
[1; 5; 7; 12; 13]
is an ordered list relative to the usual < relation on numbers, the list
[17; 14; 9; 6; 2]
is an ordered list relative to the > relation on numbers but not relative to the < order and the list
[17; 14; 9; 13; 2]
How would I define a type constructor olist that is like the list type constructor except that values of this type carry enough information to determine whether or not they are ordered lists relative to the intended ordering relation?
For example, I need a funciton of
initOList : ('a -> 'a -> bool) -> 'a olist
that takes an ordering relation over a given type and returns an empty olist of that type. Where do I beging with creating this type instructor and use it for creating the initOList function?
A static type represents some property of an expression that can be checked at the compile time, i.e., without relying on any information that is available only at runtime. So you can't really express in the type systems things that depend on runtime values, such as the list length, ordering, contents, etc.
In your particular case, you want to have a type constructor that takes an ordering and returns a new type. However, type constructors are functions in the type domain, i.e., they map types to types (i.e., take types as their parameters and return types). But ordering is a function of type 'a -> 'a -> int, so you can't really express it in OCaml.
A type system that allows values to appear in the domain of type constructors (i.e., to parametrize types with the runtime values) is called Dependent Type systems. OCaml doesn't really provide dependent typing. This is not because it is impossible to do this, this is a matter of choice, as working with the dependent typing system is much harder, moreover, type inference in the dependent type systems it is undecidable (i.e., it is impossible to write an algorithm that will work for all programs), so a significant help from a user is usually required to prove that a program is well-typed. Thus the systems that implement the Dependent Typing are more close to automated theorem provers, e.g., Coq, Isabelle, etc. However, there are those that are more close to conventional programming languages, e.g., F*.
So now, we are clear, that in OCaml a type cannot be parametrized with a runtime value. However, the typing system still provides enough power to express a notion of a sorted list. We can't really use the type system to check that our sorting algorithm is correct, but we can tell it, that hey, this function ensures that a list is sorted, just believe us, and take it as granted. We can represent this using the module level abstraction, e.g.,
module SortedList : sig
type 'a t
val create : ('a -> 'a -> int) -> 'a list -> 'a t
end = struct
type 'a t = 'a list
let create = List.sort
end
The idea is that we have type 'a SortedList.t that can be only constructed with the SortedList.create function and there are no other ways around it. So an expression that has type 'a SortedList.t bears a proof in itself, that it is sorted. We can use it to make preconditions of functions explicit in the type system, e.g., suppose we have a dedup function that removes duplicates from a list, and it works correctly if the input list is sorted. We can't really test this precondition, thus we may rely on a type system:
val dedup : 'a SortedList.t -> 'a SortedList.t
Thus the dedup function type states that (a) it is applicable only to sorted lists, and (b) it preserves the ordering.
There are two issues with our approach. First of all, our 'a SortedList.t is too abstract, as it doesn't provide any access operators. We can just provide a to_list operator, that will erase the proof that a list is ordered, but allow all the list operations on it, e.g.,
module SortedList : sig
type 'a t
val create : ('a -> 'a -> int) -> 'a list -> 'a t
val to_list : 'a t -> 'a list
end = struct
type 'a t = 'a list
let create = List.sort
let to_list xs = xs
end
The to_list operation is correct, as a set of all sorted lists is a subset of all lists. That means, that 'a SortedList.t is actually a subtype of 'a list. It is nice, that OCaml provides an explicit mechanism for expressing this subtype relation via private abstract types.
module SortedList : sig
type 'a t = private 'a list
val create : ('a -> 'a -> int) -> 'a list -> 'a t
end = struct
type 'a t = 'a list
let create = List.sort
end
Such definition of a type states that 'a SortedList.t is a subtype of 'a list so it is possible to upcast one into another. Remember, since upcasting is explicit in OCaml, it is not automatic, so you need to use the upcasting operator, e.g., xs : 'a SortedList.t :> 'a list.
The second problem in our implementation is that our 'a SortedList.t doesn't really distinguish list that are differently sorted, i.e., ascending or descending. It's not a problem for the dedup function, but some functions may require that their input is sorted in a specific order (for example a function that will find a median or any other statistic mode). So we need to encode the ordering. We will encode ordering in such way, that we will treat each ordering function as a different type. (An alternative solution would be just to encode concrete variants, such as Ascending and Descending, but I will leave this is an exercise). The main drawback of our approach is that our ordered list can't be a parametric type anymore, as the ordering function is defined for a particular ground type. In fact, this means, that our OrderedList.t is now a higher order polymorphic type, so we need to use functors to implement it:
module type Ordering = sig
type element
type witness
val compare : element -> element -> int
end
module SortedList : sig
type ('a,'o) t = private 'a list
module Make(Order : Ordering) : sig
type nonrec t = (Order.element, Order.witness) t
val create : Order.element list -> t
end
end = struct
type ('a,'o) t = 'a list
module Make(Order : Ordering) = struct
type nonrec t = (Order.element, Order.witness) t
let create = List.sort Order.compare
end
end
Now, let's play a little bit with our implementation. Let's provide to different Orders:
module AscendingInt = struct
type element = int
type witness = Ascending_int
let compare : int -> int -> int = compare
end
module DescendingInt = struct
type element = int
type witness = Descending_int
let compare : int -> int -> int = fun x y -> compare y x
end
module AscendingSortedList = SortedList.Make(AscendingInt)
module DescendingSortedList = SortedList.Make(DescendingInt)
Now, let's test that the two sorted lists are actually having different types:
# let asorted = AscendingSortedList.create [3;2;1];;
val asorted : AscendingSortedList.t = [1; 2; 3]
# let bsorted = DescendingSortedList.create [3;2;1];;
val bsorted : DescendingSortedList.t = [3; 2; 1]
# compare asorted bsorted;;
Characters 16-23:
compare asorted bsorted;;
^^^^^^^
Error: This expression has type
DescendingSortedList.t =
(DescendingInt.element, DescendingInt.witness) SortedList.t
but an expression was expected of type
AscendingSortedList.t =
(AscendingInt.element, AscendingInt.witness) SortedList.t
Type DescendingInt.witness is not compatible with type
AscendingInt.witness
Now let's check that both of them are actually subtypes of 'a list:
# compare (asorted :> int list) (bsorted :> int list);;
- : int = -1
According to the definition you've provided, an olist is simply a list with a comparison function attached to it. This can be easily achieved using records (you can read more about them here).
From there, you can write functions that creates and manipulates values of your new type.
We need to go deeper
If you've managed to implement the solution above, I suggest you put all your definitions inside a module. You can read more about these here. Put simply, modules are a way to put your type definition and your functions together.
My comprehension of the problem comes from Heilperin's et al. "Concrete Abstraction". I got that currying is the translation of the evaluation of a function that takes several arguments into evaluating a sequence of functions, each with a single argument. I have clear the semantic differences between the two approaches (can I call them this way?) but I am sure I did not grasp the practical implications behind the two approaches.
Please consider, in Ocaml:
# let foo x y = x * y;;
foo : int -> int -> int = <fun>
and
# let foo2 (x, y) = x * y;;
foo2 : int * int -> int = <fun>
The results will be the same for the two functions.
But, practically, what does make the two functions different? Readability? Computational efficiency? My lack of experience fails to give to this problem an adequate reading.
First of all, I would like to stress, that due to compiler optimizations the two functions above will be compiled into the same assembly code. Without the optimizations, the cost of currying would be too high, i.e., an application of a curried function would require allocating an amount of closures equal to the number of arguments.
In practice, curried function is useful, to define partial application. For example, cf.,
let double = foo 2
let double2 x = foo2 (2,x)
Another implication is that in a curried form, you do not need to allocate temporary tuples for the arguments, like in the example above, the function double2 will create an unnecessary tuple (2,x) every time it is called.
Finally, the curried form, actually simplifies reasoning about functions, as now, instead of having N families of N-ary functions, we have only unary functions. That allows, to type functions equally, for example, type 'a -> 'b is applicable to any function, e.g., int -> int, int -> int -> int, etc. Without currying, we would be required to add a number arguments into the type of a function, with all negative consequences.
With the first implementation you can define, for example
let double = foo 2
the second implementation can not be partially reused.
For example,
let x = ["a";"b";"c";"d"];;
let listp =
if (x.isa(List)) then true else false;;
Is there something like a "isa" method in OCaml to predicate a variable is a List/Array/Tuple... and so on?
OCaml has no constructs for testing the type of something. A good rule of thumb is that types are either fixed or are completely unknown. In the first case, there's no need to test. In the second case, the code is required to work for all possible types.
This works out a lot better than you might expect if you're used to other languages. It's kind of a nice application of the zero/one/infinity rule.
Note that there is no trouble defining a type that contains one of a set of types you are interested in:
type number = MyFloat of float | MyInt of int
Values of this type look like: MyFloat 3.1 or MyInt 30281. You can, in effect, test the type by matching against the constructor:
let is_int x = match x with MyFloat _ -> false | MyInt _ -> true
The same is true for lists and arrays, except that these are parameterized types:
type 'a collection = MyArray of 'a array | MyList of 'a list
let is_list x = match x with MyArray _ -> false | MyList _ -> true
What you get back for the lack of so-called introspection is that you can easily construct and deconstruct values with rich and expressive types, and you can be assured that functions you call can't mess with a value when they don't know what its type is.
Can't you just match x with something specific to your type? For example, for a sequence:
let listp = match x with | h::t -> true | _ -> false
for a tuple, I don't remember the exact syntax, but something like match x with | (k,v) -> true
and so on...
Not really: everything has a type associated with it, so either it's already known that it's a list, or it's polymorphic (like 'a), in which case we're not "allowed" to know about the underlying type. Doing anything type-specific in that case would force specialization of the value's type.
I was translating the following Haskell code to OCaml:
data NFA q s = NFA
{ intialState :: q
, isAccepting :: q -> Bool
, transition :: q -> s -> [q]
}
Initially I tried a very literal translation:
type ('q,'s) nfa = NFA of { initialState: 'q;
isAccepting: 'q -> bool;
transition: 'q -> 's -> 'q list }
...and of course this gives a syntax error because the type constructor part, "NFA of" isn't allowed. It has to be:
type ('q,'s) nfa = { initialState: 'q;
isAccepting: 'q -> bool;
transition: 'q -> 's -> 'q list }
That got me to wondering why this is so. Why can't you have the type constructor for a record type just as you could for a tuple type (as below)?
type ('q, 's) dfa = NFA of ('q * ('q->bool) * ( 'q -> 's -> 'q list) )
Why would you want a type constructor for record types, except because that's your habit in Haskell?
In Haskell, records are not exactly first-class constructs: they are more like a syntactic sugar on top of tuples. You can define record fields name, use them as accessors, and do partial record update, but that desugars into access by positions in plain tuples. The constructor name is therefore necessary to tell one record from another after desugaring: if you had no constructor name, two records with different field names but the same field types would desugar into equivalent types, which would be a bad thing.
In OCaml, records are a primitive notion and they have their own identity. Therefore, they don't need a head constructor to distinguish them from tuples or records of the same field types. I don't see why you would like to add a head constructor, as this is more verbose without giving more information or helping expressivity.
Why can't you have the type constructor for a record type just as you could for a tuple type (as below)?
Be careful ! There is no tuple in the example you show, only a sum type with multiple parameters. Foo of bar * baz is not the same thing as Foo of (bar * baz): the former constructor has two parameters, and the latter constructor has only one parameter, which is a tuple. This differentiation is done for performances reasons (in memory, the two parameters are packed together with the constructor tag, while the tuple creates an indirection pointer). Using tuples instead of multi-parameters is slightly more flexible : you can match as both Foo (x, y) -> ... or Foo p -> ..., the latter not being available to multi-parameter constructors.
There is no asymmetry between tuples and records in that none of them has a special status in the sum type construction, which is only a sum of constructors of arbitrary arity. That said, it is easier to use tuples as parameter types for the constructors, as tuple types don't have to be declared to be used. Eg. some people have asked for the ability to write
type foo =
| Foo of { x : int; y : int }
| Bar of { z : foo list }
instead of the current
type foo = Foo of foo_t | Bar of bar_t
and foo_t = { x : int; y : int }
and bar_t = { z : foo list }
Your request is a particular (and not very interesting) case of this question. However, even with such shorthand syntax, there would still be one indirection pointer between the constructor and the data, making this style unattractive for performance-conscious programs -- what could be useful is the ability to have named constructor parameters.
PS: I'm not saying that Haskell's choice of desugaring records into tuples is a bad thing. By translating one feature into another, you reduce redundancy/overlapping of concepts. That said, I personally think it would be more natural to desugar tuples into records (with numerical field names, as done in eg. Oz). In programming language design, there are often no "good" and "bad" choices, only different compromises.
You can't have it because the language doesn't support it. It would actually be an easy fit into the type system or the data representation, but it would be a small additional complication in the compiler, and it hasn't been done. So yes, you have to choose between naming the constructor and naming the arguments.
Note that record label names are tied to a particular type, so e.g. {initialState=q} is a pattern of type ('q, 's) nfa; you can't usefully reuse the label name in a different type. So naming the constructor is only really useful when the type has multiple constructors; then, if you have a lot of arguments to a constructor, you may prefer to define a separate type for the arguments.
I believe there's a patch floating around for this feature, but I don't know if it's up-to-date for the latest OCaml version or anything, and it would require anyone using your code to have that patch.