In OCaml, how to predicate a variable is a List? - list

For example,
let x = ["a";"b";"c";"d"];;
let listp =
if (x.isa(List)) then true else false;;
Is there something like a "isa" method in OCaml to predicate a variable is a List/Array/Tuple... and so on?

OCaml has no constructs for testing the type of something. A good rule of thumb is that types are either fixed or are completely unknown. In the first case, there's no need to test. In the second case, the code is required to work for all possible types.
This works out a lot better than you might expect if you're used to other languages. It's kind of a nice application of the zero/one/infinity rule.
Note that there is no trouble defining a type that contains one of a set of types you are interested in:
type number = MyFloat of float | MyInt of int
Values of this type look like: MyFloat 3.1 or MyInt 30281. You can, in effect, test the type by matching against the constructor:
let is_int x = match x with MyFloat _ -> false | MyInt _ -> true
The same is true for lists and arrays, except that these are parameterized types:
type 'a collection = MyArray of 'a array | MyList of 'a list
let is_list x = match x with MyArray _ -> false | MyList _ -> true
What you get back for the lack of so-called introspection is that you can easily construct and deconstruct values with rich and expressive types, and you can be assured that functions you call can't mess with a value when they don't know what its type is.

Can't you just match x with something specific to your type? For example, for a sequence:
let listp = match x with | h::t -> true | _ -> false
for a tuple, I don't remember the exact syntax, but something like match x with | (k,v) -> true
and so on...

Not really: everything has a type associated with it, so either it's already known that it's a list, or it's polymorphic (like 'a), in which case we're not "allowed" to know about the underlying type. Doing anything type-specific in that case would force specialization of the value's type.

Related

How do I constrain the key type of the Hashable module to be the same as the type parameter in the signature?

I have this signature for a mutable set:
open Base
open Hashable
module type MutableSet = sig
type 'a t
val contains : 'a t -> 'a -> bool
end
I want to implement the signature with HashSet using the Hashable module from the Base library.
module HashSet(H : Hashable) : MutableSet = struct
let num_buckets = 16
type 'a t = { mutable buckets : ('a list) array }
let contains s e =
let bucket_index = (H.hash e) % num_buckets in
let bucket = s.buckets.(bucket_index) in
List.exists ~f:(fun e' -> H.equal e e') bucket
end
I am getting the error
Error: Signature mismatch:
Modules do not match:
sig
type 'a t = { mutable buckets : 'a list array; }
val contains : 'a H.t t -> 'a H.t -> bool
end
is not included in
MutableSet
Values do not match:
val contains : 'a H.t t -> 'a H.t -> bool
is not included in
val contains : 'a t -> 'a -> Base.bool
I think the issue is that the type of the Hashable key is not constrained to be the same as 'a, the type of the elements that are in the set. How do I constrain the types to be the same?
The crux of the problem is the H.equal function, which has type 'a t -> 'a t -> bool, cf., it with H.hash which has type 'a -> int.
I think that everything went wrong, because of your wrong assumptions on what the hashable means in Base. The type Hashable.t is a record of three functions, and is defined as follows1:
type 'a t = {
hash : 'a -> int;
compare : 'a -> 'a -> int;
sexp_of_t : 'a -> Sexp.t;
}
Therefore, any type that wants to be hashable must provide an implementation of these three functions. And although there is a module type of module Hashable it is not designed to be used as a parameter to a functor. There is only one module Hashable, that defines the interface (type class if you want) of a hashable value.
Therefore, if you need a monomorphic MutableSet for a key that is hashable, you shall write a functor, that takes a module of type Hashable.Key.
module HashSet(H : Hashable.Key) = struct
let num_buckets = 16
type elt = H.t
type t = { mutable buckets : H.t list array }
let contains s e =
let bucket_index = H.hash e % num_buckets in
let bucket = s.buckets.(bucket_index) in
List.exists ~f:(fun e' -> H.compare e e' = 0) bucket
end;;
If you want to implement a polymorphic MutableSet, then you do not need to write a functor (if it is polymoprhic then it is already defined for all possible types.). You can even use the polymorphic functions from the Hashable module itself, e.g.,
module PolyHashSet = struct
let num_buckets = 16
let {hash; compare} = Hashable.poly
type 'a t = { mutable buckets : 'a list array }
let contains s e =
let bucket_index = hash e % num_buckets in
let bucket = s.buckets.(bucket_index) in
List.exists ~f:(fun e' -> compare e e' = 0) bucket
end
Answers to the follow-up questions
When would you want to use Hashable.equal to compare two type classes?
1) When you need to ensure that two hashtables are using the same comparison function. For example, if you would like to merge two tables or intersect two tables, they should use the same comparison/hash functions, otherwise, the results are undefined.
2) When you need to compare two hashtables for equality.
Is there a way to define the polymorphic version without using the built in polymorphic hash functions and equals methods?
If by "built-in" you mean primitives provided by OCaml, then the answer is, no, such hashtable have to use the polymorphic comparison primitive from the OCaml standard library.
You don't have to use the Hashable module from the base library, to access to them. They are also available via Caml or Polymorphic_compare modules in Base. Or, if you're not using the Base or Core libraries, then the compare function from Stdlib by default is polymorphic and has type 'a -> 'a -> int.
With all that said, I think some clarification is needed on what we say by polymorphic version. The Base's Hash_set, as well as Hashtbl, are also polymorphic data structures, as they have types 'a t and ('k,'a) t correspondingly, which are both polymorphic in their keys. They, however, do not rely on the polymorphic comparison function but require a user to provide a comparison function during the construction. In fact, they require an implementation of the hashable interface. Therefore, to create an empty hash table you need to pass it a module which implements it, e.g.,
let empty = Hashtbl.create (module Int)
Where the passed module must implement the Hashable.Key interface, which beyond others provide the implementation of hashable via the Hashable.of_key function. And the hashtable implementation is just storing the comparison functions in itself, e.g., roughly,
type ('a,'k) hashtbl = {
mutable buckets : Avltree.t array;
mutable length : int;
hashable : 'k hashable;
}
I think, that given this implementation it is now more obvious when we need to compare to hashable records.
Is one version (monomorphic with Functor vs polymorphic) preferable over the other?
First of all, we actually have three versions. Functor, polymorphic, and one that uses the polymorphic comparison function (let's name it universal). The latter is of the least preference and should be avoided if possible.
Concerning the former two, both are good, but a polymorphic version is more versatile without involving too many compromises. Theoretically, a functor version opens more opportunities for compiler optimizations (as the comparison function could be inlined), but it comes with a price that for every key you will have a different module/type.
You can also benefit from both approaches and provide both polymorphic and monomorphic implementations (with the latter being a specialization of the former), e.g., this is how maps and sets are implemented in JS Base/Core. There is a polymorphic type for a set,
type ('a,'comparator_witness) set
which is a binary tree coupled with a compare function, which is reflected in the set type by the 'comparator_witness type, so that for each comparison function a fresh new type is generated, thus preventing Set.union et al operations between two sets that have different compare function stored in them.
There is also, at the same time a Set.Make(K : Key) functor, which creates a module that provides type t = (K.t,K.comparator_witness) set type, which can, theoretically, benefit from inlining. Moreover, each module that implements Core.Comparable.S and below, will also provide a .Map, .Set, etc modules, e.g., Int.Set. Those modules are usually created via the corresponding Make functors (i.e., Map.Make, Set.Make), but they open opportunities for manual specializations.
1) Therefore Hashable.equal function actually compares functions not values. It basically compares two type classes. And, I believe, the Hashable.hash function is typed 'a -> int accidentally, and the type that was intended was also 'a t -> int.
module HashSet(H : Hashable) : (MutableSet with type t = H.t)
This, I guess. Cannot check at the moment, though.
The issue is that the equality H.equal in
List.exists ~f:(fun e' -> H.equal e e') bucket
is the equality over hash function dictionaries ('a H.t). Thus as written, the contains function only works on sets of hash function dictionaries. If you want a polymorphic mutable set, you will have to use the polymorphic equality.

OCaml type constructor help. I need to understand this further

An ordered list is a list in which an element that appears later in the list is never smaller than an element that appears earlier in the list, where the notion of "smaller than" is given by a specific chosen relation. For example, the list
[1; 5; 7; 12; 13]
is an ordered list relative to the usual < relation on numbers, the list
[17; 14; 9; 6; 2]
is an ordered list relative to the > relation on numbers but not relative to the < order and the list
[17; 14; 9; 13; 2]
How would I define a type constructor olist that is like the list type constructor except that values of this type carry enough information to determine whether or not they are ordered lists relative to the intended ordering relation?
For example, I need a funciton of
initOList : ('a -> 'a -> bool) -> 'a olist
that takes an ordering relation over a given type and returns an empty olist of that type. Where do I beging with creating this type instructor and use it for creating the initOList function?
A static type represents some property of an expression that can be checked at the compile time, i.e., without relying on any information that is available only at runtime. So you can't really express in the type systems things that depend on runtime values, such as the list length, ordering, contents, etc.
In your particular case, you want to have a type constructor that takes an ordering and returns a new type. However, type constructors are functions in the type domain, i.e., they map types to types (i.e., take types as their parameters and return types). But ordering is a function of type 'a -> 'a -> int, so you can't really express it in OCaml.
A type system that allows values to appear in the domain of type constructors (i.e., to parametrize types with the runtime values) is called Dependent Type systems. OCaml doesn't really provide dependent typing. This is not because it is impossible to do this, this is a matter of choice, as working with the dependent typing system is much harder, moreover, type inference in the dependent type systems it is undecidable (i.e., it is impossible to write an algorithm that will work for all programs), so a significant help from a user is usually required to prove that a program is well-typed. Thus the systems that implement the Dependent Typing are more close to automated theorem provers, e.g., Coq, Isabelle, etc. However, there are those that are more close to conventional programming languages, e.g., F*.
So now, we are clear, that in OCaml a type cannot be parametrized with a runtime value. However, the typing system still provides enough power to express a notion of a sorted list. We can't really use the type system to check that our sorting algorithm is correct, but we can tell it, that hey, this function ensures that a list is sorted, just believe us, and take it as granted. We can represent this using the module level abstraction, e.g.,
module SortedList : sig
type 'a t
val create : ('a -> 'a -> int) -> 'a list -> 'a t
end = struct
type 'a t = 'a list
let create = List.sort
end
The idea is that we have type 'a SortedList.t that can be only constructed with the SortedList.create function and there are no other ways around it. So an expression that has type 'a SortedList.t bears a proof in itself, that it is sorted. We can use it to make preconditions of functions explicit in the type system, e.g., suppose we have a dedup function that removes duplicates from a list, and it works correctly if the input list is sorted. We can't really test this precondition, thus we may rely on a type system:
val dedup : 'a SortedList.t -> 'a SortedList.t
Thus the dedup function type states that (a) it is applicable only to sorted lists, and (b) it preserves the ordering.
There are two issues with our approach. First of all, our 'a SortedList.t is too abstract, as it doesn't provide any access operators. We can just provide a to_list operator, that will erase the proof that a list is ordered, but allow all the list operations on it, e.g.,
module SortedList : sig
type 'a t
val create : ('a -> 'a -> int) -> 'a list -> 'a t
val to_list : 'a t -> 'a list
end = struct
type 'a t = 'a list
let create = List.sort
let to_list xs = xs
end
The to_list operation is correct, as a set of all sorted lists is a subset of all lists. That means, that 'a SortedList.t is actually a subtype of 'a list. It is nice, that OCaml provides an explicit mechanism for expressing this subtype relation via private abstract types.
module SortedList : sig
type 'a t = private 'a list
val create : ('a -> 'a -> int) -> 'a list -> 'a t
end = struct
type 'a t = 'a list
let create = List.sort
end
Such definition of a type states that 'a SortedList.t is a subtype of 'a list so it is possible to upcast one into another. Remember, since upcasting is explicit in OCaml, it is not automatic, so you need to use the upcasting operator, e.g., xs : 'a SortedList.t :> 'a list.
The second problem in our implementation is that our 'a SortedList.t doesn't really distinguish list that are differently sorted, i.e., ascending or descending. It's not a problem for the dedup function, but some functions may require that their input is sorted in a specific order (for example a function that will find a median or any other statistic mode). So we need to encode the ordering. We will encode ordering in such way, that we will treat each ordering function as a different type. (An alternative solution would be just to encode concrete variants, such as Ascending and Descending, but I will leave this is an exercise). The main drawback of our approach is that our ordered list can't be a parametric type anymore, as the ordering function is defined for a particular ground type. In fact, this means, that our OrderedList.t is now a higher order polymorphic type, so we need to use functors to implement it:
module type Ordering = sig
type element
type witness
val compare : element -> element -> int
end
module SortedList : sig
type ('a,'o) t = private 'a list
module Make(Order : Ordering) : sig
type nonrec t = (Order.element, Order.witness) t
val create : Order.element list -> t
end
end = struct
type ('a,'o) t = 'a list
module Make(Order : Ordering) = struct
type nonrec t = (Order.element, Order.witness) t
let create = List.sort Order.compare
end
end
Now, let's play a little bit with our implementation. Let's provide to different Orders:
module AscendingInt = struct
type element = int
type witness = Ascending_int
let compare : int -> int -> int = compare
end
module DescendingInt = struct
type element = int
type witness = Descending_int
let compare : int -> int -> int = fun x y -> compare y x
end
module AscendingSortedList = SortedList.Make(AscendingInt)
module DescendingSortedList = SortedList.Make(DescendingInt)
Now, let's test that the two sorted lists are actually having different types:
# let asorted = AscendingSortedList.create [3;2;1];;
val asorted : AscendingSortedList.t = [1; 2; 3]
# let bsorted = DescendingSortedList.create [3;2;1];;
val bsorted : DescendingSortedList.t = [3; 2; 1]
# compare asorted bsorted;;
Characters 16-23:
compare asorted bsorted;;
^^^^^^^
Error: This expression has type
DescendingSortedList.t =
(DescendingInt.element, DescendingInt.witness) SortedList.t
but an expression was expected of type
AscendingSortedList.t =
(AscendingInt.element, AscendingInt.witness) SortedList.t
Type DescendingInt.witness is not compatible with type
AscendingInt.witness
Now let's check that both of them are actually subtypes of 'a list:
# compare (asorted :> int list) (bsorted :> int list);;
- : int = -1
According to the definition you've provided, an olist is simply a list with a comparison function attached to it. This can be easily achieved using records (you can read more about them here).
From there, you can write functions that creates and manipulates values of your new type.
We need to go deeper
If you've managed to implement the solution above, I suggest you put all your definitions inside a module. You can read more about these here. Put simply, modules are a way to put your type definition and your functions together.

What's the difference between these two functions?

I have two functions:
let rev_flatten l =
List.fold_left (fun acc x -> List.fold_left (fun acc y -> y::acc) acc x) [] l
The type is val rev_flatten : 'a list list -> 'a list = <fun>
and
let rev_flatten =
List.fold_left (fun acc x -> List.fold_left (fun acc y -> y::acc) acc x) []
The type is val rev_flatten : '_a list list -> '_a list = <fun>
I think it is the same functions, at least the same functionality, but why they have two different types? Why the second has the element type of _a? What is it?
A type variable with underscore as a prefix tells us that the variable is weakly polymorphic. A weakly polymorphic variable can be used only with one type, however a compiler can't deduce the exact type, so the type variable is mark with underscore.
When you provide an argument for the first time, a variable will no longer be polymorphic and will be able to accept arguments of a single type only.
Usually, a function is not generalized, but marked as weakly polymorphic if it might contain mutable state. In your example this is probably the case, because type system doesn't know if List.fold_left is pure or impure function.
Edit:
Why avoiding partial application (eta expansion) allows function (even impure) to be polymorphic?
Let's define a function that have an internal counter that is incremented and printed out every time the function is called. Among this, it takes a function as the argument and applies it after increasing the counter:
let count f =
let inc = ref 0 in
(fun x -> inc := !inc + 1; print_int !inc; f x);;
This function is polymorphic: ('a -> 'b) -> 'a -> 'b.
Next, let's define two more functions. A weekly polymorphic:
let max' = count max;;
val max' : '_a -> '_a -> '_a = <fun>
and a polymorphic one:
let max'' x = count max x;;
val max'' : 'a -> 'a -> 'a = <fun>
Now notice what is printed when we execute these functions:
max' 1 2;; (* prints 1 *)
max' 1 2;; (* prints 2 *)
max' 1 2;; (* prints 3 *)
max'' 1 2;; (* prints 1 *)
max'' 1 2;; (* prints 1 *)
max'' 1 2;; (* prints 1 *)
So the function that we designed as weekly polymorphic has a persistent mutable state inside that allows to use the counter as expected, while the polymorphic function is stateless and is reconstructed with every call, although we wanted to have a mutable variable inside.
This is the reason for a compiler to prefer a weakly polymorphic function that can be used with any single type instead of supporting full-fledged polymorphism.
A function with the type '_a list list -> '_a list is weakly polymorphic. What this means is that if you call the second one on an int list list, rev_flatten will no longer by '_a list list -> 'a list but int list list -> int list
You can read more here about the details of why here:
http://caml.inria.fr/resources/doc/faq/core.en.html
Cheers,
Scott
This is just the ML-style value restriction. There are some good references in a previous SO answer by gasche: What is the difference between 'a and '_l?.
Generally speaking the ML family applies a simple syntactic test to see whether it's safe to fully generalize, that is, to make a type fully polymorphic. If you generalize a case that's not safe, the program has undefined behavior (it can crash or get the wrong answer). So you need to do it only when safe.
The syntactic rule is applied because it's (relatively) easy to remember. A more complex rule was tried for a while, but it caused more harm than good (was the general conclusion). A historic description of the ML family will explain it better than I can.
One of your functions (the second one) is defined as an expression, i.e., as a function application. This is not "safe" according to the value restriction. (Remember, it's a syntactic test only.) The first is a lambda (fun x -> expr). This is "safe".
It's called the value restriction because it considers values to be safe. A function application is not a (syntactic) value. A lambda is a syntactic value. Something like [] is a value. Something like ref [] is not a value.

How to write the list in OCaml?

If I want to write list.ml in OCaml,
Q1
which way is correct?
type 'a list =
| Nil
| Cons of 'a * ('a list)
or
type 'a list =
| Nil
| Cons of 'a * 'a list
Any differences?
Q2
Also, how do I define the Cons inside the type definition as ::?
Q3
How do I define Nil inside the type definition as []?
Q1 -
There is no difference; each has two parameters associated to Cons. Although, Cons of ('a * 'a list) is different since it has one parameter, a tuple. You will come across that as an important distinction if you construct a tuple and try to wrap it in Cons as in, let x = a,Nil in Cons x. The choice depends on how you plan on constructing elements or some semantics of the data. In this particular case, no parenthesis should be used.
Q2 -
You cannot use : as the first character of infix function names as it is a keyword in the language -- :: is also a keyword regardless. In general infix operators can be defined with parenthesis around the function name and there is a special set of symbols allowed,
let (!!) a b = Cons( a,b )
Q3 -
This would require naming an identifier [], as in let [] = Nil. Those characters are not allowed in the naming conventions (see same link as above) as they are also individually keywords.

Type inference in SML

I'm currently learning SML and I have a question about something I have no name for. Lets call it "type alias" for the moment. Suppose I have the following datatype definition:
datatype 'a stack = Stack of 'a list;
I now want to add an explicit "empty stack" type. I can to this by adding it to the datatype:
datatype 'a stack = emptystack | Stack of 'a list;
Now I can pattern match a function like "push":
fun push (emptystack) (e:'a) = Stack([e])
| push (Stack(list):'a stack) (e:'a) = Stack(e::list);
The problem here is that Stack([]) and emptystack are different but I want them to be the same. So every time SML encounters an Stack([]) it should "know" that this is emptystack (in case of push it should then use the emptystack match).
Is there a way to achieve this?
The short answer is: No, it is not possible.
You can create type aliases with the code
type number = int
val foo : number -> int -> number =
fn a => fn b => a+b
val x : int = foo 1 3;
val y : number = foo 1 3;
However, as the name says, it only works for types. Your question goes for value constructors, which there is no syntax for.
Such an aliasing is not possible in SML.
Instead, you should design your datatypes to be unambiguous in their representation, if that is what you desire.
You'd probably be better suited with something that resembles the definition of 'a list more:
datatype 'a stack = EmptyStack | Stack of 'a * 'a stack;
This has the downside of not letting you use the list functions on it, but you do get an explicit empty stack constructor.
Since what you want is for one value emptystack to be synonymous with another value Stack [], you could call what you are looking for "value aliases". Values that are compared with the built-in operator = or pattern matching will not allow for aliases.
You can achieve this by creating your own equality operator, but you will lose the ability to use the built-in = (since Standard ML does not support custom operator overloading) as well as the ability to pattern match on the value constructors of your type.
Alternatively, you can construct a normal form for your type and always compare the normal form. Whenever practically feasible, follow Sebastian's suggestion of no ambiguity. There might be situations in which an unambiguous algebraic type will be much more complex than a simpler one that allows the same value to be represented in different ways.