How to write the list in OCaml? - list

If I want to write list.ml in OCaml,
Q1
which way is correct?
type 'a list =
| Nil
| Cons of 'a * ('a list)
or
type 'a list =
| Nil
| Cons of 'a * 'a list
Any differences?
Q2
Also, how do I define the Cons inside the type definition as ::?
Q3
How do I define Nil inside the type definition as []?

Q1 -
There is no difference; each has two parameters associated to Cons. Although, Cons of ('a * 'a list) is different since it has one parameter, a tuple. You will come across that as an important distinction if you construct a tuple and try to wrap it in Cons as in, let x = a,Nil in Cons x. The choice depends on how you plan on constructing elements or some semantics of the data. In this particular case, no parenthesis should be used.
Q2 -
You cannot use : as the first character of infix function names as it is a keyword in the language -- :: is also a keyword regardless. In general infix operators can be defined with parenthesis around the function name and there is a special set of symbols allowed,
let (!!) a b = Cons( a,b )
Q3 -
This would require naming an identifier [], as in let [] = Nil. Those characters are not allowed in the naming conventions (see same link as above) as they are also individually keywords.

Related

OCaml type constructor help. I need to understand this further

An ordered list is a list in which an element that appears later in the list is never smaller than an element that appears earlier in the list, where the notion of "smaller than" is given by a specific chosen relation. For example, the list
[1; 5; 7; 12; 13]
is an ordered list relative to the usual < relation on numbers, the list
[17; 14; 9; 6; 2]
is an ordered list relative to the > relation on numbers but not relative to the < order and the list
[17; 14; 9; 13; 2]
How would I define a type constructor olist that is like the list type constructor except that values of this type carry enough information to determine whether or not they are ordered lists relative to the intended ordering relation?
For example, I need a funciton of
initOList : ('a -> 'a -> bool) -> 'a olist
that takes an ordering relation over a given type and returns an empty olist of that type. Where do I beging with creating this type instructor and use it for creating the initOList function?
A static type represents some property of an expression that can be checked at the compile time, i.e., without relying on any information that is available only at runtime. So you can't really express in the type systems things that depend on runtime values, such as the list length, ordering, contents, etc.
In your particular case, you want to have a type constructor that takes an ordering and returns a new type. However, type constructors are functions in the type domain, i.e., they map types to types (i.e., take types as their parameters and return types). But ordering is a function of type 'a -> 'a -> int, so you can't really express it in OCaml.
A type system that allows values to appear in the domain of type constructors (i.e., to parametrize types with the runtime values) is called Dependent Type systems. OCaml doesn't really provide dependent typing. This is not because it is impossible to do this, this is a matter of choice, as working with the dependent typing system is much harder, moreover, type inference in the dependent type systems it is undecidable (i.e., it is impossible to write an algorithm that will work for all programs), so a significant help from a user is usually required to prove that a program is well-typed. Thus the systems that implement the Dependent Typing are more close to automated theorem provers, e.g., Coq, Isabelle, etc. However, there are those that are more close to conventional programming languages, e.g., F*.
So now, we are clear, that in OCaml a type cannot be parametrized with a runtime value. However, the typing system still provides enough power to express a notion of a sorted list. We can't really use the type system to check that our sorting algorithm is correct, but we can tell it, that hey, this function ensures that a list is sorted, just believe us, and take it as granted. We can represent this using the module level abstraction, e.g.,
module SortedList : sig
type 'a t
val create : ('a -> 'a -> int) -> 'a list -> 'a t
end = struct
type 'a t = 'a list
let create = List.sort
end
The idea is that we have type 'a SortedList.t that can be only constructed with the SortedList.create function and there are no other ways around it. So an expression that has type 'a SortedList.t bears a proof in itself, that it is sorted. We can use it to make preconditions of functions explicit in the type system, e.g., suppose we have a dedup function that removes duplicates from a list, and it works correctly if the input list is sorted. We can't really test this precondition, thus we may rely on a type system:
val dedup : 'a SortedList.t -> 'a SortedList.t
Thus the dedup function type states that (a) it is applicable only to sorted lists, and (b) it preserves the ordering.
There are two issues with our approach. First of all, our 'a SortedList.t is too abstract, as it doesn't provide any access operators. We can just provide a to_list operator, that will erase the proof that a list is ordered, but allow all the list operations on it, e.g.,
module SortedList : sig
type 'a t
val create : ('a -> 'a -> int) -> 'a list -> 'a t
val to_list : 'a t -> 'a list
end = struct
type 'a t = 'a list
let create = List.sort
let to_list xs = xs
end
The to_list operation is correct, as a set of all sorted lists is a subset of all lists. That means, that 'a SortedList.t is actually a subtype of 'a list. It is nice, that OCaml provides an explicit mechanism for expressing this subtype relation via private abstract types.
module SortedList : sig
type 'a t = private 'a list
val create : ('a -> 'a -> int) -> 'a list -> 'a t
end = struct
type 'a t = 'a list
let create = List.sort
end
Such definition of a type states that 'a SortedList.t is a subtype of 'a list so it is possible to upcast one into another. Remember, since upcasting is explicit in OCaml, it is not automatic, so you need to use the upcasting operator, e.g., xs : 'a SortedList.t :> 'a list.
The second problem in our implementation is that our 'a SortedList.t doesn't really distinguish list that are differently sorted, i.e., ascending or descending. It's not a problem for the dedup function, but some functions may require that their input is sorted in a specific order (for example a function that will find a median or any other statistic mode). So we need to encode the ordering. We will encode ordering in such way, that we will treat each ordering function as a different type. (An alternative solution would be just to encode concrete variants, such as Ascending and Descending, but I will leave this is an exercise). The main drawback of our approach is that our ordered list can't be a parametric type anymore, as the ordering function is defined for a particular ground type. In fact, this means, that our OrderedList.t is now a higher order polymorphic type, so we need to use functors to implement it:
module type Ordering = sig
type element
type witness
val compare : element -> element -> int
end
module SortedList : sig
type ('a,'o) t = private 'a list
module Make(Order : Ordering) : sig
type nonrec t = (Order.element, Order.witness) t
val create : Order.element list -> t
end
end = struct
type ('a,'o) t = 'a list
module Make(Order : Ordering) = struct
type nonrec t = (Order.element, Order.witness) t
let create = List.sort Order.compare
end
end
Now, let's play a little bit with our implementation. Let's provide to different Orders:
module AscendingInt = struct
type element = int
type witness = Ascending_int
let compare : int -> int -> int = compare
end
module DescendingInt = struct
type element = int
type witness = Descending_int
let compare : int -> int -> int = fun x y -> compare y x
end
module AscendingSortedList = SortedList.Make(AscendingInt)
module DescendingSortedList = SortedList.Make(DescendingInt)
Now, let's test that the two sorted lists are actually having different types:
# let asorted = AscendingSortedList.create [3;2;1];;
val asorted : AscendingSortedList.t = [1; 2; 3]
# let bsorted = DescendingSortedList.create [3;2;1];;
val bsorted : DescendingSortedList.t = [3; 2; 1]
# compare asorted bsorted;;
Characters 16-23:
compare asorted bsorted;;
^^^^^^^
Error: This expression has type
DescendingSortedList.t =
(DescendingInt.element, DescendingInt.witness) SortedList.t
but an expression was expected of type
AscendingSortedList.t =
(AscendingInt.element, AscendingInt.witness) SortedList.t
Type DescendingInt.witness is not compatible with type
AscendingInt.witness
Now let's check that both of them are actually subtypes of 'a list:
# compare (asorted :> int list) (bsorted :> int list);;
- : int = -1
According to the definition you've provided, an olist is simply a list with a comparison function attached to it. This can be easily achieved using records (you can read more about them here).
From there, you can write functions that creates and manipulates values of your new type.
We need to go deeper
If you've managed to implement the solution above, I suggest you put all your definitions inside a module. You can read more about these here. Put simply, modules are a way to put your type definition and your functions together.

Why is Haskell [] (list) not a type class?

I am writing a Haskell function which takes a list as input. That is, there's no reason it couldn't be a queue or dequeue, or anything that allows me to access its "head" and its "tail" (and check if it's empty). So the [a] input type seems too specific. But AFAIK there's no standard library typeclass that captures exactly this interface. Sure, I could wrap my function in a Data.Foldable.toList and make it polymorphic wrt Foldable, but that doesn't quite seem right (idiomatic).
Why is there no standard list type class? (And why is the "container" type class hierarchy in Haskell less developed than I think it should be?) Or am I missing something essential?
A given algebraic datatype can be represented as its catamorphism, a transformation known as Church encoding. That means lists are isomorphic to their foldr:
type List a = forall b. (a -> b -> b) -> b -> b
fromList :: [a] -> List a
fromList xs = \f z -> foldr f z xs
toList :: List a -> [a]
toList l = l (:) []
But foldr also characterises Foldable. You can define foldMap in terms of foldr, and vice versa.
foldMap f = foldr (mappend . f) mempty
foldr f z t = appEndo (foldMap (Endo . f) t) z
(It shouldn't be surprising that foldMap :: Monoid m => (a -> m) -> [a] -> m characterises lists, because lists are a free monoid.) In other words, Foldable basically gives you toList as a class. Instances of Foldable have a "path" through them which can be walked to give you a list; Foldable types have at least as much structure as lists.
Regarding your misgivings:
It's not like Foldable has functions head/tail/isEmpty, which is what I would find more intuitive.
null :: Foldable t => t a -> Bool is your isEmpty, and you can define (a safe version of) head straightforwardly with an appropriate choice of Monoid:
head :: Foldable t :: t a -> Maybe a
head = getFirst . foldMap (First . Just)
tail is kinda tricky in my opinion. It's not obvious what tail would even mean for an arbitrary type. You can certainly write tail :: Foldable t => t a -> Maybe [a] (by toListing and then unconsing), but I think any type T for which tail :: T a -> Maybe (T a) is defined would necessarily be structurally similar to lists (eg Seq). Besides, in my experience, the vast majority of cases where you'd think you need access to a list's tail turn out to be folds after all.
That said, abstracting over unconsable types is occasionally useful. megaparsec, for example, defines a Stream class for (monomorphic) streams of tokens to be used as input for a parser.
The Question
Making your question more concrete, let's ask:
Why isn't the type class
class HasHeadAndTail t where
head :: t a -> Maybe a
tail :: t a -> Maybe (t a)
isEmpty :: t a -> Bool
in the base library?
An Answer
This class is only useful for ordered, linear containers. Map, Set, HashMap, HashTable, and Tree all would not be instances. I'd even argue against making Seq and DList an instance since there are really two possible "heads" of that structure.
Also what can we say about any type that is an instance of this class? I think the only property is if isEmpty is False then head and tail should be non-Nothing. As a result, isEmpty shouldn't even be in the class and instead be a function isEmpty :: HashHeadAndTail t => t a -> Bool ; isEmpty = isNothing . head.
So my answer is:
This class lacks utility in so far as it lacks instances.
This class lacks useful properties and classes that lack properties are frequently discouraged.

What's the difference between these two functions?

I have two functions:
let rev_flatten l =
List.fold_left (fun acc x -> List.fold_left (fun acc y -> y::acc) acc x) [] l
The type is val rev_flatten : 'a list list -> 'a list = <fun>
and
let rev_flatten =
List.fold_left (fun acc x -> List.fold_left (fun acc y -> y::acc) acc x) []
The type is val rev_flatten : '_a list list -> '_a list = <fun>
I think it is the same functions, at least the same functionality, but why they have two different types? Why the second has the element type of _a? What is it?
A type variable with underscore as a prefix tells us that the variable is weakly polymorphic. A weakly polymorphic variable can be used only with one type, however a compiler can't deduce the exact type, so the type variable is mark with underscore.
When you provide an argument for the first time, a variable will no longer be polymorphic and will be able to accept arguments of a single type only.
Usually, a function is not generalized, but marked as weakly polymorphic if it might contain mutable state. In your example this is probably the case, because type system doesn't know if List.fold_left is pure or impure function.
Edit:
Why avoiding partial application (eta expansion) allows function (even impure) to be polymorphic?
Let's define a function that have an internal counter that is incremented and printed out every time the function is called. Among this, it takes a function as the argument and applies it after increasing the counter:
let count f =
let inc = ref 0 in
(fun x -> inc := !inc + 1; print_int !inc; f x);;
This function is polymorphic: ('a -> 'b) -> 'a -> 'b.
Next, let's define two more functions. A weekly polymorphic:
let max' = count max;;
val max' : '_a -> '_a -> '_a = <fun>
and a polymorphic one:
let max'' x = count max x;;
val max'' : 'a -> 'a -> 'a = <fun>
Now notice what is printed when we execute these functions:
max' 1 2;; (* prints 1 *)
max' 1 2;; (* prints 2 *)
max' 1 2;; (* prints 3 *)
max'' 1 2;; (* prints 1 *)
max'' 1 2;; (* prints 1 *)
max'' 1 2;; (* prints 1 *)
So the function that we designed as weekly polymorphic has a persistent mutable state inside that allows to use the counter as expected, while the polymorphic function is stateless and is reconstructed with every call, although we wanted to have a mutable variable inside.
This is the reason for a compiler to prefer a weakly polymorphic function that can be used with any single type instead of supporting full-fledged polymorphism.
A function with the type '_a list list -> '_a list is weakly polymorphic. What this means is that if you call the second one on an int list list, rev_flatten will no longer by '_a list list -> 'a list but int list list -> int list
You can read more here about the details of why here:
http://caml.inria.fr/resources/doc/faq/core.en.html
Cheers,
Scott
This is just the ML-style value restriction. There are some good references in a previous SO answer by gasche: What is the difference between 'a and '_l?.
Generally speaking the ML family applies a simple syntactic test to see whether it's safe to fully generalize, that is, to make a type fully polymorphic. If you generalize a case that's not safe, the program has undefined behavior (it can crash or get the wrong answer). So you need to do it only when safe.
The syntactic rule is applied because it's (relatively) easy to remember. A more complex rule was tried for a while, but it caused more harm than good (was the general conclusion). A historic description of the ML family will explain it better than I can.
One of your functions (the second one) is defined as an expression, i.e., as a function application. This is not "safe" according to the value restriction. (Remember, it's a syntactic test only.) The first is a lambda (fun x -> expr). This is "safe".
It's called the value restriction because it considers values to be safe. A function application is not a (syntactic) value. A lambda is a syntactic value. Something like [] is a value. Something like ref [] is not a value.

In OCaml, how to predicate a variable is a List?

For example,
let x = ["a";"b";"c";"d"];;
let listp =
if (x.isa(List)) then true else false;;
Is there something like a "isa" method in OCaml to predicate a variable is a List/Array/Tuple... and so on?
OCaml has no constructs for testing the type of something. A good rule of thumb is that types are either fixed or are completely unknown. In the first case, there's no need to test. In the second case, the code is required to work for all possible types.
This works out a lot better than you might expect if you're used to other languages. It's kind of a nice application of the zero/one/infinity rule.
Note that there is no trouble defining a type that contains one of a set of types you are interested in:
type number = MyFloat of float | MyInt of int
Values of this type look like: MyFloat 3.1 or MyInt 30281. You can, in effect, test the type by matching against the constructor:
let is_int x = match x with MyFloat _ -> false | MyInt _ -> true
The same is true for lists and arrays, except that these are parameterized types:
type 'a collection = MyArray of 'a array | MyList of 'a list
let is_list x = match x with MyArray _ -> false | MyList _ -> true
What you get back for the lack of so-called introspection is that you can easily construct and deconstruct values with rich and expressive types, and you can be assured that functions you call can't mess with a value when they don't know what its type is.
Can't you just match x with something specific to your type? For example, for a sequence:
let listp = match x with | h::t -> true | _ -> false
for a tuple, I don't remember the exact syntax, but something like match x with | (k,v) -> true
and so on...
Not really: everything has a type associated with it, so either it's already known that it's a list, or it's polymorphic (like 'a), in which case we're not "allowed" to know about the underlying type. Doing anything type-specific in that case would force specialization of the value's type.

What is [] (list constructor) in Haskell?

I'm Having problems understanding functors, specifically what a concrete type is in LYAH. I believe this is because I don't understand what [] really is.
fmap :: (a -> b) -> f a -> f b
Is [], a type-constructor? Or, is it a value constructor?
What does it mean to have the type of: [] :: [a]?
Is it like Maybe type-constructor, or Just value constructor?
If it is like Just then how come Just has a signature like Just :: a -> Maybe a rather than Just :: Maybe a, in other words why isn't [] typed [] :: a -> [a]
LYAH says this as it applies to functors: Notice how we didn't write instance Functor [a] where, because from fmap :: (a -> b) -> f a -> f b, we see that the f has to be a type constructor that takes one type. [a] is already a concrete type (of a list with any type inside it), while [] is a type constructor that takes one type and can produce types such as [Int], [String] or even [[String]]. I'm confused though the type of [] implies it is like a literal for [a] what is LYAH trying to get at?
The type is described (in a GHCI session) as:
$ ghci
Prelude> :info []
data [] a = [] | a : [a] -- Defined
We may also think about it as though it were defined as:
data List a = Nil
| Cons a (List a)
or
data List a = EmptyList
| ListElement a (List a)
Type Constructor
[a] is a polymorphic data type, which may also be written [] a as above. This may be thought about as though it were List a
In this case, [] is a type constructor taking one type argument a and returning the type [] a, which is also permitted to be written as [a].
One may write the type of a function like:
sum :: (Num a) => [a] -> a
Data Constructor
[] is a data constructor which essentially means "empty list." This data constructor takes no value arguments.
There is another data constructor, :, which prepends an element to the front of another list. The signature for this data constructor is a : [a] - it takes an element and another list of elements and returns a resultant list of elements.
The [] notation may also be used as shorthand for constructing a list. Normally we would construct a list as:
myNums = 3 : 2 : 4 : 7 : 12 : 8 : []
which is interpreted as
myNums = 3 : (2 : (4 : (7 : (12 : (8 : [])))))
but Haskell permits us also to use the shorthand
myNums = [ 3, 2, 4, 7, 12, 8 ]
as an equivalent in meaning, but slightly nicer in appearance, notation.
Ambiguous Case
There is an ambiguous case that is commonly seen: [a]. Depending on the context, this notation can mean either "a list of a's" or "a list with exactly one element, namely a." The first meaning is the intended meaning when [a] appears within a type, while the second meaning is the intended meaning when [a] appears within a value.
It's (confusingly, I'll grant you) syntactically overloaded to be both a type constructor and a value constructor.
It means that (the value constructor) [] has the type that, for all types a, it is a list of a (which is written [a]). This is because there is an empty list at every type.
The value constructor [] isn't typed a -> [a] because the empty list has no elements, and therefore it doesn't need an a to make an empty list of a's. Compare to Nothing :: Maybe a instead.
LYAH is talking about the type constructor [] with kind * -> *, as opposed to the value constructor [] with type [a].
it is a type constructor (e.g. [Int] is a type), and a data constructor ([2] is a list structure).
The empty list is a list holding any type
[a] is like Maybe a, [2] is like Just 2.
[] is a zero-ary function (a constant) so it doesn't have function type.
Just to make things more explicit, this data type:
data List a = Cons a (List a)
| Nil
...has the same structure as the built-in list type, but without the (nicer, but potentially confusing) special syntax. Here's what some correspondences look like:
List = [], type constructors with kind * -> *
List a = [a], types with kind *
Nil = [], values with polymorphic types List a and [a] respectively
Cons = :, data constructors with types a -> List a -> List a and a -> [a] -> [a] respectively
Cons 5 Nil = [5] or 5:[], single element lists
f Nil = ... = f [] = ..., pattern matching empty lists
f (Cons x Nil) = ... = f [x] = ...`, pattern matching single-element lists
f (Cons x xs) = ... = f (x:xs) = ..., pattern matching non-empty lists
In fact, if you ask ghci about [], it tells you pretty much the same definition:
> :i []
data [] a = [] | a : [a] -- Defined in GHC.Types
But you can't write such a definition yourself because the list syntax and its "outfix" type constructor is a special case, defined in the language spec.