Processing Datatype list - sml

How can I get the data out of datatype list?
I have a simple code in SML:
datatype (''a, 'b) dict =
Nil
| Dictionary of {key:''a, value:'b} list;
datatype 'dict list =
nil
| :: of 'dict * ('dict list);
val d = (Dictionary [{key="hello",value=[1,2]}]);
fun aux ((x::y):{key:''a, value:'b} list) = [x];
I want to get the key from the head of the list but I can't even separate it.
When I insert:
aux d;
I get the next error:
stdIn:2.1-2.6 Error: operator and operand don't agree [tycon mismatch]
operator domain: {key:''Z, value:zY} list
operand: (string,int ?.list) dict
in expression:
aux d
How can I split the head of the list? And how can I get the key?

SML seems to have some trouble interpreting your datatype definition in polymorphic function definitions. I suspect that the problem is that it can't infer a type for Nil in the definition. I recommend dropping that clause and using the datatype definition:
datatype (''a, 'b) dict = Dictionary of {key: ''a, value: 'b} list;
This way all dictionaries are instances of a Dictionary pattern. The basis case in function definitions would be the pattern Dictionary([]) or Dictionary(Nil) if you need an empty dictionary basis.
You do need to include the Dictionary constructor in you function definitions. To extract the key from the first entry you can do the following:
fun aux (Dictionary(entries)) = #key(hd(entries));
This uses the # operator to extract the field labeled key from the head of the list of entries. You can also use more pattern matching like this:
fun aux (Dictionary({key = x, value = y}::entries)) = x;
In either case, with your sample data you get:
- aux d;
val it = "hello" : string

Your first datatype, (''a, 'b) dict, is isomorphic to the built-in type (''a, 'b) list option (because of the extra Nil).
Your second datatype, 'dict list, shadows the built-in type 'a list. Notice that calling your type parameter 'dict does not make it special. You should probably not name it after a concrete type, but rather name it 'a or 'key or such.
Functions and types that refer to the old 'a list (such as your first type and the built-in functions) will produce one kind of lists and functions that rely on this new definition of :: and nil will produce another kind of lists. This is extremely confusing - don't override built-in types, make new ones with unique type names and unique value constructor names.
Your type error comes from feeding a value of type (string, int list) dict to a function that expects a {key:''a, value:'b} list (of the new kind of list). Clearly, the value Dictionary [{key="hello",value=[1,2]}] is not such a list.
Here is a simple dictionary implementation that uses the built-in lists:
structure Dictionary =
struct
type (''a, 'b) dict = (''a * 'b) list
fun insert (key, value, []) = [(key, value)]
| insert (key, value, (key2,value2)::rest) =
if key = key2
then (key,value)::rest
else (key2,value2)::insert (key, value, rest)
fun lookup (key, []) = NONE
| lookup (key, (key2,value2)::rest) =
if key = key2
then SOME value2
else lookup (key, rest)
end
It does not use datatypes, and it does not abstract away the representation of the dictionary. If you want to improve on the representation, going from a list to a kind of tree, you would probably need to rely on order comparison (less than, greater than, equals) rather than strictly equality.

Related

Difference between iterators, enumerations and sequences

I want to understand what is the difference between iterators, enumerations and sequences in ocaml
enumeration:
type 'a t = {
mutable count : unit -> int; (** Return the number of remaining elements in the enumeration. *)
mutable next : unit -> 'a; (** Return the next element of the enumeration or raise [No_more_elements].*)
mutable clone : unit -> 'a t;(** Return a copy of the enumeration. *)
mutable fast : bool; (** [true] if [count] can be done without reading all elements, [false] otherwise.*)
}
sequence:
type 'a node =
| Nil
| Cons of 'a * 'a t
and 'a t = unit -> 'a node
I don't have any idea about iterators
Enumerations/Generators
BatEnum (what you call "enumeration", but let's use module names instead) is more or less isomorphic to a generator, which is often said pull-based:
generator : unit -> 'a option
This means "Each time you call generator (), I will give you a new element from the collection, until there are no more elements and it returns None". Note that this means previous elements are not accessible. This behavior is called "destructive".
This is similar to the gen library. Such iterators are fundamentally very imperative (they work by maintaining a current state).
Sequences
Pull-based approaches are not necessarily destructive, this is where the Seq type fits. It's a list-like structure, except each node is hidden behind a closure. It's similar to lazy lists, but without the guaranty of persistency. You can manipulate these sequences pretty much like lists, by pattern matching on them.
type 'a node =
| Nil
| Cons of 'a * 'a seq
and 'a seq = unit -> 'a node
Iterators
Iterators such as sequence, also said "push-based", have a type that is similar to the iter function that you find on many data-structures:
iterator : ('a -> unit) -> unit
which means "iterator f will apply the f function to all the elements in the collection`.
What's the difference?
One key difference between pull-based and push-based approaches is their expressivity. Consider that you have two generators, gen1 and gen2, it's easy to add them:
let add gen1 gen2 =
let gen () =
match gen1(), gen2() with
| Some v1, Some v2 -> Some (v1+v2)
| _ -> None
in
gen
However, you can't really write such a function with most push-based approaches such as sequence, since you don't completely control the iteration.
On the flip side, push-based iterators are usually easier to define and are faster.
Recommendation
Starting in OCaml 4.07, Seq is available in the standard library. There is a seq compatibiliy package that you can use right now, and a large library of combinators in the associated oseq library.
Seq is fast, expressive and fairly easy to use, so I recommend using it.
An enumeration is not what you wrote, you just defined a record here. An enumeration is a type that contains multiple constructors but a variable can only pick one value at a time in it (you can see it as the union type in C) :
type enum = One | Two | Three
let e = One
A sequence is, as you write it, simply a recursive enumeration type (in your case you defined what is usually called a list).
To simplify, let's call the special structures that contains some elements of the same type a container (some known containers are arrays, lists, sets, maps etc)
An iterator is a function that applies the same function to each elements of a container. So you would have the map iterator which applies a function to each element but keep the structure as it is (for example, adding 1 to each element of a list l : List.map (fun e -> e + 1) l). The fold operator which applies a function to each element and an accumulator and returns the accumulator (for example, adding each element of a list l and returning the result : List.fold_left (fun acc e -> acc + e) l).
So,
enumeration and sequence : structures
iterators : function over each element of the structures

SML [circularity] error when doing recursion on lists

I'm trying to built a function that zips the 2 given function, ignoring the longer list's length.
fun zipTail L1 L2 =
let
fun helper buf L1 L2 = buf
| helper buf [x::rest1] [y::rest2] = helper ((x,y)::buf) rest1 rest2
in
reverse (helper [] L1 L2)
end
When I did this I got the error message:
Error: right-hand-side of clause doesn't agree with function result type [circularity]
I'm curious as of what a circularity error is and how should I fix this.
There are a number of problems here
1) In helper buf L1 L2 = buf, the pattern buf L1 L2 would match all possible inputs, rendering your next clause (once debugged) redundant. In context, I think that you meant helper buf [] [] = buf, but then you would run into problems of non-exhaustive matching in the case of lists of unequal sizes. The simplest fix would be to move the second clause (the one with x::rest1) into the top line and then have a second pattern to catch the cases in which at least one of the lists are empty.
2) [xs::rest] is a pattern which matches a list of 1 item where the item is a nonempty list. That isn't your attention. You need to use (,) rather than [,].
3) reverse should be rev.
Making these changes, your definition becomes:
fun zipTail L1 L2 =
let
fun helper buf (x::rest1) (y::rest2) = helper ((x,y)::buf) rest1 rest2
| helper buf rest1 rest2 = buf
in
rev (helper [] L1 L2)
end;
Which works as intended.
The error message itself is a bit hard to understand, but you can think of it like this. In
helper buf [x::rest1] [y::rest2] = helper ((x,y)::buf) rest1 rest2
the things in the brackets on the left hand side are lists of lists. So their type would be 'a list list where 'a is the type of x. In x::rest1 the type of rest1 would have to be 'a list Since rest1 also appears on the other side of the equals sign in the same position as [x::rest1] then the type of rest1 would have to be the same as the type of [x::rest1], which is 'a list list. Thus rest1 must be both 'a list and 'a list list, which is impossible.
The circularity comes from if you attempt to make sense of 'a list list = 'a list, you would need a type 'a with 'a = 'a list. This would be a type whose values consists of a list of values of the same type, and the values of the items in that list would have to themselves be lists of elements of the same type ... It is a viscous circle which never ends.
The problem with circularity shows up many other places.
You want (x::rest1) and not [x::rest1].
The problem is a syntactic misconception.
The pattern [foo] will match against a list with exactly one element in it, foo.
The pattern x::rest1 will match against a list with at least one element in it, x, and its (possibly empty) tail, rest1. This is the pattern you want. But the pattern contains an infix operator, so you need to add a parenthesis around it.
The combined pattern [x::rest1] will match against a list with exactly one element that is itself a list with at least one element. This pattern is valid, although overly specific, and does not provoke a type error in itself.
The reason you get a circularity error is that the compiler can't infer what the type of rest1 is. As it occurs on the right-hand side of the :: pattern constructor, it must be 'a list, and as it occurs all by itself, it must be 'a. Trying to unify 'a = 'a list is like finding solutions to the equation x = x + 1.
You might say "well, as long as 'a = 'a list list list list list ... infinitely, like ∞ = ∞ + 1, that's a solution." But the Damas-Hindley-Milner type system doesn't treat this infinite construction as a well-defined type. And creating the singleton list [[[...x...]]] would require an infinite amount of brackets, so it isn't entirely practical anyways.
Some simpler examples of circularity:
fun derp [x] = derp x: This is a simplification of your case where the pattern in the first argument of derp indicates a list, and the x indicates that the type of element in this list must be the same as the type of the list itself.
fun wat x = wat [x]: This is a very similar case where wat takes an argument of type 'a and calls itself with an argument of type 'a list. Naturally, 'a could be an 'a list, but then so must 'a list be an 'a list list, etc.
As I said, you're getting circularity because of a syntactic misconception wrt. list patterns. But circularity is not restricted to lists. They're a product of composed types and self-reference. Here's an example without lists taken from Function which applies its argument to itself?:
fun erg x = x x: Here, x can be thought of as having type 'a to begin with, but seeing it applied as a function to itself, it must also have type 'a -> 'b. But if 'a = 'a -> 'b, then 'a -> b = ('a -> 'b) -> 'b, and ('a -> 'b) -> b = (('a -> 'b) -> b) -> b, and so on. SML compilers are quick to determine that there are no solutions here.
This is not to say that functions with circular types are always useless. As newacct points out, turning purely anonymous functions into recursive ones actually requires this, like in the Y-combinator.
The built-in ListPair.zip
is usually tail-recursive, by the way.

How do I avoid the Value Restriction error when the argument is an empty list?

Some functions in the List module fail when the argument is an empty list. List.rev is an example. The problem is the dreaded Value Restriction.
I met the same problem while trying to define a function that returns a list with all but the last element of a list:
let takeAllButLast (xs: 'a list) =
xs |> List.take (xs.Length - 1)
The function works well with nonempty lists, but a version that would handle empty lists fails:
let takeAllButLast (xs: 'a list) =
if List.isEmpty xs then []
else xs |> List.take (xs.Length - 1)
takeAllButLast []
error FS0030: Value restriction. The value 'it' has been inferred to have generic type
val it : '_a list, etc.
I tried several things: making it an inline function, not specifying a type for the argument, specifying a type for the returned value, making the function depend on a type argument, and using the Option type to obtain an intermediate result later converted to list<'a>. Nothing worked.
For example, this function has the same problem:
let takeAllButLast<'a> (xs: 'a list) =
let empty : 'a list = []
if List.isEmpty xs then empty
else xs |> List.take (xs.Length - 1)
A similar question was asked before in SO: F# value restriction in empty list but the only answer also fails when the argument is an empty list.
Is there a way to write a function that handles both empty and nonempty lists?
Note: The question is not specific to a function that returns all but the last element of a list.
The function itself is completely fine. The function does not "fail".
You do not need to modify the body of the function. It is correct.
The problem is only with the way you're trying to call the function: takeAllButLast []. Here, the compiler doesn't know what type the result should have. Should it be string list? Or should it be int list? Maybe bool list? No way for the compiler to know. So it complains.
In order to compile such call, you need to help the compiler out: just tell it what type you expect to get. This can be done either from context:
// The compiler gleans the result type from the type of receiving variable `l`
let l: int list = takeAllButLast []
// Here, the compiler gleans the type from what function `f` expects:
let f (l: int list) = printfn "The list: %A" l
f (takeAllButLast [])
Or you can declare the type of the call expression directly:
(takeAllButLast [] : int list)
Or you can declare the type of the function, and then call it:
(takeAllButLast : int list -> int list) []
You can also do this in two steps:
let takeAllButLast_Int : int list -> int list = takeAllButLast
takeAllButLast_Int []
In every case the principle is the same: the compiler needs to know from somewhere what type you expect here.
Alternatively, you can give it a name and make that name generic:
let x<'a> = takeAllButLast [] : 'a list
Such value can be accessed as if it was a regular value, but behind the scenes it is compiled as a parameterless generic function, which means that every access to it will result in execution of its body. This is how List.empty and similar "generic values" are implemented in the standard library.
But of course, if you try to evaluate such value in F# interactive, you'll face the very same gotcha again - the type must be known - and you'll have to work around it anyway:
> x // value restriction
> (x : int list) // works

SML-NJ List Equals Nil vs Null List

I have a question about the way SML of New Jersey interprets lists:
Suppose I have a function f(x : 'a, n : int) : 'a list such that f returns a list of n copies of x, e.g. f(2,5) = [2,2,2,2,2], f(9,0) = [].
So then I go into the REPL, and I check f(9,0) = nil, and it returns true. From this, I assumed that you could use list = nil to check whether a list is the empty list. I used this in a function, and it wouldn't run. I ended up learning that the type definitions are different:
sml:121.2-123.10 Error: operator and operand don't agree [equality type required]
operator domain: ''Z * ''Z
operand: 'a list * 'Y list
in expression:
xs = nil
(Where xs was my list). I then learned that the way to check if a list is the empty list is with null list. Why is this so? What's going on with nil? Can someone explain this behavior to me?
I also note that apparently (case xs = of nil is the same as checking null xs. Does this mean nil is a type?
This is an error related to polymorphism. By default, when an empty list is evaluated, it has the type 'a list. This means that the list can contain elements of any type. If you try to evaluate 1::[], you won't get a type error because of that. This is called polymorphism, it is a feature that allows your functions to take arguments of any type. This can be useful in functions like null, because you don't care about the contents of the list in that case, you only care about its length (in fact, you only care if it's empty or not).
However, you can also have empty lists with different types. You can make your function return an empty int list. In fact, you are doing so in your function.
This is the result in a trivial implementation of your function:
- fun f(x : 'a, n : int) : 'a list =
case n of
0 => []
| _ => x::f(x, n-1);
val f = fn : 'a * int -> 'a list
- f(4,5);
val it = [4,4,4,4,4] : int list
- f(4,0);
val it = [] : int list
As you can see, even if the second argument is 0, your function returns an int list. You should be able to compare it directly with an list of type 'a list.
- it = [];
val it = true : bool
However, if you try to compare two empty lists that have different types and are not type of 'a list, you should get an error. You can see an example of it below:
- [];
val it = [] : 'a list
- val list1 : int list = [];
val list1 = [] : int list
- val list2 : char list = [];
val list2 = [] : char list
- list1 = [];
val it = true : bool
- list2 = [];
val it = true : bool
- list1 = list2;
stdIn:6.1-6.14 Error: operator and operand don't agree [tycon mismatch]
operator domain: int list * int list
operand: int list * char list
in expression:
list1 = list2
Also, case xs of nil is a way of checking if a list is empty, but this is because nil (which is just a way to write []) has the type 'a list by default. (Note that case expressions don't directly return a boolean value.) Therefore, nil is not a type, but 'a list is a polymorphic type that you can compare with lists of any type, but if your empty lists don't have polymorphic type, you will get a type error, which I think what is happening in your case.

nested lists in ocaml

I am new to Ocaml and have defined nested lists as follows:
type 'a node = Empty | One of 'a | Many of 'a node list
Now I want to define a wrapping function that wraps square brackets around the first order members of a nested list. For ex. wrap( Many [ one a; Many[ c; d]; one b; one e;] ) returns Many [Many[one a; Empty]; Many[Many[c;d]; Empty]; Many[b; Empty]; Many[e; Empty]].
Here's my code for the same:
let rec wrap list = function
Empty -> []
| Many[x; y] -> Many [ Many[x; Empty]; wrap y;];;
But I am getting an error in the last expression : This expression has the type 'a node but an expression was expected of the type 'b list. Please help.
Your two matches are not returning values of the same type. The first statement returns a b' list; the second statement returns an 'a node. To get past the type checker, you'll need to change the first statement to read as: Empty -> Empty.
A second issue (which you will run into next) is that your recursive call is not being fed a value of the correct type. wrap : 'a node -> 'a node, but y : 'a node list. One way to address this would be to replace the expression with wrap (Many y).
There will also be in issue in that your current function assumes the Many list only has two elements. I think what you want to do is Many (x::y). This matches x as the head of the list and y as the tail. However, you will then need a case to handle Many ([]) so as to avoid infinite recursion.
Finally, the overall form of your function strikes me as a bit unusual. I would replace function Empty -> ... with match list with | Empty -> ....