structural comparison of variants - compare

I want to deal with limits on the integer number line.
I would like to have Pervasives.compare treat RightInfinity > Point x for all x, and the inverse for LeftInfinity.
In the ocaml REPL:
# type open_pt = LeftInfinity | Point of int | RightInfinity
;;
# List.sort Pervasives.compare [LeftInfinity; Point 0; Point 1; RightInfinity]
;;
- : open_pt list = [LeftInfinity; RightInfinity; Point 0; Point 1]
but
# type open_pt = LeftInfinity | Point of int | RightInfinity of unit
;;
# List.sort Pervasives.compare [LeftInfinity; Point 0; Point 1; RightInfinity ()]
;;
- : open_pt list = [LeftInfinity; Point 0; Point 1; RightInfinity ()]
"The perils of polymorphic compare" says
Variants are compared first by their tags, and then, if the tags are equal, descending recursively to the content.
Can one rely on any relationship between the order in which variants appear in a type declaration and the order of the tags?

No, you should not rely on that. You should define your own comparison function. Of course, that means you'll have to lift it through datastructures (to be able to compare, say, lists of open_pt), but that's the safe thing to do when you want a domain-specific comparison function.
Note that extended standard libraries such as Batteries or Core provide auxiliary functions to lift comparisons through all common datastructures, to help you extend your domain-specific comparison to any type containing an open_pt.
Edit: Note that you can rely on that, as the ordering of non-constant constructors is specified in the OCaml/C interface. I don't think that's a good idea, though -- what if you need to put closures inside your functor argument type next time?

Related

Difference between iterators, enumerations and sequences

I want to understand what is the difference between iterators, enumerations and sequences in ocaml
enumeration:
type 'a t = {
mutable count : unit -> int; (** Return the number of remaining elements in the enumeration. *)
mutable next : unit -> 'a; (** Return the next element of the enumeration or raise [No_more_elements].*)
mutable clone : unit -> 'a t;(** Return a copy of the enumeration. *)
mutable fast : bool; (** [true] if [count] can be done without reading all elements, [false] otherwise.*)
}
sequence:
type 'a node =
| Nil
| Cons of 'a * 'a t
and 'a t = unit -> 'a node
I don't have any idea about iterators
Enumerations/Generators
BatEnum (what you call "enumeration", but let's use module names instead) is more or less isomorphic to a generator, which is often said pull-based:
generator : unit -> 'a option
This means "Each time you call generator (), I will give you a new element from the collection, until there are no more elements and it returns None". Note that this means previous elements are not accessible. This behavior is called "destructive".
This is similar to the gen library. Such iterators are fundamentally very imperative (they work by maintaining a current state).
Sequences
Pull-based approaches are not necessarily destructive, this is where the Seq type fits. It's a list-like structure, except each node is hidden behind a closure. It's similar to lazy lists, but without the guaranty of persistency. You can manipulate these sequences pretty much like lists, by pattern matching on them.
type 'a node =
| Nil
| Cons of 'a * 'a seq
and 'a seq = unit -> 'a node
Iterators
Iterators such as sequence, also said "push-based", have a type that is similar to the iter function that you find on many data-structures:
iterator : ('a -> unit) -> unit
which means "iterator f will apply the f function to all the elements in the collection`.
What's the difference?
One key difference between pull-based and push-based approaches is their expressivity. Consider that you have two generators, gen1 and gen2, it's easy to add them:
let add gen1 gen2 =
let gen () =
match gen1(), gen2() with
| Some v1, Some v2 -> Some (v1+v2)
| _ -> None
in
gen
However, you can't really write such a function with most push-based approaches such as sequence, since you don't completely control the iteration.
On the flip side, push-based iterators are usually easier to define and are faster.
Recommendation
Starting in OCaml 4.07, Seq is available in the standard library. There is a seq compatibiliy package that you can use right now, and a large library of combinators in the associated oseq library.
Seq is fast, expressive and fairly easy to use, so I recommend using it.
An enumeration is not what you wrote, you just defined a record here. An enumeration is a type that contains multiple constructors but a variable can only pick one value at a time in it (you can see it as the union type in C) :
type enum = One | Two | Three
let e = One
A sequence is, as you write it, simply a recursive enumeration type (in your case you defined what is usually called a list).
To simplify, let's call the special structures that contains some elements of the same type a container (some known containers are arrays, lists, sets, maps etc)
An iterator is a function that applies the same function to each elements of a container. So you would have the map iterator which applies a function to each element but keep the structure as it is (for example, adding 1 to each element of a list l : List.map (fun e -> e + 1) l). The fold operator which applies a function to each element and an accumulator and returns the accumulator (for example, adding each element of a list l and returning the result : List.fold_left (fun acc e -> acc + e) l).
So,
enumeration and sequence : structures
iterators : function over each element of the structures

How to use if-then-else in a recursive function

I am writing a function that will take a list of list and merge it into sorted pairs of list. For example [[1],[9],[8],[7],[4],[5],[6]] would return [[1,9],[7,8],[4,5],[6]]. This is my first attempt at SML. I keep getting this error: operator and operand don't agree [overload conflict].
fun mergePass[] = []
| mergePass(x::[]) = x::[]
| mergePass(x::y::Z) =
if x<y
then (x # y)::mergePass(Z)
else (y # x)::mergePass(Z);
Edit: If mergePass is called on [[1,9],[7,8],[4,5],[6]] I will need it to return [[1,7,8,9],[4,5,6]].
This merge function takes two sorted lists
fun merge([],y) = y
| merge(x,[]) = x
| merge(a::x,b::y) =
if a < b then a::merge(x,b::y)
else b::merge(a::x,y);
You seem reasonably close. A few hints/remarks:
1) Aesthetically, using nil in one line and [] in others seems odd. Either use all nil or use all []
2) Since the input are lists of lists, in x::y::z, the identifiers x and y would be lists of integers, rather than individual integers. Thus, x<y wouldn't make sense. You can't compare lists of integers using <.
3) Your problem description strongly suggests that the inner-lists are all 1-element lists. Thus you could use the pattern [x]::[y]::z to allow you to compare x and y. In this case, x#y could be replaced by [x,y]
4) If the inner lists are allowed to be of arbitrary size, then your code needs major revision and would probably require a full-fledged sort function to sort the result of concatenating pairs of inner lists. Also, in this case, the single list in the one inner list case should probably be sorted.
5) You have a typo: mergeP isn't mergePass.
On Edit:
If the sublists are each sorted (and the name of the overall function perhaps suggests this) then you need a function called e.g. merge which will take two sorted lists and combine them into a single sorted list. If this is for a class and you have already seen a merge function as an example (perhaps in a discussion of merge-sort) -- just use that. Otherwise you will have to write your own before you write this function. Once you have the merge function, skip the part of comparing x and y and instead have something as simple as:
| mergePass (xs::ys::zss) = (merge xs ys) :: mergePass zss
If the sublists are not merged, then you will need a full-fledged sort in which case you would use something like:
| mergePass (xs::ys::zss) = sort(xs # ys) :: mergePass zss

Remove real element from list - SML

I have written the following code:
fun remove_element(nil, elem) = raise Empty
| remove_element(hd::tl, elem) = if(hd=elem) then tl else hd::remove_element(tl, elem);
but that function (which removed element elem from list) works for int. I need to make it work for real numbers, but I can't do it. I have tried a lot of ways of rewriting the function and also I used :real but these bring me errors.
Any suggestions?
Thank you
The accepted answer should have allowed you to finish your assignment, so I will show two other approaches for variations of your problem without worrying about doing your homework for you. As Kevin Johnson said, it isn't possible to directly compare two reals. It is possible to do so indirectly since a=b if and only if a<=b and b<=a. Often this is a bug, especially if the list in question is of numbers produced by numerical computations. But -- there are some situations where it makes sense to compare reals for equality so you should certainly be able to do so as long as you are clear that this is what you want. This leads to the following modification of your code:
fun remove_real([],x:real) = []
| remove_real(y::ys,x) =
if (y <= x andalso y >= x) then
remove_real(ys,x)
else
y::remove_real(ys,x);
A few points:
1) I changed it to remove all occurrences of the element from the list rather than just the first occurrence. This involved changing the basis case to returning the empty list since [] with y removed is just [] rather than an error situation. Also, rather than simply returning the tail if the element is found I return the recursive call applied to the tail to remove any additional occurrences later on. You could easily modify the code to make it closer to your original code.
2) I needed to put the explicit type annotation x:real so that SML could infer that the list was of type real list rather than type int list.
3) I replaced nil by [] for aesthetic reasons
4) I replaced your pattern hd::tl by y::ys. For one thing, hd and tl are built-in functions -- I see no reason to bind those identifiers to anything else, even if it is just local to a function definition. For another thing, the less visual clutter in a pattern the better.
5) I made more use of white space. Partially a matter of taste, but I think that fairly complicated clauses (like your second line) should be split across multiple lines.
If you want to go the route of including an error tolerance for comparing reals, I think that it makes most sense to include the tolerance as an explicit parameter. I find |x-y| < e to be more natural than two inequalities. Unfortunately, the built-in abs only applies to ints. If x - y is real then the expression
if x - y < 0.0 then y - x else x - y
returns the absolute value of x - y (it flips the sign in the case that it is neagative). As an added bonus -- the comparison with 0.0 rather than 0 is all that SML needs to infer the type. This leads to:
fun remove_elem([],x,tol) = []
| remove_elem(y::ys,x,tol) =
if (if x - y < 0.0 then y - x else x - y) < tol then
remove_elem(ys,x,tol)
else
y::remove_elem(ys,x,tol);
Typical output:
- remove_real([2.0, 3.1, 3.14, 3.145, 3.14], 3.14);
val it = [2.0,3.1,3.145] : real list
- remove_elem([2.0, 3.1, 3.14, 3.145, 3.14], 3.14,0.01);
val it = [2.0,3.1] : real list
- remove_elem([2.0, 3.1, 3.14, 3.145, 3.14], 3.14,0.001);
val it = [2.0,3.1,3.145] : real list
The issue is here: hd=elem
In languages like ML and Javascript, you cannot directly compare two reals as reals are bound to rounding errors.
You have to use a lambda range and define an interval instead. elem - lambda < hd andalso elem + lambda > hd

Sorting List with OCaml standard library function

I'm studying OCaml and and doing various exercises on ordering data.
I would like to understand how to use the standard librari List for ordering
For example I would like to sort this array using these functions [94; 50; 6; 7; 8; 8]
List.sort
List.stable_sort
List.fast_sort
List.unique_sort
What is the syntax to do it ?
If you want to use these functions on your list, you have to specifiy the comparison function.
Quote from the documentation:
The comparison function must return 0 if its arguments compare as
equal, a positive integer if the first is greater, and a negative
integer if the first is smaller
In the module Pervasives you have a polymorphic comparison function:
val compare : 'a -> 'a -> int
So, in your case you can just do:
List.sort compare [94; 50; 6; 7; 8; 8]

Function overloading in OCaml

I have defined some types:
type box = Box of int
type table = Table of int
type compare_result = Lt | Eq | Gt
It seems that in OCaml, we can't define 2 functions with same name but different types of arguments:
let compare (a: box) (b: box): compare_result = (...)
let compare (a: table) (b: table): compare_result = (...)
let res_box = compare (Box 1) (Box 2) in (* which is supposed to call the first funciton *)
let res_table = compare (Table 1) (Table 2) in (* which is supposed to call the second function *)
So could anyone tell me what is the alternative in OCaml to do this? Do we have to name these 2 functions differently?
Yes, the easiest solution is simply to call the functions differently. Allowing programs that do this vastly complicates the type system (not to the point that it isn't possible for experts to design a solution: to the point that you would find it unusable when they do).
Existing solutions for writing a single function compare are the object system in OCaml, and type classes in Haskell (a different extension to the same base type system). But it's much simpler to stay in the simple fragment and to name your functions compare differently.