Ocaml List: Implement append and map functions - list

I'm currently trying to extend a friend's OCaml program. It's a huge collection of functions needed for some data analysis.. Since I'm not really an OCaml crack I'm currently stuck on a (for me) strange List implementation:
type 'a cell = Nil
| Cons of ('a * 'a llist)
and 'a llist = (unit -> 'a cell);;
I've figured out that this implements some sort of "lazy" list, but I have absolutely no idea how it really works. I need to implement an Append and a Map Function based on the above type. Has anybody got an idea how to do that?
Any help would really be appreciated!

let rec append l1 l2 =
match l1 () with
Nil -> l2 |
(Cons (a, l)) -> fun () -> (Cons (a, append l l2));;
let rec map f l =
fun () ->
match l () with
Nil -> Nil |
(Cons (a, r)) -> fun () -> (Cons (f a, map f r));;
The basic idea of this implementation of lazy lists is that each computation is encapsulated in a function (the technical term is a closure) via fun () -> x.
The expression x is then only evaluated when the function is applied to () (the unit value, which contains no information).

It might help to note that function closures are essentially equivalent to lazy values:
lazy n : 'a Lazy.t <=> (fun () -> n) : unit -> 'a
force x : 'a <=> x () : 'a
So the type 'a llist is equivalent to
type 'a llist = 'a cell Lazy.t
i.e., a lazy cell value.
A map implementation might make more sense in terms of the above definition
let rec map f lst =
match force lst with
| Nil -> lazy Nil
| Cons (hd,tl) -> lazy (Cons (f hd, map f tl))
Translating that back into closures:
let rec map f lst =
match lst () with
| Nil -> (fun () -> Nil)
| Cons (hd,tl) -> (fun () -> Cons (f hd, map f tl))
Similarly with append
let rec append a b =
match force a with
| Nil -> b
| Cons (hd,tl) -> lazy (Cons (hd, append tl b))
becomes
let rec append a b =
match a () with
| Nil -> b
| Cons (hd,tl) -> (fun () -> Cons (hd, append tl b))
I generally prefer to use the lazy syntax, since it makes it more clear what's going on.
Note, also, that a lazy suspension and a closure are not exactly equivalent. For example,
let x = lazy (print_endline "foo") in
force x;
force x
prints
foo
whereas
let x = fun () -> print_endline "foo" in
x ();
x ()
prints
foo
foo
The difference is that force computes the value of the expression exactly once.

Yes, the lists can be infinite. The code given in the other answers will append to the end of an infinite list, but there's no program you can write than can observe what is appended following an infinite list.

Related

List.assoc using List.find

I want to implement the List.assoc function using List.find, this is what I have tried:
let rec assoc lista x = match lista with
| [] -> raise Not_found
| (a,b)::l -> try (List.find (fun x -> a = x) lista)
b
with Not_found -> assoc l x;;
but it gives me this error:
This expression has type ('a * 'b) list but an expression was expected of type 'a list
The type variable 'a occurs inside 'a * 'b
I don't know if this is something expected to happen or if I'm doing something wrong. I also tried this as an alternative:
let assoc lista x = match lista with
| [] -> raise Not_found
| (a,b)::l -> match List.split lista with
| (l1,l2) -> let ind = find l1 (List.find (fun s -> compare a x = 0))
in List.nth l2 ind;;
where find is a function that returns the index of the element requested:
let rec find lst x =
match lst with
| [] -> raise Not_found
| h :: t -> if x = h then 0 else 1 + find t x;;
with this code the problem is that the function should have type ('a * 'b) list -> 'a -> 'b, but instead it's (('a list -> 'a) * 'b) list -> ('a list -> 'a) -> 'b, so when I try
assoc [(1,a);(2,b);(3,c)] 2;;
I get:
This expression has type int but an expression was expected of type
'a list -> 'a (refering to the first element of the pair inside the list)
I don't understand why I don't get the expected function type.
First off, a quick suggestion on making your assoc function more idiomatic OCaml: have it take the list as the last argument.
Secondly, why are you attempting to implement this in terms of find? It's much easier without.
let rec assoc x lista =
match lista with
| [] -> raise Not_found
| (a, b) :: xs -> if a = x then b else assoc x xs
Something like this is simpler and substantially more efficient with the way lists work in OCaml.
Having the list as the last argument, even means we can write this more tersely.
let rec assoc x =
function
| [] -> raise Not_found
| (a, b) :: xs -> if a = x then b else assoc x xs
As to your question, OCaml infers the types of functions from how they're used.
find l1 (List.find (fun s -> compare a x = 0))
We know l1 is an int list. So we must be trying to find it in an int list list. So:
List.find (fun s -> compare a x = 0)
Must return an int list list. It's a mess. Try rethinking your function and you'll end up with something much easier to reason about.

Ocaml Type error: This expression has type 'a * 'b but an expression was expected of type 'c list

I'm required to output a pair of lists and I'm not understanding why the pair I'm returning is not of the correct type.
let rec split l = match l with
| [] -> []
| [y] -> [y]
| x :: xs ->
let rec helper l1 acc = match l1 with
| [] -> []
| x :: xs ->
if ((List.length xs) = ((List.length l) / 2)) then
(xs, (x :: acc))
else helper xs (x :: acc)
in helper l []
(Please take the time to copy/paste and format your code on SO rather than providing a link to an image. It makes it much easier to help, and more useful in the future.)
The first case of the match in your helper function doesn't return a pair. All the cases of a match need to return the same type (of course).
Note that the cases of your outermost match are also of different types (if you assume that helper returns a pair).

iterate with fold_right in Ocaml

fold_right gives me values starting from the tail of the list but I want to give a function to fold_right as a parameter such that this function would collect values starting from the head of the list .
I want iterto receive values starting with the head of the list.
Continous Passing is the keyword ... .Another way to ask the question would be how tofold_leftwith fold_right
let fold f ls acc = List.fold_right f ls acc
val iter : ('a -> unit) -> 'a t -> unit
let iter f my_type =
let rec iiter my_type return =
return (fold (fun x y -> f x) my_type ()) () in iiter my_type (fun x y -> ())
But when I call :
iter (fun a -> print_string a) ["hi";"how";"are";"you"];;
Output:
youarehowhi
I need
hihowareyou
This is quite simple, you must try to match the signatures for the behavior.
Iteration takes no input, and returns unit, while folding takes an input and returns an output of the same type. Now, if the input taken by folding is unit then you'll have a folding function which applies a function on each element of a collection by passing an additional unit and returning an unit, which basically corresponds to the normal iteration, eg:
# let foo = [1;2;3;4;5];;
# List.fold_left (fun _ a -> print_int a; ()) () foo;;
12345- : unit = ()
As you can see the fold function just ignores the first argument, and always returns unit.
let fold_left f init ls =
let res = List.fold_right (fun a b acc -> b (f acc a)) ls (fun a -> a)
in res init
now calling
fold_left (fun a b -> Printf.printf "%s\n" b) () ["how";"are";"you"];;
gives us
how
are
you
fold_left is like List.fold_left but constructed with List.fold_right (Not tail-recursive):
let fold_left f a l = List.fold_right (fun b a -> f a b) (List.rev l) a ;;
Is not a good idea, because fold_left is not tail-recursive and List.fold_left is tail-recursive. Is better to produce a fold_right (tail-recursive) as :
let fold_right f l a = List.fold_left (fun a b -> f b a) a (List.rev l) ;;
If you can't use List.rev :
let rev l =
let rec aux acc = function
| [] -> acc
| a::tl -> aux (a::acc) tl
in
aux [] l
;;
iter use fold_left :
let iter f op = ignore (fold_left (fun a b -> f b;a) [] op ) ;;
Test :
# fold_left (fun a b -> (int_of_string b)::a ) [] ["1";"3"];;
- : int list = [3; 1]
# rev [1;2;3];;
- : int list = [3; 2; 1]
# iter print_string ["hi";"how";"are";"you"];;
hihowareyou- : unit = ()
The continuation that you need to pass through fold in this case is a function that will, once called, iterate through the rest of the list.
EDIT: like so:
let iter f list = fold
(fun head iter_tail -> (fun () -> f head;; iter_tail ()))
list
()

Does it make sense or is it possible to write a `iter` function for lazy list?

Say we have such a lazy list:
type 'a lazy_list_t = Cons of 'a * (unit -> 'a lazy_list_t)
Does it make sense to have a function like the iter in regular list:
val iter : ('a -> unit) -> 'a list -> unit
List.iter f [a1; ...; an] applies function f in turn to a1; ...; an. It is equivalent to begin f a1; f a2; ...; f an; () end.
Or is it possible to produce iter_lazy like
val iter_lazy: ('a -> unit) -> 'a lazy_list -> unit
No, it does not make much sense.
First, and you probably noticed it, all your list are infinite (you do not have an empty element). So, only examples of inhabitant of your type are somehow using a recursive function, eg. :
let omega =
let rec f n = Cons (n, fun () -> f (n + 1)) in
f 0
This implements the infinite stream [ 0, 1, 2, 3, ...
If you WANT a diverging program you could implement :
let rec iter f (Cons (n, g)) = f n; iter f (g ())
but if you do iter print_int omega it will result output all integers which will take some time.
So itering is not an option. What would work is "mapping", you can implement the function :
val map: ('a -> 'b) -> 'a lazy_list_t -> 'b lazy_list
let rec map f (Cons (x, g)) = Cons (f x, fun () -> map f (g ()))
Notice how the recursive call to map is "protected" by the "fun () ->" so it will not trigger "right away" but only each time the tail of your lazy list is forced.
You can use this to lazily compute on infinite streams, eg :
let evens = map ((*) 2) omega
computes the stream [0; 2; 4; 6; 8; ...
Note, that you could use it to implement some sort of "iter" by mapping a function that does a side_effect eg.
let units = map print_int evens
will output right away the number "0" and outputs the stream [(); (); (); ... and each time you force one of the "tail" of this stream it will output the corresponding number (it can happen multiple times). Example:
(* Force the tail *)
val tl : 'a lazy_list_t -> 'a lazy_list_t
let tl (Cons (_, g)) = g ()
let () = begin
tl units; (* ouputs "2" *)
tl (tl units); (* outputs "24" *)
tl units; (* outputs "2" *)
end
(I haven't tried the code so there may be some typos).

Streams (aka "lazy lists") and tail recursion

This question uses the following "lazy list" (aka "stream") type:
type 'a lazylist = Cons of 'a * (unit -> 'a lazylist)
My question is: how to define a tail-recursive function lcycle that takes a non-empty (and non-lazy) list l as argument, and returns the lazylist corresponding to repeatedly cycling over the elements l. For example:
# ltake (lcycle [1; 2; 3]) 10;;
- : int list = [1; 2; 3; 1; 2; 3; 1; 2; 3; 1]
(ltake is a lazy analogue of List::take; I give one implementation at the end of this post.)
I have implemented several non-tail-recursive versions of lcycles, such as:
let lcycle l =
let rec inner l' =
match l' with
| [] -> raise (Invalid_argument "lcycle: empty list")
| [h] -> Cons (h, fun () -> inner l)
| h::t -> Cons (h, fun () -> inner t)
in inner l
...but I have not managed to write a tail-recursive one.
Basically, I'm running into the problem that lazy evaluation is implemented by constructs of the form
Cons (a, fun () -> <lazylist>)
This means that all my recursive calls happen within such a construct, which is incompatible with tail recursion.
Assuming the lazylist type as defined above, is it possible to define a tail-recursive lcycle? Or is this inherently impossible with OCaml?
EDIT: My motivation here is not to "fix" my implementation of lcycle by making it tail-recursive, but rather to find out whether it is even possible to implement a tail recursive version of lcycle, given the definition of lazylist above. Therefore, pointing out that my lcycle is fine misses what I'm trying to get at. I'm sorry I did not make this point sufficiently clear in my original post.
This implementation of ltake, as well as the definition of the lazylist type above, comes from here:
let rec ltake (Cons (h, tf)) n =
match n with
0 -> []
| _ -> h :: ltake (tf ()) (n - 1)
I don't see much of a problem with this definition. The call to inner is within a function which won't be invoked until lcycle has returned. Thus there is no stack safety issue.
Here's an alternative which moves the empty list test out of the lazy loop:
let lcycle = function
| [] -> invalid_arg "lcycle: empty"
| x::xs ->
let rec first = Cons (x, fun () -> inner xs)
and inner = function
| [] -> first
| y::ys -> Cons (y, fun () -> inner ys) in
first
The problem is that you're trying to solve a problem that doesn't exist. of_list function will not take any stack space, and this is why lazy lists are so great. Let me try to explain the process. When you apply of_list function to a non empty list, it creates a Cons of the head of the list and a closure, that captures a reference to the tail of the list. Afterwards it momentary returns. Nothing more. So it takes only few words of memory, and none of them uses stack. One word contains x value, another contains a closure, that captures only a pointer to the xs.
So then, you deconstruct this pair, you got the value x that you can use right now, and function next, that is indeed the closure that, when invoked, will be applied to a list and if it is nonempty, will return another Cons. Note, that previous cons will be already destroyed to junk, so new memory won't be used.
If you do not believe, you can construct an of_list function that will never terminate (i.e., will cycle over the list), and print it with a iter function. It will run for ever, without taking any memory.
type 'a lazylist = Cons of 'a * (unit -> 'a lazylist)
let of_list lst =
let rec loop = function
| [] -> loop lst
| x :: xs -> Cons (x, fun () -> loop xs) in
loop lst
let rec iter (Cons (a, next)) f =
f a;
iter (next ()) f