How could you recreate the List.iter function in ocaml? - list

I'm quite new to the Ocaml language and in general a newbie in programming. So I feel like this question is very basic but here it is:
I would like to recreate the List.iter function in Ocaml to understand it better and because I've been asked to by my teacher.
Here's what I've done :
let rec iter f = function
|[]->()
|e::q-> f e (iter f q);;
My two very simple problems are
I don't really understand how List.iter works
this results in ('a -> unit -> unit) -> 'a list -> unit = and I know my 'f' should only be a
'a->unit and I don't know how to change it
(If I made any mistakes, I'm sorry, my native language is french)

The definition of List.iter is something like this. This function call:
List.iter f [x1; x2; ...; xn]
is equivalent to these separate calls:
f x1;
f x2;
. . .
f xn
Your problem is mostly that you're missing a semicolon (;) to separate statments that should be done sequentially.
This expression:
f e (iter f q)
is one big expression that calls f with three parameters. You need to separate it into its two parts.

Related

Lwt and recursive functions

Is it ok to use Lwt.return as the final call in a recursive function?
I have a function that compiles fine but does not run properly and it looks like the function f below. Please assume that there is no issue with any function provided as g in this example, I am basically just trying to find out if it is ok to have a function with the following form or if there is a better/simpler (and Lwt compliant) way of doing the following:
let rec f (x : string list) (g : string -> unit Lwt.t) =
match List.length x with
| 0 -> Lwt.return ()
| _ -> g (List.hd x) >>= fun () -> f (List.tl x) g
;;
val f : string list -> (string -> unit Lwt.t) -> unit Lwt.t = <fun>
I am pretty sure that I am doing it wrong. But the actual function I am using is much more complex than this example so I am having a difficult time debugging it.
First of all the correct way of dealing with lists in OCaml is deconstructing them with pattern matching, like this:
let rec f (xs : string list) (g : string -> unit Lwt.t) =
match xs with
| [] -> return ()
| x :: xs -> g x >>= fun () -> f xs g
The next step would be notice, that you're actually just perform iteration over a list. There is a Lwt_list.iter_s for this:
let f g xs = Lwt_list.iter_s g xs
That can simplified even more
let f = Lwt_list.iter_s
That means, that you even do not need to write such function, since it is already there.
And finally, there was no issues with recursion in your original implementation. The function that you've provided was tail recursive.
It depends whether g returns an lwt thread that is already computed such as return () or scheduled and woken up later by the lwt scheduler. In the former case, it's possible that the call to fun () -> f (List.tl x) g is made right away instead of being scheduled for later, and that could grow the stack depending on what optimizations are happening.
I don't think your code should rely on such tricky behavior. For this particular example, as suggested in #ivg's answer, you should use the functions from the Lwt_list module.
It's a good idea to look at the implementation of the Lwt_list module to see how it's done. The same advice goes for the OCaml standard library as well.

How do I write a function to create a circular version of a list in OCaml?

Its possible to create infinite, circular lists using let rec, without needing to resort to mutable references:
let rec xs = 1 :: 0 :: xs ;;
But can I use this same technique to write a function that receives a finite list and returns an infinite, circular version of it? I tried writing
let rec cycle xs =
let rec result = go xs and
go = function
| [] -> result
| (y::ys) -> y :: go ys in
result
;;
But got the following error
Error: This kind of expression is not allowed as right-hand side of `let rec'
Your code has two problems:
result = go xs is in illegal form for let rec
The function tries to create a loop by some computation, which falls into an infinite loop causing stack overflow.
The above code is rejected by the compiler because you cannot write an expression which may cause recursive computation in the right-hand side of let rec (see Limitations of let rec in OCaml).
Even if you fix the issue you still have a problem: cycle does not finish the job:
let rec cycle xs =
let rec go = function
| [] -> go xs
| y::ys -> y :: g ys
in
go xs;;
cycle [1;2];;
cycle [1;2] fails due to stack overflow.
In OCaml, let rec can define a looped structure only when its definition is "static" and does not perform any computation. let rec xs = 1 :: 0 :: xs is such an example: (::) is not a function but a constructor, which purely constructs the data structure. On the other hand, cycle performs some code execution to dynamically create a structure and it is infinite. I am afraid that you cannot write a function like cycle in OCaml.
If you want to introduce some loops in data like cycle in OCaml, what you can do is using lazy structure to prevent immediate infinite loops like Haskell's lazy list, or use mutation to make a loop by a substitution. OCaml's list is not lazy nor mutable, therefore you cannot write a function dynamically constructs looped lists.
If you do not mind using black magic, you could try this code:
let cycle l =
if l = [] then invalid_arg "cycle" else
let l' = List.map (fun x -> x) l in (* copy the list *)
let rec aux = function
| [] -> assert false
| [_] as lst -> (* find the last cons cell *)
(* and set the last pointer to the beginning of the list *)
Obj.set_field (Obj.repr lst) 1 (Obj.repr l')
| _::t -> aux t
in aux l'; l'
Please be aware that using the Obj module is highly discouraged. On the other hand, there are industrial-strength programs and libraries (Coq, Jane Street's Core, Batteries included) that are known to use this sort of forbidden art.
camlspotter's answer is good enough already. I just want to add several more points here.
First of all, for the problem of write a function that receives a finite list and returns an infinite, circular version of it, it can be done in code / implementation level, just if you really use the function, it will have stackoverflow problem and will never return.
A simple version of what you were trying to do is like this:
let rec circle1 xs = List.rev_append (List.rev xs) (circle1 xs)
val circle: 'a list -> 'a list = <fun>
It can be compiled and theoretically it is correct. On [1;2;3], it is supposed to generate [1;2;3;1;2;3;1;2;3;1;2;3;...].
However, of course, it will fail because its run will be endless and eventually stackoverflow.
So why let rec circle2 = 1::2::3::circle2 will work?
Let's see what will happen if you do it.
First, circle2 is a value and it is a list. After OCaml get this info, it can create a static address for circle2 with memory representation of list.
The memory's real value is 1::2::3::circle2, which actually is Node (1, Node (2, Node (3, circle2))), i.e., A Node with int 1 and address of a Node with int 2 and address of a Node with int 3 and address of circle2. But we already know circle2's address, right? So OCaml just put circle2's address there.
Everything will work.
Also, through this example, we can also know a fact that for a infinite circled list defined like this actually doesn't cost limited memory. It is not generating a real infinite list to consume all memory, instead, when a circle finishes, it just jumps "back" to the head of the list.
Let's then go back to example of circle1. Circle1 is a function, yes, it has an address, but we do not need or want it. What we want is the address of the function application circle1 xs. It is not like circle2, it is a function application which means we need to compute something to get the address. So,
OCaml will do List.rev xs, then try to get address circle1 xs, then repeat, repeat.
Ok, then why we sometimes get Error: This kind of expression is not allowed as right-hand side of 'let rec'?
From http://caml.inria.fr/pub/docs/manual-ocaml/extn.html#s%3aletrecvalues
the let rec binding construct, in addition to the definition of
recursive functions, also supports a certain class of recursive
definitions of non-functional values, such as
let rec name1 = 1 :: name2 and name2 = 2 :: name1 in expr which
binds name1 to the cyclic list 1::2::1::2::…, and name2 to the cyclic
list 2::1::2::1::…Informally, the class of accepted definitions
consists of those definitions where the defined names occur only
inside function bodies or as argument to a data constructor.
If you use let rec to define a binding, say let rec name. This name can be only in either a function body or a data constructor.
In previous two examples, circle1 is in a function body (let rec circle1 = fun xs -> ...) and circle2 is in a data constructor.
If you do let rec circle = circle, it will give error as circle is not in the two allowed cases. let rec x = let y = x in y won't do either, because again, x not in constructor or function.
Here is also a clear explanation:
https://realworldocaml.org/v1/en/html/imperative-programming-1.html
Section Limitations of let rec

Is there an infix function composition operator in OCaml?

Just a quick question. I'm wondering if there is a infix function composition operator in OCaml defined in the standard library (or in Jane Street's Core or in Batteries) like the (.) function in Haskell which saves us a lot parentheses since we can write (f . g . h) x instead of the less appealing f (g (h x))).
Thanks folks.
The answer here is the same as for flip :-). Function composition isn't defined in the OCaml standard library. In this case, it isn't something I miss once in a while, I miss it all the time.
The OCaml Batteries Included project defines function composition (in the order you give) using the operator -| in the BatStd module. As lukstafi points out (see below), this operator will apparently change to % in a future release of Batteries. (I've verified this in their source tree.)
As far as I can see, the Jane Street Core project doesn't define a function composition operator. It defines a function compose in the Fn module.
I just want to add that the operator is fairly easy to include, in F# it's simply defined as:
let (<<) f g x = f(g(x));;
which has the type signature: val ( << ) : f:('a -> 'b) -> g:('c -> 'a) -> x:'c -> 'b doing exactly what you need...
(f << g << h) x = f(g(h(x))
so you don't need the batteries project if you don't have to
I'd like to add that the reason it looks like << is, as you might guess, because the >> operator does the opposite:
let (>>) f g x = g(f(x));;
(f >> g >> h) x = h(g(f(x))
There is Fn.compose function in Core, but it is not an infix operator. Also, it is implemented as a regular function and has runtime overhead.
In practice, it is pretty convenient to use pipe operator. It has no runtime overhead as implemented directly in compiler (starting from 4.00). See Optimized Pipe Operators for more details.
Pipe operator is available as '|>' in Core. So, you can rewrite your expression as following: h x |> g |> f
The use of an infix composition operator seems to be discouraged. (see this discussion).
You can write f ## g ## h x instead of f (g (h x))).
In Containers (yet another stdlib replacement for Ocaml), the function composition operator is called % and can be found in the CCFun module:
open Containers
open Fun
let is_zero n = (n = 0)
let nonzeros = List.filter (not % is_zero) [0;1;2;3;0]
Maybe that could help you.
let identite f = f
let (>>) = List.fold_right identite
test:
# let f=fun x-> x+1 and
g=fun x-> x*2 and
h=fun x-> x+3;;
# [f;g;h] >> 2;;
- : int = 11

Look for a trick to iter on ocaml type constructors

I have an ocaml type :
type t = A | B | ...
and a function to print things about that type :
let pp_t fmt x = match x with
| A -> Format.fprintf fmt "some nice explanations about A"
| B -> Format.fprintf fmt "some nice explanations about B"
| ...
How could I write a function to print all the explanations ? Something equivalent to :
let pp_all_t fmt =
Format.fprintf fmt A;
Format.fprintf fmt B;
...
but that would warn me if I forget to add a new constructor.
It would be even better to have something that automatically build that function,
because my problem is that t is quiet big and changes a lot.
I can't imagine how I can "iterate" on the type constructors, but maybe there is a trick...
EDIT: What I finally did is :
type t = A | B | ... | Z
let first_t = A
let next_t = function A -> B | B -> C | ... | Z -> raise Not_found
let pp_all_t fmt =
let rec pp x = pp_t fmt x ; try let x = next_t x in pp x with Not_found -> ()
in pp first_t
so when I update t, the compiler warns me that I have to update pp_t and next_t, and pp_all_t doesn't have to change.
Thanks to you all for the advices.
To solve your problem for a complicated and evolving type, in practice I would probably write an OCaml program that generates the code from a file containing a list of the values and the associated information.
However, if you had a function incr_t : t -> t that incremented a value of type t, and if you let the first and last values of t stay fixed, you could write the following:
let pp_all_t fmt =
let rec loop v =
pp_t fmt v;
if v < Last_t then loop (incr_t v)
in
loop First_t
You can't have a general polymorphic incr_t in OCaml, because it only makes sense for types whose constructors are nullary (take no values). But you can write your own incr_t for any given type.
This kind of thing is handled quite nicely in Haskell. Basically, the compiler will write some number of functions for you when the definitions are pretty obvious. There is a similar project for OCaml called deriving. I've never used it, but it does seem to handle the problem of enumerating values.
Since you say you want a "trick", if you don't mind using the unsafe part of OCaml (which I personally do mind), you can write incr_t as follows:
let incr_t (v: t) : t =
(* Please don't use this trick in real code :-) ! See discussion below.
*)
if t < Last_t then
Obj.magic (Obj.magic v + 1)
else
failwith "incr_t: argument out of range"
I try to avoid this kind of code if at all possible, it's too dangerous. For example, it will produce nonsense values if the type t gets constructors that take values. Really it's "an accident waiting to happen".
One needs some form of metaprogramming for such tasks. E.g. you could explore deriving to generate incr_t from the Jeffrey's answer.
Here is a sample code for the similar task : https://stackoverflow.com/a/1781918/118799
The simplest thing you can do is to define a list of all the constructors:
let constructors_t = [A; B; ...]
let pp_all_t = List.iter pp_t constructors_t
This is a one-liner, simple to do. Granted, it's slightly redundant (which gray or dark magic would avoid), but it's still probably the best way to go in term of "does what I want" / "has painful side effects" ratio.

How can I avoid warnings when I apply function to a known list of arguments in OCaml?

How can I transform several values a,b,c etc. to a',b',c' etc, such that x'=f(x)? The values are bound to specific names and their quantity is known at compile-time.
I tried to apply a function to a list in the following way:
let [a';b'] = List.map f [a;b] in ...
But it yields warning:
Warning P: this pattern-matching is not exhaustive.
Here is an example of a value that is not matched:
[]
Any way to avoid it?
You can write a few functions for mapping on to uniform tuples, i.e.:
let map4 f (x,y,z,w) = (f x, f y, f z, f w)
let map3 f (x,y,z) = (f x, f y, f z)
let map2 f (x,y) = (f x, f y)
and then you can just use them whenever the need arises.
let (x',y') = map2 f (x,y)
Unfortunately not. You can silence the compiler by writing
match List.map f [a;b] with
[a';b'] -> ...
| _ -> assert false
but that's all.
The compiler is trying to help you here. It tells you that you are trying to assign an unknown list to [a';b'] . What if one year later you change this code so that the first list, [a;b], is refactored to a different place in the code so you don't see it, and the function f is changed so that it sometimes returns a different list? You will then sometimes get a run-time exception trying to match [a';b'] with a wrong list. The compiler cannot check that the code is correct, hence the warning.
Why not write
let (a', b', c') = ( f a, f b, f c);;
It's not so much more work to write this, but completely safe against any future changes in the code.