OCaml split function - list

I'm currently studying for a CS exam and I'm having an hard time understanding an exercise from my book. The exercise is as follows:
Define, using FOLDR and without using explicit recursion, a function (split : ’a list -> ’a -> ’a list * ’a list) such that split l n returns a pair of lists. The first contains all the values preceding the first occurrence of n in l (in the same order), and the second contains all the remaining elements (in the same order). If n does not appear in l, there are no values preceding the first occurrence of n.
Examples: split [3;-5;1;0;1;-8;0;3] 0 = ([3;-5;1],[0;1;-8;0;3]), split [3;4;5] 7 = ([],[3;4;5])
This is the code written by my professor to solve the exercise:
let split l n =
let f x (l1, l2, b) =
if x = n then ([], x::(l1#l2), true)
else if b then (x::l1, l2, b)
else (l1, x::l2, b)
in let (l1, l2, b) = foldr f ([], [], false) l in (l1, l2) ;;
I don’t understand that second line at all (let f x (l1, l2, b)).
How do those parameters get filled with a value, so that all the logic that comes with it makes sense? For example: what is x and how can it be compared to n if it has no value? What is the meaning of those Boolean values in b?
In addition I don't understand that foldr function in the last line and I don't find any documentation about it. In fact, even my compiler doesn’t understand what foldr is and gives me an error (*Unbound value foldr*). Initially I thought it was some kind of abbreviation for List.fold_right but if I try to replace with the latter I still get an error because the following parameters are not correct (File "split.ml", line 6, characters 41-56:
Error: This expression has type 'a * 'b * 'c
but an expression was expected of type 'd list).
Thank you in advance for any help or advice.

I don't know whether this is allowed by OCAML's syntax rules or not, but let's add some extra white space to make it clearer:
let split l n =
let f x ( l1, l2 , b ) =
if x = n then ( [], x::(l1#l2), true )
else if b then (x::l1, l2 , b ) (* b is true *)
else ( l1, x:: l2 , b ) (* b is false *)
in let (l1, l2, b) = foldr f ( [], [] , false) l
in (l1, l2) ;;
foldr is, in pseudocode,
foldr f z [x1 ; x2 ; ... ; xn1 ; xn ]
=
x1 -f- (x2 -f- (... -f- (xn1 -f- (xn -f- z))...))
where a -f- b denotes simple application f a b, just written infix for convenience. In other words,
foldr f z [x1 ; x2 ; ... ; xn1 ; xn] (* x_{n-1} *)
=
f x1 (foldr f z [x2 ; ... ; xn1 ; xn]) (* x_{n-1} *)
whereas
foldr f z [] = z
Thus the above is actually equivalent to
= let t1 = f xn z
in let t2 = f xn1 t1 (* x_{n-1} *)
in ....
in let tn1 = f x2 tn2 (* t_{n-2} *)
in let tn = f x1 tn1 (* t_{n-1} *)
in tn
You should now be able to see that what this does is to work on the list's elements from right to left, passing interim results to the subsequent applications of f on the left.
You should also be able now to write that missing definition of foldr yourself.
So if we substitute your specific definition of f, how it works for a list of e.g. three elements [x1; x2; x3] where x2 = n, is equivalent to
let (l1, l2, b ) = ( [], [] , false)
in let (l1_3, l2_3, b_3) = ( l1, x3::l2 , b )
in let (l1_2, l2_2, b_2) = ( [], x2::(l1_3#l2_3) , true )
in let (l1_1, l2_1, b_1) = (x1::l1_2, l2_2 , b_2 )
in (l1_1, l2_1)
i.e.
let (l1, l2, b ) = ( [], [] , false)
in let (l1_3, l2_3, b_3) = ( [], x3::[] , false)
in let (l1_2, l2_2, b_2) = ( [], x2::([]#[x3]) , true )
in let (l1_1, l2_1, b_1) = (x1::[], [x2 ; x3] , true )
in ([x1], [x2; x3])
Thus the resulting lists are being built from the back.
The Boolean flags being passed along allows the function to correctly handle cases where there are more than one elements equal to n in the list. The same effect can be achieved without any flags, with a small post-processing step instead, as
let split l n =
let f x ( l1, l2 ) =
if x = n then ( [], x::(l1#l2))
else (x::l1, l2 )
in match foldr f ( [], [] ) l with
| (l1, []) -> ([], l1)
| (l1, l2) -> (l1, l2) ;;
(if that's not a valid OCaml code, take it as a pseudocode then).

If you use the List.fold_right then this will work.
let split l n =
let f x (l1, l2, b) =
if x = n
then
([], x::(l1#l2), true)
else if b
then
(x::l1, l2, b)
else
(l1, x::l2, b)
in let (l1, l2, b) = List.fold_right f l ([], [], false)
in (l1, l2)

Fellow CS Major here.
let f x (l1, l2, b)
defines a function, which takes two arguments one called x and one is a tripple of 3 arguments (l1, l2, b). This function has it's scope limited to the following line
in let (l1, l2, b) = foldr f ([], [], false) l in (l1, l2) ;;
The part you are probably struggling with is the keyword "in" which limits the scope of one expression to the next one. so exp1 in exp2 limits the scope of expression one to expression two.
Also, x and (l1, l2, b) stand for arbitrary parameters only valid in the function body. Look at which parameters foldr takes (the first one should be a function that has the same parameters as the function f your professor defined). This foldr function then assigns a value to x (and (l1, l2, b)) in the context of foldr.
let f x (l1, l2, b)
....
in let (l1, l2, b) = foldr f ([], [], false) l in (l1, l2) ;;
While (l1, l2, b) in the first line is not the same as (l1, l2, b) in the third line (of the snippet above), here
in let (l1, l2, b) = foldr f ([], [], false) l in (l1, l2) ;;
l1 and l2 are the same (in let (l1, l2, b) and in (l1, l2)).
PS: You need to define the foldr function (either import it or maybe your professor has some definition on the exercise sheet that you can copy).

Related

Multiplying Lists through Folding

So I am currently trying to figure out how to write a function where it takes 2 lists of equal lengths and multiplies the same position of both lists through folding, and returns the result as a new List.
eg) let prodList [1; 2; 3] [4; 5; 6] ;;
==> (through folding) ==> [1*4; 2*5; 3*6]
==> result = [4; 10; 18]
I feel like I need to use List.combine, since it will put the values that need to be multiplied into tuples. After that, I can't figure out how to break apart the tuple in a way that allows me to multiply the values. Here is what I have so far:
let prodLists l1 l2 =
let f a x = (List.hd(x)) :: a in
let base = [] in
let args = List.rev (List.combine l1 l2) in
List.fold_left f base args
Am I on the right track?
You can use fold_left2 which folds two lists of the same length. The documentation can give you more details (https://caml.inria.fr/pub/docs/manual-ocaml/libref/List.html):
val fold_left2 : ('a -> 'b -> 'c -> 'a) -> 'a -> 'b list -> 'c list -> 'a
List.fold_left2 f a [b1; ...; bn] [c1; ...; cn] is f (... (f (f a b1 c1) b2 c2) ...) bn cn. Raise Invalid_argument if the two lists are determined to have different lengths.
Another way is to fold the output of combine as you have suggested, I would recommend you to try it by yourself before looking at the solution bellow.
Solution:
let prod_lists l s =
List.rev (List.fold_left2 (fun acc a b -> (a * b) :: acc) [] l s);;
let prod_lists' l s =
List.fold_left (fun acc (a, b) -> (a * b) :: acc) [] (List.rev (List.combine l s));;
First let me note using fold to implement this operation seems a bit forced, since you have to traverse both lists at the same time. Fold however combines the elements of a single list. Nonetheless here is an implementation.
let e [] = []
let f x hxs (y::ys) = (x*y) :: hxs ys
let prodList xs ys = List.fold_right f xs e ys
Looks a bit complicated, so let me explain.
Universal Property of fold right
First you should be aware of the following property of fold_right.
h xs = fold_right f xs e
if and only if
h [] = e
h (x::xs) = f x (h xs)
This means that if we write the multiplication of lists in the recursive form below, then we can use the e and f to write it using fold as above. Note though we are operating two lists so h takes two arguments.
Base case - empty lists
Multiplying two empty lists returns an empty list.
h [] [] = []
How to write this in the form above? Just abstract over the second argument.
h [] = fun [] -> []
So,
e = fun [] -> []`
Or equivalently,
e [] = []
Recursive case - non-empty lists
h (x::xs) (y::ys) = x*y :: h xs ys
Or, using just one argument,
h (x::xs) = fun -> (y::ys) -> x*y :: h xs ys
Now we need to rewrite this expression in the form h (x::xs) = f x (h xs). It may seem complicated but we just need to abstract over x and h xs.
h (x::xs) = (fun x hxs -> fun (y::ys) -> x*y :: hxs ys) x (h xs)
so we have that f is defined by,
f = fun x hxs -> fun (y::ys) -> x*y :: hxs ys
or equivalently,
f x hxs (y::ys) = x*y :: hxs ys
Solution as a fold right
Having determined both e and f we just plug then into fold according to the first equation of the property above. And we get,
h xs = List.fold_right f xs e
or equivalently,
h xs ys = List.fold_right f xs e ys
Understanding the implementation
Note that the type of List.fold_right f xs e is int list -> int list, so the fold is building a function on lists, that given some ys will multiply it with the given parameter xs.
For an empty xs you will expect an empty ys and return an empty result so,
e [] = fun [] -> []
As for the recursive case, the function f in a fold_right must implement a solution for x::xs from a solution for xs. So f takes an x of type int and a function hxs of type int list -> int list which implements the multiplication for the tail, and it must implement multiplication for x::xs.
f x hxs = fun (y::ys) -> x*y :: hxs ys
So f constructs a function that multiplies x with y, and then applies to ys the already constructed hxs which multiplies xs to a list.
You mostly have the right idea; you'll want to combine (zip in other languages) the two lists and then map over each tuple:
let prod_lists l1 l2 =
List.combine l1 l2
|> List.map (fun (a, b) -> a * b)
The key is that you can pattern match on that tuple using (a, b).
You can also fold over the combined list, then rev the result, if you don't want to use map.

OCaml: Combination of elements in Lists, functional reasoning

I am back to coding in OCaml and I missed it so much. I missed it so much I completely lost my reasoning in this language and I hit a wall today.
What I want to do is the combination of elements between a set of n lists.
I decomposed the problem by first attempting the combination of elements between two list of arbitrary sizes.
Assume we have to lists: l1 = [1;2;3] and l2 = [10,20].
What I want to do is obtain the following list:
l_res = [10;20;20;40;30;60]
I know how to do this using loop structures, but I really want to solve this without them.
I tried the following:
let f l1 l2 =
List.map (fun y -> (List.map (fun x -> x * y) l1) l2
But this does not seem to work. The type I get is f : int list -> int list -> int list list but I want f : int list -> int list -> int list
I tried already many different approaches I feel I am over complicating.
What did I miss?
What you are missing is that List.map f [a; b; c] gives [f a; f b; f c] so what you'll get from your function will be
f [a; b; c] [d; e] = [[ad; ae]; [bd; be]; [cd; ce]]
but you want
f [a; b; c] [d; e] = [ad; ae; bd; be; cd; ce]
so you need to use an other iterator, i.e. :
let f l1 l2 =
let res = List.fold_left (fun acc x ->
List.fold_left (fun acc y -> (x * y) :: acc) acc l2
) [] l1 in
List.rev res
or to flatten your result :
val concat : 'a list list -> 'a list
Concatenate a list of lists. The elements of the argument are all
concatenated together (in the same order) to give the result. Not
tail-recursive (length of the argument + length of the longest
sub-list).
val flatten : 'a list list -> 'a list
Same as concat. Not tail-recursive (length of the argument + length of
the longest sub-list).
Some Core-flavoured answers:
open Core.Std
let f1 l1 l2 =
List.map (List.cartesian_product l1 l2) ~f:(fun (x, y) -> x * y)
let f2 l1 l2 =
List.concat_map l1 ~f:(fun x -> List.map l2 ~f:(fun y -> x * y))
let f4 l1 l2 =
let open List.Monad_infix in
l1 >>= fun x ->
l2 >>| fun y ->
x * y
The last answer explicitly (and arguably the two other answers implicitly) makes use of the list monad, which this is a textbook use case of. I couldn't find the list monad in Batteries, which is possibly not so surprising as it's much less widely used than (say) the option or result monads.
let f l1 l2 =
let multiply x = List.map (( * )x) l2 in
l1 |> List.map multiply
|> List.concat

SML function to with 2 lists that returns the XOR---fixed

Anyone able to offer any advice for a function in SML that will take 2 lists and return the XOR of them, so that if you have the lists [a,b,c,d], [c,d,e,f] the function returns [a,b,e,f] ?
I have tried to do it with 2 functions, but even that does not work properly.
fun del(nil,L2) = nil
|del(x::xs,L2)=
if (List.find (fn y => y = x) L2) <> (SOME x) then
del(xs, L2) # [x]
else
del(xs, L2);
fun xor(L3,L4) =
rev(del(L3,L4)) # rev(del(L4,L3));
Your attempt seems almost correct, except that fn x => x = x does not make sense, since it always returns true. I think you want fn y => y = x instead.
A couple of other remarks:
You can replace your use of List.find with List.filter which is closer to what you want.
Don't do del(xs,L) # [x] for the recursive step. Appending to the end of the list has a cost linear to the length of the first list, so if you do it in every step, your function will have quadratic runtime. Do x :: del(xs,L) instead, which also allows you to drop the list reversals in the end.
What you call "XOR" here is usually called the symmetric difference, at least for set-like structures.
The simplest way would be to filter out duplicates from each list and then concatenate the two resulting lists. Using List.filter you can remove any element that is a member (List.exists) of the other list.
However that is quite inefficient, and the below code is more an example of how not to do it in real life, though it is "functionally" nice to look at :)
fun symDiff a b =
let
fun diff xs ys =
List.filter (fn x => not (List.exists ( fn y => x = y) ys)) xs
val a' = diff a b
val b' = diff b a
in
a' # b'
end
This should be a better solution, that is still kept simple. It uses the SML/NJ specific ListMergeSort module for sorting the combined list a # b.
fun symDiff1 a b =
let
val ab' = ListMergeSort.sort op> (a # b)
(* Remove elements if they occur more than once. Flag indicates whether x
should be removed when no further matches are found *)
fun symDif' (x :: y :: xs) flag =
(case (x = y, flag) of
(* Element is not flagged for removal, so keep it *)
(false, false) => x :: symDif' (y :: xs) false
(* Reset the flag and remove x as it was marked for removal *)
| (false, true) => symDif' (y::xs) false
(* Remove y and flag x for removal if it wasn't already *)
| (true, _) => symDif' (x::xs) true)
| symDif' xs _ = xs
in
symDif' ab' false
end
However this is still kind of stupid. As the sorting function goes through all elements in the combined list, and thus it also ought to be the one that is "responsible" for removing duplicates.

Linked list partition function and reversed results

I wrote this F# function to partition a list up to a certain point and no further -- much like a cross between takeWhile and partition.
let partitionWhile c l =
let rec aux accl accr =
match accr with
| [] -> (accl, [])
| h::t ->
if c h then
aux (h::accl) t
else
(accl, accr)
aux [] l
The only problem is that the "taken" items are reversed:
> partitionWhile ((>=) 5) [1..10];;
val it : int list * int list = ([5; 4; 3; 2; 1], [6; 7; 8; 9; 10])
Other than resorting to calling rev, is there a way this function could be written that would have the first list be in the correct order?
Here's a continuation-based version. It's tail-recursive and returns the list in the original order.
let partitionWhileCps c l =
let rec aux f = function
| h::t when c h -> aux (fun (acc, l) -> f ((h::acc), l)) t
| l -> f ([], l)
aux id l
Here are some benchmarks to go along with the discussion following Brian's answer (and the accumulator version for reference):
let partitionWhileAcc c l =
let rec aux acc = function
| h::t when c h -> aux (h::acc) t
| l -> (List.rev acc, l)
aux [] l
let test =
let l = List.init 10000000 id
(fun f ->
let r = f ((>) 9999999) l
printfn "%A" r)
test partitionWhileCps // Real: 00:00:06.912, CPU: 00:00:07.347, GC gen0: 78, gen1: 65, gen2: 1
test partitionWhileAcc // Real: 00:00:03.755, CPU: 00:00:03.790, GC gen0: 52, gen1: 50, gen2: 1
Cps averaged ~7s, Acc ~4s. In short, continuations buy you nothing for this exercise.
I expect you can use continuations, but calling List.rev at the end is the best way to go.
I usually prefer Sequences over List as they are lazy and you got List.toSeq and Seq.toList functions to convert between them. Below is the implementation of your partitionWhile function using sequences.
let partitionWhile (c:'a -> bool) (l:'a list) =
let fromEnum (e:'a IEnumerator) =
seq { while e.MoveNext() do yield e.Current}
use e = (l |> List.toSeq).GetEnumerator()
(e |> fromEnum |> Seq.takeWhile c |> Seq.toList)
,(e |> fromEnum |> Seq.toList)
You can rewrite the function like this:
let partitionWhile c l =
let rec aux xs =
match xs with
| [] -> ([], [])
| h :: t ->
if c h then
let (good, bad) = aux t in
(h :: good, bad)
else
([], h :: t)
aux l
Yes, as Brian has noted it is no longer tail recursive, but it answers the question as stated. Incidentally, span in Haskell is implemented exactly the same way in Hugs:
span p [] = ([],[])
span p xs#(x:xs')
| p x = (x:ys, zs)
| otherwise = ([],xs)
where (ys,zs) = span p xs'
A good reason for preferring this version in Haskell is laziness: In the first version all the good elements are visited before the list is reversed. In the second version the first good element can be returned immediately.
I don't think I'm the only one to learn a lot from (struggling with) Daniel's CPS solution. In trying to figure it out, it helped me change several potentially (to the beginner) ambiguous list references, like so:
let partitionWhileCps cond l1 =
let rec aux f l2 =
match l2 with
| h::t when cond h -> aux (fun (acc, l3) -> f (h::acc, l3)) t
| l4 -> f ([], l4)
aux id l1
(Note that "[]" in the l4 match is the initial acc value.) I like this solution because it feels less kludgey not having to use List.rev, by drilling to the end of the first list and building the second list backwards. I think the other main way to avoid .rev would be to use tail recursion with a cons operation. Some languages optimize "tail recursion mod cons" in the same way as proper tail recursion (but Don Syme has said that this won't be coming to F#).
So this is not tail-recursive safe in F#, but it makes my answer an answer and avoids List.rev (this is ugly to have to access the two tuple elements and would be a more fitting parallel to the cps approach otherwise, I think, like if we only returned the first list):
let partitionWhileTrmc cond l1 =
let rec aux acc l2 =
match l2 with
| h::t when cond h -> ( h::fst(aux acc t), snd(aux acc t))
| l3 -> (acc, l3)
aux [] l1

Ocaml noobie Q -- how to use accumulating parameters?

I'm trying to learn Ocaml by working on Problem 18 from Project Euler. I know what I want to do, I just can't figure out how to do it.
I've got three lists:
let list1 = [1;2;3;4;5];;
let list2 = [ 6;7;8;9];;
let line = [9999];;
I want to add the numbers list2 to the max adjacent number in list1, IOW I would add 6+2, 7+3, 8+4 and 9+5 to get a list [8;10;12;14]. The list line[] is a dummy variable.
Here's my third try:
let rec meld3 l1 l2 accum =
if List.length l2 = 1 then
List.append accum [ (hd l2 + max (hd l1) (hd (tl l1)))]
else
(
List.append accum [ (hd l2 + max (hd l1) (hd (tl l1)))];
meld3 (tl l1) (tl l2) accum ;
)
;;
let fu = meld3 list1 list2 line ;;
List.iter print_int fu;;
After running this, I would expect line = [9999;8;10;12;14] but instead line = [9999].
OTOH, fu prints out as [999914].
When I step through the code, the code is executing as I expect, but nothing is changing; the accum in the else block is never modified.
I just don't get this language. Can anyone advise?
OK, let's break down your code. Here's your original.
let rec meld3 l1 l2 accum =
if List.length l2 = 1 then
List.append accum [ (hd l2 + max (hd l1) (hd (tl l1)))]
else
(
List.append accum [ (hd l2 + max (hd l1) (hd (tl l1)))];
meld3 (tl l1) (tl l2) accum ;
)
The first thing I'm going to do is rewrite it so a Caml programmer will understand it, without changing any of the computations. Primarily this means using pattern matching instead of hd and tl. This transformation is not trivial; it's important to simplify the list manipulation to make it easier to identify the problem with the code. It also makes it more obvious that this function fails if l2 is empty.
let rec meld3 l1 l2 accum = match l1, l2 with
| x1::x2::xs, [y] -> (* here the length of l2 is exactly 1 *)
List.append accum [ y + max x1 x2 ]
| x1::x2::xs, y::ys -> (* here the length of l2 is at least 1 *)
( List.append accum [ y + max x1 x2 ]
; meld3 (x2::xs) ys accum
)
Now I think the key to your difficulty is the understanding of the semicolon operator. If I write (e1; e2), the semantics is that e1 is evaluated for side effect (think printf) and then the result of e1 is thrown away. I think what you want instead is for the result of e1 to become the new value of accum for the recursive call. So instead of throwing away e1, we make it a parameter (this is the key step where the computation actually changes):
let rec meld3 l1 l2 accum = match l1, l2 with
| x1::x2::xs, [y] -> (* here the length of l2 is exactly 1 *)
List.append accum [ y + max x1 x2 ]
| x1::x2::xs, y::ys -> (* here the length of l2 is at least 1 *)
(
meld3 (x2::xs) ys (List.append accum [ y + max x1 x2 ])
)
Next step is to observe that we've violated the Don't Repeat Yourself principle, and we can fix that by making the base case where l2 is empty:
let rec meld3 l1 l2 accum = match l1, l2 with
| x1::x2::xs, [] -> (* here the length of l2 is 0 *)
accum
| x1::x2::xs, y::ys -> (* here the length of l2 is at least 1 *)
(
meld3 (x2::xs) ys (List.append accum [ y + max x1 x2 ])
)
We then clean up a bit:
let rec meld3 l1 l2 accum = match l1, l2 with
| _, [] -> accum
| x1::x2::xs, y::ys -> meld3 (x2::xs) ys (List.append accum [ y + max x1 x2 ])
Finally, the repeated calls to append make the code quadratic. This is a classic problem with accumulating parameters and has a classic solution: accumulate the answer list in reverse order:
let rec meld3 l1 l2 accum' = match l1, l2 with
| _, [] -> List.rev accum'
| x1::x2::xs, y::ys -> meld3 (x2::xs) ys (y + max x1 x2 :: accum')
I've changed the name accum to accum'; the prime is conventional for a list in reverse order. This last version is the only version I have compiled, and I haven't tested any of the code. (I did test the code in my other answer).
I hope this answer is more helpful.
Well, I think you haven't grasped the essence of functional programming: instead of calling List.append and throwing the value away, you need to pass that value as the parameter accum to the recursive call.
I would tackle this problem by decoupling the triangle geometry from the arithmetic. The first function takes two lists (rows of the triangle) and produces a new list of triples, each containing and element plus that element's left and right child. Then a simple map produces a list containing the sum of each element with its greater child:
(* function to merge a list l of length N with a list l' of length N+1,
such that each element of the merged lists consists of a triple
(l[i], l'[i], l'[i+1])
*)
let rec merge_rows l l' = match l, l' with
| [], [last] -> [] (* correct end of list *)
| x::xs, y1::y2::ys -> (x, y1, y2) :: merge_rows xs (y2::ys)
| _ -> raise (Failure "bad length in merge_rows")
let sum_max (cur, left, right) = cur + max left right
let merge_and_sum l l' = List.map sum_max (merge_rows l l')
let list1 = [1;2;3;4;5]
let list2 = [ 6;7;8;9]
let answer = merge_and_sum list2 list1
If you are working on Euler 18, I advise you to look up "dynamic programming".