Library function to find difference between two lists - OCaml - ocaml

Is there a library function to find List1 minus elements that appear in List2? I've been googling around and haven't found much.
It doesn't seem too trivial to write it myself. I've written a function to remove a specific element from a list but that's much more simple:
let rec difference l arg = match l with
| [] -> []
| x :: xs ->
if (x = arg) then difference xs arg
else x :: difference xs arg;;

Will this do?
let diff l1 l2 = List.filter (fun x -> not (List.mem x l2)) l1

What I ended up actually doing was just writing another function which would call the first one I posted
let rec difference l arg = match l with
| [] -> []
| x :: xs ->
if (x = arg) then difference xs arg
else x :: difference xs arg;;
let rec list_diff l1 l2 = match l2 with
| [] -> l1
| x :: xs -> list_diff (difference l1 x) xs;;
Although the solution I accepted is much more elegant

Related

Multiplying Lists through Folding

So I am currently trying to figure out how to write a function where it takes 2 lists of equal lengths and multiplies the same position of both lists through folding, and returns the result as a new List.
eg) let prodList [1; 2; 3] [4; 5; 6] ;;
==> (through folding) ==> [1*4; 2*5; 3*6]
==> result = [4; 10; 18]
I feel like I need to use List.combine, since it will put the values that need to be multiplied into tuples. After that, I can't figure out how to break apart the tuple in a way that allows me to multiply the values. Here is what I have so far:
let prodLists l1 l2 =
let f a x = (List.hd(x)) :: a in
let base = [] in
let args = List.rev (List.combine l1 l2) in
List.fold_left f base args
Am I on the right track?
You can use fold_left2 which folds two lists of the same length. The documentation can give you more details (https://caml.inria.fr/pub/docs/manual-ocaml/libref/List.html):
val fold_left2 : ('a -> 'b -> 'c -> 'a) -> 'a -> 'b list -> 'c list -> 'a
List.fold_left2 f a [b1; ...; bn] [c1; ...; cn] is f (... (f (f a b1 c1) b2 c2) ...) bn cn. Raise Invalid_argument if the two lists are determined to have different lengths.
Another way is to fold the output of combine as you have suggested, I would recommend you to try it by yourself before looking at the solution bellow.
Solution:
let prod_lists l s =
List.rev (List.fold_left2 (fun acc a b -> (a * b) :: acc) [] l s);;
let prod_lists' l s =
List.fold_left (fun acc (a, b) -> (a * b) :: acc) [] (List.rev (List.combine l s));;
First let me note using fold to implement this operation seems a bit forced, since you have to traverse both lists at the same time. Fold however combines the elements of a single list. Nonetheless here is an implementation.
let e [] = []
let f x hxs (y::ys) = (x*y) :: hxs ys
let prodList xs ys = List.fold_right f xs e ys
Looks a bit complicated, so let me explain.
Universal Property of fold right
First you should be aware of the following property of fold_right.
h xs = fold_right f xs e
if and only if
h [] = e
h (x::xs) = f x (h xs)
This means that if we write the multiplication of lists in the recursive form below, then we can use the e and f to write it using fold as above. Note though we are operating two lists so h takes two arguments.
Base case - empty lists
Multiplying two empty lists returns an empty list.
h [] [] = []
How to write this in the form above? Just abstract over the second argument.
h [] = fun [] -> []
So,
e = fun [] -> []`
Or equivalently,
e [] = []
Recursive case - non-empty lists
h (x::xs) (y::ys) = x*y :: h xs ys
Or, using just one argument,
h (x::xs) = fun -> (y::ys) -> x*y :: h xs ys
Now we need to rewrite this expression in the form h (x::xs) = f x (h xs). It may seem complicated but we just need to abstract over x and h xs.
h (x::xs) = (fun x hxs -> fun (y::ys) -> x*y :: hxs ys) x (h xs)
so we have that f is defined by,
f = fun x hxs -> fun (y::ys) -> x*y :: hxs ys
or equivalently,
f x hxs (y::ys) = x*y :: hxs ys
Solution as a fold right
Having determined both e and f we just plug then into fold according to the first equation of the property above. And we get,
h xs = List.fold_right f xs e
or equivalently,
h xs ys = List.fold_right f xs e ys
Understanding the implementation
Note that the type of List.fold_right f xs e is int list -> int list, so the fold is building a function on lists, that given some ys will multiply it with the given parameter xs.
For an empty xs you will expect an empty ys and return an empty result so,
e [] = fun [] -> []
As for the recursive case, the function f in a fold_right must implement a solution for x::xs from a solution for xs. So f takes an x of type int and a function hxs of type int list -> int list which implements the multiplication for the tail, and it must implement multiplication for x::xs.
f x hxs = fun (y::ys) -> x*y :: hxs ys
So f constructs a function that multiplies x with y, and then applies to ys the already constructed hxs which multiplies xs to a list.
You mostly have the right idea; you'll want to combine (zip in other languages) the two lists and then map over each tuple:
let prod_lists l1 l2 =
List.combine l1 l2
|> List.map (fun (a, b) -> a * b)
The key is that you can pattern match on that tuple using (a, b).
You can also fold over the combined list, then rev the result, if you don't want to use map.

Exception Stack_overflow for big integer in recursive functions

My Quicksort code works for some values of N (size of list), but for big values (for example, N = 82031) the error returned by OCaml is:
Fatal error: exception Stack_overflow.
What am I doing wrong?
Should I create an iterative version due to the fact that OCaml does not support recursive functions for big values?
let rec append l1 l2 =
match l1 with
| [] -> l2
| x::xs -> x::(append xs l2)
let rec partition p l =
match l with
| [] -> ([],[])
| x::xs ->
let (cs,bs) = partition p xs in
if p < x then
(cs,x::bs)
else
(x::cs,bs)
let rec quicksort l =
match l with
| [] -> []
| x::xs ->
let (ys, zs) = partition x xs in
append (quicksort ys) (x :: (quicksort zs));;
The problem is that none of your recursive functions are tail-recursive.
Tail-recursivity means that no further actions should be done by the caller (see here). In that case, there is no need to keep the environment of the caller function and the stack is not filled with environments of recursive calls. A language like OCaml can compile that in an optimal way but for this you need to provide tail-recursive functions.
For example, your first function, append :
let rec append l1 l2 =
match l1 with
| [] -> l2
| x::xs -> x::(append xs l2)
As you can see, after append xs l2 has been called, the caller needs to execute x :: ... and this function end up by not being tail-recursive.
Another way of doing it in a tail-recursive way is this :
let append l1 l2 =
let rec aux l1 l2 =
match l1 with
| [] -> l2
| x::xs -> append xs (x :: l2)
in aux (List.rev l1) l2
But, actually, you can try to use List.rev_append knowing that this function will append l1 and l2 but l1 will be reversed (List.rev_append [1;2;3] [4;5;6] gives [3;2;1;4;5;6])
Try to transform your other functions in tail-recursive ones and see what it gives you.
Best to fix the underlying problem as noted above, but if you really need a big stack, set ulimit -s. See also:
https://stackoverflow.com/a/71375559/14055985

Appending two lists

So this is one way to append two lists:
let rec append l1 l2 =
match l1 with
| h :: t -> h :: append t l2
| [] -> l2
But I am trying to write a tail-recursive version of append. (solve the problem before calling the recursive function).
This is my code so far, but when I try to add append in the first if statement the code becomes faulty for weird reasons.
let list1 = [1;2;3;4]
let list2 = [5;6;7;8]
let rec append lista listb =
match listb with
| h :: taillist -> if taillist != [] then
begin
lista # [h];
(* I cant put an append recursive call here because it causes error*)
end else
append lista taillist;
| [] -> lista;;
append list1 list2;;
The easiest way to transform a non tail-recursive list algorithm into a tail-recursive one, is to use an accumulator. Consider rewriting your code using a third list, that will accumulate the result. Use cons (i.e., ::) to prepend new elements to the third list, finally you will have a result of concatenation. Next, you need just to reverse it with List.rev et voila.
For the sake of completeness, there is a tail-recursive append:
let append l1 l2 =
let rec loop acc l1 l2 =
match l1, l2 with
| [], [] -> List.rev acc
| [], h :: t -> loop (h :: acc) [] t
| h :: t, l -> loop (h :: acc) t l
in
loop [] l1 l2
I would recommend to solve 99 problems to learn this idiom.
A couple of comments on your code:
It seems like cheating to define a list append function using #, since this is already a function that appends two lists :-)
Your code is written as if OCaml were an imperative language; i.e., you seem to expect the expression lista # [h] to modify the value of lista. But OCaml doesn't work that way. Lists in OCaml are immutable, and lista # [h] just calculates a new value without changing any previous values. You would need to pass this new value in your recursive call.
As #ivg says, the most straightforward way to solve your problem is using an accumulator, with a list reversal at the end. This is a common idiom in a language with immutable lists.
A version using constant stack space, implemented with a couple of standard functions (you'll get a tail-recursive solution after unfolding the definitions):
let append xs ys = List.rev_append (List.rev xs) ys
Incidentally, some OCaml libraries implement the append function in a pretty sophisticated way:
(1) see core_list0.ml in the Core_kernel library: search for "slow_append" and "count_append"
(2) or batList.mlv in the Batteries library.
An alternative tail-recursive solution (F#) leveraging continuations :
let concat x =
let rec concat f = function
| ([], x) -> f x
| (x1::x2, x3) -> concat (fun x4 -> f (x1::x4)) (x2, x3)
concat id x
I think the best way to go about it, like some have said would be to reverse the first list, then recursively add the head to the front of list2, but the top comment with code uses an accumulator, when you can get the same result without it by :: to the second list instead of an accumulator
let reverse list =
let rec reverse_helper acc list =
match list with
| [] -> acc
| h::t -> reverse_helper (h::acc) t in
reverse_helper [] lst;;
let append list1 list2 =
let rec append_helper list1_rev list2 =
match list1_rev with
| [] -> list2
| h :: t -> append_helper t (h::lst2) in
append_helper (reverse lst1) lst2;;
A possible answer to your question could be the following code :
let append list1 list2 =
let rec aux acc list1 list2 = match list1, list2 with
| [], [] -> List.rev(acc)
| head :: tail, [] -> aux (head :: acc) tail []
| [], head :: tail -> aux (head :: acc) [] tail
| head :: tail, head' :: tail' -> aux (head :: acc) tail (head' :: tail')
in aux [] list1 list2;
It's pretty similar to the code given by another one of the commenters on your post, but this one is more exhaustive, as I added a case for if list2 is empty from the beginning and list1 isn't
Here is a simpler solution:
let rec apptr l k =
let ln = List.rev l in
let rec app ln k acc = match ln with
| [] -> acc
| h::t -> app t k (h::acc) in
app ln k k
;;
let rec append (mylist: 'a list) (myotherlist : 'a list ): 'a list =
match mylist with
| [] -> myotherlist
| a :: rest -> a :: append rest myotherlist

How do you map a function to only certain elements in a list?

E.g. if you have a function (fun x -> x+1) and you want to map it to [1; 2; 3]. But you only want to map it when x=1, so that the output is [2; 2; 3]. How do you do this?
Using OCaml, I tried:
let rec foo (input : int list) : int list =
match input with
| [] -> []
| hd::tl -> List.map (fun x -> if x=1 then (x+1)) input;;
And I've tried 'when' statements, but to no avail.
An else branch is missing here.
You're almost there. You just need to make a complete if/else statement:
if x=1 then (x+1) else x
OCaml requires a return value on any branch of above expression.
To be clear, when guard is irrelevant here because it is used for conditional pattern matching. Since pattern matching is redundant in this case, your function could be shortened quite a lot:
let foo input =
List.map (fun x -> if x=1 then x+1 else x) input
You can actually use a when statement, even if I prefer #pad's solution:
let foo (input : int list) : int list =
let rec aux acc input =
match input with
[] -> List.rev acc
| x :: xs when x = 1 -> aux ((x + 1) :: acc) xs
| x :: xs -> aux (x :: acc) xs
in
aux [] input

Linked list partition function and reversed results

I wrote this F# function to partition a list up to a certain point and no further -- much like a cross between takeWhile and partition.
let partitionWhile c l =
let rec aux accl accr =
match accr with
| [] -> (accl, [])
| h::t ->
if c h then
aux (h::accl) t
else
(accl, accr)
aux [] l
The only problem is that the "taken" items are reversed:
> partitionWhile ((>=) 5) [1..10];;
val it : int list * int list = ([5; 4; 3; 2; 1], [6; 7; 8; 9; 10])
Other than resorting to calling rev, is there a way this function could be written that would have the first list be in the correct order?
Here's a continuation-based version. It's tail-recursive and returns the list in the original order.
let partitionWhileCps c l =
let rec aux f = function
| h::t when c h -> aux (fun (acc, l) -> f ((h::acc), l)) t
| l -> f ([], l)
aux id l
Here are some benchmarks to go along with the discussion following Brian's answer (and the accumulator version for reference):
let partitionWhileAcc c l =
let rec aux acc = function
| h::t when c h -> aux (h::acc) t
| l -> (List.rev acc, l)
aux [] l
let test =
let l = List.init 10000000 id
(fun f ->
let r = f ((>) 9999999) l
printfn "%A" r)
test partitionWhileCps // Real: 00:00:06.912, CPU: 00:00:07.347, GC gen0: 78, gen1: 65, gen2: 1
test partitionWhileAcc // Real: 00:00:03.755, CPU: 00:00:03.790, GC gen0: 52, gen1: 50, gen2: 1
Cps averaged ~7s, Acc ~4s. In short, continuations buy you nothing for this exercise.
I expect you can use continuations, but calling List.rev at the end is the best way to go.
I usually prefer Sequences over List as they are lazy and you got List.toSeq and Seq.toList functions to convert between them. Below is the implementation of your partitionWhile function using sequences.
let partitionWhile (c:'a -> bool) (l:'a list) =
let fromEnum (e:'a IEnumerator) =
seq { while e.MoveNext() do yield e.Current}
use e = (l |> List.toSeq).GetEnumerator()
(e |> fromEnum |> Seq.takeWhile c |> Seq.toList)
,(e |> fromEnum |> Seq.toList)
You can rewrite the function like this:
let partitionWhile c l =
let rec aux xs =
match xs with
| [] -> ([], [])
| h :: t ->
if c h then
let (good, bad) = aux t in
(h :: good, bad)
else
([], h :: t)
aux l
Yes, as Brian has noted it is no longer tail recursive, but it answers the question as stated. Incidentally, span in Haskell is implemented exactly the same way in Hugs:
span p [] = ([],[])
span p xs#(x:xs')
| p x = (x:ys, zs)
| otherwise = ([],xs)
where (ys,zs) = span p xs'
A good reason for preferring this version in Haskell is laziness: In the first version all the good elements are visited before the list is reversed. In the second version the first good element can be returned immediately.
I don't think I'm the only one to learn a lot from (struggling with) Daniel's CPS solution. In trying to figure it out, it helped me change several potentially (to the beginner) ambiguous list references, like so:
let partitionWhileCps cond l1 =
let rec aux f l2 =
match l2 with
| h::t when cond h -> aux (fun (acc, l3) -> f (h::acc, l3)) t
| l4 -> f ([], l4)
aux id l1
(Note that "[]" in the l4 match is the initial acc value.) I like this solution because it feels less kludgey not having to use List.rev, by drilling to the end of the first list and building the second list backwards. I think the other main way to avoid .rev would be to use tail recursion with a cons operation. Some languages optimize "tail recursion mod cons" in the same way as proper tail recursion (but Don Syme has said that this won't be coming to F#).
So this is not tail-recursive safe in F#, but it makes my answer an answer and avoids List.rev (this is ugly to have to access the two tuple elements and would be a more fitting parallel to the cps approach otherwise, I think, like if we only returned the first list):
let partitionWhileTrmc cond l1 =
let rec aux acc l2 =
match l2 with
| h::t when cond h -> ( h::fst(aux acc t), snd(aux acc t))
| l3 -> (acc, l3)
aux [] l1