Eliminate consecutive duplicates of list elements ocaml - ocaml

I am working on "99 Ocaml Problems" and in the solution, I see this pattern matching:
let rec compress (mylist : 'a list) : 'a list = match mylist with
|a::(b::_ as t) -> if a = b then compress t else a::compress t
|smaller -> smaller
I understand that for the first matching case, if element a is the same as element b, then I move on to the list t. If not, I will append element a to the list of compressing t.
For the second matching case, I am not sure what is the type of "smaller".
When I try to put a square bracket around it since I am thinking the author wants to match second case with one element list, but I have a non-exhaustive pattern.
Can you explain to me what the "smaller" is in this case?

The variable smaller is an 'a list. It matches anything that doesn't match the earlier branch, i.e., a list with one element or the empty list.

Another way to write the compress function without smaller:
let rec compress (mylist : 'a list) : 'a list = match mylist with
| a::(b::_ as t) -> if a = b then compress t else a::compress t
| _ -> mylist;;
Which says the same as the answer of user tbrk : if mylist does not match the first expression, then compress mylist returns mylist.

Related

Using fold_left to search for a list with a specific length in OCaml

I've written a function which search through a list of int-list to return the index of the list with an specific length by using pattern-matching:
let rec search x lst i = match lst with
| [] -> raise(Failure "Not found")
| hd :: tl -> if (List.length hd = x) then i else search x tl (i+1)
;;
For example:
utop # search 2 [ [1;2];[1;2;3] ] 0 ;;
- : int = 0
Is there a way to write a function with the same functionality using fold.left ?
What does List.fold_left actually do?
It takes (in reverse order to the order of arguments) a list, an initial value, and a function that works on that initial value and the first element in the list. If the list is empty, it returns the initial value. Otherwise it uses the function to update the initial value by way of recursion and works on the tail of the list.
let rec fold_left f init lst =
match lst with
| [] -> init
| x::xs -> fold_left f (f init x) xs
Now, what information do you need to keep track of as you iterate? The index. Easy enough.
But, what if you don't actually find a list of that length? You need to keep track of whether you've found one. So let's say we use a tuple of the index and a boolean flag.
Your function you pass to fold_left just needs to determine if a match has been found no update is necessary. Essentially we just no-op over the rest of the list. But, if we haven't found a match, then we need to test the current sublist's length and update the init value accordingly.
#glennsl (in a comment) and #Chris already explained that you may use List.fold_left but that it’s not the right tool for the job, because it processes the whole list whereas you want to stop once an occurrence is found. There are solutions but they are not satisfying:
(#Chris’ solution:) use a folding function that ignores the new elements once an occurrence has been found: you’re just wasting time, walking through the remaining tail for nothing;
evade the loop by throwing and catching an exception: better but hacky, you’re working around the normal functioning of List.fold_left.
I just mention that there is a generic function in the standard library that matches your situation almost perfectly:
val find : ('a -> bool) -> 'a list -> 'a
find f l returns the first element of the list l that satisfies the predicate f.
Raises Not_found if there is no value that satisfies f in the list l.
However it does not return the index, unlike what you are asking for. This is a deliberate design choice in the standard library, because list indexing is inefficient (linear time) and you shouldn’t do it. If, after these cautionary words, you still want the index, it is easy to write a generic function find_with_index.
Another remark on your code: you can avoid computing the lengths of inner lists fully, thanks to the following standard function:
val compare_length_with : 'a list -> int -> int
Compare the length of a list to an integer. compare_length_with l len is equivalent to compare (length l) len, except that the computation stops after at most len iterations on the list.
Since 4.05.0
So instead of if List.length hd = x, you can do if List.compare_length_with hd x = 0.

SML [circularity] error when doing recursion on lists

I'm trying to built a function that zips the 2 given function, ignoring the longer list's length.
fun zipTail L1 L2 =
let
fun helper buf L1 L2 = buf
| helper buf [x::rest1] [y::rest2] = helper ((x,y)::buf) rest1 rest2
in
reverse (helper [] L1 L2)
end
When I did this I got the error message:
Error: right-hand-side of clause doesn't agree with function result type [circularity]
I'm curious as of what a circularity error is and how should I fix this.
There are a number of problems here
1) In helper buf L1 L2 = buf, the pattern buf L1 L2 would match all possible inputs, rendering your next clause (once debugged) redundant. In context, I think that you meant helper buf [] [] = buf, but then you would run into problems of non-exhaustive matching in the case of lists of unequal sizes. The simplest fix would be to move the second clause (the one with x::rest1) into the top line and then have a second pattern to catch the cases in which at least one of the lists are empty.
2) [xs::rest] is a pattern which matches a list of 1 item where the item is a nonempty list. That isn't your attention. You need to use (,) rather than [,].
3) reverse should be rev.
Making these changes, your definition becomes:
fun zipTail L1 L2 =
let
fun helper buf (x::rest1) (y::rest2) = helper ((x,y)::buf) rest1 rest2
| helper buf rest1 rest2 = buf
in
rev (helper [] L1 L2)
end;
Which works as intended.
The error message itself is a bit hard to understand, but you can think of it like this. In
helper buf [x::rest1] [y::rest2] = helper ((x,y)::buf) rest1 rest2
the things in the brackets on the left hand side are lists of lists. So their type would be 'a list list where 'a is the type of x. In x::rest1 the type of rest1 would have to be 'a list Since rest1 also appears on the other side of the equals sign in the same position as [x::rest1] then the type of rest1 would have to be the same as the type of [x::rest1], which is 'a list list. Thus rest1 must be both 'a list and 'a list list, which is impossible.
The circularity comes from if you attempt to make sense of 'a list list = 'a list, you would need a type 'a with 'a = 'a list. This would be a type whose values consists of a list of values of the same type, and the values of the items in that list would have to themselves be lists of elements of the same type ... It is a viscous circle which never ends.
The problem with circularity shows up many other places.
You want (x::rest1) and not [x::rest1].
The problem is a syntactic misconception.
The pattern [foo] will match against a list with exactly one element in it, foo.
The pattern x::rest1 will match against a list with at least one element in it, x, and its (possibly empty) tail, rest1. This is the pattern you want. But the pattern contains an infix operator, so you need to add a parenthesis around it.
The combined pattern [x::rest1] will match against a list with exactly one element that is itself a list with at least one element. This pattern is valid, although overly specific, and does not provoke a type error in itself.
The reason you get a circularity error is that the compiler can't infer what the type of rest1 is. As it occurs on the right-hand side of the :: pattern constructor, it must be 'a list, and as it occurs all by itself, it must be 'a. Trying to unify 'a = 'a list is like finding solutions to the equation x = x + 1.
You might say "well, as long as 'a = 'a list list list list list ... infinitely, like ∞ = ∞ + 1, that's a solution." But the Damas-Hindley-Milner type system doesn't treat this infinite construction as a well-defined type. And creating the singleton list [[[...x...]]] would require an infinite amount of brackets, so it isn't entirely practical anyways.
Some simpler examples of circularity:
fun derp [x] = derp x: This is a simplification of your case where the pattern in the first argument of derp indicates a list, and the x indicates that the type of element in this list must be the same as the type of the list itself.
fun wat x = wat [x]: This is a very similar case where wat takes an argument of type 'a and calls itself with an argument of type 'a list. Naturally, 'a could be an 'a list, but then so must 'a list be an 'a list list, etc.
As I said, you're getting circularity because of a syntactic misconception wrt. list patterns. But circularity is not restricted to lists. They're a product of composed types and self-reference. Here's an example without lists taken from Function which applies its argument to itself?:
fun erg x = x x: Here, x can be thought of as having type 'a to begin with, but seeing it applied as a function to itself, it must also have type 'a -> 'b. But if 'a = 'a -> 'b, then 'a -> b = ('a -> 'b) -> 'b, and ('a -> 'b) -> b = (('a -> 'b) -> b) -> b, and so on. SML compilers are quick to determine that there are no solutions here.
This is not to say that functions with circular types are always useless. As newacct points out, turning purely anonymous functions into recursive ones actually requires this, like in the Y-combinator.
The built-in ListPair.zip
is usually tail-recursive, by the way.

breaking a list into a new list of 2 neighboring elements

I need to break a list like [1;2;3;4;5] into [[1;2]; [3;4]; [5]] in OCaml.
I wrote the following function but it is giving me an error (Error: This expression has type 'a list but an expression was expected of type 'a The type variable 'a occurs inside 'a list)
let rec getNewList l =
match l with
[] -> failwith "empty list"
| [x] -> [x]
| x::(y::_ as t) -> [x;y] :: getNewList t;;
What am I missing? how can I fix it?
You want a function of type 'a list -> 'a list list. However, the second branch of your match returns something of type 'a list.
As a side comment, you shouldn't consider it an error if the input is an empty list. There's a perfectly natural answer for this case. Otherwise you'll have a lot of extra trouble writing your function.
You're not far from a solution. Three things :
if the list is empty, you definitely want your result to be the empty list
second case should be [x] -> [[x]]
for the main case, how many times should y appear in your result ?

nested lists in ocaml

I am new to Ocaml and have defined nested lists as follows:
type 'a node = Empty | One of 'a | Many of 'a node list
Now I want to define a wrapping function that wraps square brackets around the first order members of a nested list. For ex. wrap( Many [ one a; Many[ c; d]; one b; one e;] ) returns Many [Many[one a; Empty]; Many[Many[c;d]; Empty]; Many[b; Empty]; Many[e; Empty]].
Here's my code for the same:
let rec wrap list = function
Empty -> []
| Many[x; y] -> Many [ Many[x; Empty]; wrap y;];;
But I am getting an error in the last expression : This expression has the type 'a node but an expression was expected of the type 'b list. Please help.
Your two matches are not returning values of the same type. The first statement returns a b' list; the second statement returns an 'a node. To get past the type checker, you'll need to change the first statement to read as: Empty -> Empty.
A second issue (which you will run into next) is that your recursive call is not being fed a value of the correct type. wrap : 'a node -> 'a node, but y : 'a node list. One way to address this would be to replace the expression with wrap (Many y).
There will also be in issue in that your current function assumes the Many list only has two elements. I think what you want to do is Many (x::y). This matches x as the head of the list and y as the tail. However, you will then need a case to handle Many ([]) so as to avoid infinite recursion.
Finally, the overall form of your function strikes me as a bit unusual. I would replace function Empty -> ... with match list with | Empty -> ....

What does (x:_) and [x:_] mean?

head' :: [a] -> a
head' [] = error "No head for empty lists!"
head' (x:_) = x
head' :: [a] -> a
head' xs = case xs of [] -> error "No head for empty lists!"
(x:_) -> x
I am asking for a fairly easy question which I don't understand.
In the code above, I see that it takes a list for an input.
But on the third line, it says (x:_) which confuses me.
Can anyone explain to me why they wrote (x:_) instead of [x:_]?
And plus, I don't understand what (x:_) means.
Thank you.
: is a constructor for lists, which takes the head of the new list as its left argument and the tail as its right argument. If you use it as a pattern like here that means that the head of the list you match is given to the left pattern and the tail to the right.
So in this case the head of the list is stored in the variable x and the tail is not used (_ means that you don't care about the value).
And yes, you can also use [] to pattern match against lists, but only lists of fixed size. For example the pattern [x] matches a list with exactly one element, which is then stored in the variable x. Likewise [x,y] would match a list with two elements.
Your proposed pattern [x:y] would thus match a list with one element, which matches the pattern x:y. In other words, it would match a list of lists which contains exactly one list.
This is a concept called pattern matching. : is an infix constructor, just like + is an infix function. In Haskell you pattern match with constructors.
(1 : 2 : 3 : [])
Is the same as [1, 2, 3], the square bracket notation is just syntactic sugar for creating lists.
Your pattern (x : _) means that you want to bind the first element of the list to x and that you do not care about the rest of the list _.