Why is it that in some functions when I use modules like
List.filter it does not return an error: stack overflow
but when I use
let rec filter p l =
match l with
| [] -> []
| hd::tl -> if p hd then hd::(filter p l) else filter p l
which has the same functions as List.filter it produces the stack overflow error
Your filter function calls itself recursively with the same list l:
... then hd::(filter p l) else filter p l
^^^^^^^^^^^^ ^^^^^^^^^^
This means you never converge towards a solution, you are recursing infinitely.
Reaching Stack Overflow on small inputs is a symptom of such problems.
Likewise, when you write a tail-recursive function you may have non-terminating loops, which is another way to identify when the program is incorrect (some programs are infinite loop on purpose).
In your case you need to use tl, the remaining elements from the list. Having something that shrinks at each step of recursion is important if you expect the function to terminate.
Related
I am trying to figure out is there anyway that i can use append to make the three lists of integer inside a list to become a list of a list of integers, for example
[[1];[2];[3]] -> [[1;2;3]]
[] -> [[]]
[[]] -> []
but i am not sure how loop really in OCaml.
and the below is what i have tried, but i dont think it work
let rec ls (l : 'a list list) =
match l with
| [] -> []
| x :: y -> l#y
i have tried to use # to do function, but i don't how to remove the bracket.
Note that in your attempt, you never use x which is the head of the list, and the function is not recursive. It never calls itself. Note that # is never necessary in this exercise, which is good because it leads to some ugly performance implications.
Consider that you can use pattern-matching to identify whether a list is empty or not, and to extract elements from the head and the tail of a list. What should the result of flattening an empty list be? An empty list.
let rec flatten =
function
| [] -> []
Now, if the first list in the list of lists is empty, it should be the result of flattening the tail. This seems pretty obvious so far.
let rec flatten =
function
| [] -> []
| []::tl -> flatten tl
Now, if it's not empty then we can cons the first element of the first list onto the result of flattening... I'll leave that as an exercise for you to fill in.
let rec flatten =
function
| [] -> []
| []::tl -> flatten tl
| (x::xs)::tl -> x :: flatten ...
Looping via recursion
While OCaml does have imperative loops, it is much more idiomatic, especially when dealing with lists, to loop via recursion.
In order to use recursion to loop, there must be at least one exit case where the function does not recursively call itself, but there must also be at least one case where it does, and that function call must in some way update the state being passed in so that it converges on the exit case.
If the exit case is passing in an empty list, the recursive calls must get closer to passing in an empty list on each call or the recursion will never end.
If you did want to append...
If you decided you do like #, and don't care about O(n^2) runtime complexity, you can use it with List.fold_left to readily accomplish this goal.
# List.fold_left (#) [] [[1;2]; [3;4]];;
- : int list = [1; 2; 3; 4]
This is equivalent to [] # [1;2] # [3;4].
I've written a function which search through a list of int-list to return the index of the list with an specific length by using pattern-matching:
let rec search x lst i = match lst with
| [] -> raise(Failure "Not found")
| hd :: tl -> if (List.length hd = x) then i else search x tl (i+1)
;;
For example:
utop # search 2 [ [1;2];[1;2;3] ] 0 ;;
- : int = 0
Is there a way to write a function with the same functionality using fold.left ?
What does List.fold_left actually do?
It takes (in reverse order to the order of arguments) a list, an initial value, and a function that works on that initial value and the first element in the list. If the list is empty, it returns the initial value. Otherwise it uses the function to update the initial value by way of recursion and works on the tail of the list.
let rec fold_left f init lst =
match lst with
| [] -> init
| x::xs -> fold_left f (f init x) xs
Now, what information do you need to keep track of as you iterate? The index. Easy enough.
But, what if you don't actually find a list of that length? You need to keep track of whether you've found one. So let's say we use a tuple of the index and a boolean flag.
Your function you pass to fold_left just needs to determine if a match has been found no update is necessary. Essentially we just no-op over the rest of the list. But, if we haven't found a match, then we need to test the current sublist's length and update the init value accordingly.
#glennsl (in a comment) and #Chris already explained that you may use List.fold_left but that it’s not the right tool for the job, because it processes the whole list whereas you want to stop once an occurrence is found. There are solutions but they are not satisfying:
(#Chris’ solution:) use a folding function that ignores the new elements once an occurrence has been found: you’re just wasting time, walking through the remaining tail for nothing;
evade the loop by throwing and catching an exception: better but hacky, you’re working around the normal functioning of List.fold_left.
I just mention that there is a generic function in the standard library that matches your situation almost perfectly:
val find : ('a -> bool) -> 'a list -> 'a
find f l returns the first element of the list l that satisfies the predicate f.
Raises Not_found if there is no value that satisfies f in the list l.
However it does not return the index, unlike what you are asking for. This is a deliberate design choice in the standard library, because list indexing is inefficient (linear time) and you shouldn’t do it. If, after these cautionary words, you still want the index, it is easy to write a generic function find_with_index.
Another remark on your code: you can avoid computing the lengths of inner lists fully, thanks to the following standard function:
val compare_length_with : 'a list -> int -> int
Compare the length of a list to an integer. compare_length_with l len is equivalent to compare (length l) len, except that the computation stops after at most len iterations on the list.
Since 4.05.0
So instead of if List.length hd = x, you can do if List.compare_length_with hd x = 0.
EDIT: see this followup question that simplifies the problem I am trying to identify here, and asks for input on a GHC modification proposal.
So I was trying to write a generic breadth-first search function and came up with the following:
bfs :: (a -> Bool) -> (a -> [a]) -> [a] -> Maybe a
bfs predf expandf xs = find predf bfsList
where bfsList = xs ++ concatMap expandf bfsList
which I thought was pretty elegant, however in the does-not-exist case it blocks forever.
After all the terms have been expanded to [], concatMap will never return another item, so concatMap is blocking waiting for another item from itself? Could Haskell be made smart enough to realize the list generation is blocked reading the self-reference and terminate the list?
The best replacement I've been able to come up with isn't quite as elegant, since I have to handle the termination case myself:
where bfsList = concat.takeWhile (not.null) $ iterate (concatMap expandf) xs
For concrete examples, the first search terminates with success, and the second one blocks:
bfs (==3) (\x -> if x<1 then [] else [x/2, x/5]) [5, 3*2**8]
bfs (==3) (\x -> if x<1 then [] else [x/2, x/5]) [5, 2**8]
Edited to add a note to explain my bfs' solution below.
The way your question is phrased ("could Haskell be made smart enough"), it sounds like you think the correct value for a computation like:
bfs (\x -> False) (\x -> []) []
given your original definition of bfs should be Nothing, and Haskell is just failing to find the correct answer.
However, the correct value for the above computation is bottom. Substituting the definition of bfs (and simplifying the [] ++ expression), the above computation is equal to:
find (\x -> False) bfsList
where bfsList = concatMap (\x -> []) bfsList
Evaluating find requires determining if bfsList is empty or not, so it must be forced to weak head normal form. This forcing requires evaluating the concatMap expression, which also must determine if bfsList is empty or not, forcing it to WHNF. This forcing loop implies bfsList is bottom, and therefore so is find.
Haskell could be smarter in detecting the loop and giving an error, but it would be incorrect to return [].
Ultimately, this is the same thing that happens with:
foo = case foo of [] -> []
which also loops infinitely. Haskell's semantics imply that this case construct must force foo, and forcing foo requires forcing foo, so the result is bottom. It's true that if we considered this definition an equation, then substituting foo = [] would "satisfy" it, but that's not how Haskell semantics work, for the same reason that:
bar = bar
does not have value 1 or "awesome", even though these values satisfy it as an "equation".
So, the answer to your question is, no, this behavior couldn't be changed so as to return an empty list without fundamentally changing Haskell semantics.
Also, as an alternative that looks pretty slick -- even with its explicit termination condition -- maybe consider:
bfs' :: (a -> Bool) -> (a -> [a]) -> [a] -> Maybe a
bfs' predf expandf = look
where look [] = Nothing
look xs = find predf xs <|> look (concatMap expandf xs)
This uses the Alternative instance for Maybe, which is really very straightforward:
Just x <|> ... -- yields `Just x`
Nothing <|> Just y -- yields `Just y`
Nothing <|> Nothing -- yields `Nothing` (doesn't happen above)
so look checks the current set of values xs with find, and if it fails and returns Nothing, it recursively looks in their expansions.
As a silly example that makes the termination condition look less explicit, here's its double-monad (Maybe in implicit Reader) version using listToMaybe as the terminator! (Not recommended in real code.)
bfs'' :: (a -> Bool) -> (a -> [a]) -> [a] -> Maybe a
bfs'' predf expandf = look
where look = listToMaybe *>* find predf *|* (look . concatMap expandf)
(*>*) = liftM2 (>>)
(*|*) = liftM2 (<|>)
infixl 1 *>*
infixl 3 *|*
How does this work? Well, it's a joke. As a hint, the definition of look is the same as:
where look xs = listToMaybe xs >>
(find predf xs <|> look (concatMap expandf xs))
We produce the results list (queue) in steps. On each step we consume what we have produced on the previous step. When the last expansion step added nothing, we stop:
bfs :: (a -> Bool) -> (a -> [a]) -> [a] -> Maybe a
bfs predf expandf xs = find predf queue
where
queue = xs ++ gen (length xs) queue -- start the queue with `xs`, and
gen 0 _ = [] -- when nothing in queue, stop;
gen n q = let next = concatMap expandf (take n q) -- take n elemts from queue,
in next ++ -- process, enqueue the results,
gen (length next) (drop n q) -- advance by `n` and continue
Thus we get
~> bfs (==3) (\x -> if x<1 then [] else [x/2, x/5]) [5, 3*2**8]
Just 3.0
~> bfs (==3) (\x -> if x<1 then [] else [x/2, x/5]) [5, 2**8]
Nothing
One potentially serious flow in this solution is that if any expandf step produces an infinite list of results, it will get stuck calculating its length, totally needlessly so.
In general, just introduce a counter and increment it by the length of solutions produced at each expansion step (length . concatMap expandf or something), decrementing by the amount that was consumed. When it reaches 0, do not attempt to consume anything anymore because there's nothing to consume at that point, and you should instead terminate.
This counter serves in effect as a pointer back into the queue being constructed. A value of n indicates that the place where the next result will be placed is n notches ahead of the place in the list from which the input is taken. 1 thus means that the next result is placed directly after the input value.
The following code can be found in Wikipedia's article about corecursion (search for "corecursive queue"):
data Tree a b = Leaf a | Branch b (Tree a b) (Tree a b)
bftrav :: Tree a b -> [Tree a b]
bftrav tree = queue
where
queue = tree : gen 1 queue -- have one value in queue from the start
gen 0 _ = []
gen len (Leaf _ : s) = gen (len-1) s -- consumed one, produced none
gen len (Branch _ l r : s) = l : r : gen (len+1) s -- consumed one, produced two
This technique is natural in Prolog with top-down list instantiation and logical variables which can be explicitly in a not-yet-set state. See also tailrecursion-modulo-cons.
gen in bfs can be re-written to be more incremental, which is usually a good thing to have:
gen 0 _ = []
gen n (y:ys) = let next = expandf y
in next ++ gen (n - 1 + length next) ys
bfsList is defined recursively, which is not in itself a problem in Haskell. It does, however, produce an infinite list, which, again, isn't in itself a problem, because Haskell is lazily evaluated.
As long as find eventually finds what it's looking for, it's not an issue that there's still an infinity of elements, because at that point evaluation stops (or, rather, moves on to do other things).
AFAICT, the problem in the second case is that the predicate is never matched, so bfsList just keeps producing new elements, and find keeps on looking.
After all the terms have been expanded to [] concatMap will never return another item
Are you sure that's the correct diagnosis? As far as I can tell, with the lambda expressions supplied above, each input element always expand to two new elements - never to []. The list is, however, infinite, so if the predicate goes unmatched, the function will evaluate forever.
Could Haskell be made smart enough to realize the list generation is blocked reading the self-reference and terminate the list?
It'd be nice if there was a general-purpose algorithm to determine whether or not a computation would eventually complete. Alas, as both Turing and Church (independently of each other) proved in 1936, such an algorithm can't exist. This is also known as the Halting problem. I'm not a mathematician, though, so I may be wrong, but I think it applies here as well...
The best replacement I've been able to come up with isn't quite as elegant
Not sure about that one... If I try to use it instead of the other definition of bfsList, it doesn't compile... Still, I don't think the problem is the empty list.
Its possible to create infinite, circular lists using let rec, without needing to resort to mutable references:
let rec xs = 1 :: 0 :: xs ;;
But can I use this same technique to write a function that receives a finite list and returns an infinite, circular version of it? I tried writing
let rec cycle xs =
let rec result = go xs and
go = function
| [] -> result
| (y::ys) -> y :: go ys in
result
;;
But got the following error
Error: This kind of expression is not allowed as right-hand side of `let rec'
Your code has two problems:
result = go xs is in illegal form for let rec
The function tries to create a loop by some computation, which falls into an infinite loop causing stack overflow.
The above code is rejected by the compiler because you cannot write an expression which may cause recursive computation in the right-hand side of let rec (see Limitations of let rec in OCaml).
Even if you fix the issue you still have a problem: cycle does not finish the job:
let rec cycle xs =
let rec go = function
| [] -> go xs
| y::ys -> y :: g ys
in
go xs;;
cycle [1;2];;
cycle [1;2] fails due to stack overflow.
In OCaml, let rec can define a looped structure only when its definition is "static" and does not perform any computation. let rec xs = 1 :: 0 :: xs is such an example: (::) is not a function but a constructor, which purely constructs the data structure. On the other hand, cycle performs some code execution to dynamically create a structure and it is infinite. I am afraid that you cannot write a function like cycle in OCaml.
If you want to introduce some loops in data like cycle in OCaml, what you can do is using lazy structure to prevent immediate infinite loops like Haskell's lazy list, or use mutation to make a loop by a substitution. OCaml's list is not lazy nor mutable, therefore you cannot write a function dynamically constructs looped lists.
If you do not mind using black magic, you could try this code:
let cycle l =
if l = [] then invalid_arg "cycle" else
let l' = List.map (fun x -> x) l in (* copy the list *)
let rec aux = function
| [] -> assert false
| [_] as lst -> (* find the last cons cell *)
(* and set the last pointer to the beginning of the list *)
Obj.set_field (Obj.repr lst) 1 (Obj.repr l')
| _::t -> aux t
in aux l'; l'
Please be aware that using the Obj module is highly discouraged. On the other hand, there are industrial-strength programs and libraries (Coq, Jane Street's Core, Batteries included) that are known to use this sort of forbidden art.
camlspotter's answer is good enough already. I just want to add several more points here.
First of all, for the problem of write a function that receives a finite list and returns an infinite, circular version of it, it can be done in code / implementation level, just if you really use the function, it will have stackoverflow problem and will never return.
A simple version of what you were trying to do is like this:
let rec circle1 xs = List.rev_append (List.rev xs) (circle1 xs)
val circle: 'a list -> 'a list = <fun>
It can be compiled and theoretically it is correct. On [1;2;3], it is supposed to generate [1;2;3;1;2;3;1;2;3;1;2;3;...].
However, of course, it will fail because its run will be endless and eventually stackoverflow.
So why let rec circle2 = 1::2::3::circle2 will work?
Let's see what will happen if you do it.
First, circle2 is a value and it is a list. After OCaml get this info, it can create a static address for circle2 with memory representation of list.
The memory's real value is 1::2::3::circle2, which actually is Node (1, Node (2, Node (3, circle2))), i.e., A Node with int 1 and address of a Node with int 2 and address of a Node with int 3 and address of circle2. But we already know circle2's address, right? So OCaml just put circle2's address there.
Everything will work.
Also, through this example, we can also know a fact that for a infinite circled list defined like this actually doesn't cost limited memory. It is not generating a real infinite list to consume all memory, instead, when a circle finishes, it just jumps "back" to the head of the list.
Let's then go back to example of circle1. Circle1 is a function, yes, it has an address, but we do not need or want it. What we want is the address of the function application circle1 xs. It is not like circle2, it is a function application which means we need to compute something to get the address. So,
OCaml will do List.rev xs, then try to get address circle1 xs, then repeat, repeat.
Ok, then why we sometimes get Error: This kind of expression is not allowed as right-hand side of 'let rec'?
From http://caml.inria.fr/pub/docs/manual-ocaml/extn.html#s%3aletrecvalues
the let rec binding construct, in addition to the definition of
recursive functions, also supports a certain class of recursive
definitions of non-functional values, such as
let rec name1 = 1 :: name2 and name2 = 2 :: name1 in expr which
binds name1 to the cyclic list 1::2::1::2::…, and name2 to the cyclic
list 2::1::2::1::…Informally, the class of accepted definitions
consists of those definitions where the defined names occur only
inside function bodies or as argument to a data constructor.
If you use let rec to define a binding, say let rec name. This name can be only in either a function body or a data constructor.
In previous two examples, circle1 is in a function body (let rec circle1 = fun xs -> ...) and circle2 is in a data constructor.
If you do let rec circle = circle, it will give error as circle is not in the two allowed cases. let rec x = let y = x in y won't do either, because again, x not in constructor or function.
Here is also a clear explanation:
https://realworldocaml.org/v1/en/html/imperative-programming-1.html
Section Limitations of let rec
Here's what I've got so far...
fun positive l1 = positive(l1,[],[])
| positive (l1, p, n) =
if hd(l1) < 0
then positive(tl(l1), p, n # [hd(l1])
else if hd(l1) >= 0
then positive(tl(l1), p # [hd(l1)], n)
else if null (h1(l1))
then p
Yes, this is for my educational purposes. I'm taking an ML class in college and we had to write a program that would return the biggest integer in a list and I want to go above and beyond that to see if I can remove the positives from it as well.
Also, if possible, can anyone point me to a decent ML book or primer? Our class text doesn't explain things well at all.
You fail to mention that your code doesn't type.
Your first function clause just has the variable l1, which is used in the recursive. However here it is used as the first element of the triple, which is given as the argument. This doesn't really go hand in hand with the Hindley–Milner type system that SML uses. This is perhaps better seen by the following informal thoughts:
Lets start by assuming that l1 has the type 'a, and thus the function must take arguments of that type and return something unknown 'a -> .... However on the right hand side you create an argument (l1, [], []) which must have the type 'a * 'b list * 'c list. But since it is passed as an argument to the function, that must also mean that 'a is equal to 'a * 'b list * 'c list, which clearly is not the case.
Clearly this was not your original intent. It seems that your intent was to have a function that takes an list as argument, and then at the same time have a recursive helper function, which takes two extra accumulation arguments, namely a list of positive and negative numbers in the original list.
To do this, you at least need to give your helper function another name, such that its definition won't rebind the definition of the original function.
Then you have some options, as to which scope this helper function should be in. In general if it doesn't make any sense to be calling this helper function other than from the "main" function, then it should not be places in a scope outside the "main" function. This can be done using a let binding like this:
fun positive xs =
let
fun positive' ys p n = ...
in
positive' xs [] []
end
This way the helper function positives' can't be called outside of the positive function.
With this take care of there are some more issues with your original code.
Since you are only returning the list of positive integers, there is no need to keep track of the
negative ones.
You should be using pattern matching to decompose the list elements. This way you eliminate the
use of taking the head and tail of the list, and also the need to verify whether there actually is
a head and tail in the list.
fun foo [] = ... (* input list is empty *)
| foo (x::xs) = ... (* x is now the head, and xs is the tail *)
You should not use the append operator (#), whenever you can avoid it (which you always can).
The problem is that it has a terrible running time when you have a huge list on the left hand
side and a small list on the right hand side (which is often the case for the right hand side, as
it is mostly used to append a single element). Thus it should in general be considered bad
practice to use it.
However there exists a very simple solution to this, which is to always concatenate the element
in front of the list (constructing the list in reverse order), and then just reversing the list
when returning it as the last thing (making it in expected order):
fun foo [] acc = rev acc
| foo (x::xs) acc = foo xs (x::acc)
Given these small notes, we end up with a function that looks something like this
fun positive xs =
let
fun positive' [] p = rev p
| positive' (y::ys) p =
if y < 0 then
positive' ys p
else
positive' ys (y :: p)
in
positive' xs []
end
Have you learned about List.filter? It might be appropriate here - it takes a function (which is a predicate) of type 'a -> bool and a list of type 'a list, and returns a list consisting of only the elements for which the predicate evaluates to true. For example:
List.filter (fn x => Real.>= (x, 0.0)) [1.0, 4.5, ~3.4, 42.0, ~9.0]
Your existing code won't work because you're comparing to integers using the intversion of <. The code hd(l1) < 0 will work over a list of int, not a list of real. Numeric literals are not automatically coerced by Standard ML. One must explicitly write 0.0, and use Real.< (hd(l1), 0.0) for your test.
If you don't want to use filter from the standard library, you could consider how one might implement filter yourself. Here's one way:
fun filter f [] = []
| filter f (h::t) =
if f h
then h :: filter f t
else filter f t