Stack overflow with recursive function - list

I'm trying to make a recursive function where i pass in a integer and a list. I want to append a certain number of "-" (dashes) to the list if the length of the list is less than the integer as follows:
let rec dashes (longest, l1) =
if length l1 = longest then l1
else ["-"]#l1#dashes(longest,l1);;
However I get a stack overflow and I'm not sure why.

Your recursive call to dashes passes through the original l1 argument, so the list length never grows, and the terminating condition remains false.

Related

How can I calculate the length of a list containing lists in OCAML

i am a beginner in ocaml and I am stuck in my project.
I would like to count the number of elements of a list contained in a list.
Then test if the list contains odd or even lists.
let listoflists = [[1;2] ; [3;4;5;6] ; [7;8;9]]
output
l1 = even
l2 = even
l3 = odd
The problem is that :
List.tl listoflists
Gives the length of the rest of the list
so 2
-> how can I calculate the length of the lists one by one ?
-> Or how could I get the lists and put them one by one in a variable ?
for the odd/even function, I have already done it !
Tell me if I'm not clear
and thank you for your help .
Unfortunately it's not really possible to help you very much because your question is unclear. Since this is obviously a homework problem I'll just make a few comments.
Since you talk about putting values in variables you seem to have some programming experience. But you should know that OCaml code tends to work with immutable variables and values, which means you have to look at things differently. You can have variables, but they will usually be represented as function parameters (which indeed take different values at different times).
If you have no experience at all with OCaml it is probably worth working through a tutorial. The OCaml.org website recommends the first 6 chapters of the OCaml manual here. In the long run this will probably get you up to speed faster than asking questions here.
You ask how to do a calculation on each list in a list of lists. But you don't say what the answer is supposed to look like. If you want separate answers, one for each sublist, the function to use is List.map. If instead you want one cumulative answer calculated from all the sublists, you want a fold function (like List.fold_left).
You say that List.tl calculates the length of a list, or at least that's what you seem to be saying. But of course that's not the case, List.tl returns all but the first element of a list. The length of a list is calculated by List.length.
If you give a clearer definition of your problem and particularly the desired output you will get better help here.
Use List.iter f xs to apply function f to each element of the list xs.
Use List.length to compute the length of each list.
Even numbers are integrally divisible by two, so if you divide an even number by two the remainder will be zero. Use the mod operator to get the remainder of the division. Alternatively, you can rely on the fact that in the binary representation the odd numbers always end with 1 so you can use land (logical and) to test the least significant bit.
If you need to refer to the position of the list element, use List.iteri f xs. The List.iteri function will apply f to two arguments, the first will be the position of the element (starting from 0) and the second will be the element itself.

OCaml update value through iteration of a list

I was wondering how I can update the value of a variable through the iteration of a list. For example, let's say I want to keep track of the number of variables of a list. I could do something like
let list = [1;2;3;4;5]
let length = 0 in
let getCount elmt =
length = length+1 in
List.iter getCount list
but I get the error This expression has type 'a -> bool which makes sense because at length = length+1 I am comparing using =. How should I update the value of length?
EDIT:
I tried
let wordMap =
let getCount word =
StringMap.add word (1000) wordMap in
List.fold_left getCount StringMap.empty wordlist;;
but it doesn't know what wordMap is in getCount function...
#PatJ gives a good discussion. But in real code you would just use a fold. The purpose of a fold is precisely what you ask for, to maintain some state (of any type you like) while traversing a list.
Learning to think in terms of folds is a basic skill in functional programming, so it's worth learning.
Once you're good at folds, you can decide on a case-by-case basis whether you need mutable state. In almost all cases you don't.
(You can definitely use a fold to accumulate a map while traversing a list.)
You have several ways to do this.
The simpler way (often preferred by beginners coming from the imperative world) is to use a reference. A reference is a variable you can legally mutate.
let length l =
let count = ref 0 in
let getCount _ = (* _ means we ignore the argument *)
count := !count + 1
in
List.iter getCount l;
!count
As you can see in here, !count returns the value currently in the reference and := allows you to do the imperative update.
But you should not write that code
Yeah, I'm using bold, this is how serious I am about it. Basically, you should avoid using references when you can rely on pure functional programing. That is, when there are no side-effects.
So how do you modify a variable when you are not allowed to? That's where recursion comes in. Check this:
let rec length l =
match l with
| [] -> 0
| _::tl -> 1 + length tl
In that code, we no longer have a count variable. Don't worry, we'll get it back soon. But you can see that just by calling length again, we can assign a new value tl to the argument l. Yet it is pure and considered a better practice.
Well, almost.
The last code has the problem of recursion: each call will add (useless) data to the stack and, after being through the list, will do the additions. We don't want that.
However, function calls can be optimized if they are tail calls. As Wikipedia can explain to you:
a tail call is a subroutine call performed as the final action of a
procedure.
In the later code, the recursive call to length isn't a tail call as + is the final action of the function. The usual trick is to use an accumulator to store the intermediate results. Let's call it count.
let rec length_iterator count l =
match l with
| [] -> count
| _::tl -> length_iterator (count+1) tl
in
let length l = length_iterator 0 l
And now we have a neat, pure, and easy-to-optimize code that calculates the length of your list.
So, to answer the question as stated in the title: iterate with a (tail-)recursive function and have the updatable variables as arguments of this function.
If the goal is to get the length of the list, just use the function provided by List Module: List.length. Otherwise, variables in OCaml are never mutable and what you're trying to do is illegal in OCaml and not functional at all. But if you really have to update a value, consider using ref(for more info: http://www.cs.cornell.edu/courses/cs3110/2011sp/recitations/rec10.htm).

Does using Haskell's (++) operator to append to list cause multiple list traversals?

Does appending to a Haskell list with (++) cause lists to be traversed multiple times?
I tried a simple experiment in GHCI.
The first run:
$ ghci
GHCi, version 7.8.4: http://www.haskell.org/ghc/ :? for help
Prelude> let t = replicate 9999999 'a' ++ ['x'] in last t
'x'
(0.33 secs, 1129265584 bytes)
The second run:
$ ghci
GHCi, version 7.8.4: http://www.haskell.org/ghc/ :? for help
Prelude> let t = replicate 9999999 'a' in last t
'a'
(0.18 secs, 568843816 bytes)
The only difference is the ++ ['x'] to append a last element to a list. It causes the runtime to increase from .18s to .33s, and the memory to increase from 568MB to 1.12GB.
So it seems that indeed it does cause multiple traversals. Can someone confirm on more theoretical grounds?
You can't conclude from these numbers whether the first run does two traversals, or one traversal in which each step takes more time and allocates more memory than the single traversal in the second run.
In fact, it's the latter that is happening here. You can think of the two evaluations like this:
in the second expression let t = replicate 9999999 'a' in last t, in each step but the last one, last evaluates its argument, which causes replicate to allocate a cons cell and decrement a counter, and then the cons cell is consumed by last.
in the first expression let t = replicate 9999999 'a' ++ ['x'] in last t, in each step but the last one, last evaluates its argument, which causes (++) to evaluate its first argument, which causes replicate to allocate a cons cell and decrement a counter, and then that cons cell is consumed by (++) and (++) allocates a new cons cell, and then that new cons cell is consumed by last.
So the first expression is still a single traversal, it's just one that does more work per step.
Now if you wanted to you could divide up all this work into "the work done by last" and "the work done by (++)" and call those two "traversals"; and that can be a useful approach for understanding the total amount of work done by your program. But due to Haskell's laziness, the two "traversals" are really interleaved as described above, so most people would say that the list is traversed just once.
I'd like to talk a bit about what happens when we enable optimizations, because it can transform the performance characteristics of the program pretty radically. I'll be looking at the Core output produced by ghc -O2 Main.hs -ddump-simpl -dsuppress-all. Also, I run the compiled programs with +RTS -s to get info about memory usage and running time.
With GHC 7.8.4 the two versions of the code run in the same amount of time and with the same amount of heap allocation. That's because replicate 9999999 'a' and ++ ['x'] is replaced with a genlist 9999999, where genlist looks like the following (not exactly the same, as I employ liberal translation from the original Core):
genlist :: Int -> [Char]
genlist n | n <= 1 = "ax"
| otherwise = 'a' : genList (n - 1)
Since we do generation and concatenation in a single step, we allocate each list cell just once.
With GHC 7.10.1, we get fancy new optimizations for list processing. Now both of our programs allocate about as much memory as a print $ "Hello World" program (about 52 Kb on my machine). This is because we skip the list creation entirely. Now last is fused away too; we instead get a call getlast 9999999, with getlast being:
getlast :: Int -> Char
getlast 1 = 'x'
getlast n = getlast (n - 1)
In the executable we'll have a small machine code loop that counts down from 9999999 to 1. GHC is not quite smart enough to skip all computation and go straight to returning 'x', but it does a good job nevertheless, and in the end it gives us something rather different to the original code.

How to split a list into a list of lists by removing a specific separation(Haskell)

I'm a newbie to Haskell, I have a problem. I need to write a function that splits a list into a list of lists everywhere a 'separation' appears.
I will try to help you develop the understanding of how to develop functions that work on lists via recursion. It is helpful to learn how to do it first in a 'low-level' way so you can understand better what's happening in the 'high-level' ways that are more common in real code.
First, you must think about the nature of the type of data that you want to work with. The list is in some sense the canonical example of a recursively-defined type in Haskell: a list is either the empty list [] or it is some list element a combined with a list via a : list. Those are the only two possibilities. We call the empty list the base case because it is the one that does not refer to itself in its definition. If there were no base case, recursion would never "bottom out" and would continue indefinitely!
The fact that there are two cases in the definition of a list means that you must consider two cases in the definition of a function that works with lists. The canonical way to consider multiple cases in Haskell is pattern matching. Haskell syntax provides a number of ways to do pattern matching, but I'll just use the basic case expression for now:
case xs of
[] -> ...
x:xs' -> ...
Those are the two cases one must consider for a list. The first matches the literal empty list constructor; the second matches the element-adding constructor : and also binds two variables, x and xs', to the first element in the list and the sublist containing the rest of the elements.
If your function was passed a list that matches the first case, then you know that either the initial list was empty or that you have completed the recursion on the list all the way past its last element. Either way, there is no more list to process; you are either finished (if your calls were tail-recursive) or you need to pass the basic element of your answer construction back to the function that called this one (by returning it). In the case that your answer will be a list, the basic element will usually be the empty list again [].
If your function was passed a list that matches the second case, then you know that it was passed a non-empty list, and furthermore you have a couple of new variables bound to useful values. Based on these variables, you need to decide two things:
How do I do one step of my algorithm on that one element, assuming I have the correct answer from performing it on the rest of the list?
How do I combine the results of that one step with the results of performing it on the rest of the list?
Once you've figured the answers to those questions, you need to construct an expression that combines them; getting the answer for the rest of the list is just a matter of invoking the recursive call on the rest of the list, and then you need to perform the step for the first element and the combining.
Here's a simple example that finds the length of a list
listLength :: [a] -> Int
listLength as =
case as of
[] -> 0 -- The empty list has a length of 0
a:as' -> 1 + listlength as' -- If not empty, the length is one more than the
-- length of the rest of the list
Here's another example that removes matching elements from a list
listFilter :: Int -> [Int] -> Int
listFilter x ns =
case ns of
[] -> [] -- base element to build the answer on
n:ns' -> if n == x
then listFilter x ns' -- don't include n in the result list
else n : (listFilter x ns') -- include n in the result list
Now, the question you asked is a little bit more difficult, as it involves a secondary 'list matching' recursion to identify the separator within the basic recursion on the list. It is sometimes helpful to add extra parameters to your recursive function in order to hold extra information about where you are at in the problem. It's also possible to pattern match on two parameters at the same time by putting them in a tuple:
case (xs, ys) of
([] , [] ) -> ...
(x:xs', [] ) -> ...
([] , y:ys') -> ...
(x:xs', y:ys') -> ...
Hopefully these hints will help you to make some progress on your problem!
Let's see if the problem can be reduced in a obvious way.
Suppose splitList is called with xs to split and ys as the separator. If xs is empty, the problem is the smallest, so what's the answer to that problem? It is important to have the right answer here, because the inductive solution depends on this decision. But we can make this decision later.
Ok, so for problem to be reducable, the list xs is not empty. So, it has at least a head element h and the smaller problem t, the tail of the list: you can match xs#(h:t). How to obtain the solution to the smaller problem? Well, splitList can solve that problem by the definition of the function. So now the trick is to figure out how to build the solution for bigger problem (h:t), when we know the solution to the smaller problem zs=splitList t ys. Here we know that zs is the list of lists, [[a]], and because t may have been the smallest problem, zs may well be the solution to the smallest problem. So, whatever you do with zs, it must be valid even for the solution to the smallest problem.
splitList [] ys = ... -- some constant is the solution to the smallest problem
splitList xs#(h:t) ys = let zs = splitList t ys
in ... -- build a solution to (h:t) from solution to t
I don't know how to test it. Anybody tells me how to write a function to a .hs file and use winGHCi to run this function?
WinGHCi automatically associates with .hs files so just double-click on the file and ghci should start up. After making some changes to the file using your favourite editor you can write use the :r command in ghci to reload the file.
To test the program after fixing typos, type-errors, and ensuring correct indentation, try calling functions you have defined with different inputs (or use QuickCheck). Note Maybe is defined as Just x or Nothing. You can use fromMaybe to extract x (and provide default value for the Nothing case).
Also try to make sure that pattern matching is exhaustive.

stack overflow when generating large sequence of letters in ocaml

Given an alphabet ["a"; "b"; "c"] I want to dump all sequences of length 25 to a file. (Letters can repeat in a sequence; it's not a permutation.) The problem is, I get a Stack overflow during evaluation (looping recursion?) when I try using the following code:
let addAlphabetToPrefix alphabet prefix =
List.map (function letter -> (prefix ^ letter)) alphabet;;
let rec generateWords alphabet counter words =
if counter > 25 then
words
else
let newWords = List.flatten(List.map (function word -> addAlphabetToPrefix alphabet word) words) in
generateWords alphabet (counter + 1) newWords;;
generateWords ["a"; "b"; "c"] 0 [""];; (* Produces a stack overflow. *)
Is there a better way of doing this? I was thinking of generating the entire list first, and then dumping the entire list to a file, but do I have to repeatedly generate partials lists and then dump? Would making something lazy help?
Why exactly is a stack overflow occurring? AFAICT, my generateWords function is tail-recursive. Is the problem that the words list I'm generating is getting too big to fit into memory?
Your functions are being compiled as tailcalls. I confirmed from the linearized code; obtained from the -dlinear option in the native compiler, ocamlopt[.opt].
The fact of the matter is, your heap is growing exponentially, and 25 words is unsustainable in this method. Trying with 11 works fine (and is the highest I could deal with).
Yes, there is a better way to do this. You can generate the combinations by looking up the index of the combination in lexicographical order or using grey codes (same page). These would only require storage for one word, can be run in parallel, and will never cause a segmentation fault --you might overflow the using the index method though, in which case you can switch to the big integers but will sacrifice speed, or grey codes (which may be difficult to parallelize, depending on the grey code).
OCaml optimizes tail recursion, so your code should work, except: the standard library's List.map function is, unfortunately, not tail-recursive. The stack overflow is potentially occurring in one of those calls, as your lists get rather large.
Batteries Included and Jane Street's Core library both provide tail-recursive versions of map. Try one of those and see if it fixes the problem.