Traversing a list in SML - sml

I want to make a function that traverses a list,processes the head,stops after K recursions and also creates an identical list using the head element in each recursion.Code:
fun trav (0, _, list) = list
| trav(K, x::xs, list) =
trav(K - 1, xs, list#[x])
so if i call trav(4,[1,2,3,4,5,6],[])
I expect
list =[1] ,K=3
=[1,2] ,K=2
=[1,2,3] ,K=1
=[1,2,3,4] ,K=0
However for very large inputs this-> list#[x] seems to crash my program(i am not sure why) and if i use (x :: list) instead giving a different (but same sized) list as a result in each step everything works ok.Why is this happening?How could i implement list#[x] using the cons operator?

list#[x] needs to traverse the entirety of list, and then copy it by consing it element by element to [x], which is very inefficient.
The conventional solution is to build the result in reverse, and then reverse it to the desired order when you're done:
fun trav (0, _, list) = List.rev list
| trav (K, x::xs, list) = trav (K-1, xs, x::list)
This may seem inefficient, but it is actually much more efficient than the appending version.
(It has linear time complexity rather than quadratic, in case that means anything to you.)

[...] for very large inputs list#[x] seems to crash my program [...] and if i use x :: list instead giving a different (but same sized) list as a result in each step everything works ok.
Why is this happening?
list#[x] exhausts your program's stack memory because the # operator is not tail-recursive.
When list is very long, it builds an expression like so:
[a,b,c,d,e,f,...] # z
~> a :: [b,c,d,e,f,g,...] # z
~> a :: b :: [c,d,e,f,g,...] # z
~> ...
~> a :: b :: ... :: [] # z
~> a :: b :: ... :: z :: []
All of these intermediary list elements are kept in the recursive call stack, which your program eventually runs out of. And to top this off, this expensive computation is repeated for each element in the list, so a O(n²) time cost.

Related

How can I implement Tail Recursion on list in Haskell to generate sublists?

I want to make the biggersort function recall itself recursively on the tail of the list. My code below works and gives me the following output :
[6,7,8]
I want to continue by starting next with 3 then 1 .. to the last element.
What I want is something like :
[[6,7,8],[3,5,7,8],[1,5,7,8]..]
My code:
import Data.List
import System.IO
biggersort :: Ord a => [a] -> [a]
biggersort [] = []
biggersort (p:xs) = [p] ++ (biggersort greater)
where
greater = filter (>= p) xs
main = do
print $ biggersort [6,3,1,5,2,7,8,1]
You can access the list starting from each position using tails, so that a particularly concise implementation of the function you want in terms of the function you have is:
biggersorts = map biggersort . tails
There is one very noticeable downside (and one less noticeable downside) to this implementation, though. The noticeable downside is that computation on shorter lists is repeated when processing longer lists, leading to O(n^2) best-case time; the less obvious downside is that there is no memory sharing between elements of the result list, leading to O(n^2) worst-case memory usage. These bounds can be improved to O(n) average-case/O(n^2) worst-case time* and O(n) worst-case memory usage.
The idea is to start from the end of the list, and build towards the front. At each step, we look at all the rest of the results to see if there's one we can reuse. So:
biggersorts :: Ord a => [a] -> [[a]]
biggersorts [] = [[]]
biggersorts (a:as) = (a:as') : rec where
as' = fromMaybe [] (find bigger rec)
rec = biggersorts as
bigger (a':_) = a <= a'
bigger _ = False -- True is also fine
Beware that because of the sharing, it can be tricky to compare performance of these two fairly. The usual tricks for fully evaluating the output of this function don't play super nicely with sharing; so it's tricky to write something that obviously fully evaluates the outputs but also visits each subterm O(1) times (I think it's less important that this latter property be obvious). The most obvious way to do it (to me) involves rewriting both functions using some mildly advanced techniques, so I will elide that here.
* It seems like it ought to be possible to improve this to O(n*log(n)) worst-case time. During the recursion, build a cache :: Map a [a] to go with the rec :: [[a]]; the intuition for cache is that it tells which element of rec to use at each existing "boundary" a value. Updating the cache at each step involves splitting it at the current value and throwing away the bottom half.
I find the average-case time analysis harder for this variant; is it still O(n)? It seems plausible that the Map has O(1) average size during the run of this variant, but I wasn't able to convince myself of this guess. If not, is there another variant that achieves O(n) average-case/O(n*log(n)) best-case? ...is there one that does O(n) worst-case? (Probably the same counting argument used to bound the runtime of sorting rules this out...?)
Dunno!
You can make a function that will use biggersort as a helper function, so:
biggersort :: Ord a => [a] -> [a]
biggersort [] = []
biggersort (p:xs) = p : biggersort (filter (>= p) xs)
biggersorts :: Ord a => [a] -> [[a]]
biggersorts [] = []
biggersorts xa#(_:xs) = biggersort xa : biggersorts xs
main = print (biggersorts [6,3,1,5,2,7,8,1])
This then prints:
Prelude> biggersorts [6,3,1,5,2,7,8,1]
[[6,7,8],[3,5,7,8],[1,5,7,8],[5,7,8],[2,7,8],[7,8],[8],[1]]

Generate all permutations of a list including diferent sizes and repeated elements

I wanted to create the function genAllSize ::[a] -> [[a]], that receives a list l and generates all the lists sorted by size that can be built with the elements of the list l; i.e.
> genAllSize [2,4,8]
[[],[2],[4],[8],[2,2],[4,2],[8,2],[2,4],[4,4],[8,4],[2,8],[4,8],[8,8],[2,2,2],[4,2,2],[8,2,2], ...
How would you do it? I came up with a solution using permutations from Data.List but I do not want to use it.
Given an input list xs, select a prefix of that in a non deterministic way
For each element in the prefix, replace it with any element of xs, in a non deterministic way
Result:
> xs = [2,4,8]
> inits xs >>= mapM (const xs)
[[],[2],[4],[8],[2,2],[2,4],[2,8],[4,2],[4,4],[4,8],[8,2],[8,4],
[8,8],[2,2,2],[2,2,4],[2,2,8],[2,4,2],[2,4,4],[2,4,8],[2,8,2],
[2,8,4],[2,8,8],[4,2,2],[4,2,4],[4,2,8],[4,4,2],[4,4,4],[4,4,8],
[4,8,2],[4,8,4],[4,8,8],[8,2,2],[8,2,4],[8,2,8],[8,4,2],[8,4,4],
[8,4,8],[8,8,2],[8,8,4],[8,8,8]]
The other answers seem sort of complicated. I'd do it this way:
> [0..] >>= flip replicateM "abc"
["","a","b","c","aa","ab","ac","ba","bb","bc","ca","cb","cc","aaa","aab",...
Hmm I guess you a need a lazy infinite list of cycling subsequences. One naive way could be like
Prelude> take 100 $ nub . subsequences . cycle $ [2,4,8]
[[],[2],[4],[2,4],[8],[2,8],[4,8],[2,4,8],[2,2],[4,2],[2,4,2],[8,2],[2,8,2],[4,8,2],[2,4,8,2],[4,4],[2,4,4],[8,4],[2,8,4],[4,8,4],[2,4,8,4],[2,2,4],[4,2,4],[2,4,2,4],[8,2,4],[2,8,2,4],[4,8,2,4],[2,4,8,2,4],[8,8],[2,8,8],[4,8,8],[2,4,8,8],[2,2,8],[4,2,8],[2,4,2,8],[8,2,8],[2,8,2,8],[4,8,2,8],[2,4,8,2,8],[4,4,8],[2,4,4,8],[8,4,8],[2,8,4,8],[4,8,4,8],[2,4,8,4,8],[2,2,4,8],[4,2,4,8],[2,4,2,4,8],[8,2,4,8],[2,8,2,4,8],[4,8,2,4,8],[2,4,8,2,4,8],[2,2,2],[4,2,2],[2,4,2,2],[8,2,2],[2,8,2,2],[4,8,2,2],[2,4,8,2,2],[4,4,2],[2,4,4,2],[8,4,2],[2,8,4,2],[4,8,4,2],[2,4,8,4,2],[2,2,4,2],[4,2,4,2],[2,4,2,4,2],[8,2,4,2],[2,8,2,4,2],[4,8,2,4,2],[2,4,8,2,4,2]]
A simple and highly efficient option:
genAllSize [] = [[]]
genAllSize [a] = iterate (a:) []
genAllSize xs =
[] : [x:q|q<-genAllSize xs,x<-xs]
(Thanks to Will Ness for a small but very nice simplification.)
This solution takes advantage of the fact that a valid solution list is either empty or an element of the argument list consed onto a shorter valid solution list. Unlike Daniel Wagner's solution, this one doesn't resort to counting. My tests suggest that it performs extremely well under typical conditions.
Why do we need a special case for a one-element list? The general case performs extremely badly for that, because it maps over the same list over and over with no logarithmic slowdown.
But what's the deal with that call to genAllSizes with the very same argument? Wouldn't it be better to save the result to increase sharing?
genAllSize [] = [[]]
genAllSize xs = p
where
p = [] : [x:q|q<-p,x<-xs]
Indeed, on a theoretical machine with unlimited constant-time memory, this is optimal: walking the list takes worst-case O(1) time for each cons. In practice, it's only a good idea if a great many entries will be realized and retained. Otherwise, there's a problem: most of the list entries will be retained indefinitely, dramatically increasing memory residency and the amount of work the garbage collector needs to do. The non-bold sharing version above still offers amortized O(1) time per cons, but it needs very little memory (logarithmic rather than linear).
Examples
genAllSize "ab" =
["","a","b","aa","ba"
,"ab","bb","aaa","baa"
,"aba","bba","aab","bab"
,"abb","bbb","aaaa",...]
genAllSize "abc" =
["","a","b","c","aa","ba"
,"ca","ab","bb","cb","ac"
,"bc","cc","aaa","baa"
,"caa","aba","bba","cba"
,"aca",...]
An explicit option
You can also use two accumulators:
genAllSize [] = [[]]
genAllSize [a] = iterate (a:) []
genAllSize (x:xs) = go ([], []) where
go (curr, remain) = curr : go (step curr remain)
step [] [] = ([x], [xs])
step (_:ls) ((r:rs):rss) =
(r:ls, rs:rss)
step (_:ls) ([] : rs) =
(x : ls', xs : rs')
where
!(ls', rs') = step ls rs
This version keeps track of the current "word" and also the remaining available "letters" in each position. The performance seems comparable in general, but a bit better with regard to memory residency. It's also much harder to understand!
This produces the elements in a different order within each length than your example, but it meets the definition of the text of your question. Changing the order is easy - you have to replace <*> with a slightly different operator of your own making.
import Control.Applicative
import Control.Monad
rinvjoin :: Applicative both => both a -> both (both a)
rinvjoin = fmap pure
extendBranches options branches = (<|>) <$> options <*> branches
singletonBranchExtensions = rinvjoin
genAllSize [] = []
genAllSize xs = join <$> iterate (extendBranches extensions) $ initialBranches
where extensions = singletonBranchExtensions xs
initialBranches = pure empty

Trying to understand how not to add elements to a list

As a followup to this question, I'm attempting to understand how not to add elements to a list using ++.
From this answer:
Again if you only want to append a single element to the list, that is
not a problem. This is a problem if you want to append n elements that
way to a list, so if you each time append a single element to the
list, and you do that n times, then the algorithm will be O(n2).
So from my understanding, this means you shouldn't do this:
let numbers = [1,3,5,10,15]
newNumbers = numbers ++ [27]
listofnumbers = newNumbers ++ [39]
Is this what the bold text in the quoted answer telling you not to do? If not, using code, what is the bold text warning you not to do?
The answer talks about a bad time complexity when it comes to appending elements to the end of the list. When you concat a list xs of length m and a list ys of length n together using (++) then xs ++ ys will have time complexity O(m) (under the assumption you evaluate xs ++ ys for a number of steps in proportion to m).
So if your list ys consists of a single element y (that is ys == [y]) then [y] ++ xs will be O(1) because you add it to the beginning but xs ++ [y] will be O(m) because you add it to the end of another list. So when you repeatedly add elements to the end of another list you will end up with O(m^2). So better do it within one go so you will have O(m).
Note that lists in Haskell are actually stacks which could have an infinite number of elements.
Try these two functions with a large list (like [1..10000]) and see if you notice any difference:
func1 a [] = a
func1 a (x:rest) = func (a ++ [x]) rest
func2 a b = a ++ b

Calculating the difference between two strings

I have two strings
a :: [String]
a = ["A1","A2","B3","C3"]
and
b :: [String]
b = ["A1","B2","B3","D5"]
And I want to calculate the difference between two strings based on the first character and second character and combination of two characters.
If the combination of two elements are the same, it would be calculate as 1
The function I declared is
calcP :: [String] -> [String] -> (Int,[String])
calcP (x:xs) (y:ys) = (a,b)
where
a = 0 in
???
b = ????
I know that I should have a increment variable to count the correct element, and where I should put it in? For now I totally have no idea about how to do that, can anyone give me some hint??
The desired result would be
(2,["B2","D5"])
How should I do that?
I assume that the lists have the same size.
The differences between the two lists
Let's focus on the main part of the problem:
Prelude> a=["A1","A2","B3","C3"]
Prelude> b=["A1","B2","B3","D5"]
First, notice that the zip method zips two lists. If you use it on a and b, you get:
Prelude> zip a b
[("A1","A1"),("A2","B2"),("B3","B3"),("C3","D5")]
Ok. It's now time to compare the terms one to one. There are many ways to do it.
Filter
Prelude> filter(\(x,y)->x/=y)(zip a b)
[("A2","B2"),("C3","D5")]
The lambda function returns True if the elements of the pair are different (/= operator). Thus, the filter keeps only the pairs that don't match.
It's ok, but you have to do a little more job to keep only the second element of each pair.
Prelude> map(snd)(filter(\(x,y)->x/=y)(zip a b))
["B2","D5"]
map(snd) applies snd, which keeps only the second element of a pair, to every discordant pair.
Fold
A fold is more generic, and may be used to implement a filter. Let's see how:
Prelude> foldl(\l(x,y)->if x==y then l else l++[y])[](zip a b)
["B2","D5"]
The lambda function takes every pair (x,y) and compares the two elements. If they have the same value, the accumulator list remains the identical, but if the values are different, the accumulator list is augmented by the second element.
List comprehension
This is more compact, and should seem obvious to every Python programmer:
Prelude> [y|(x,y)<-zip a b, x/=y] -- in Python [y for (x,y) in zip(a,b) if x!= y]
["B2","D5"]
The number of elements
You want a pair with the number of elements and the elements themselves.
Fold
With a fold, it's easy but cumbersome: you will use a slightly more complicated accumulator, that stores simultaneously the differences (l) and the number of those differences (n).
Prelude> foldl(\(n,l)(x,y)->if x==y then (n,l) else (n+1,l++[y]))(0,[])$zip a b
(2,["B2","D5"])
Lambda
But you can use the fact that your output is redundant: you want a list preceeded by the length of that list. Why not apply a lambda that does the job?
Prelude> (\x->(length x,x))[1,2,3]
(3,[1,2,3])
With a list comprehension, it gives:
Prelude> (\x->(length x,x))[y|(x,y)<-zip a b, x/=y]
(2,["B2","D5"])
Bind operator
Finally, and for the fun, you don't need to build the lambda this way. You could do:
Prelude> ((,)=<<length)[y|(x,y)<-zip a b,x/=y]
(2,["B2","D5"])
What happens here? (,) is a operator that makes a pair from two elements:
Prelude> (,) 1 2
(1,2)
and ((,)=<<length) : 1. takes a list (technically a Foldable) and passes it to the length function; 2. the list and the length are then passed by =<< (the "bind" operator) to the (,) operator, hence the expected result.
Partial conclusion
"There is more than than one way to do it" (but it's not Perl!)
Haskell offers a lot of builtins functions and operators to handle this kind of basic manipulation.
What about doing it recursively? If two elements are the same, the first element of the resulting tuple is incremented; otherwise, the second element of the resulting tuple is appended by the mismatched element:
calcP :: [String] -> [String] -> (Int,[String])
calcP (x:xs) (y:ys)
| x == y = increment (calcP xs ys)
| otherwise = append y (calcP xs ys)
where
increment (count, results) = (count + 1, results)
append y (count, results) = (count, y:results)
calcP [] x = (0, x)
calcP x [] = (0, [])
a = ["A1","A2","B3","C3"]
b = ["A1","B2","B3","D5"]
main = print $ calcP a b
The printed result is (2,["B2","D5"])
Note, that
calcP [] x = (0, x)
calcP x [] = (0, [])
are needed to provide exhaustiveness for the pattern matching. In other words, you need to provide the case when one of the passed elements is an empty list. This also provides the following logic:
If the first list is greater than the second one on n elements, these n last elements are ignored.
If the second list is greater than the first one on n elements, these n last elements are appended to the second element of the resulting tuple.
I'd like to propose a very different method than the other folks: namely, compute a "summary statistic" for each pairing of elements between the two lists, and then combine the summaries into your desired result.
First some imports.
import Data.Monoid
import Data.Foldable
For us, the summary statistic is how many matches there are, together with the list of mismatches from the second argument:
type Statistic = (Sum Int, [String])
I've used Sum Int instead of Int to specify how statistics should be combined. (Other options here include Product Int, which would multiply together the values instead of adding them.) We can compute the summary of a single pairing quite simply:
summary :: String -> String -> Statistic
summary a b | a == b = (1, [ ])
| otherwise = (0, [b])
Combining the summaries for all the elements is just a fold:
calcP :: [String] -> [String] -> Statistic
calcP as bs = fold (zipWith summary as bs)
In ghci:
> calcP ["A1", "A2", "B3", "C3"] ["A1", "B2", "B3", "D5"]
(Sum {getSum = 2},["B2","D5"])
This general pattern (of processing elements one at a time into a Monoidal type) is frequently useful, and spotting where it's applicable can greatly simplify your code.

Inserting an integer into a list at specific place

I want to make a program insertAt where z is the place in the list, and y is the number being inserted into the list xs. Im new to haskell and this is what I have so far.
insertAt :: Int-> Int-> [Int]-> [Int]
insertAt z y xs
| z==1 = y:xs
but I'm not sure where to go from there.
I have an elementAt function, where
elementAt v xs
| v==1 = head xs
| otherwise = elementAt (v-1) (tail xs)
but I'm not sure how I can fit it in or if I even need to. If possible, I'd like to avoid append.
If this isn't homework: let (ys,zs) = splitAt n xs in ys ++ [new_element] ++ zs
For the rest of this post I'm going to assume you're doing this problem as homework or to teach yourself how to do this kind of thing.
The key to this kind of problem is to break it down into its natural cases. You're processing two pieces of data: the list you're inserting into, and the position in that list. In this case, each piece of data has two natural cases: the list you're procssing can be empty or not, and the number you're processing can be zero or not. So the first step is to write out all four cases:
insertAt 0 val [] = ...
insertAt 0 val (x:xs) = ...
insertAt n val [] = ...
insertAt n val (x:xs) = ...
Now, for each of these four cases, you need to think about what the answer should be given that you're in that case.
For the first two cases, the answer is easy: if you want to insert into the front of a list, just stick the value you're interested in at the beginning, whether the list is empty or not.
The third case demonstrates that there's actually an ambiguity in the question: what happens if you're asked to insert into, say, the third position of a list that's empty? Sounds like an error to me, but you'll have to answer what you want to do in that case for yourself.
The fourth case is most interesting: Suppose you want to insert a value into not-the-first position of a list that's not empty. In this case, remember that you can use recursion to solve smaller instances of your problem. In this case, you can use recursion to solve, for instance, insertAt (n-1) val xs -- that is, the result of inserting your same value into the tail of your input list at the n-1th position. For example, if you were trying to insert 5 into position 3 (the fourth position) of the list [100,200,300], you can use recursion to insert 5 into position 2 (the third position) of the list [200,300], which means the recursive call would produce [200,300,5].
We can just assume that the recursive call will work; our only job now is to convert the answer to that smaller problem into the answer to the original problem we were given. The answer we want in the example is [100,200,300,5] (the result of inserting 5 into position 4 of the list [100,200,300], and what we have is the list [200,300,5]. So how can we get the result we want? Just add back on the first element! (Think about why this is true.)
With that case finished, we've covered all the possible cases for combinations of lists and positions to update. Since our function will work correctly for all possibilities, and our possibilities cover all possible inputs, that means our function will always work correctly. So we're done!
I'll leave it to you to translate these ideas into Haskell since the point of the exercise is for you to learn it, but hopefully that lets you know how to solve the problem.
You could split the list at index z and then concatenate the first part of the list with the element (using ++ [y]) and then with the second part of the list. However, this would create a new list as data is immutable by default. The first element of the list by convention has the index 0 (so adjust z accordingly if you want the meaning of fist elemnt is indexed by 1).
insertAt :: Int -> Int-> [Int] -> [Int]
insertAt z y xs = as ++ (y:bs)
where (as,bs) = splitAt z xs
While above answers are correct, I think this is more concise:
insertAt :: Int -> Int-> [Int]-> [Int]
insertAt z y xs = (take z xs) ++ y:(drop z xs)