Haskell list creation and concatenation performance [duplicate]

Haskell list creation and concatenation performance [duplicate] - list

This question already has answers here:
Haskell foldl' poor performance with (++)
(3 answers)
Closed 4 years ago.
In the process of playing with haskell and finding solution for project euler n°40 i found that this code is very fast:
p = concat [show n | n <- [1..]]
dl x = p !! x
y = [10^a - 1| a<-[0..6]]
s = [digitToInt (dl b)::Int | b <-y]
but this one is extremly slow like a millions time slower
p = foldl1 (++) (map show [1..1000000])
dl x = p !! x
y = [10^a - 1| a<-[0..6]]
s = [digitToInt (dl b)::Int | b <-y]
can anybody explain to me why? thanks

As already mentioned in the comments, your lists are associating in the worst possible way. If we look at the usual (and most sensible) definition of (++), we see
(++) :: [a] -> [a] -> [a]
[] ++ ys = ys
(x:xs) ++ ys = x : (xs ++ ys)
(Source)
So (++) is O(n) where n is the length of the first list only. It doesn't depend on the length of the second. When you do foldl1 (or any of the other foldl variants), the derivation looks something like...
foldl1 (++) [a, b, c, d]
==> ((a ++ b) ++ c) ++ d
Since we only look at the first argument to determine the complexity, if we let n be the sum of the lengths of a, b, c, d, then we're iterating over the elements of a three times, those of b twice, and c once. So we're doing, ostensibly, n operations n times, for a total of O(n^2) operations. (You can do the math of this rigorously. You'll end up with a sum resulting in a triangular number, and the triangular number formula is quadratic, which is where the n^2 comes from. Proof is left as an exercise to the reader.)
On the other hand, if we use foldr, then
foldlr (++) [] [a, b, c, d]
==> a ++ (b ++ (c ++ (d ++ [])))
Again, we only look at the left-hand argument every time. In this case, the arguments are organized nicely, so we only iterate over each element once, resulting in O(n) recursive steps.
Difference lists solve this in a different way, by "magically" reassociating your parentheses to be better placed. You can read more about those in the linked article.
The reason concat works better is that the people who wrote Haskell knew about these problems and knew to use foldr (or equivalent) to get the better performance.

Related

A faster way of generating combinations with a given length, preserving the order

TL;DR: I want the exact behavior as filter ((== 4) . length) . subsequences. Just using subsequences also creates variable length of lists, which takes a lot of time to process. Since in the end only lists of length 4 are needed, I was thinking there must be a faster way.
I have a list of functions. The list has the type [Wor -> Wor]
The list looks something like this
[f1, f2, f3 .. fn]
What I want is a list of lists of n functions while preserving order like this
input : [f1, f2, f3 .. fn]
argument : 4 functions
output : A list of lists of 4 functions.
Expected output would be where if there's an f1 in the sublist, it'll always be at the head of the list.
If there's a f2 in the sublist and if the sublist doens't have f1, f2 would be at head. If fn is in the sublist, it'll be at last.
In general if there's a fx in the list, it never will be infront of f(x - 1) .
Basically preserving the main list's order when generating sublists.
It can be assumed that length of list will always be greater then given argument.
I'm just starting to learn Haskell so I haven't tried all that much but so far this is what I have tried is this:
Generation permutations with subsequences function and applying (filter (== 4) . length) on it seems to generate correct permutations -but it doesn't preserve order- (It preserves order, I was confusing it with my own function).
So what should I do?
Also if possible, is there a function or a combination of functions present in Hackage or Stackage which can do this? Because I would like to understand the source.

You describe a nondeterministic take:
ndtake :: Int -> [a] -> [[a]]
ndtake 0 _ = [[]]
ndtake n [] = []
ndtake n (x:xs) = map (x:) (ndtake (n-1) xs) ++ ndtake n xs
Either we take an x, and have n-1 more to take from xs; or we don't take the x and have n more elements to take from xs.
Running:
> ndtake 3 [1..4]
[[1,2,3],[1,2,4],[1,3,4],[2,3,4]]
Update: you wanted efficiency. If we're sure the input list is finite, we can aim at stopping as soon as possible:
ndetake n xs = go (length xs) n xs
where
go spare n _ | n > spare = []
go spare n xs | n == spare = [xs]
go spare 0 _ = [[]]
go spare n [] = []
go spare n (x:xs) = map (x:) (go (spare-1) (n-1) xs)
++ go (spare-1) n xs
Trying it:
> length $ ndetake 443 [1..444]
444
The former version seems to be stuck on this input, but the latter one returns immediately.
But, it measures the length of the whole list, and needlessly so, as pointed out by #dfeuer in the comments. We can achieve the same improvement in efficiency while retaining a bit more laziness:
ndzetake :: Int -> [a] -> [[a]]
ndzetake n xs | n > 0 =
go n (length (take n xs) == n) (drop n xs) xs
where
go n b p ~(x:xs)
| n == 0 = [[]]
| not b = []
| null p = [(x:xs)]
| otherwise = map (x:) (go (n-1) b p xs)
++ go n b (tail p) xs
Now the last test also works instantly with this code as well.
There's still room for improvement here. Just as with the library function subsequences, the search space could be explored even more lazily. Right now we have
> take 9 $ ndzetake 3 [1..]
[[1,2,3],[1,2,4],[1,2,5],[1,2,6],[1,2,7],[1,2,8],[1,2,9],[1,2,10],[1,2,11]]
but it could be finding [2,3,4] before forcing the 5 out of the input list. Shall we leave it as an exercise?

Here's the best I've been able to come up with. It answers the challenge Will Ness laid down to be as lazy as possible in the input. In particular, ndtake m ([1..n]++undefined) will produce as many entries as possible before throwing an exception. Furthermore, it strives to maximize sharing among the result lists (note the treatment of end in ndtakeEnding'). It avoids problems with badly balanced list appends using a difference list. This sequence-based version is considerably faster than any pure-list version I've come up with, but I haven't teased apart just why that is. I have the feeling it may be possible to do even better with a better understanding of just what's going on, but this seems to work pretty well.
Here's the general idea. Suppose we ask for ndtake 3 [1..5]. We first produce all the results ending in 3 (of which there is one). Then we produce all the results ending in 4. We do this by (essentially) calling ndtake 2 [1..3] and adding the 4 onto each result. We continue in this manner until we have no more elements.
import qualified Data.Sequence as S
import Data.Sequence (Seq, (|>))
import Data.Foldable (toList)
We will use the following simple utility function. It's almost the same as splitAtExactMay from the 'safe' package, but hopefully a bit easier to understand. For reasons I haven't investigated, letting this produce a result when its argument is negative leads to ndtake with a negative argument being equivalent to subsequences. If you want, you can easily change ndtake to do something else for negative arguments.
-- to return an empty list in the negative case.
splitAtMay :: Int -> [a] -> Maybe ([a], [a])
splitAtMay n xs
| n <= 0 = Just ([], xs)
splitAtMay _ [] = Nothing
splitAtMay n (x : xs) = flip fmap (splitAtMay (n - 1) xs) $
\(front, rear) -> (x : front, rear)
Now we really get started. ndtake is implemented using ndtakeEnding, which produces a sort of "difference list", allowing all the partial results to be concatenated cheaply.
ndtake :: Int -> [t] -> [[t]]
ndtake n xs = ndtakeEnding n xs []
ndtakeEnding :: Int -> [t] -> ([[t]] -> [[t]])
ndtakeEnding 0 _xs = ([]:)
ndtakeEnding n xs = case splitAtMay n xs of
Nothing -> id -- Not enough elements
Just (front, rear) ->
(front :) . go rear (S.fromList front)
where
-- For each element, produce a list of all combinations
-- *ending* with that element.
go [] _front = id
go (r : rs) front =
ndtakeEnding' [r] (n - 1) front
. go rs (front |> r)
ndtakeEnding doesn't call itself recursively. Rather, it calls ndtakeEnding' to calculate the combinations of the front part. ndtakeEnding' is very much like ndtakeEnding, but with a few differences:
We use a Seq rather than a list to represent the input sequence. This lets us split and snoc cheaply, but I'm not yet sure why that seems to give amortized performance that is so much better in this case.
We already know that the input sequence is long enough, so we don't need to check.
We're passed a tail (end) to add to each result. This lets us share tails when possible. There are lots of opportunities for sharing tails, so this can be expected to be a substantial optimization.
We use foldr rather than pattern matching. Doing this manually with pattern matching gives clearer code, but worse constant factors. That's because the :<|, and :|> patterns exported from Data.Sequence are non-trivial pattern synonyms that perform a bit of calculation, including amortized O(1) allocation, to build the tail or initial segment, whereas folds don't need to build those.
NB: this implementation of ndtakeEnding' works well for recent GHC and containers; it seems less efficient for earlier versions. That might be the work of Donnacha Kidney on foldr for Data.Sequence. In earlier versions, it might be more efficient to pattern match by hand, using viewl for versions that don't offer the pattern synonyms.
ndtakeEnding' :: [t] -> Int -> Seq t -> ([[t]] -> [[t]])
ndtakeEnding' end 0 _xs = (end:)
ndtakeEnding' end n xs = case S.splitAt n xs of
(front, rear) ->
((toList front ++ end) :) . go rear front
where
go = foldr go' (const id) where
go' r k !front = ndtakeEnding' (r : end) (n - 1) front . k (front |> r)
-- With patterns, a bit less efficiently:
-- go Empty _front = id
-- go (r :<| rs) !front =
-- ndtakeEnding' (r : end) (n - 1) front
-- . go rs (front :|> r)

Haskell - split a list into two sublists with closest sums

I'm a Haskell beginner trying to learn more about the language by solving some online quizzes/problem sets.
The problem/question is quite lengthy but a part of it requires code that can find the number which divides a given list into two (nearly) equal (by sum) sub-lists.
Given [1..10]
Answer should be 7 since 1+2+..7 = 28 & 8+9+10 = 27
This is the way I implemented it
-- partitions list by y
partishner :: (Floating a) => Int -> [a] -> [[[a]]]
partishner 0 xs = [[xs],[]]
partishner y xs = [take y xs : [drop y xs]] ++ partishner (y - 1) xs
-- finds the equal sum
findTheEquilizer :: (Ord a, Floating a) => [a] -> [[a]]
findTheEquilizer xs = fst $ minimumBy (comparing snd) zipParty
where party = (tail . init) (partishner (length xs) xs) -- removes [xs,[]] types
afterParty = (map (\[x, y] -> (x - y) ** 2) . init . map (map sum)) party
zipParty = zip party afterParty -- zips partitions and squared diff betn their sums
Given (last . head) (findTheEquilizer [1..10])
output : 7
For numbers near 50k it works fine
λ> (last . head) (findTheEquilizer [1..10000])
7071.0
The trouble starts when I put in lists with any more than 70k elements in it. It takes forever to compute.
So what do I have to change in the code to make it run better or do I have to change my whole approach? I'm guessing it's the later, but I'm not sure how to go about do that.

It looks to me that the implementation is quite chaotic. For example partishner seems to construct a list of lists of lists of a, where, given I understood it correctly, the outer list contains lists with each two elements: the list of elements on "the left", and the list of elements at the "right". As a result, this will take O(n2) to construct the lists.
By using lists over 2-tuples, this is also quite "unsafe", since a list can - although here probably impossible - contain no elements, one element, or more than two elements. If you make a mistake in one of the functions, it will be hard to find out that mistake.
It looks to me that it might be easier to implement a "sweep algorithm": we first calculate the sum of all the elements in the list. This is the value on the "right" in case we decide to split at that specific point, next we start moving from left to right, each time subtracting the element from the sum on the right, and adding it to the sum on the left. We can each time evaluate the difference in score, like:
import Data.List(unfoldr)
sweep :: Num a => [a] -> [(Int, a, [a])]
sweep lst = x0 : unfoldr f x0
where x0 = (0, sum lst, lst)
f (_, _, []) = Nothing
f (i, r, (x: xs)) = Just (l, l)
where l = (i+1, r-2*x, xs)
For example:
Prelude Data.List> sweep [1,4,2,5]
[(0,12,[1,4,2,5]),(1,10,[4,2,5]),(2,2,[2,5]),(3,-2,[5]),(4,-12,[])]
So if we select to split at the first split point (before the first element), the sum on the right is 12 higher than the sum on the left, if we split after the first element, the sum on the right (11) is 10 higher than the sum on the left (1).
We can then obtain the minimum of these splits with minimumBy :: (a -> a -> Ordering) -> [a] -> a:
import Data.List(minimumBy)
import Data.Ord(comparing)
findTheEquilizer :: (Ord a, Num a) => [a] -> ([a], [a])
findTheEquilizer lst = (take idx lst, tl)
where (idx, _, tl) = minimumBy (comparing (abs . \(_, x, _) -> x)) (sweep lst)
We then obtain the correct value for [1..10]:
Prelude Data.List Data.Ord Data.List> findTheEquilizer [1..10]
([1,2,3,4,5,6,7],[8,9,10])
or for 70'000:
Prelude Data.List Data.Ord Data.List> head (snd (findTheEquilizer [1..70000]))
49498
The above is not ideal, it can be implemented more elegantly, but I leave this as an exercise.

Okay, firstly, let analyse why it run forever (...actually not forever, just slow), take a look of partishner function:
partishner y xs = [take y xs : [drop y xs]] ++ partishner (y - 1) xs
where take y xs and drop y xs are run linear time, i.e. O(N), and so as
[take y xs : [drop y xs]]
is O(N) too.
However, it is run again and again in recursive way over each element of given list. Now suppose the length of given list is M, each call of partishner function take O(N) times, to finish computation need:
O(1+2+...M) = (M(1+M)/2) ~ O(M^2)
Now, the list has 70k elements, it at least need 70k ^ 2 step. So why it hang.
Instead of using partishner function, you can sum the list in linear way as:
sumList::(Floating a)=>[a]->[a]
sumList xs = sum 0 xs
where sum _ [] = []
sum s (y:ys) = let s' = s + y in s' : sum s' ys
and findEqilizer just sum the given list from left to right (leftSum) and from right to left (rightSum) and take the result just as your original program, but the whole process just take linear time.
findEquilizer::(Ord a, Floating a) => [a] -> a
findEquilizer [] = 0
findEquilizer xs =
let leftSum = reverse $ 0:(sumList $ init xs)
rightSum = sumList $ reverse $ xs
afterParty = zipWith (\x y->(x-y) ** 2) leftSum rightSum
in fst $ minimumBy (comparing snd) (zip (reverse $ init xs) afterParty)

I assume that none of the list elements are negative, and use a "tortoise and hare" approach. The hare steps through the list, adding up elements. The tortoise does the same thing, but it keeps its sum doubled and it carefully ensures that it only takes a step when that step won't put it ahead of the hare.
approxEqualSums
:: (Num a, Ord a)
=> [a] -> (Maybe a, [a])
approxEqualSums as0 = stepHare 0 Nothing as0 0 as0
where
-- ht is the current best guess.
stepHare _tortoiseSum ht tortoise _hareSum []
= (ht, tortoise)
stepHare tortoiseSum ht tortoise hareSum (h:hs)
= stepTortoise tortoiseSum ht tortoise (hareSum + h) hs
stepTortoise tortoiseSum ht [] hareSum hare
= stepHare tortoiseSum ht [] hareSum hare
stepTortoise tortoiseSum ht tortoise#(t:ts) hareSum hare
| tortoiseSum' <= hareSum
= stepTortoise tortoiseSum' (Just t) ts hareSum hare
| otherwise
= stepHare tortoiseSum ht tortoise hareSum hare
where tortoiseSum' = tortoiseSum + 2*t
In use:
> approxEqualSums [1..10]
(Just 6,[7,8,9,10])
6 is the last element before going over half, and 7 is the first one after that.

I asked in the comment and OP says [1..n] is not really defining the question. Yes i guess what's asked is like [1 -> n] in random ascending sequence such as [1,3,7,19,37,...,1453,...,n].
Yet..! Even as per the given answers, for a list like [1..n] we really don't need to do any list operation at all.
The sum of [1..n] is n*(n+1)/2.
Which means we need to find m for n*(n+1)/4
Which means m(m+1)/2 = n*(n+1)/4.
So if n == 100 then m^2 + m - 5050 = 0
All we need is
formula where a = 1, b = 1 and c = -5050 yielding the reasonable root to be 70.565 ⇒ 71 (rounded). Lets check. 71*72/2 = 2556 and 5050-2556 = 2494 which says 2556 - 2494 = 62 minimal difference (<71). Yes we must split at 71. So just do like result = [[1..71],[72..100]] over..!
But when it comes to not subsequent ascending, that's a different animal. It has to be done by first finding the sum and then like binary search by jumping halfway the list and comparing the sums to decide whether to jump halfway back or forward accordingly. I will implement that one later.

Here's a code which is empirically behaving better than linear, and gets to the 2,000,000 in just over 1 second even when interpreted:
g :: (Ord c, Num c) => [c] -> [(Int, c)]
g = head . dropWhile ((> 0) . snd . last) . map (take 2) . tails . zip [1..]
. (\xs -> zipWith (-) (map (last xs -) xs) xs) . scanl1 (+)
g [1..10] ==> [(6,13),(7,-1)] -- 0.0s
g [1..70000] ==> [(49497,32494),(49498,-66502)] -- 0.09s
g [70000,70000-1..1] ==> [(20502,66502),(20503,-32494)] -- 0.09s
g [1..100000] ==> [(70710,75190),(70711,-66232)] -- 0.11s
g [1..1000000] ==> [(707106,897658),(707107,-516556)] -- 0.62s
g [1..2000000] ==> [(1414213,1176418),(1414214,-1652010)] -- 1.14s n^0.88
g [1..3000000] ==> [(2121320,836280),(2121321,-3406362)] -- 1.65s n^0.91
It works by running the partial sums with scanl1 (+) and taking the total sum as its last, so that for each partial sum, subtracting it from the total gives us the sum of the second part of the split.
The algorithm assumes all the numbers in the input list are strictly positive, so the partial sums list is monotonically increasing. Nothing else is assumed about the numbers.
The value must be chosen from the pair (the g's result) so that its second component's absolute value is the smaller between the two.
This is achieved by minimumBy (comparing (abs . snd)) . g.
clarifications: There's some confusion about "complexity" in the comments below, yet the answer says nothing at all about complexity but uses a specific empirical measurement. You can't argue with empirical data (unless you misinterpret its meaning).
The answer does not claim it "is better than linear", it says "it behaves better than linear" [in the tested range of problem sizes], which the empirical data incontrovertibly show.
Finally, an appeal to authority. Robert Sedgewick is an authority on algorithms. Take it up with him.
(and of course the algorithm handles unordered data as well as it does ordered).
As for the reasons for OP's code inefficiency: map sum . inits can't help being quadratic, but the equivalent scanl (+) 0 is linear. The radical improvement comes about from a lot of redundant calculations in the former being avoided in the latter. (Another example of this can be seen here.)

Fast length of an intersection with duplicates in Haskell

I'm writing a mastermind solver, and in an inner loop I calculate the length of the intersection with duplicates of two lists. Right now the function I have is
overlap :: Eq c => [c] -> [c] -> Int
overlap [] _ = 0
overlap (x:xs) ys
| x `elem` ys = 1 + overlap xs (delete x ys)
| otherwise = overlap xs ys
Is it possible to make this faster? If it helps, the arguments to overlap are short lists of the same length, at most 6 elements, and the c type has less than 10 possible values.

In general it is (almost) impossible to boost the performance of such algorithm: in order to remove duplicates in two unordered and unhashable lists, can be done in O(n^2).
In general, you can however boost performance with the following conditions (per condition, a different approach):
If you can for instance ensure that for each list you create/modify/..., the order of the elements is maintained; this can require some engineering. In that case, the algorithm can run in O(n).
In that case you can run it with:
--Use this only if xs and ys are sorted
overlap :: Ord c => [c] -> [c] -> Int
overlap (x:xs) (y:ys) | x < y = overlap xs (y:ys)
| x > y = overlap (x:xs) ys
| otherwise = 1 + overlap xs ys
overlap [] _ = 0
overlap _ [] = 0
In general sorting of a list can be done in O(n log n) and is thus more efficient than your O(n^2) overlap algorithm. The new overlap algorithm runs in O(n).
In case c is ordered, you might use a Data.Set as well. In that case you can use the fromList method that runs in O(n log n) to create a TreeSet for the two lists, then use the intersection function to calculate the intersection in O(n) time and finally use the size function to calculate the size.
--Use this only if c can be ordered
overlap :: Ord c => [c] -> [c] -> Int
overlap xs ys = size $ intersection (fromList xs) (fromList ys)

Are you using same ys for multiple xs?
If yes, you can try to calculate hash values for each element in ys and match by this value, but keep in mind that calculating hash needs to be faster then 6 comparisons.
If either of those is Ord you may also sort it earlier, and verify only necessary part of ys.
However, if you need fast random access lists aren't the best structure, you should probably take a look at Data.Array and Data.HashMap

Need to partition a list into lists based on breaks in ascending order of elements (Haskell)

Say I have any list like this:
[4,5,6,7,1,2,3,4,5,6,1,2]
I need a Haskell function that will transform this list into a list of lists which are composed of the segments of the original list which form a series in ascending order. So the result should look like this:
[[4,5,6,7],[1,2,3,4,5,6],[1,2]]
Any suggestions?

You can do this by resorting to manual recursion, but I like to believe Haskell is a more evolved language. Let's see if we can develop a solution that uses existing recursion strategies. First some preliminaries.
{-# LANGUAGE NoMonomorphismRestriction #-}
-- because who wants to write type signatures, amirite?
import Data.List.Split -- from package split on Hackage
Step one is to observe that we want to split the list based on a criteria that looks at two elements of the list at once. So we'll need a new list with elements representing a "previous" and "next" value. There's a very standard trick for this:
previousAndNext xs = zip xs (drop 1 xs)
However, for our purposes, this won't quite work: this function always outputs a list that's shorter than the input, and we will always want a list of the same length as the input (and in particular we want some output even when the input is a list of length one). So we'll modify the standard trick just a bit with a "null terminator".
pan xs = zip xs (map Just (drop 1 xs) ++ [Nothing])
Now we're going to look through this list for places where the previous element is bigger than the next element (or the next element doesn't exist). Let's write a predicate that does that check.
bigger (x, y) = maybe False (x >) y
Now let's write the function that actually does the split. Our "delimiters" will be values that satisfy bigger; and we never want to throw them away, so let's keep them.
ascendingTuples = split . keepDelimsR $ whenElt bigger
The final step is just to throw together the bit that constructs the tuples, the bit that splits the tuples, and a last bit of munging to throw away the bits of the tuples we don't care about:
ascending = map (map fst) . ascendingTuples . pan
Let's try it out in ghci:
*Main> ascending [4,5,6,7,1,2,3,4,5,6,1,2]
[[4,5,6,7],[1,2,3,4,5,6],[1,2]]
*Main> ascending [7,6..1]
[[7],[6],[5],[4],[3],[2],[1]]
*Main> ascending []
[[]]
*Main> ascending [1]
[[1]]
P.S. In the current release of split, keepDelimsR is slightly stricter than it needs to be, and as a result ascending currently doesn't work with infinite lists. I've submitted a patch that makes it lazier, though.

ascend :: Ord a => [a] -> [[a]]
ascend xs = foldr f [] xs
where
f a [] = [[a]]
f a xs'#(y:ys) | a < head y = (a:y):ys
| otherwise = [a]:xs'
In ghci
*Main> ascend [4,5,6,7,1,2,3,4,5,6,1,2]
[[4,5,6,7],[1,2,3,4,5,6],[1,2]]

This problem is a natural fit for a paramorphism-based solution. Having (as defined in that post)
para :: (a -> [a] -> b -> b) -> b -> [a] -> b
foldr :: (a -> b -> b) -> b -> [a] -> b
para c n (x : xs) = c x xs (para c n xs)
foldr c n (x : xs) = c x (foldr c n xs)
para c n [] = n
foldr c n [] = n
we can write
partition_asc xs = para c [] xs where
c x (y:_) ~(a:b) | x<y = (x:a):b
c x _ r = [x]:r
Trivial, since the abstraction fits.
BTW they have two kinds of map in Common Lisp - mapcar
(processing elements of an input list one by one)
and maplist (processing "tails" of a list). With this idea we get
import Data.List (tails)
partition_asc2 xs = foldr c [] . init . tails $ xs where
c (x:y:_) ~(a:b) | x<y = (x:a):b
c (x:_) r = [x]:r
Lazy patterns in both versions make it work with infinite input lists
in a productive manner (as first shown in Daniel Fischer's answer).
update 2020-05-08: not so trivial after all. Both head . head . partition_asc $ [4] ++ undefined and the same for partition_asc2 fail with *** Exception: Prelude.undefined. The combining function g forces the next element y prematurely. It needs to be more carefully written to be productive right away before ever looking at the next element, as e.g. for the second version,
partition_asc2' xs = foldr c [] . init . tails $ xs where
c (x:ys) r#(~(a:b)) = (x:g):gs
where
(g,gs) | not (null ys)
&& x < head ys = (a,b)
| otherwise = ([],r)
(again, as first shown in Daniel's answer).

You can use a right fold to break up the list at down-steps:
foldr foo [] xs
where
foo x yss = (x:zs) : ws
where
(zs, ws) = case yss of
(ys#(y:_)) : rest
| x < y -> (ys,rest)
| otherwise -> ([],yss)
_ -> ([],[])
(It's a bit complicated in order to have the combining function lazy in the second argument, so that it works well for infinite lists too.)

One other way of approaching this task (which, in fact lays the fundamentals of a very efficient sorting algorithm) is using the Continuation Passing Style a.k.a CPS which, in this particular case applied to folding from right; foldr.
As is, this answer would only chunk up the ascending chunks however, it would be nice to chunk up the descending ones at the same time... preferably in reverse order all in O(n) which would leave us with only binary merging of the obtained chunks for a perfectly sorted output. Yet that's another answer for another question.
chunks :: Ord a => [a] -> [[a]]
chunks xs = foldr go return xs $ []
where
go :: Ord a => a -> ([a] -> [[a]]) -> ([a] -> [[a]])
go c f = \ps -> let (r:rs) = f [c]
in case ps of
[] -> r:rs
[p] -> if c > p then (p:r):rs else [p]:(r:rs)
*Main> chunks [4,5,6,7,1,2,3,4,5,6,1,2]
[[4,5,6,7],[1,2,3,4,5,6],[1,2]]
*Main> chunks [4,5,6,7,1,2,3,4,5,4,3,2,6,1,2]
[[4,5,6,7],[1,2,3,4,5],[4],[3],[2,6],[1,2]]
In the above code c stands for current and p is for previous and again, remember we are folding from right so previous, is actually the next item to process.

Haskell: How to simplify or eliminate liftM2?

Consider the following code I wrote:
import Control.Monad
increasing :: Integer -> [Integer]
increasing n
| n == 1 = [1..9]
| otherwise = do let ps = increasing (n - 1)
let last = liftM2 mod ps [10]
let next = liftM2 (*) ps [10]
alternateEndings next last
where alternateEndings xs ys = concat $ zipWith alts xs ys
alts x y = liftM2 (+) [x] [y..9]
Where 'increasing n' should return a list of n-digit numbers whose numbers increase (or stay the same) from left-to-right.
Is there a way to simplify this? The use of 'let' and 'liftM2' everywhere looks ugly to me. I think I'm missing something vital about the list monad, but I can't seem to get rid of them.

Well, as far as liftM functions go, my preferred way to use those is the combinators defined in Control.Applicative. Using those, you'd be able to write last = mod <$> ps <*> [10]. The ap function from Control.Monad does the same thing, but I prefer the infix version.
What (<$>) and (<*>) goes like this: liftM2 turns a function a -> b -> c into a function m a -> m b -> m c. Plain liftM is just (a -> b) -> (m a -> m b), which is the same as fmap and also (<$>).
What happens if you do that to a multi-argument function? It turns something like a -> b -> c -> d into m a -> m (b -> c -> d). This is where ap or (<*>) come in: what they do is turn something like m (a -> b) into m a -> m b. So you can keep stringing it along that way for as many arguments as you like.
That said, Travis Brown is correct that, in this case, it seems you don't really need any of the above. In fact, you can simplify your function a great deal: For instance, both last and next can be written as single-argument functions mapped over the same list, ps, and zipWith is the same as a zip and a map. All of these maps can be combined and pushed down into the alts function. This makes alts a single-argument function, eliminating the zip as well. Finally, the concat can be combined with the map as concatMap or, if preferred, (>>=). Here's what it ends up:
increasing' :: Integer -> [Integer]
increasing' 1 = [1..9]
increasing' n = increasing' (n - 1) >>= alts
where alts x = map ((x * 10) +) [mod x 10..9]
Note that all refactoring I did to get to that version from yours was purely syntactic, only applying transformations that should have no impact on the result of the function. Equational reasoning and referential transparency are nice!

I think what you are trying to do is this:
increasing :: Integer -> [Integer]
increasing 1 = [1..9]
increasing n = do p <- increasing (n - 1)
let last = p `mod` 10
next = p * 10
alt <- [last .. 9]
return $ next + alt
Or, using a "list comprehension", which is just special monad syntax for lists:
increasing2 :: Integer -> [Integer]
increasing2 1 = [1..9]
increasing2 n = [next + alt | p <- increasing (n - 1),
let last = p `mod` 10
next = p * 10,
alt <- [last .. 9]
]
The idea in the list monad is that you use "bind" (<-) to iterate over a list of values, and let to compute a single value based on what you have so far in the current iteration. When you use bind a second time, the iterations are nested from that point on.

It looks very unusual to me to use liftM2 (or <$> and <*>) when one of the arguments is always a singleton list. Why not just use map? The following does the same thing as your code:
increasing :: Integer -> [Integer]
increasing n
| n == 1 = [1..9]
| otherwise = do let ps = increasing (n - 1)
let last = map (flip mod 10) ps
let next = map (10 *) ps
alternateEndings next last
where alternateEndings xs ys = concat $ zipWith alts xs ys
alts x y = map (x +) [y..9]

Here's how I'd write your code:
increasing :: Integer -> [Integer]
increasing 1 = [1..9]
increasing n = let allEndings x = map (10*x +) [x `mod` 10 .. 9]
in concatMap allEndings $ increasing (n - 1)
I arrived at this code as follows. The first thing I did was to use pattern matching instead of guards, since it's clearer here. The next thing I did was to eliminate the liftM2s. They're unnecessary here, because they're always called with one size-one list; in that case, it's the same as calling map. So liftM2 (*) ps [10] is just map (* 10) ps, and similarly for the other call sites. If you want a general replacement for liftM2, though, you can use Control.Applicative's <$> (which is just fmap) and <*> to replace liftMn for any n: liftMn f a b c ... z becomes f <$> a <*> b <*> c <*> ... <*> z. Whether or not it's nicer is a matter of taste; I happen to like it.1 But here, we can eliminate that entirely.
The next place I simplified the original code is the do .... You never actually take advantage of the fact that you're in a do-block, and so that code can become
let ps = increasing (n - 1)
last = map (`mod` 10) ps
next = map (* 10) ps
in alternateEndings next last
From here, arriving at my code essentially involved writing fusing all of your maps together. One of the only remaining calls that wasn't a map was zipWith. But because you effectively have zipWith alts next last, you only work with 10*p and p `mod` 10 at the same time, so we can calculate them in the same function. This leads to
let ps = increasing (n - 1)
in concat $ map alts ps
where alts p = map (10*p +) [y `mod` 10..9]
And this is basically my code: concat $ map ... should always become concatMap (which, incidentally, is =<< in the list monad), we only use ps once so we can fold it in, and I prefer let to where.
1: Technically, this only works for Applicatives, so if you happen to be using a monad which hasn't been made one, <$> is `liftM` and <*> is `ap`. All monads can be made applicative functors, though, and many of them have been.

I think it's cleaner to pass last digit in a separate parameter and use lists.
f a 0 = [[]]
f a n = do x <- [a..9]
k <- f x (n-1)
return (x:k)
num = foldl (\x y -> 10*x + y) 0
increasing = map num . f 1

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Haskell list creation and concatenation performance [duplicate] - list

Related

A faster way of generating combinations with a given length, preserving the order

Haskell - split a list into two sublists with closest sums

Fast length of an intersection with duplicates in Haskell

Need to partition a list into lists based on breaks in ascending order of elements (Haskell)

Haskell: How to simplify or eliminate liftM2?

Categories

Resources