all possibilities of dividing a list in two in Haskell - list

What's the most direct/efficient way to create all possibilities of dividing one (even) list into two in Haskell? I toyed with splitting all permutations of the list but that would add many extras - all the instances where each half contains the same elements, just in a different order. For example,
[1,2,3,4] should produce something like:
[ [1,2], [3,4] ]
[ [1,3], [2,4] ]
[ [1,4], [2,3] ]
Edit: thank you for your comments -- the order of elements and the type of the result is less important to me than the concept - an expression of all two-groups from one group, where element order is unimportant.

Here's an implementation, closely following the definition.
The first element always goes into the left group. After that, we add the next head element into one, or the other group. If one of the groups becomes too big, there is no choice anymore and we must add all the rest into the the shorter group.
divide :: [a] -> [([a], [a])]
divide [] = [([],[])]
divide (x:xs) = go ([x],[], xs, 1,length xs) []
where
go (a,b, [], i,j) zs = (a,b) : zs -- i == lengh a - length b
go (a,b, s#(x:xs), i,j) zs -- j == length s
| i >= j = (a,b++s) : zs
| (-i) >= j = (a++s,b) : zs
| otherwise = go (x:a, b, xs, i+1, j-1) $ go (a, x:b, xs, i-1, j-1) zs
This produces
*Main> divide [1,2,3,4]
[([2,1],[3,4]),([3,1],[2,4]),([1,4],[3,2])]
The limitation of having an even length list is unnecessary:
*Main> divide [1,2,3]
[([2,1],[3]),([3,1],[2]),([1],[3,2])]
(the code was re-written in the "difference-list" style for efficiency: go2 A zs == go1 A ++ zs).
edit: How does this work? Imagine yourself sitting at a pile of stones, dividing it into two. You put the first stone to a side, which one it doesn't matter (so, left, say). Then there's a choice where to put each next stone — unless one of the two piles becomes too small by comparison, and we thus must put all the remaining stones there at once.

To find all partitions of a non-empty list (of even length n) into two equal-sized parts, we can, to avoid repetitions, posit that the first element shall be in the first part. Then it remains to find all ways to split the tail of the list into one part of length n/2 - 1 and one of length n/2.
-- not to be exported
splitLen :: Int -> Int -> [a] -> [([a],[a])]
splitLen 0 _ xs = [([],xs)]
splitLen _ _ [] = error "Oops"
splitLen k l ys#(x:xs)
| k == l = [(ys,[])]
| otherwise = [(x:us,vs) | (us,vs) <- splitLen (k-1) (l-1) xs]
++ [(us,x:vs) | (us,vs) <- splitLen k (l-1) xs]
does that splitting if called appropriately. Then
partitions :: [a] -> [([a],[a])]
partitions [] = [([],[])]
partitions (x:xs)
| even len = error "Original list with odd length"
| otherwise = [(x:us,vs) | (us,vs) <- splitLen half len xs]
where
len = length xs
half = len `quot` 2
generates all the partitions without redundantly computing duplicates.
luqui raises a good point. I haven't taken into account the possibility that you'd want to split lists with repeated elements. With those, it gets a little more complicated, but not much. First, we group the list into equal elements (done here for an Ord constraint, for only Eq, that could still be done in O(length²)). The idea is then similar, to avoid repetitions, we posit that the first half contains more elements of the first group than the second (or, if there is an even number in the first group, equally many, and similar restrictions hold for the next group etc.).
repartitions :: Ord a => [a] -> [([a],[a])]
repartitions = map flatten2 . halves . prepare
where
flatten2 (u,v) = (flatten u, flatten v)
prepare :: Ord a => [a] -> [(a,Int)]
prepare = map (\xs -> (head xs, length xs)) . group . sort
halves :: [(a,Int)] -> [([(a,Int)],[(a,Int)])]
halves [] = [([],[])]
halves ((a,k):more)
| odd total = error "Odd number of elements"
| even k = [((a,low):us,(a,low):vs) | (us,vs) <- halves more] ++ [normalise ((a,c):us,(a,k-c):vs) | c <- [low + 1 .. min half k], (us,vs) <- choose (half-c) remaining more]
| otherwise = [normalise ((a,c):us,(a,k-c):vs) | c <- [low + 1 .. min half k], (us,vs) <- choose (half-c) remaining more]
where
remaining = sum $ map snd more
total = k + remaining
half = total `quot` 2
low = k `quot` 2
normalise (u,v) = (nz u, nz v)
nz = filter ((/= 0) . snd)
choose :: Int -> Int -> [(a,Int)] -> [([(a,Int)],[(a,Int)])]
choose 0 _ xs = [([],xs)]
choose _ _ [] = error "Oops"
choose need have ((a,k):more) = [((a,c):us,(a,k-c):vs) | c <- [least .. most], (us,vs) <- choose (need-c) (have-k) more]
where
least = max 0 (need + k - have)
most = min need k
flatten :: [(a,Int)] -> [a]
flatten xs = xs >>= uncurry (flip replicate)

Daniel Fischer's answer is a good way to solve the problem. I offer a worse (more inefficient) way, but one which more obviously (to me) corresponds to the problem description. I will generate all partitions of the list into two equal length sublists, then filter out equivalent ones according to your definition of equivalence. The way I usually solve problems is by starting like this -- create a solution that is as obvious as possible, then gradually transform it into a more efficient one (if necessary).
import Data.List (sort, nubBy, permutations)
type Partition a = ([a],[a])
-- Your notion of equivalence (sort to ignore the order)
equiv :: (Ord a) => Partition a -> Partition a -> Bool
equiv p q = canon p == canon q
where
canon (xs,ys) = sort [sort xs, sort ys]
-- All ordered partitions
partitions :: [a] -> [Partition a]
partitions xs = map (splitAt l) (permutations xs)
where
l = length xs `div` 2
-- All partitions filtered out by the equivalence
equivPartitions :: (Ord a) => [a] -> [Partition a]
equivPartitions = nubBy equiv . partitions
Testing
>>> equivPartitions [1,2,3,4]
[([1,2],[3,4]),([3,2],[1,4]),([3,1],[2,4])]
Note
After using QuickCheck to test the equivalence of this implementation with Daniel's, I found an important difference. Clearly, mine requires an (Ord a) constraint and his does not, and this hints at what the difference would be. In particular, if you give his [0,0,0,0], you will get a list with three copies of ([0,0],[0,0]), whereas mine will give only one copy. Which of these is correct was not specified; Daniel's is natural when considering the two output lists to be ordered sequences (which is what that type is usually considered to be), mine is natural when considering them as sets or bags (which is how this question seemed to be treating them).
Splitting The Difference
It is possible to get from an implementation that requires Ord to one that doesn't, by operating on the positions rather than the values in a list. I came up with this transformation -- an idea which I believe originates with Benjamin Pierce in his work on bidirectional programming.
import Data.Traversable
import Control.Monad.Trans.State
data Labelled a = Labelled { label :: Integer, value :: a }
instance Eq (Labelled a) where
a == b = compare a b == EQ
instance Ord (Labelled a) where
compare a b = compare (label a) (label b)
labels :: (Traversable t) => t a -> t (Labelled a)
labels t = evalState (traverse trav t) 0
where
trav x = state (\i -> i `seq` (Labelled i x, i + 1))
onIndices :: (Traversable t, Functor u)
=> (forall a. Ord a => t a -> u a)
-> forall b. t b -> u b
onIndices f = fmap value . f . labels
Using onIndices on equivPartitions wouldn't speed it up at all, but it would allow it to have the same semantics as Daniel's (up to equiv of the results) without the constraint, and with my more naive and obvious way of expressing it -- and I just thought it was an interesting way to get rid of the constraint.

My own generalized version, added much later, inspired by Will's answer:
import Data.Map (adjust, fromList, toList)
import Data.List (groupBy, sort)
divide xs n evenly = divide' xs (zip [0..] (replicate n [])) where
evenPSize = div (length xs) n
divide' [] result = [result]
divide' (x:xs) result = do
index <- indexes
divide' xs (toList $ adjust (x :) index (fromList result)) where
notEmptyBins = filter (not . null . snd) $ result
partlyFullBins | evenly == "evenly" = map fst . filter ((<evenPSize) . length . snd) $ notEmptyBins
| otherwise = map fst notEmptyBins
indexes = partlyFullBins
++ if any (null . snd) result
then map fst . take 1 . filter (null . snd) $ result
else if null partlyFullBins
then map fst. head . groupBy (\a b -> length (snd a) == length (snd b)) . sort $ result
else []

Related

Breaking a list into sublists of a specified size using foldr

I'm taking a functional programming class and I'm having a hard time leaving the OOP mindset behind and finding answers to a lot of my questions.
I have to create a function that takes an ordered list and converts it into specified size sublists using a variation of fold.
This isn't right, but it's what I have:
splitList :: (Ord a) => Int -> [a] -> [[a]]
splitList size xs
| [condition] = foldr (\item subList -> item:subList) [] xs
| otherwise =
I've been searching and I found out that foldr is the variation that works better for what I want, and I think I've understood how fold works, I just don't know how I'll set up the guards so that when length sublist == size haskell resets the accumulator and goes on to the next list.
If I didn't explain myself correctly, here's the result I want:
> splitList 3 [1..10]
> [[1,2,3],[4,5,6],[7,8,9],[10]]
Thanks!
While Fabián's and chi's answers are entirely correct, there is actually an option to solve this puzzle using foldr. Consider the following code:
splitList :: Int -> [a] -> [[a]]
splitList n =
foldr (\el acc -> case acc of
[] -> [[el]]
(h : t) | length h < n -> (el : h) : t
_ -> [el] : acc
) []
The strategy here is to build up a list by extending its head as long as its length is lesser than desired. This solution has, however, two drawbacks:
It does something slightly different than in your example;
splitList 3 [1..10] produces [[1],[2,3,4],[5,6,7],[8,9,10]]
It's complexity is O(n * length l), as we measure length of up to n–sized list on each of the element which yields linear number of linear operations.
Let's first take care of first issue. In order to start counting at the beginning we need to traverse the list left–to–right, while foldr does it right–to–left. There is a common trick called "continuation passing" which will allow us to reverse the direction of the walk:
splitList :: Int -> [a] -> [[a]]
splitList n l = map reverse . reverse $
foldr (\el cont acc ->
case acc of
[] -> cont [[el]]
(h : t) | length h < n -> cont ((el : h) : t)
_ -> cont ([el] : acc)
) id l []
Here, instead of building the list in the accumulator we build up a function that will transform the list in the right direction. See this question for details. The side effect is reversing the list so we need to counter that by reverse application to the whole list and all of its elements. This goes linearly and tail-recursively tho.
Now let's work on the performance issue. The problem was that the length is linear on casual lists. There are two solutions for this:
Use another structure that caches length for a constant time access
Cache the value by ourselves
Because I guess it is a list exercise, let's go for the latter option:
splitList :: Int -> [a] -> [[a]]
splitList n l = map reverse . reverse . snd $
foldr (\el cont (countAcc, listAcc) ->
case listAcc of
[] -> cont (countAcc, [[el]])
(h : t) | countAcc < n -> cont (countAcc + 1, (el : h) : t)
(h : t) -> cont (1, [el] : (h : t))
) id l (1, [])
Here we extend our computational state with a counter that at each points stores the current length of the list. This gives us a constant check on each element and results in linear time complexity in the end.
A way to simplify this problem would be to split this into multiple functions. There are two things you need to do:
take n elements from the list, and
keep taking from the list as much as possible.
Lets try taking first:
taking :: Int -> [a] -> [a]
taking n [] = undefined
taking n (x:xs) = undefined
If there are no elemensts then we cannot take any more elements so we can only return an empty list, on the other hand if we do have an element then we can think of taking n (x:xs) as x : taking (n-1) xs, we would only need to check that n > 0.
taking n (x:xs)
| n > 0 = x :taking (n-1) xs
| otherwise = []
Now, we need to do that multiple times with the remainder so we should probably also return whatever remains from taking n elements from a list, in this case it would be whatever remains when n = 0 so we could try to adapt it to
| otherwise = ([], x:xs)
and then you would need to modify the type signature to return ([a], [a]) and the other 2 definitions to ensure you do return whatever remained after taking n.
With this approach your splitList would look like:
splitList n [] = []
splitList n l = chunk : splitList n remainder
where (chunk, remainder) = taking n l
Note however that folding would not be appropriate since it "flattens" whatever you are working on, for example given a [Int] you could fold to produce a sum which would be an Int. (foldr :: (a -> b -> b) -> b -> [a] -> b or "foldr function zero list produces an element of the function return type")
You want:
splitList 3 [1..10]
> [[1,2,3],[4,5,6],[7,8,9],[10]]
Since the "remainder" [10] in on the tail, I recommend you use foldl instead. E.g.
splitList :: (Ord a) => Int -> [a] -> [[a]]
splitList size xs
| size > 0 = foldl go [] xs
| otherwise = error "need a positive size"
where go acc x = ....
What should go do? Essentially, on your example, we must have:
splitList 3 [1..10]
= go (splitList 3 [1..9]) 10
= go [[1,2,3],[4,5,6],[7,8,9]] 10
= [[1,2,3],[4,5,6],[7,8,9],[10]]
splitList 3 [1..9]
= go (splitList 3 [1..8]) 9
= go [[1,2,3],[4,5,6],[7,8]] 9
= [[1,2,3],[4,5,6],[7,8,9]]
splitList 3 [1..8]
= go (splitList 3 [1..7]) 8
= go [[1,2,3],[4,5,6],[7]] 8
= [[1,2,3],[4,5,6],[7,8]]
and
splitList 3 [1]
= go [] 1
= [[1]]
Hence, go acc x should
check if acc is empty, if so, produce a singleton list [[x]].
otherwise, check the last list in acc:
if its length is less than size, append x
otherwise, append a new list [x] to acc
Try doing this by hand on your example to understand all the cases.
This will not be efficient, but it will work.
You don't really need the Ord a constraint.
Checking the accumulator's first sublist's length would lead to information flow from the right and the first chunk ending up the shorter one, potentially, instead of the last. Such function won't work on infinite lists either (not to mention the foldl-based variants).
A standard way to arrange for the information flow from the left with foldr is using an additional argument. The general scheme is
subLists n xs = foldr g z xs n
where
g x r i = cons x i (r (i-1))
....
The i argument to cons will guide its decision as to where to add the current element into. The i-1 decrements the counter on the way forward from the left, instead of on the way back from the right. z must have the same type as r and as the foldr itself as a whole, so,
z _ = [[]]
This means there must be a post-processing step, and some edge cases must be handled as well,
subLists n xs = post . foldr g z xs $ n
where
z _ = [[]]
g x r i | i == 1 = cons x i (r n)
g x r i = cons x i (r (i-1))
....
cons must be lazy enough not to force the results of the recursive call prematurely.
I leave it as an exercise finishing this up.
For a simpler version with a pre-processing step instead, see this recent answer of mine.
Just going to give another answer: this is quite similar to trying to write groupBy as a fold, and actually has a couple gotchas w.r.t. laziness that you have to bear in mind for an efficient and correct implementation. The following is the fastest version I found that maintains all the relevant laziness properties:
splitList :: Int -> [a] -> [[a]]
splitList m xs = snd (foldr f (const ([],[])) xs 1)
where
f x a i
| i <= 1 = let (ys,zs) = a m in ([], (x : ys) : zs)
| otherwise = let (ys,zs) = a (i-1) in (x : ys , zs)
The ys and the zs gotten from the recursive processing of the rest of list indicate the first and the rest of the groups into which the rest of the list will be broken up, by said recursive processing. So we either prepend the current element before that first subgroup if it is still shorter than needed, or we prepend before the first subgroup when it is just right and start a new, empty subgroup.

Get index of next smallest element in the list in Haskell

I m a newbie to Haskell. I am pretty good with Imperative languages but not with functional. Haskell is my first as a functional language.
I am trying to figure out, how to get the index of the smallest element in the list where the minimum element is defined by me.
Let me explain by examples.
For example :
Function signature
minList :: x -> [x]
let x = 2
let list = [2,3,5,4,6,5,2,1,7,9,2]
minList x list --output 1 <- is index
This should return 1. Because the at list[1] is 3. It returns 1 because 3 is the smallest element after x (=2).
let x = 1
let list = [3,5,4,6,5,2,1,7,9,2]
minList x list -- output 9 <- is index
It should return 9 because at list[9] is 2 and 2 is the smallest element after 1. x = 1 which is defined by me.
What I have tried so far.
minListIndex :: (Ord a, Num a) => a -> [a] -> a
minListIndex x [] = 0
minListIndex x (y:ys)
| x > y = length ys
| otherwise = m
where m = minListIndex x ys
When I load the file I get this error
• Couldn't match expected type ‘a’ with actual type ‘Int’
‘a’ is a rigid type variable bound by
the type signature for:
minListIndex :: forall a. (Ord a, Num a) => a -> [a] -> a
at myFile.hs:36:17
• In the expression: 1 + length ys
In an equation for ‘minListIndex’:
minListIndex x (y : ys)
| x > y = 1 + length ys
| otherwise = 1 + m
where
m = minListIndex x ys
• Relevant bindings include
m :: a (bound at myFile.hs:41:19)
ys :: [a] (bound at myFile.hs:38:19)
y :: a (bound at myFile.hs:38:17)
x :: a (bound at myFile.hs:38:14)
minListIndex :: a -> [a] -> a (bound at myFile.hs:37:1)
When I modify the function like this
minListIndex :: (Ord a, Num a) => a -> [a] -> a
minListIndex x [] = 0
minListIndex x (y:ys)
| x > y = 2 -- <- modified...
| otherwise = 3 -- <- modifiedd
where m = minListIndex x ys
I load the file again then it compiles and runs but ofc the output is not desired.
What is the problem with
| x > y = length ys
| otherwise = m
?
In short: Basically, I want to find the index of the smallest element but higher than the x which is defined by me in parameter/function signature.
Thanks for the help in advance!
minListIndex :: (Ord a, Num a) => a -> [a] -> a
The problem is that you are trying to return result of generic type a but it is actually index in a list.
Suppose you are trying to evaluate your function for a list of doubles. In this case compiler should instantiate function's type to Double -> [Double] -> Double which is nonsense.
Actually compiler notices that you are returning something that is derived from list's length and warns you that it is not possible to match generic type a with concrete Int.
length ys returns Int, so you can try this instead:
minListIndex :: Ord a => a -> [a] -> Int
Regarding your original problem, seems that you can't solve it with plain recursion. Consider defining helper recursive function with accumulator. In your case it can be a pair (min_value_so_far, its_index).
First off, I'd separate the index type from the list element type altogether. There's no apparent reason for them to be the same. I will use the BangPatterns extension to avoid a space leak without too much notation; enable that by adding {-# language BangPatterns #-} to the very top of the file. I will also import Data.Word to get access to the Word64 type.
There are two stages: first, find the index of the given element (if it's present) and the rest of the list beyond that point. Then, find the index of the minimum of the tail.
-- Find the 0-based index of the first occurrence
-- of the given element in the list, and
-- the rest of the list after that element.
findGiven :: Eq a => a -> [a] -> Maybe (Word64, [a])
findGiven given = go 0 where
go !_k [] = Nothing --not found
go !k (x:xs)
| given == xs = Just (k, xs)
| otherwise = go (k+1) xs
-- Find the minimum (and its index) of the elements of the
-- list greater than the given one.
findMinWithIndexOver :: Ord a => a -> [a] -> Maybe (Word64, a)
findMinWithIndexOver given = go 0 Nothing where
go !_k acc [] = acc
go !k acc (x : xs)
| x <= given = go (k + 1) acc xs
| otherwise
= case acc of
Nothing -> go (k + 1) (Just (k, x)) xs
Just (ix_min, curr_min)
| x < ix_min = go (k + 1) (Just (k, x)) xs
| otherwise = go (k + 1) acc xs
You can now put these functions together to construct the one you seek. If you want a general Num result rather than a Word64 one, you can use fromIntegral at the very end. Why use Word64? Unlike Int or Word, it's (practically) guaranteed not to overflow in any reasonable amount of time. It's likely substantially faster than using something like Integer or Natural directly.
It is not clear for me what do you want exactly. Based on examples I guess it is: find the index of the smallest element higher than x which appears after x. In that case, This solution is plain Prelude. No imports
minList :: Ord a => a -> [a] -> Int
minList x l = snd . minimum . filter (\a -> x < fst a) . dropWhile (\a -> x /= fst a) $ zip l [0..]
The logic is:
create the list of pairs, [(elem, index)] using zip l [0..]
drop elements until you find the input x using dropWhile (\a -> x /= fst a)
discards elements less than x using filter (\a -> x < fst a)
find the minimum of the resulting list. Tuples are ordered using lexicographic order so it fits your problem
take the index using snd
Your function can be constructed out of ready-made parts as
import Data.Maybe (listToMaybe)
import Data.List (sortBy)
import Data.Ord (comparing)
foo :: (Ord a, Enum b) => a -> [a] -> Maybe b
foo x = fmap fst . listToMaybe . take 1
. dropWhile ((<= x) . snd)
. sortBy (comparing snd)
. dropWhile ((/= x) . snd)
. zip [toEnum 0..]
This Maybe finds the index of the next smallest element in the list above the given element, situated after the given element, in the input list. As you've requested.
You can use any Enum type of your choosing as the index.
Now you can implement this higher-level executable specs as direct recursion, using an efficient Map data structure to hold your sorted elements above x seen so far to find the next smallest, etc.
Correctness first, efficiency later!
Efficiency update: dropping after the sort drops them sorted, so there's a wasted effort there; indeed it should be replaced with the filtering (as seen in the answer by Luis Morillo) before the sort. And if our element type is in Integral (so it is a properly discrete type, unlike just an Enum, thanks to #dfeuer for pointing this out!), there's one more opportunity for an opportunistic optimization: if we hit on a succ minimal element by pure chance, there's no further chance of improvement, and so we should bail out at that point right there:
bar :: (Integral a, Enum b) => a -> [a] -> Maybe b
bar x = fmap fst . either Just (listToMaybe . take 1
. sortBy (comparing snd))
. findOrFilter ((== succ x).snd) ((> x).snd)
. dropWhile ((/= x) . snd)
. zip [toEnum 0..]
findOrFilter :: (a -> Bool) -> (a -> Bool) -> [a] -> Either a [a]
findOrFilter t p = go
where go [] = Right []
go (x:xs) | t x = Left x
| otherwise = fmap ([x | p x] ++) $ go xs
Testing:
> foo 5 [2,3,5,4,6,5,2,1,7,9,2] :: Maybe Int
Just 4
> foo 2 [2,3,5,4,6,5,2,1,7,9,2] :: Maybe Int
Just 1
> foo 1 [3,5,4,6,5,2,1,7,9,2] :: Maybe Int
Just 9

Haskell - split a list into two sublists with closest sums

I'm a Haskell beginner trying to learn more about the language by solving some online quizzes/problem sets.
The problem/question is quite lengthy but a part of it requires code that can find the number which divides a given list into two (nearly) equal (by sum) sub-lists.
Given [1..10]
Answer should be 7 since 1+2+..7 = 28 & 8+9+10 = 27
This is the way I implemented it
-- partitions list by y
partishner :: (Floating a) => Int -> [a] -> [[[a]]]
partishner 0 xs = [[xs],[]]
partishner y xs = [take y xs : [drop y xs]] ++ partishner (y - 1) xs
-- finds the equal sum
findTheEquilizer :: (Ord a, Floating a) => [a] -> [[a]]
findTheEquilizer xs = fst $ minimumBy (comparing snd) zipParty
where party = (tail . init) (partishner (length xs) xs) -- removes [xs,[]] types
afterParty = (map (\[x, y] -> (x - y) ** 2) . init . map (map sum)) party
zipParty = zip party afterParty -- zips partitions and squared diff betn their sums
Given (last . head) (findTheEquilizer [1..10])
output : 7
For numbers near 50k it works fine
λ> (last . head) (findTheEquilizer [1..10000])
7071.0
The trouble starts when I put in lists with any more than 70k elements in it. It takes forever to compute.
So what do I have to change in the code to make it run better or do I have to change my whole approach? I'm guessing it's the later, but I'm not sure how to go about do that.
It looks to me that the implementation is quite chaotic. For example partishner seems to construct a list of lists of lists of a, where, given I understood it correctly, the outer list contains lists with each two elements: the list of elements on "the left", and the list of elements at the "right". As a result, this will take O(n2) to construct the lists.
By using lists over 2-tuples, this is also quite "unsafe", since a list can - although here probably impossible - contain no elements, one element, or more than two elements. If you make a mistake in one of the functions, it will be hard to find out that mistake.
It looks to me that it might be easier to implement a "sweep algorithm": we first calculate the sum of all the elements in the list. This is the value on the "right" in case we decide to split at that specific point, next we start moving from left to right, each time subtracting the element from the sum on the right, and adding it to the sum on the left. We can each time evaluate the difference in score, like:
import Data.List(unfoldr)
sweep :: Num a => [a] -> [(Int, a, [a])]
sweep lst = x0 : unfoldr f x0
where x0 = (0, sum lst, lst)
f (_, _, []) = Nothing
f (i, r, (x: xs)) = Just (l, l)
where l = (i+1, r-2*x, xs)
For example:
Prelude Data.List> sweep [1,4,2,5]
[(0,12,[1,4,2,5]),(1,10,[4,2,5]),(2,2,[2,5]),(3,-2,[5]),(4,-12,[])]
So if we select to split at the first split point (before the first element), the sum on the right is 12 higher than the sum on the left, if we split after the first element, the sum on the right (11) is 10 higher than the sum on the left (1).
We can then obtain the minimum of these splits with minimumBy :: (a -> a -> Ordering) -> [a] -> a:
import Data.List(minimumBy)
import Data.Ord(comparing)
findTheEquilizer :: (Ord a, Num a) => [a] -> ([a], [a])
findTheEquilizer lst = (take idx lst, tl)
where (idx, _, tl) = minimumBy (comparing (abs . \(_, x, _) -> x)) (sweep lst)
We then obtain the correct value for [1..10]:
Prelude Data.List Data.Ord Data.List> findTheEquilizer [1..10]
([1,2,3,4,5,6,7],[8,9,10])
or for 70'000:
Prelude Data.List Data.Ord Data.List> head (snd (findTheEquilizer [1..70000]))
49498
The above is not ideal, it can be implemented more elegantly, but I leave this as an exercise.
Okay, firstly, let analyse why it run forever (...actually not forever, just slow), take a look of partishner function:
partishner y xs = [take y xs : [drop y xs]] ++ partishner (y - 1) xs
where take y xs and drop y xs are run linear time, i.e. O(N), and so as
[take y xs : [drop y xs]]
is O(N) too.
However, it is run again and again in recursive way over each element of given list. Now suppose the length of given list is M, each call of partishner function take O(N) times, to finish computation need:
O(1+2+...M) = (M(1+M)/2) ~ O(M^2)
Now, the list has 70k elements, it at least need 70k ^ 2 step. So why it hang.
Instead of using partishner function, you can sum the list in linear way as:
sumList::(Floating a)=>[a]->[a]
sumList xs = sum 0 xs
where sum _ [] = []
sum s (y:ys) = let s' = s + y in s' : sum s' ys
and findEqilizer just sum the given list from left to right (leftSum) and from right to left (rightSum) and take the result just as your original program, but the whole process just take linear time.
findEquilizer::(Ord a, Floating a) => [a] -> a
findEquilizer [] = 0
findEquilizer xs =
let leftSum = reverse $ 0:(sumList $ init xs)
rightSum = sumList $ reverse $ xs
afterParty = zipWith (\x y->(x-y) ** 2) leftSum rightSum
in fst $ minimumBy (comparing snd) (zip (reverse $ init xs) afterParty)
I assume that none of the list elements are negative, and use a "tortoise and hare" approach. The hare steps through the list, adding up elements. The tortoise does the same thing, but it keeps its sum doubled and it carefully ensures that it only takes a step when that step won't put it ahead of the hare.
approxEqualSums
:: (Num a, Ord a)
=> [a] -> (Maybe a, [a])
approxEqualSums as0 = stepHare 0 Nothing as0 0 as0
where
-- ht is the current best guess.
stepHare _tortoiseSum ht tortoise _hareSum []
= (ht, tortoise)
stepHare tortoiseSum ht tortoise hareSum (h:hs)
= stepTortoise tortoiseSum ht tortoise (hareSum + h) hs
stepTortoise tortoiseSum ht [] hareSum hare
= stepHare tortoiseSum ht [] hareSum hare
stepTortoise tortoiseSum ht tortoise#(t:ts) hareSum hare
| tortoiseSum' <= hareSum
= stepTortoise tortoiseSum' (Just t) ts hareSum hare
| otherwise
= stepHare tortoiseSum ht tortoise hareSum hare
where tortoiseSum' = tortoiseSum + 2*t
In use:
> approxEqualSums [1..10]
(Just 6,[7,8,9,10])
6 is the last element before going over half, and 7 is the first one after that.
I asked in the comment and OP says [1..n] is not really defining the question. Yes i guess what's asked is like [1 -> n] in random ascending sequence such as [1,3,7,19,37,...,1453,...,n].
Yet..! Even as per the given answers, for a list like [1..n] we really don't need to do any list operation at all.
The sum of [1..n] is n*(n+1)/2.
Which means we need to find m for n*(n+1)/4
Which means m(m+1)/2 = n*(n+1)/4.
So if n == 100 then m^2 + m - 5050 = 0
All we need is
formula where a = 1, b = 1 and c = -5050 yielding the reasonable root to be 70.565 ⇒ 71 (rounded). Lets check. 71*72/2 = 2556 and 5050-2556 = 2494 which says 2556 - 2494 = 62 minimal difference (<71). Yes we must split at 71. So just do like result = [[1..71],[72..100]] over..!
But when it comes to not subsequent ascending, that's a different animal. It has to be done by first finding the sum and then like binary search by jumping halfway the list and comparing the sums to decide whether to jump halfway back or forward accordingly. I will implement that one later.
Here's a code which is empirically behaving better than linear, and gets to the 2,000,000 in just over 1 second even when interpreted:
g :: (Ord c, Num c) => [c] -> [(Int, c)]
g = head . dropWhile ((> 0) . snd . last) . map (take 2) . tails . zip [1..]
. (\xs -> zipWith (-) (map (last xs -) xs) xs) . scanl1 (+)
g [1..10] ==> [(6,13),(7,-1)] -- 0.0s
g [1..70000] ==> [(49497,32494),(49498,-66502)] -- 0.09s
g [70000,70000-1..1] ==> [(20502,66502),(20503,-32494)] -- 0.09s
g [1..100000] ==> [(70710,75190),(70711,-66232)] -- 0.11s
g [1..1000000] ==> [(707106,897658),(707107,-516556)] -- 0.62s
g [1..2000000] ==> [(1414213,1176418),(1414214,-1652010)] -- 1.14s n^0.88
g [1..3000000] ==> [(2121320,836280),(2121321,-3406362)] -- 1.65s n^0.91
It works by running the partial sums with scanl1 (+) and taking the total sum as its last, so that for each partial sum, subtracting it from the total gives us the sum of the second part of the split.
The algorithm assumes all the numbers in the input list are strictly positive, so the partial sums list is monotonically increasing. Nothing else is assumed about the numbers.
The value must be chosen from the pair (the g's result) so that its second component's absolute value is the smaller between the two.
This is achieved by minimumBy (comparing (abs . snd)) . g.
clarifications: There's some confusion about "complexity" in the comments below, yet the answer says nothing at all about complexity but uses a specific empirical measurement. You can't argue with empirical data (unless you misinterpret its meaning).
The answer does not claim it "is better than linear", it says "it behaves better than linear" [in the tested range of problem sizes], which the empirical data incontrovertibly show.
Finally, an appeal to authority. Robert Sedgewick is an authority on algorithms. Take it up with him.
(and of course the algorithm handles unordered data as well as it does ordered).
As for the reasons for OP's code inefficiency: map sum . inits can't help being quadratic, but the equivalent scanl (+) 0 is linear. The radical improvement comes about from a lot of redundant calculations in the former being avoided in the latter. (Another example of this can be seen here.)

haskell: how to get list of numbers which are higher then their neighbours in a starting list

I am trying to learn Haskell and I want to solve one task. I have a list of Integers and I need to add them to another list if they are bigger then both of their neighbors. For Example:
I have a starting list of [0,1,5,2,3,7,8,4] and I need to print out a list which is [5, 8]
This is the code I came up but it returns an empty list:
largest :: [Integer]->[Integer]
largest n
| head n > head (tail n) = head n : largest (tail n)
| otherwise = largest (tail n)
I would solve this as outlined by Thomas M. DuBuisson. Since we want the ends of the list to "count", we'll add negative infinities to each end before creating triples. The monoid-extras package provides a suitable type for this.
import Data.Monoid.Inf
pad :: [a] -> [NegInf a]
pad xs = [negInfty] ++ map negFinite xs ++ [negInfty]
triples :: [a] -> [(a, a, a)]
triples (x:rest#(y:z:_)) = (x,y,z) : triples rest
triples _ = []
isBig :: Ord a => (a,a,a) -> Bool
isBig (x,y,z) = y > x && y > z
scnd :: (a, b, c) -> b
scnd (a, b, c) = b
finites :: [Inf p a] -> [a]
finites xs = [x | Finite x <- xs]
largest :: Ord a => [a] -> [a]
largest = id
. finites
. map scnd
. filter isBig
. triples
. pad
It seems to be working appropriately; in ghci:
> largest [0,1,5,2,3,7,8,4]
[5,8]
> largest [10,1,10]
[10,10]
> largest [3]
[3]
> largest []
[]
You might also consider merging finites, map scnd, and filter isBig in a single list comprehension (then eliminating the definitions of finites, scnd, and isBig):
largest :: Ord a => [a] -> [a]
largest xs = [x | (a, b#(Finite x), c) <- triples (pad xs), a < b, c < b]
But I like the decomposed version better; the finites, scnd, and isBig functions may turn out to be useful elsewhere in your development, especially if you plan to build a few variants of this for different needs.
One thing you might try is lookahead. (Thomas M. DuBuisson suggested a different one that will also work if you handle the final one or two elements correctly.) Since it sounds like this is a problem you want to solve on your own as a learning exercise, I’ll write a skeleton that you can take as a starting-point if you want:
largest :: [Integer] -> [Integer]
largest [] = _
largest [x] = _ -- What should this return?
largest [x1,x2] | x1 > x2 = _
| x1 < x2 = _
| otherwise = _
largest [x1,x2,x3] | x2 > x1 && x2 > x3 = _
| x3 > x2 = _
| otherwise = _
largest (x1:x2:x3:xs) | x2 > x1 && x2 > x3 = _
| otherwise = _
We need the special case of [x1,x2,x3] in addition to (x1:x2:x3:[]) because, according to the clarification in your comment, largest [3,3,2] should return []. but largest [3,2] should return [3]. Therefore, the final three elements require special handling and cannot simply recurse on the final two.
If you also want the result to include the head of the list if it is greater than the second element, you’d make this a helper function and your largest would be something like largest (x1:x2:xs) = (if x1>x2 then [x1] else []) ++ largest' (x1:x2:xs). That is, you want some special handling for the first elements of the original list, which you don’t want to apply to all the sublists when you recurse.
As suggested in the comments, one approach would be to first group the list into tuples of length 3 using Preludes zip3 and tail:
*Main> let xs = [0,1,5,2,3,7,8,4]
*Main> zip3 xs (tail xs) (tail (tail xs))
[(0,1,5),(1,5,2),(5,2,3),(2,3,7),(3,7,8),(7,8,4)]
Which is of type: [a] -> [b] -> [c] -> [(a, b, c)] and [a] -> [a] respectively.
Next you need to find a way to filter out the tuples where the middle element is bigger than the first and last element. One way would be to use Preludes filter function:
*Main> let xs = [(0,1,5),(1,5,2),(5,2,3),(2,3,7),(3,7,8),(7,8,4)]
*Main> filter (\(a, b, c) -> b > a && b > c) xs
[(1,5,2),(7,8,4)]
Which is of type: (a -> Bool) -> [a] -> [a]. This filters out elements of a list based on a Boolean returned from the predicate passed.
Now for the final part, you need to extract the middle element from the filtered tuples above. You can do this easily with Preludes map function:
*Main> let xs = [(1,5,2),(7,8,4)]
*Main> map (\(_, x, _) -> x) xs
[5,8]
Which is of type: (a -> b) -> [a] -> [b]. This function maps elements from a list of type a to b.
The above code stitched together would look like this:
largest :: (Ord a) => [a] -> [a]
largest xs = map (\(_, x, _) -> x) $ filter (\(a, b, c) -> b > a && b > c) $ zip3 xs (tail xs) (tail (tail xs))
Note here I used typeclass Ord, since the above code needs to compare with > and <. It's fine to keep it as Integer here though.

Intersection of infinite lists

I know from computability theory that it is possible to take the intersection of two infinite lists, but I can't find a way to express it in Haskell.
The traditional method fails as soon as the second list is infinite, because you spend all your time checking it for a non-matching element in the first list.
Example:
let ones = 1 : ones -- an unending list of 1s
intersect [0,1] ones
This never yields 1, as it never stops checking ones for the element 0.
A successful method needs to ensure that each element of each list will be visited in finite time.
Probably, this will be by iterating through both lists, and spending approximately equal time checking all previously-visited elements in each list against each other.
If possible, I'd like to also have a way to ignore duplicates in the lists, as it is occasionally necessary, but this is not a requirement.
Using the universe package's Cartesian product operator we can write this one-liner:
import Data.Universe.Helpers
isect :: Eq a => [a] -> [a] -> [a]
xs `isect` ys = [x | (x, y) <- xs +*+ ys, x == y]
-- or this, which may do marginally less allocation
xs `isect` ys = foldr ($) [] $ cartesianProduct
(\x y -> if x == y then (x:) else id)
xs ys
Try it in ghci:
> take 10 $ [0,2..] `isect` [0,3..]
[0,6,12,18,24,30,36,42,48,54]
This implementation will not produce any duplicates if the input lists don't have any; but if they do, you can tack on your favorite dup-remover either before or after calling isect. For example, with nub, you might write
> nub ([0,1] `isect` repeat 1)
[1
and then heat up your computer pretty good, since it can never be sure there might not be a 0 in that second list somewhere if it looks deep enough.
This approach is significantly faster than David Fletcher's, produces many fewer duplicates and produces new values much more quickly than Willem Van Onsem's, and doesn't assume the lists are sorted like freestyle's (but is consequently much slower on such lists than freestyle's).
An idea might be to use incrementing bounds. Let is first relax the problem a bit: yielding duplicated values is allowed. In that case you could use:
import Data.List (intersect)
intersectInfinite :: Eq a => [a] -> [a] -> [a]
intersectInfinite = intersectInfinite' 1
where intersectInfinite' n = intersect (take n xs) (take n ys) ++ intersectInfinite' (n+1)
In other words we claim that:
A∩B = A1∩B1 ∪ A2∩B2 ∪ ... ∪ ...
with A1 is a set containing the first i elements of A (yes there is no order in a set, but let's say there is somehow an order). If the set contains less elements then the full set is returned.
If c is in A (at index i) and in B (at index j), c will be emitted in segment (not index) max(i,j).
This will thus always generate an infinite list (with an infinite amount of duplicates) regardless whether the given lists are finite or not. The only exception is when you give it an empty list, in which case it will take forever. Nevertheless we here ensured that every element in the intersection will be emitted at least once.
Making the result finite (if the given lists are finite)
Now we can make our definition better. First we make a more advanced version of take, takeFinite (let's first give a straight-forward, but not very efficient defintion):
takeFinite :: Int -> [a] -> (Bool,[a])
takeFinite _ [] = (True,[])
takeFinite 0 _ = (False,[])
takeFinite n (x:xs) = let (b,t) = takeFinite (n-1) xs in (b,x:t)
Now we can iteratively deepen until both lists have reached the end:
intersectInfinite :: Eq a => [a] -> [a] -> [a]
intersectInfinite = intersectInfinite' 1
intersectInfinite' :: Eq a => Int -> [a] -> [a] -> [a]
intersectInfinite' n xs ys | fa && fb = intersect xs ys
| fa = intersect ys xs
| fb = intersect xs ys
| otherwise = intersect xfa xfb ++ intersectInfinite' (n+1) xs ys
where (fa,xfa) = takeFinite n xs
(fb,xfb) = takeFinite n ys
This will now terminate given both lists are finite, but still produces a lot of duplicates. There are definitely ways to resolve this issue more.
Here's one way. For each x we make a list of maybes which has
Just x only where x appeared in ys. Then we interleave all
these lists.
isect :: Eq a => [a] -> [a] -> [a]
isect xs ys = (catMaybes . foldr interleave [] . map matches) xs
where
matches x = [if x == y then Just x else Nothing | y <- ys]
interleave :: [a] -> [a] -> [a]
interleave [] ys = ys
interleave (x:xs) ys = x : interleave ys xs
Maybe it can be improved using some sort of fairer interleaving -
it's already pretty slow on the example below because (I think)
it's doing an exponential amount of work.
> take 10 (isect [0..] [0,2..])
[0,2,4,6,8,10,12,14,16,18]
If elements in the lists are ordered then you can easy to do that.
intersectOrd :: Ord a => [a] -> [a] -> [a]
intersectOrd [] _ = []
intersectOrd _ [] = []
intersectOrd (x:xs) (y:ys) = case x `compare` y of
EQ -> x : intersectOrd xs ys
LT -> intersectOrd xs (y:ys)
GT -> intersectOrd (x:xs) ys
Here's yet another alternative, leveraging Control.Monad.WeightedSearch
import Control.Monad (guard)
import Control.Applicative
import qualified Control.Monad.WeightedSearch as W
We first define a cost for digging inside the list. Accessing the tail costs 1 unit more. This will ensure a fair scheduling among the two infinite lists.
eachW :: [a] -> W.T Int a
eachW = foldr (\x w -> pure x <|> W.weight 1 w) empty
Then, we simply disregard infinite lists.
intersection :: [Int] -> [Int] -> [Int]
intersection xs ys = W.toList $ do
x <- eachW xs
y <- eachW ys
guard (x==y)
return y
Even better with MonadComprehensions on:
intersection2 :: [Int] -> [Int] -> [Int]
intersection2 xs ys = W.toList [ y | x <- eachW xs, y <- eachW ys, x==y ]
Solution
I ended up using the following implementation; a slight modification of the answer by David Fletcher:
isect :: Eq a => [a] -> [a] -> [a]
isect [] = const [] -- don't bother testing against an empty list
isect xs = catMaybes . diagonal . map matches
where matches y = [if x == y then Just x else Nothing | x <- xs]
This can be augmented with nub to filter out duplicates:
isectUniq :: Eq a => [a] -> [a] -> [a]
isectUniq xs = nub . isect xs
Explanation
Of the line isect xs = catMaybes . diagonal . map matches
(map matches) ys computes a list of lists of comparisons between elements of xs and ys, where the list indices specify the indices in ys and xs respectively: i.e (map matches) ys !! 3 !! 0 would represent the comparison of ys !! 3 with xs !! 0, which would be Nothing if those values differ. If those values are the same, it would be Just that value.
diagonals takes a list of lists and returns a list of lists where the nth output list contains an element each from the first n lists. Another way to conceptualise it is that (diagonals . map matches) ys !! n contains comparisons between elements whose indices in xs and ys sum to n.
diagonal is simply a flat version of diagonals (diagonal = concat diagonals)
Therefore (diagonal . map matches) ys is a list of comparisons between elements of xs and ys, where the elements are approximately sorted by the sum of the indices of the elements of ys and xs being compared; this means that early elements are compared to later elements with the same priority as middle elements being compared to each other.
(catMaybes . diagonal . map matches) ys is a list of only the elements which are in both lists, where the elements are approximately sorted by the sum of the indices of the two elements being compared.
Note
(diagonal . map (catMaybes . matches)) ys does not work: catMaybes . matches only yields when it finds a match, instead of also yielding Nothing on no match, so the interleaving does nothing to distribute the work.
To contrast, in the chosen solution, the interleaving of Nothing and Just values by diagonal means that the program divides its attention between 'searching' for multiple different elements, not waiting for one to succeed; whereas if the Nothing values are removed before interleaving, the program may spend too much time waiting for a fruitless 'search' for a given element to succeed.
Therefore, we would encounter the same problem as in the original question: while one element does not match any elements in the other list, the program will hang; whereas the chosen solution will only hang while no matches are found for any elements in either list.