A better way of balancing a list in Haskell - list

I'm learning Haskell by solving some online problems and training exercises.
Right now I'm trying to write a function that'd take a list and balance it like this
[2,3,2,3,2] ---> [[2,2,2], [3,3]]
[1,2,3,7,8] ---> [[1,2,7], [3,8]]
[1,2,9,10] ---> [[2,9],[1,10]]
[1,1,1,1,1,2,3] ---> [[1,1,1,1,1],[2,3]]
(always in two parts, either works in the case of 2nd)
The way I thought of doing this was by using Permutation function from base Data.List
and filtering out valid lists with a function like this
sumCheck x
| sum (take (length x `div` 2) x) == sum (drop (length x `div` 2) x) = True
| otherwise = False
if the length of the list is even. If not then a recursive function that'd
(take y x) (drop (y - (y - 0)) x)
(take (y - 1) x) (drop (y - (y - 1)) x)
and so on.
or by just partitioning them like this
parts :: [a] -> [([a], [a])]
parts [] = [([], [])]
parts (x : xs) = let pts = parts xs in
[(x : ys, zs) | (ys, zs) <- pts] ++ [(ys, x : zs) | (ys, zs) <- pts]
then checking the balance by
checkBal (xs, ys) = abs (sum xs - sum ys)
and sorting in ascending order, where the first one would be most balanced.
Now this works well for smaller lists but as you might have guessed, for large lists, it just keeps on processing.
I think binPack could help with this but I rather not use external packages and try to do it on my own. (with some help ofc!)

This is kind of a fun algorithm problem. I encourage you to keep thinking about it on your own; severe spoilers below.
Here's my plan: I'm going to compute all the possible sums of subsets, together with (just one) witness of who's in and who's out to achieve that sum. Then we'll just check which possible sum is closest to half the entire list's sum. So:
import Data.Maybe
import Data.Map (Map)
import qualified Data.Map as M
type Sums a = Map a ([a], [a])
update :: (Num a, Ord a) => a -> Sums a -> Sums a
update n sums = M.union
( (\(i, o) -> (i, n:o)) <$> sums)
(M.mapKeysMonotonic (n+) $ (\(i, o) -> (n:i, o)) <$> sums)
computeSums :: (Num a, Ord a) => [a] -> Sums a
computeSums = foldr update (M.singleton 0 ([], []))
balance :: (Integral a, Ord a) => [a] -> ([a], [a])
balance xs = snd . fromJust $ M.lookupLE (sum xs `div` 2) (computeSums xs)
Trying it out:
> mapM_ (\xs -> putStrLn $ show xs ++ " ---> " ++ show (balance xs)) [[2,3,2,3,2],[1,2,3,7,8],[1,2,9,10],[1,1,1,1,1,2,3]]
[2,3,2,3,2] ---> ([3,3],[2,2,2])
[1,2,3,7,8] ---> ([3,7],[1,2,8])
[1,2,9,10] ---> ([2,9],[1,10])
[1,1,1,1,1,2,3] ---> ([2,3],[1,1,1,1,1])
It also terminates relatively quickly on large lists:
> :set +s
> (\(a,b) -> (sum a, sum b)) . balance $ replicate 1001 1
(500,501)
(0.12 secs, 156,255,000 bytes)
> (\(a,b) -> (sum a, sum b)) . balance $ [1..200]
(10050,10050)
(1.25 secs, 752,015,256 bytes)
> (\(a,b) -> (sum a, sum b)) . balance $ take 20 (iterate (2*) 1)
(524287,524288)
(0.92 secs, 532,754,784 bytes)
This last example exercises the exponential worst-case behavior of this algorithm (since every subset of the powers of two gives a different sum).

Related

Get index of next smallest element in the list in Haskell

I m a newbie to Haskell. I am pretty good with Imperative languages but not with functional. Haskell is my first as a functional language.
I am trying to figure out, how to get the index of the smallest element in the list where the minimum element is defined by me.
Let me explain by examples.
For example :
Function signature
minList :: x -> [x]
let x = 2
let list = [2,3,5,4,6,5,2,1,7,9,2]
minList x list --output 1 <- is index
This should return 1. Because the at list[1] is 3. It returns 1 because 3 is the smallest element after x (=2).
let x = 1
let list = [3,5,4,6,5,2,1,7,9,2]
minList x list -- output 9 <- is index
It should return 9 because at list[9] is 2 and 2 is the smallest element after 1. x = 1 which is defined by me.
What I have tried so far.
minListIndex :: (Ord a, Num a) => a -> [a] -> a
minListIndex x [] = 0
minListIndex x (y:ys)
| x > y = length ys
| otherwise = m
where m = minListIndex x ys
When I load the file I get this error
• Couldn't match expected type ‘a’ with actual type ‘Int’
‘a’ is a rigid type variable bound by
the type signature for:
minListIndex :: forall a. (Ord a, Num a) => a -> [a] -> a
at myFile.hs:36:17
• In the expression: 1 + length ys
In an equation for ‘minListIndex’:
minListIndex x (y : ys)
| x > y = 1 + length ys
| otherwise = 1 + m
where
m = minListIndex x ys
• Relevant bindings include
m :: a (bound at myFile.hs:41:19)
ys :: [a] (bound at myFile.hs:38:19)
y :: a (bound at myFile.hs:38:17)
x :: a (bound at myFile.hs:38:14)
minListIndex :: a -> [a] -> a (bound at myFile.hs:37:1)
When I modify the function like this
minListIndex :: (Ord a, Num a) => a -> [a] -> a
minListIndex x [] = 0
minListIndex x (y:ys)
| x > y = 2 -- <- modified...
| otherwise = 3 -- <- modifiedd
where m = minListIndex x ys
I load the file again then it compiles and runs but ofc the output is not desired.
What is the problem with
| x > y = length ys
| otherwise = m
?
In short: Basically, I want to find the index of the smallest element but higher than the x which is defined by me in parameter/function signature.
Thanks for the help in advance!
minListIndex :: (Ord a, Num a) => a -> [a] -> a
The problem is that you are trying to return result of generic type a but it is actually index in a list.
Suppose you are trying to evaluate your function for a list of doubles. In this case compiler should instantiate function's type to Double -> [Double] -> Double which is nonsense.
Actually compiler notices that you are returning something that is derived from list's length and warns you that it is not possible to match generic type a with concrete Int.
length ys returns Int, so you can try this instead:
minListIndex :: Ord a => a -> [a] -> Int
Regarding your original problem, seems that you can't solve it with plain recursion. Consider defining helper recursive function with accumulator. In your case it can be a pair (min_value_so_far, its_index).
First off, I'd separate the index type from the list element type altogether. There's no apparent reason for them to be the same. I will use the BangPatterns extension to avoid a space leak without too much notation; enable that by adding {-# language BangPatterns #-} to the very top of the file. I will also import Data.Word to get access to the Word64 type.
There are two stages: first, find the index of the given element (if it's present) and the rest of the list beyond that point. Then, find the index of the minimum of the tail.
-- Find the 0-based index of the first occurrence
-- of the given element in the list, and
-- the rest of the list after that element.
findGiven :: Eq a => a -> [a] -> Maybe (Word64, [a])
findGiven given = go 0 where
go !_k [] = Nothing --not found
go !k (x:xs)
| given == xs = Just (k, xs)
| otherwise = go (k+1) xs
-- Find the minimum (and its index) of the elements of the
-- list greater than the given one.
findMinWithIndexOver :: Ord a => a -> [a] -> Maybe (Word64, a)
findMinWithIndexOver given = go 0 Nothing where
go !_k acc [] = acc
go !k acc (x : xs)
| x <= given = go (k + 1) acc xs
| otherwise
= case acc of
Nothing -> go (k + 1) (Just (k, x)) xs
Just (ix_min, curr_min)
| x < ix_min = go (k + 1) (Just (k, x)) xs
| otherwise = go (k + 1) acc xs
You can now put these functions together to construct the one you seek. If you want a general Num result rather than a Word64 one, you can use fromIntegral at the very end. Why use Word64? Unlike Int or Word, it's (practically) guaranteed not to overflow in any reasonable amount of time. It's likely substantially faster than using something like Integer or Natural directly.
It is not clear for me what do you want exactly. Based on examples I guess it is: find the index of the smallest element higher than x which appears after x. In that case, This solution is plain Prelude. No imports
minList :: Ord a => a -> [a] -> Int
minList x l = snd . minimum . filter (\a -> x < fst a) . dropWhile (\a -> x /= fst a) $ zip l [0..]
The logic is:
create the list of pairs, [(elem, index)] using zip l [0..]
drop elements until you find the input x using dropWhile (\a -> x /= fst a)
discards elements less than x using filter (\a -> x < fst a)
find the minimum of the resulting list. Tuples are ordered using lexicographic order so it fits your problem
take the index using snd
Your function can be constructed out of ready-made parts as
import Data.Maybe (listToMaybe)
import Data.List (sortBy)
import Data.Ord (comparing)
foo :: (Ord a, Enum b) => a -> [a] -> Maybe b
foo x = fmap fst . listToMaybe . take 1
. dropWhile ((<= x) . snd)
. sortBy (comparing snd)
. dropWhile ((/= x) . snd)
. zip [toEnum 0..]
This Maybe finds the index of the next smallest element in the list above the given element, situated after the given element, in the input list. As you've requested.
You can use any Enum type of your choosing as the index.
Now you can implement this higher-level executable specs as direct recursion, using an efficient Map data structure to hold your sorted elements above x seen so far to find the next smallest, etc.
Correctness first, efficiency later!
Efficiency update: dropping after the sort drops them sorted, so there's a wasted effort there; indeed it should be replaced with the filtering (as seen in the answer by Luis Morillo) before the sort. And if our element type is in Integral (so it is a properly discrete type, unlike just an Enum, thanks to #dfeuer for pointing this out!), there's one more opportunity for an opportunistic optimization: if we hit on a succ minimal element by pure chance, there's no further chance of improvement, and so we should bail out at that point right there:
bar :: (Integral a, Enum b) => a -> [a] -> Maybe b
bar x = fmap fst . either Just (listToMaybe . take 1
. sortBy (comparing snd))
. findOrFilter ((== succ x).snd) ((> x).snd)
. dropWhile ((/= x) . snd)
. zip [toEnum 0..]
findOrFilter :: (a -> Bool) -> (a -> Bool) -> [a] -> Either a [a]
findOrFilter t p = go
where go [] = Right []
go (x:xs) | t x = Left x
| otherwise = fmap ([x | p x] ++) $ go xs
Testing:
> foo 5 [2,3,5,4,6,5,2,1,7,9,2] :: Maybe Int
Just 4
> foo 2 [2,3,5,4,6,5,2,1,7,9,2] :: Maybe Int
Just 1
> foo 1 [3,5,4,6,5,2,1,7,9,2] :: Maybe Int
Just 9

Haskell - split a list into two sublists with closest sums

I'm a Haskell beginner trying to learn more about the language by solving some online quizzes/problem sets.
The problem/question is quite lengthy but a part of it requires code that can find the number which divides a given list into two (nearly) equal (by sum) sub-lists.
Given [1..10]
Answer should be 7 since 1+2+..7 = 28 & 8+9+10 = 27
This is the way I implemented it
-- partitions list by y
partishner :: (Floating a) => Int -> [a] -> [[[a]]]
partishner 0 xs = [[xs],[]]
partishner y xs = [take y xs : [drop y xs]] ++ partishner (y - 1) xs
-- finds the equal sum
findTheEquilizer :: (Ord a, Floating a) => [a] -> [[a]]
findTheEquilizer xs = fst $ minimumBy (comparing snd) zipParty
where party = (tail . init) (partishner (length xs) xs) -- removes [xs,[]] types
afterParty = (map (\[x, y] -> (x - y) ** 2) . init . map (map sum)) party
zipParty = zip party afterParty -- zips partitions and squared diff betn their sums
Given (last . head) (findTheEquilizer [1..10])
output : 7
For numbers near 50k it works fine
λ> (last . head) (findTheEquilizer [1..10000])
7071.0
The trouble starts when I put in lists with any more than 70k elements in it. It takes forever to compute.
So what do I have to change in the code to make it run better or do I have to change my whole approach? I'm guessing it's the later, but I'm not sure how to go about do that.
It looks to me that the implementation is quite chaotic. For example partishner seems to construct a list of lists of lists of a, where, given I understood it correctly, the outer list contains lists with each two elements: the list of elements on "the left", and the list of elements at the "right". As a result, this will take O(n2) to construct the lists.
By using lists over 2-tuples, this is also quite "unsafe", since a list can - although here probably impossible - contain no elements, one element, or more than two elements. If you make a mistake in one of the functions, it will be hard to find out that mistake.
It looks to me that it might be easier to implement a "sweep algorithm": we first calculate the sum of all the elements in the list. This is the value on the "right" in case we decide to split at that specific point, next we start moving from left to right, each time subtracting the element from the sum on the right, and adding it to the sum on the left. We can each time evaluate the difference in score, like:
import Data.List(unfoldr)
sweep :: Num a => [a] -> [(Int, a, [a])]
sweep lst = x0 : unfoldr f x0
where x0 = (0, sum lst, lst)
f (_, _, []) = Nothing
f (i, r, (x: xs)) = Just (l, l)
where l = (i+1, r-2*x, xs)
For example:
Prelude Data.List> sweep [1,4,2,5]
[(0,12,[1,4,2,5]),(1,10,[4,2,5]),(2,2,[2,5]),(3,-2,[5]),(4,-12,[])]
So if we select to split at the first split point (before the first element), the sum on the right is 12 higher than the sum on the left, if we split after the first element, the sum on the right (11) is 10 higher than the sum on the left (1).
We can then obtain the minimum of these splits with minimumBy :: (a -> a -> Ordering) -> [a] -> a:
import Data.List(minimumBy)
import Data.Ord(comparing)
findTheEquilizer :: (Ord a, Num a) => [a] -> ([a], [a])
findTheEquilizer lst = (take idx lst, tl)
where (idx, _, tl) = minimumBy (comparing (abs . \(_, x, _) -> x)) (sweep lst)
We then obtain the correct value for [1..10]:
Prelude Data.List Data.Ord Data.List> findTheEquilizer [1..10]
([1,2,3,4,5,6,7],[8,9,10])
or for 70'000:
Prelude Data.List Data.Ord Data.List> head (snd (findTheEquilizer [1..70000]))
49498
The above is not ideal, it can be implemented more elegantly, but I leave this as an exercise.
Okay, firstly, let analyse why it run forever (...actually not forever, just slow), take a look of partishner function:
partishner y xs = [take y xs : [drop y xs]] ++ partishner (y - 1) xs
where take y xs and drop y xs are run linear time, i.e. O(N), and so as
[take y xs : [drop y xs]]
is O(N) too.
However, it is run again and again in recursive way over each element of given list. Now suppose the length of given list is M, each call of partishner function take O(N) times, to finish computation need:
O(1+2+...M) = (M(1+M)/2) ~ O(M^2)
Now, the list has 70k elements, it at least need 70k ^ 2 step. So why it hang.
Instead of using partishner function, you can sum the list in linear way as:
sumList::(Floating a)=>[a]->[a]
sumList xs = sum 0 xs
where sum _ [] = []
sum s (y:ys) = let s' = s + y in s' : sum s' ys
and findEqilizer just sum the given list from left to right (leftSum) and from right to left (rightSum) and take the result just as your original program, but the whole process just take linear time.
findEquilizer::(Ord a, Floating a) => [a] -> a
findEquilizer [] = 0
findEquilizer xs =
let leftSum = reverse $ 0:(sumList $ init xs)
rightSum = sumList $ reverse $ xs
afterParty = zipWith (\x y->(x-y) ** 2) leftSum rightSum
in fst $ minimumBy (comparing snd) (zip (reverse $ init xs) afterParty)
I assume that none of the list elements are negative, and use a "tortoise and hare" approach. The hare steps through the list, adding up elements. The tortoise does the same thing, but it keeps its sum doubled and it carefully ensures that it only takes a step when that step won't put it ahead of the hare.
approxEqualSums
:: (Num a, Ord a)
=> [a] -> (Maybe a, [a])
approxEqualSums as0 = stepHare 0 Nothing as0 0 as0
where
-- ht is the current best guess.
stepHare _tortoiseSum ht tortoise _hareSum []
= (ht, tortoise)
stepHare tortoiseSum ht tortoise hareSum (h:hs)
= stepTortoise tortoiseSum ht tortoise (hareSum + h) hs
stepTortoise tortoiseSum ht [] hareSum hare
= stepHare tortoiseSum ht [] hareSum hare
stepTortoise tortoiseSum ht tortoise#(t:ts) hareSum hare
| tortoiseSum' <= hareSum
= stepTortoise tortoiseSum' (Just t) ts hareSum hare
| otherwise
= stepHare tortoiseSum ht tortoise hareSum hare
where tortoiseSum' = tortoiseSum + 2*t
In use:
> approxEqualSums [1..10]
(Just 6,[7,8,9,10])
6 is the last element before going over half, and 7 is the first one after that.
I asked in the comment and OP says [1..n] is not really defining the question. Yes i guess what's asked is like [1 -> n] in random ascending sequence such as [1,3,7,19,37,...,1453,...,n].
Yet..! Even as per the given answers, for a list like [1..n] we really don't need to do any list operation at all.
The sum of [1..n] is n*(n+1)/2.
Which means we need to find m for n*(n+1)/4
Which means m(m+1)/2 = n*(n+1)/4.
So if n == 100 then m^2 + m - 5050 = 0
All we need is
formula where a = 1, b = 1 and c = -5050 yielding the reasonable root to be 70.565 ⇒ 71 (rounded). Lets check. 71*72/2 = 2556 and 5050-2556 = 2494 which says 2556 - 2494 = 62 minimal difference (<71). Yes we must split at 71. So just do like result = [[1..71],[72..100]] over..!
But when it comes to not subsequent ascending, that's a different animal. It has to be done by first finding the sum and then like binary search by jumping halfway the list and comparing the sums to decide whether to jump halfway back or forward accordingly. I will implement that one later.
Here's a code which is empirically behaving better than linear, and gets to the 2,000,000 in just over 1 second even when interpreted:
g :: (Ord c, Num c) => [c] -> [(Int, c)]
g = head . dropWhile ((> 0) . snd . last) . map (take 2) . tails . zip [1..]
. (\xs -> zipWith (-) (map (last xs -) xs) xs) . scanl1 (+)
g [1..10] ==> [(6,13),(7,-1)] -- 0.0s
g [1..70000] ==> [(49497,32494),(49498,-66502)] -- 0.09s
g [70000,70000-1..1] ==> [(20502,66502),(20503,-32494)] -- 0.09s
g [1..100000] ==> [(70710,75190),(70711,-66232)] -- 0.11s
g [1..1000000] ==> [(707106,897658),(707107,-516556)] -- 0.62s
g [1..2000000] ==> [(1414213,1176418),(1414214,-1652010)] -- 1.14s n^0.88
g [1..3000000] ==> [(2121320,836280),(2121321,-3406362)] -- 1.65s n^0.91
It works by running the partial sums with scanl1 (+) and taking the total sum as its last, so that for each partial sum, subtracting it from the total gives us the sum of the second part of the split.
The algorithm assumes all the numbers in the input list are strictly positive, so the partial sums list is monotonically increasing. Nothing else is assumed about the numbers.
The value must be chosen from the pair (the g's result) so that its second component's absolute value is the smaller between the two.
This is achieved by minimumBy (comparing (abs . snd)) . g.
clarifications: There's some confusion about "complexity" in the comments below, yet the answer says nothing at all about complexity but uses a specific empirical measurement. You can't argue with empirical data (unless you misinterpret its meaning).
The answer does not claim it "is better than linear", it says "it behaves better than linear" [in the tested range of problem sizes], which the empirical data incontrovertibly show.
Finally, an appeal to authority. Robert Sedgewick is an authority on algorithms. Take it up with him.
(and of course the algorithm handles unordered data as well as it does ordered).
As for the reasons for OP's code inefficiency: map sum . inits can't help being quadratic, but the equivalent scanl (+) 0 is linear. The radical improvement comes about from a lot of redundant calculations in the former being avoided in the latter. (Another example of this can be seen here.)

haskell: how to get list of numbers which are higher then their neighbours in a starting list

I am trying to learn Haskell and I want to solve one task. I have a list of Integers and I need to add them to another list if they are bigger then both of their neighbors. For Example:
I have a starting list of [0,1,5,2,3,7,8,4] and I need to print out a list which is [5, 8]
This is the code I came up but it returns an empty list:
largest :: [Integer]->[Integer]
largest n
| head n > head (tail n) = head n : largest (tail n)
| otherwise = largest (tail n)
I would solve this as outlined by Thomas M. DuBuisson. Since we want the ends of the list to "count", we'll add negative infinities to each end before creating triples. The monoid-extras package provides a suitable type for this.
import Data.Monoid.Inf
pad :: [a] -> [NegInf a]
pad xs = [negInfty] ++ map negFinite xs ++ [negInfty]
triples :: [a] -> [(a, a, a)]
triples (x:rest#(y:z:_)) = (x,y,z) : triples rest
triples _ = []
isBig :: Ord a => (a,a,a) -> Bool
isBig (x,y,z) = y > x && y > z
scnd :: (a, b, c) -> b
scnd (a, b, c) = b
finites :: [Inf p a] -> [a]
finites xs = [x | Finite x <- xs]
largest :: Ord a => [a] -> [a]
largest = id
. finites
. map scnd
. filter isBig
. triples
. pad
It seems to be working appropriately; in ghci:
> largest [0,1,5,2,3,7,8,4]
[5,8]
> largest [10,1,10]
[10,10]
> largest [3]
[3]
> largest []
[]
You might also consider merging finites, map scnd, and filter isBig in a single list comprehension (then eliminating the definitions of finites, scnd, and isBig):
largest :: Ord a => [a] -> [a]
largest xs = [x | (a, b#(Finite x), c) <- triples (pad xs), a < b, c < b]
But I like the decomposed version better; the finites, scnd, and isBig functions may turn out to be useful elsewhere in your development, especially if you plan to build a few variants of this for different needs.
One thing you might try is lookahead. (Thomas M. DuBuisson suggested a different one that will also work if you handle the final one or two elements correctly.) Since it sounds like this is a problem you want to solve on your own as a learning exercise, I’ll write a skeleton that you can take as a starting-point if you want:
largest :: [Integer] -> [Integer]
largest [] = _
largest [x] = _ -- What should this return?
largest [x1,x2] | x1 > x2 = _
| x1 < x2 = _
| otherwise = _
largest [x1,x2,x3] | x2 > x1 && x2 > x3 = _
| x3 > x2 = _
| otherwise = _
largest (x1:x2:x3:xs) | x2 > x1 && x2 > x3 = _
| otherwise = _
We need the special case of [x1,x2,x3] in addition to (x1:x2:x3:[]) because, according to the clarification in your comment, largest [3,3,2] should return []. but largest [3,2] should return [3]. Therefore, the final three elements require special handling and cannot simply recurse on the final two.
If you also want the result to include the head of the list if it is greater than the second element, you’d make this a helper function and your largest would be something like largest (x1:x2:xs) = (if x1>x2 then [x1] else []) ++ largest' (x1:x2:xs). That is, you want some special handling for the first elements of the original list, which you don’t want to apply to all the sublists when you recurse.
As suggested in the comments, one approach would be to first group the list into tuples of length 3 using Preludes zip3 and tail:
*Main> let xs = [0,1,5,2,3,7,8,4]
*Main> zip3 xs (tail xs) (tail (tail xs))
[(0,1,5),(1,5,2),(5,2,3),(2,3,7),(3,7,8),(7,8,4)]
Which is of type: [a] -> [b] -> [c] -> [(a, b, c)] and [a] -> [a] respectively.
Next you need to find a way to filter out the tuples where the middle element is bigger than the first and last element. One way would be to use Preludes filter function:
*Main> let xs = [(0,1,5),(1,5,2),(5,2,3),(2,3,7),(3,7,8),(7,8,4)]
*Main> filter (\(a, b, c) -> b > a && b > c) xs
[(1,5,2),(7,8,4)]
Which is of type: (a -> Bool) -> [a] -> [a]. This filters out elements of a list based on a Boolean returned from the predicate passed.
Now for the final part, you need to extract the middle element from the filtered tuples above. You can do this easily with Preludes map function:
*Main> let xs = [(1,5,2),(7,8,4)]
*Main> map (\(_, x, _) -> x) xs
[5,8]
Which is of type: (a -> b) -> [a] -> [b]. This function maps elements from a list of type a to b.
The above code stitched together would look like this:
largest :: (Ord a) => [a] -> [a]
largest xs = map (\(_, x, _) -> x) $ filter (\(a, b, c) -> b > a && b > c) $ zip3 xs (tail xs) (tail (tail xs))
Note here I used typeclass Ord, since the above code needs to compare with > and <. It's fine to keep it as Integer here though.

all possibilities of dividing a list in two in Haskell

What's the most direct/efficient way to create all possibilities of dividing one (even) list into two in Haskell? I toyed with splitting all permutations of the list but that would add many extras - all the instances where each half contains the same elements, just in a different order. For example,
[1,2,3,4] should produce something like:
[ [1,2], [3,4] ]
[ [1,3], [2,4] ]
[ [1,4], [2,3] ]
Edit: thank you for your comments -- the order of elements and the type of the result is less important to me than the concept - an expression of all two-groups from one group, where element order is unimportant.
Here's an implementation, closely following the definition.
The first element always goes into the left group. After that, we add the next head element into one, or the other group. If one of the groups becomes too big, there is no choice anymore and we must add all the rest into the the shorter group.
divide :: [a] -> [([a], [a])]
divide [] = [([],[])]
divide (x:xs) = go ([x],[], xs, 1,length xs) []
where
go (a,b, [], i,j) zs = (a,b) : zs -- i == lengh a - length b
go (a,b, s#(x:xs), i,j) zs -- j == length s
| i >= j = (a,b++s) : zs
| (-i) >= j = (a++s,b) : zs
| otherwise = go (x:a, b, xs, i+1, j-1) $ go (a, x:b, xs, i-1, j-1) zs
This produces
*Main> divide [1,2,3,4]
[([2,1],[3,4]),([3,1],[2,4]),([1,4],[3,2])]
The limitation of having an even length list is unnecessary:
*Main> divide [1,2,3]
[([2,1],[3]),([3,1],[2]),([1],[3,2])]
(the code was re-written in the "difference-list" style for efficiency: go2 A zs == go1 A ++ zs).
edit: How does this work? Imagine yourself sitting at a pile of stones, dividing it into two. You put the first stone to a side, which one it doesn't matter (so, left, say). Then there's a choice where to put each next stone — unless one of the two piles becomes too small by comparison, and we thus must put all the remaining stones there at once.
To find all partitions of a non-empty list (of even length n) into two equal-sized parts, we can, to avoid repetitions, posit that the first element shall be in the first part. Then it remains to find all ways to split the tail of the list into one part of length n/2 - 1 and one of length n/2.
-- not to be exported
splitLen :: Int -> Int -> [a] -> [([a],[a])]
splitLen 0 _ xs = [([],xs)]
splitLen _ _ [] = error "Oops"
splitLen k l ys#(x:xs)
| k == l = [(ys,[])]
| otherwise = [(x:us,vs) | (us,vs) <- splitLen (k-1) (l-1) xs]
++ [(us,x:vs) | (us,vs) <- splitLen k (l-1) xs]
does that splitting if called appropriately. Then
partitions :: [a] -> [([a],[a])]
partitions [] = [([],[])]
partitions (x:xs)
| even len = error "Original list with odd length"
| otherwise = [(x:us,vs) | (us,vs) <- splitLen half len xs]
where
len = length xs
half = len `quot` 2
generates all the partitions without redundantly computing duplicates.
luqui raises a good point. I haven't taken into account the possibility that you'd want to split lists with repeated elements. With those, it gets a little more complicated, but not much. First, we group the list into equal elements (done here for an Ord constraint, for only Eq, that could still be done in O(length²)). The idea is then similar, to avoid repetitions, we posit that the first half contains more elements of the first group than the second (or, if there is an even number in the first group, equally many, and similar restrictions hold for the next group etc.).
repartitions :: Ord a => [a] -> [([a],[a])]
repartitions = map flatten2 . halves . prepare
where
flatten2 (u,v) = (flatten u, flatten v)
prepare :: Ord a => [a] -> [(a,Int)]
prepare = map (\xs -> (head xs, length xs)) . group . sort
halves :: [(a,Int)] -> [([(a,Int)],[(a,Int)])]
halves [] = [([],[])]
halves ((a,k):more)
| odd total = error "Odd number of elements"
| even k = [((a,low):us,(a,low):vs) | (us,vs) <- halves more] ++ [normalise ((a,c):us,(a,k-c):vs) | c <- [low + 1 .. min half k], (us,vs) <- choose (half-c) remaining more]
| otherwise = [normalise ((a,c):us,(a,k-c):vs) | c <- [low + 1 .. min half k], (us,vs) <- choose (half-c) remaining more]
where
remaining = sum $ map snd more
total = k + remaining
half = total `quot` 2
low = k `quot` 2
normalise (u,v) = (nz u, nz v)
nz = filter ((/= 0) . snd)
choose :: Int -> Int -> [(a,Int)] -> [([(a,Int)],[(a,Int)])]
choose 0 _ xs = [([],xs)]
choose _ _ [] = error "Oops"
choose need have ((a,k):more) = [((a,c):us,(a,k-c):vs) | c <- [least .. most], (us,vs) <- choose (need-c) (have-k) more]
where
least = max 0 (need + k - have)
most = min need k
flatten :: [(a,Int)] -> [a]
flatten xs = xs >>= uncurry (flip replicate)
Daniel Fischer's answer is a good way to solve the problem. I offer a worse (more inefficient) way, but one which more obviously (to me) corresponds to the problem description. I will generate all partitions of the list into two equal length sublists, then filter out equivalent ones according to your definition of equivalence. The way I usually solve problems is by starting like this -- create a solution that is as obvious as possible, then gradually transform it into a more efficient one (if necessary).
import Data.List (sort, nubBy, permutations)
type Partition a = ([a],[a])
-- Your notion of equivalence (sort to ignore the order)
equiv :: (Ord a) => Partition a -> Partition a -> Bool
equiv p q = canon p == canon q
where
canon (xs,ys) = sort [sort xs, sort ys]
-- All ordered partitions
partitions :: [a] -> [Partition a]
partitions xs = map (splitAt l) (permutations xs)
where
l = length xs `div` 2
-- All partitions filtered out by the equivalence
equivPartitions :: (Ord a) => [a] -> [Partition a]
equivPartitions = nubBy equiv . partitions
Testing
>>> equivPartitions [1,2,3,4]
[([1,2],[3,4]),([3,2],[1,4]),([3,1],[2,4])]
Note
After using QuickCheck to test the equivalence of this implementation with Daniel's, I found an important difference. Clearly, mine requires an (Ord a) constraint and his does not, and this hints at what the difference would be. In particular, if you give his [0,0,0,0], you will get a list with three copies of ([0,0],[0,0]), whereas mine will give only one copy. Which of these is correct was not specified; Daniel's is natural when considering the two output lists to be ordered sequences (which is what that type is usually considered to be), mine is natural when considering them as sets or bags (which is how this question seemed to be treating them).
Splitting The Difference
It is possible to get from an implementation that requires Ord to one that doesn't, by operating on the positions rather than the values in a list. I came up with this transformation -- an idea which I believe originates with Benjamin Pierce in his work on bidirectional programming.
import Data.Traversable
import Control.Monad.Trans.State
data Labelled a = Labelled { label :: Integer, value :: a }
instance Eq (Labelled a) where
a == b = compare a b == EQ
instance Ord (Labelled a) where
compare a b = compare (label a) (label b)
labels :: (Traversable t) => t a -> t (Labelled a)
labels t = evalState (traverse trav t) 0
where
trav x = state (\i -> i `seq` (Labelled i x, i + 1))
onIndices :: (Traversable t, Functor u)
=> (forall a. Ord a => t a -> u a)
-> forall b. t b -> u b
onIndices f = fmap value . f . labels
Using onIndices on equivPartitions wouldn't speed it up at all, but it would allow it to have the same semantics as Daniel's (up to equiv of the results) without the constraint, and with my more naive and obvious way of expressing it -- and I just thought it was an interesting way to get rid of the constraint.
My own generalized version, added much later, inspired by Will's answer:
import Data.Map (adjust, fromList, toList)
import Data.List (groupBy, sort)
divide xs n evenly = divide' xs (zip [0..] (replicate n [])) where
evenPSize = div (length xs) n
divide' [] result = [result]
divide' (x:xs) result = do
index <- indexes
divide' xs (toList $ adjust (x :) index (fromList result)) where
notEmptyBins = filter (not . null . snd) $ result
partlyFullBins | evenly == "evenly" = map fst . filter ((<evenPSize) . length . snd) $ notEmptyBins
| otherwise = map fst notEmptyBins
indexes = partlyFullBins
++ if any (null . snd) result
then map fst . take 1 . filter (null . snd) $ result
else if null partlyFullBins
then map fst. head . groupBy (\a b -> length (snd a) == length (snd b)) . sort $ result
else []

Count positive values

Here is my function. It checks for positive values, changes them to ones and sums them.
countPositive :: [Integer] -> Integer
countPositive xs = foldr (+) 0 $ map (^0) (filter (>0) xs)
Is there a better strategy to count positive values without using length but just foldr, map and filter?
Foldr doesn't seem right here. You want foldl' instead. This is my solution:
countPos :: (Num a, Ord a) => [a] -> Int
countPos = length . filter (> 0)
As you don't want to use length for some reason you would basically just reinvent it:
countPos xs = sum (1 <$ filter (> 0) xs)
or yet another method:
countPos = foldl' (\x _ -> succ x) 0 . filter (> 0)
There are lots and lots of way to do this. If 100 people answer to this post, chances are you get 100 different ways to do it, but the simplest way is to use filter and length.
Sure, just directly count them with foldr:
countPositive = foldr (\n count -> if n > 0 then count + 1 else count) 0
Or reimplement length with foldr:
countPositive = foldr (const succ) 0 . filter (>0)