Intersection of infinite lists - list

I know from computability theory that it is possible to take the intersection of two infinite lists, but I can't find a way to express it in Haskell.
The traditional method fails as soon as the second list is infinite, because you spend all your time checking it for a non-matching element in the first list.
Example:
let ones = 1 : ones -- an unending list of 1s
intersect [0,1] ones
This never yields 1, as it never stops checking ones for the element 0.
A successful method needs to ensure that each element of each list will be visited in finite time.
Probably, this will be by iterating through both lists, and spending approximately equal time checking all previously-visited elements in each list against each other.
If possible, I'd like to also have a way to ignore duplicates in the lists, as it is occasionally necessary, but this is not a requirement.

Using the universe package's Cartesian product operator we can write this one-liner:
import Data.Universe.Helpers
isect :: Eq a => [a] -> [a] -> [a]
xs `isect` ys = [x | (x, y) <- xs +*+ ys, x == y]
-- or this, which may do marginally less allocation
xs `isect` ys = foldr ($) [] $ cartesianProduct
(\x y -> if x == y then (x:) else id)
xs ys
Try it in ghci:
> take 10 $ [0,2..] `isect` [0,3..]
[0,6,12,18,24,30,36,42,48,54]
This implementation will not produce any duplicates if the input lists don't have any; but if they do, you can tack on your favorite dup-remover either before or after calling isect. For example, with nub, you might write
> nub ([0,1] `isect` repeat 1)
[1
and then heat up your computer pretty good, since it can never be sure there might not be a 0 in that second list somewhere if it looks deep enough.
This approach is significantly faster than David Fletcher's, produces many fewer duplicates and produces new values much more quickly than Willem Van Onsem's, and doesn't assume the lists are sorted like freestyle's (but is consequently much slower on such lists than freestyle's).

An idea might be to use incrementing bounds. Let is first relax the problem a bit: yielding duplicated values is allowed. In that case you could use:
import Data.List (intersect)
intersectInfinite :: Eq a => [a] -> [a] -> [a]
intersectInfinite = intersectInfinite' 1
where intersectInfinite' n = intersect (take n xs) (take n ys) ++ intersectInfinite' (n+1)
In other words we claim that:
A∩B = A1∩B1 ∪ A2∩B2 ∪ ... ∪ ...
with A1 is a set containing the first i elements of A (yes there is no order in a set, but let's say there is somehow an order). If the set contains less elements then the full set is returned.
If c is in A (at index i) and in B (at index j), c will be emitted in segment (not index) max(i,j).
This will thus always generate an infinite list (with an infinite amount of duplicates) regardless whether the given lists are finite or not. The only exception is when you give it an empty list, in which case it will take forever. Nevertheless we here ensured that every element in the intersection will be emitted at least once.
Making the result finite (if the given lists are finite)
Now we can make our definition better. First we make a more advanced version of take, takeFinite (let's first give a straight-forward, but not very efficient defintion):
takeFinite :: Int -> [a] -> (Bool,[a])
takeFinite _ [] = (True,[])
takeFinite 0 _ = (False,[])
takeFinite n (x:xs) = let (b,t) = takeFinite (n-1) xs in (b,x:t)
Now we can iteratively deepen until both lists have reached the end:
intersectInfinite :: Eq a => [a] -> [a] -> [a]
intersectInfinite = intersectInfinite' 1
intersectInfinite' :: Eq a => Int -> [a] -> [a] -> [a]
intersectInfinite' n xs ys | fa && fb = intersect xs ys
| fa = intersect ys xs
| fb = intersect xs ys
| otherwise = intersect xfa xfb ++ intersectInfinite' (n+1) xs ys
where (fa,xfa) = takeFinite n xs
(fb,xfb) = takeFinite n ys
This will now terminate given both lists are finite, but still produces a lot of duplicates. There are definitely ways to resolve this issue more.

Here's one way. For each x we make a list of maybes which has
Just x only where x appeared in ys. Then we interleave all
these lists.
isect :: Eq a => [a] -> [a] -> [a]
isect xs ys = (catMaybes . foldr interleave [] . map matches) xs
where
matches x = [if x == y then Just x else Nothing | y <- ys]
interleave :: [a] -> [a] -> [a]
interleave [] ys = ys
interleave (x:xs) ys = x : interleave ys xs
Maybe it can be improved using some sort of fairer interleaving -
it's already pretty slow on the example below because (I think)
it's doing an exponential amount of work.
> take 10 (isect [0..] [0,2..])
[0,2,4,6,8,10,12,14,16,18]

If elements in the lists are ordered then you can easy to do that.
intersectOrd :: Ord a => [a] -> [a] -> [a]
intersectOrd [] _ = []
intersectOrd _ [] = []
intersectOrd (x:xs) (y:ys) = case x `compare` y of
EQ -> x : intersectOrd xs ys
LT -> intersectOrd xs (y:ys)
GT -> intersectOrd (x:xs) ys

Here's yet another alternative, leveraging Control.Monad.WeightedSearch
import Control.Monad (guard)
import Control.Applicative
import qualified Control.Monad.WeightedSearch as W
We first define a cost for digging inside the list. Accessing the tail costs 1 unit more. This will ensure a fair scheduling among the two infinite lists.
eachW :: [a] -> W.T Int a
eachW = foldr (\x w -> pure x <|> W.weight 1 w) empty
Then, we simply disregard infinite lists.
intersection :: [Int] -> [Int] -> [Int]
intersection xs ys = W.toList $ do
x <- eachW xs
y <- eachW ys
guard (x==y)
return y
Even better with MonadComprehensions on:
intersection2 :: [Int] -> [Int] -> [Int]
intersection2 xs ys = W.toList [ y | x <- eachW xs, y <- eachW ys, x==y ]

Solution
I ended up using the following implementation; a slight modification of the answer by David Fletcher:
isect :: Eq a => [a] -> [a] -> [a]
isect [] = const [] -- don't bother testing against an empty list
isect xs = catMaybes . diagonal . map matches
where matches y = [if x == y then Just x else Nothing | x <- xs]
This can be augmented with nub to filter out duplicates:
isectUniq :: Eq a => [a] -> [a] -> [a]
isectUniq xs = nub . isect xs
Explanation
Of the line isect xs = catMaybes . diagonal . map matches
(map matches) ys computes a list of lists of comparisons between elements of xs and ys, where the list indices specify the indices in ys and xs respectively: i.e (map matches) ys !! 3 !! 0 would represent the comparison of ys !! 3 with xs !! 0, which would be Nothing if those values differ. If those values are the same, it would be Just that value.
diagonals takes a list of lists and returns a list of lists where the nth output list contains an element each from the first n lists. Another way to conceptualise it is that (diagonals . map matches) ys !! n contains comparisons between elements whose indices in xs and ys sum to n.
diagonal is simply a flat version of diagonals (diagonal = concat diagonals)
Therefore (diagonal . map matches) ys is a list of comparisons between elements of xs and ys, where the elements are approximately sorted by the sum of the indices of the elements of ys and xs being compared; this means that early elements are compared to later elements with the same priority as middle elements being compared to each other.
(catMaybes . diagonal . map matches) ys is a list of only the elements which are in both lists, where the elements are approximately sorted by the sum of the indices of the two elements being compared.
Note
(diagonal . map (catMaybes . matches)) ys does not work: catMaybes . matches only yields when it finds a match, instead of also yielding Nothing on no match, so the interleaving does nothing to distribute the work.
To contrast, in the chosen solution, the interleaving of Nothing and Just values by diagonal means that the program divides its attention between 'searching' for multiple different elements, not waiting for one to succeed; whereas if the Nothing values are removed before interleaving, the program may spend too much time waiting for a fruitless 'search' for a given element to succeed.
Therefore, we would encounter the same problem as in the original question: while one element does not match any elements in the other list, the program will hang; whereas the chosen solution will only hang while no matches are found for any elements in either list.

Related

Sort algorithm for list of integers in Haskell with recursion

I need to sort an integer list on haskell, from smaller to greater numbers, but i dont know where to start.
The recursion syntax is kinda difficult for me
A little bit of help would be great.
Ive done this but it does not solve my problem:
ordenarMemoria :: [Int] -> [Int]
ordenarMemoria [] = []
ordenarMemoria (x:y:xs)
| y > x = ordenarMemoria (y:xs)
| otherwise = ordenarMemoria (x:xs)
Thanks
You attempt is on the right track for a bubble sort, which is a good starting place for sorting. A few notes:
You handle the cases when the list is empty or has at least two elements (x and y), but you have forgotten the case when your list has exactly one element. You will always reach this case because you are calling your function recursively on smaller lists.
ordenarMemoria [x] = -- how do you sort a 1-element list?
Second note: in this pattern
ordenarMemoria (x:y:xs)
| y > x = ordenarMemoria (y:xs)
| otherwise = ordenarMemoria (x:xs)
you are sorting a list starting with two elements x and y. You compare x to y, and then sort the rest of the list after removing one of the two elements. This is all good.
The question I have is: what happened to the other element? A sorted list has to have all the same elements as the input, so you should use both x and y in the output. So in:
| y > x = ordenarMemoria (y:xs)
you have forgotten about x. Consider
| y > x = x : ordenarMemoria (y:xs)
which indicates to output x, then the sorted remainder.
The other branch forgets about one of the inputs, too.
After you fix the function, you might notice that the list gets a bit more sorted, but it is still not completely sorted. That's a property of the bubble sort—you might have to run it multiple times.
I'll highly recommend you read Learn You a Haskell, there is an online version here, it has a chapter where you can learn how to sort lists using recursion, like Quicksort for example:
quicksort :: (Ord a) => [a] -> [a]
quicksort [] = []
quicksort (x:xs) =
let smallerSorted = quicksort [a | a <- xs, a <= x]
biggerSorted = quicksort [a | a <- xs, a > x]
in smallerSorted ++ [x] ++ biggerSorted
I need to sort an integer list
How about sort from Data.List?
$ stack ghci
Prelude> :m + Data.List
Prelude Data.List> sort [2,3,1]
[1,2,3]
There are lots of choices. I generally recommend starting with bottom-up mergesort in Haskell, but heapsort isn't a bad choice either. Quicksort poses much more serious difficulties.
-- Given two lists, each of which is in increasing
-- order, produce a list in increasing order.
--
-- merge [1,4,5] [2,4,7] = [1,2,4,4,5,7]
merge :: Ord a => [a] -> [a] -> [a]
merge [] ys = ???
merge xs [] = ???
merge (x : xs) (y : ys)
| x <= y = ???
| otherwise = ???
-- Turn a list of elements into a list of lists
-- of elements, each of which has only one element.
--
-- splatter [1,2,3] = [[1], [2], [3]]
splatter :: [a] -> [[a]]
splatter = map ????
-- Given a list of sorted lists, merge the adjacent pairs of lists.
-- mergePairs [[1,3],[2,4],[0,8],[1,2],[5,7]]
-- = [[1,2,3,4],[0,1,2,8],[5,7]]
mergePairs :: Ord a => [[a]] -> [[a]]
mergePairs [] = ????
mergePairs [as] = ????
mergePairs (as : bs : more) = ????
-- Given a list of lists of sorted lists, merge them all
-- together into one list.
--
-- mergeToOne [[1,4],[2,3]] = [1,2,3,4]
mergeToOne :: Ord a => [[a]] -> [a]
mergeToOne [] = ???
mergeToOne [as] = ???
mergeToOne lots = ??? -- use mergePairs here
mergeSort :: Ord a => [a] -> [a]
mergeSort as = ???? -- Use splatter and mergeToOne
Once you've filled in the blanks above, try optimizing the sort by making splatter produce sorted lists of two or perhaps three elements instead of singletons.
Here is a modified either quicksort or insertion sort. It uses the fastest method of prefixing or suffixing values to the output list. If the next value is less than or greater than the first or last of the list, it is simply affixed to the beginning or end of the list. If the value is not less than the head value or greater than the last value then it must be inserted. The insertion is the same logic as the so-called quicksort above.
Now, the kicker. This function is made to run as a foldr function just to reduce the complexity of the the function. It can easily be converted to a recursive function but it runs fine with foldr.
f2x :: (Ord a) => a -> [a] -> [a]
f2x n ls
| null ls = [n]
| ( n <= (head ls) ) = n:ls -- ++[11]
| ( n >= (last ls) ) = ls ++ [n] -- ++ [22]
| True = [lx|lx <-ls,n > lx]++ n:[lx|lx <-ls,n < lx]
The comments after two line can be removed and the function can be run with scanr to see how many hits are with simple prefix or suffix of values and which are inserted somewhere other that the first or last value.
foldr f2x [] [5,4,3,2,1,0,9,8,7,6]
Or af = foldr a2x [] ... af [5,4,3,2,1,0,9,8,7,6] >-> [0,1,2,3,4,5,6,7,8,9]
EDIT 5/18/2018
The best thing about Stack Overflow is the people like #dfeuer that make you think. #dfeuer suggested using partition. I am like a child, not knowing how. I expressed my difficulty with partition but #dfeuer forced me to see how to use it. #dfeuer also pointed out that the use of last in the above function was wasteful. I did not know that, either.
The following function uses partition imported from Data.List.
partition outputs a tuple pair. This function is also meant to use with foldr. It is a complete insertion sort function.
ft nv ls = b++[nv]++e where (b,e) = partition (<=nv) ls
Use it like above
foldr ft [] [5,4,3,2,1,0,9,8,7,6]
Haskell and functional programming is all about using existing functions in other functions.
putEleInSortedListA :: Ord a => a -> [a] -> [a]
putEleInSortedListA a [] = [a]
putEleInSortedListA a (b:bs)
| a < b = a : b : bs
| otherwise = b: putEleInSortedListA a bs
sortListA :: Ord a => [a] -> [a]
sortListA la = foldr (\a b -> putEleInSortedListA a b) [] la

Creating a lists of lists with new element in each position

i'm new in the haskell world and i'd like to know how to insert a value in each position of a list in haskell, and return a lists of sublists containing the value in each position. For example:
insert' :: a -> [a] -> [[a]]
insert' a [] = [[a]]
insert' a list = ??
To get something like:
insert' 7 [1,2,3] = [[7,1,2,3],[1,7,2,3],[1,2,7,3],[1,2,3,7]]
insert' :: a -> [a] -> [[a]]
insert' y [] = [[y]]
insert' y xss#(x:xs) = (y : xss) : map (x :) (insert' y xs)
While the empty list case comes natural, let's take a look at insert' y xss#(x:xs). We essentially have two cases we need to cover:
y appears in front of x. Then we can just use y : xss.
y appears somewhere after x. We therefore just insert it in the rest of our list and make sure that x is the first element with map (x:).
Although #delta's answer is definitely more elegant, here a solution with difference lists. If we insert an element x on every location of list ys = [y1,y2,...,yn], the first time we will insert it as head, so that means we can construct x : ys.
. For the second element of the resulting list, we want to construct a list [y1,x,y2,...,yn]. We can do this like y1 : x : y2s. The next lists will all have a structure y1 : ....
The question is: how can we write a recursive structure that keeps track of the fact that we want to put elements in the head. We can use a function for that: we start with a function id. If we now call id (x:ys) then we will of course generate the list (x:ys).
We can however, based on the id function, construct a new function id2 = \z -> id (y1:z). This function will thus put y1 in the head of the list and then add the list with which we call id2 as tail. Next we can construct id3 = \z -> id2 (y2:z). This will put y1 and y2 as first elements followed by the tail z.
So we can put this into the following recursive format:
insert' :: a -> [a] -> [[a]]
insert' x = go id
where go d [] = [d [x]]
go d ys#(yh:yt) = (d (x : ys)) : go (d . (yh :)) yt
So we redirect insert' to go where the initial difference list is simply the id function. Each time we check if we have reached the end of the given list. If that is the case, we return the basecase: we call [x] (as tail) on the difference list, and thus construct a list where we append x as last element.
In case we have not yet reached the last element, we will first emit d (x : ys): we prepend x to the list and provide this as argument to the difference list d. d will prepend y1 : y2 : ... : yk up to the point where we insert x. Furthermore we call recursively go (d . (yh :)) yt on the tail of the list: we thus construct a new difference list, wehere we insert (yh :) as tail of the list. We thus produce a new function with one argument: the tail after the yh element.
This function produces the expected results:
*Main> insert' 4 []
[[4]]
*Main> insert' 4 [1,2,5]
[[4,1,2,5],[1,4,2,5],[1,2,4,5],[1,2,5,4]]
*Main> insert' 7 [1,2,3]
[[7,1,2,3],[1,7,2,3],[1,2,7,3],[1,2,3,7]]
You may also do as follows;
import Data.List
spread :: a -> [a] -> [[a]]
spread x xs = zipWith (++) (inits xs) ((x:) <$> tails xs)
*Main> spread 7 [1,2,3]
[[7,1,2,3],[1,7,2,3],[1,2,7,3],[1,2,3,7]]
*Main> spread 7 []
[[7]]
So this is about three stages.
(x:) <$> tails xs is all about applying the (x:) function to all elements of tails xs function. So tails [1,2,3] would return [[1,2,3],[2,3],[3],[]] and we are to apply an fmap which is designated by <$> in the inline form. This is going to be the third argument of the zipWith function.
(inits xs) which would return [[],[1],[1,2],[1,2,3]], is going to be the second argument to zipWith.
zipWith (++) is obviously will zip two list of lists by concatenating the list elements.
So we may also express the same functionality with applicative function functors as follows;
spread :: a -> [a] -> [[a]]
spread x = zipWith (++) <$> inits <*> fmap (x:) . tails
In this case we fmap the zipWith (++) function with type [[a]] -> [[a]] -> [[a]] over inits and then apply it over to fmap (x:) . tails.
It could get more pointfree but becomes more complicated to read through (at least for me). In my opinion this is as best as it gets.

Better way to write this function: list to matrix (= list of lists)

Is there a better way to write this function (i.e. in one line via folds)?
-- list -> rowWidth -> list of lists
listToMatrix :: [a] -> Int -> [[a]]
listToMatrix [] _ = []
listToMatrix xs b = [(take b xs)] ++ (listToMatrix (drop b xs) b)
Actually this is a nice case for unfolding. Data.List method unfoldr, unlike folding a list, creates a list to a from a seed value by applying a function to it up until this function returns Nothing. Until we reach the terminating condition that will return Nothing the function returns Just (a,b) where a is the current generated item of the list and b is the next value of the seed. In this particular case our seed value is the given list.
import Data.List
chunk :: Int -> [a] -> [[a]]
chunk n = unfoldr (\xs -> if null xs then Nothing else Just (splitAt n xs))
*Main> chunk 3 [1,2,3,4,5,6,7,8,9]
[[1,2,3],[4,5,6],[7,8,9]]
*Main> chunk 3 [1,2,3,4,5,6,7,8,9,10]
[[1,2,3],[4,5,6],[7,8,9],[10]]
Yes, there is a better way to write this function. But I don't think that making it one line is going to improve anything.
Use the prepending operator (:) instead of list concatenation (++) for single-element prepending
The expression [(take b xs)] ++ (listToMatrix (drop b xs) b) is inefficient. I don't mean inefficient in terms of performances, because the compiler probably optimizes that, but, here, you are constructing a list, and then calling a function on it ((++)) which is going to deconstruct it by pattern matching. You could instead build your list directly using the (:) data constructor, which allows you to prepend a single element to your list. The expression becomes take b xs : listToMatrix (drop b xs) b
Use splitAt to avoid running through the list twice
import Data.List (splitAt)
listToMatrix :: [a] -> Int -> [[a]]
listToMatrix xs b = row : listToMatrix remaining b
where (row, remaining) = splitAt b xs
Ensure the correctness of your data by using Maybe
listToMatrix :: [a] -> Int -> Maybe [[a]]
listToMatrix xs b
| length xs `mod` b /= 0 = Nothing
| null xs = Just []
| otherwise = Just $ row : listToMatrix remaining b
where (row, remaining) = splitAt b xs
You can even avoid checking every time if you have the right number of elements by defining a helper function:
listToMatrix :: [a] -> Int -> Maybe [[a]]
listToMatrix xs b
| length xs `mod` b /= 0 = Nothing
| otherwise = Just (go xs)
where go [] = []
go xs = row : go remaining
where (row, remaining) = splitAt b xs
Ensure the correctness of your data by using safe types
A matrix has te same number of elements in each row, whereas nested lists don't allow to ensure that kind of conditions. To be certain that every row has the same number of elements, you can either use a library such as matrix or hmatrix
No, although I wish chunksOf was in the standard library. You can get that from here if you want: https://hackage.haskell.org/package/split-0.2.3.2/docs/Data-List-Split.html
Note a better definition would be:
listToMatrix xs b = (take b xs) : (listToMatrix (drop b xs) b)
although this might compile to the same core. You could also use splitAt, though again this is likely to perform the same.
listToMatrix xs b = let (xs,xss) = splitAt b xs in xs : listToMatrix xss
You can use this, the idea is to take all the chunks of the main list, it is a different approach but basicly do the same:
Prelude> let list2Matrix n xs = map (\(x ,y)-> (take n) $ (drop (n*x)) y) $ zip [0..] $ replicate (div (length xs) n) xs
Prelude> list2Matrix 10 [1..100]
[[1,2,3,4,5,6,7,8,9,10],[11,12,13,14,15,16,17,18,19,20],[21,22,23,24,25,26,27,28,29,30],[31,32,33,34,35,36,37,38,39,40],[41,42,43,44,45,46,47,48,49,50],[51,52,53,54,55,56,57,58,59,60],[61,62,63,64,65,66,67,68,69,70],[71,72,73,74,75,76,77,78,79,80],[81,82,83,84,85,86,87,88,89,90],[91,92,93,94,95,96,97,98,99,100]]

all possibilities of dividing a list in two in Haskell

What's the most direct/efficient way to create all possibilities of dividing one (even) list into two in Haskell? I toyed with splitting all permutations of the list but that would add many extras - all the instances where each half contains the same elements, just in a different order. For example,
[1,2,3,4] should produce something like:
[ [1,2], [3,4] ]
[ [1,3], [2,4] ]
[ [1,4], [2,3] ]
Edit: thank you for your comments -- the order of elements and the type of the result is less important to me than the concept - an expression of all two-groups from one group, where element order is unimportant.
Here's an implementation, closely following the definition.
The first element always goes into the left group. After that, we add the next head element into one, or the other group. If one of the groups becomes too big, there is no choice anymore and we must add all the rest into the the shorter group.
divide :: [a] -> [([a], [a])]
divide [] = [([],[])]
divide (x:xs) = go ([x],[], xs, 1,length xs) []
where
go (a,b, [], i,j) zs = (a,b) : zs -- i == lengh a - length b
go (a,b, s#(x:xs), i,j) zs -- j == length s
| i >= j = (a,b++s) : zs
| (-i) >= j = (a++s,b) : zs
| otherwise = go (x:a, b, xs, i+1, j-1) $ go (a, x:b, xs, i-1, j-1) zs
This produces
*Main> divide [1,2,3,4]
[([2,1],[3,4]),([3,1],[2,4]),([1,4],[3,2])]
The limitation of having an even length list is unnecessary:
*Main> divide [1,2,3]
[([2,1],[3]),([3,1],[2]),([1],[3,2])]
(the code was re-written in the "difference-list" style for efficiency: go2 A zs == go1 A ++ zs).
edit: How does this work? Imagine yourself sitting at a pile of stones, dividing it into two. You put the first stone to a side, which one it doesn't matter (so, left, say). Then there's a choice where to put each next stone — unless one of the two piles becomes too small by comparison, and we thus must put all the remaining stones there at once.
To find all partitions of a non-empty list (of even length n) into two equal-sized parts, we can, to avoid repetitions, posit that the first element shall be in the first part. Then it remains to find all ways to split the tail of the list into one part of length n/2 - 1 and one of length n/2.
-- not to be exported
splitLen :: Int -> Int -> [a] -> [([a],[a])]
splitLen 0 _ xs = [([],xs)]
splitLen _ _ [] = error "Oops"
splitLen k l ys#(x:xs)
| k == l = [(ys,[])]
| otherwise = [(x:us,vs) | (us,vs) <- splitLen (k-1) (l-1) xs]
++ [(us,x:vs) | (us,vs) <- splitLen k (l-1) xs]
does that splitting if called appropriately. Then
partitions :: [a] -> [([a],[a])]
partitions [] = [([],[])]
partitions (x:xs)
| even len = error "Original list with odd length"
| otherwise = [(x:us,vs) | (us,vs) <- splitLen half len xs]
where
len = length xs
half = len `quot` 2
generates all the partitions without redundantly computing duplicates.
luqui raises a good point. I haven't taken into account the possibility that you'd want to split lists with repeated elements. With those, it gets a little more complicated, but not much. First, we group the list into equal elements (done here for an Ord constraint, for only Eq, that could still be done in O(length²)). The idea is then similar, to avoid repetitions, we posit that the first half contains more elements of the first group than the second (or, if there is an even number in the first group, equally many, and similar restrictions hold for the next group etc.).
repartitions :: Ord a => [a] -> [([a],[a])]
repartitions = map flatten2 . halves . prepare
where
flatten2 (u,v) = (flatten u, flatten v)
prepare :: Ord a => [a] -> [(a,Int)]
prepare = map (\xs -> (head xs, length xs)) . group . sort
halves :: [(a,Int)] -> [([(a,Int)],[(a,Int)])]
halves [] = [([],[])]
halves ((a,k):more)
| odd total = error "Odd number of elements"
| even k = [((a,low):us,(a,low):vs) | (us,vs) <- halves more] ++ [normalise ((a,c):us,(a,k-c):vs) | c <- [low + 1 .. min half k], (us,vs) <- choose (half-c) remaining more]
| otherwise = [normalise ((a,c):us,(a,k-c):vs) | c <- [low + 1 .. min half k], (us,vs) <- choose (half-c) remaining more]
where
remaining = sum $ map snd more
total = k + remaining
half = total `quot` 2
low = k `quot` 2
normalise (u,v) = (nz u, nz v)
nz = filter ((/= 0) . snd)
choose :: Int -> Int -> [(a,Int)] -> [([(a,Int)],[(a,Int)])]
choose 0 _ xs = [([],xs)]
choose _ _ [] = error "Oops"
choose need have ((a,k):more) = [((a,c):us,(a,k-c):vs) | c <- [least .. most], (us,vs) <- choose (need-c) (have-k) more]
where
least = max 0 (need + k - have)
most = min need k
flatten :: [(a,Int)] -> [a]
flatten xs = xs >>= uncurry (flip replicate)
Daniel Fischer's answer is a good way to solve the problem. I offer a worse (more inefficient) way, but one which more obviously (to me) corresponds to the problem description. I will generate all partitions of the list into two equal length sublists, then filter out equivalent ones according to your definition of equivalence. The way I usually solve problems is by starting like this -- create a solution that is as obvious as possible, then gradually transform it into a more efficient one (if necessary).
import Data.List (sort, nubBy, permutations)
type Partition a = ([a],[a])
-- Your notion of equivalence (sort to ignore the order)
equiv :: (Ord a) => Partition a -> Partition a -> Bool
equiv p q = canon p == canon q
where
canon (xs,ys) = sort [sort xs, sort ys]
-- All ordered partitions
partitions :: [a] -> [Partition a]
partitions xs = map (splitAt l) (permutations xs)
where
l = length xs `div` 2
-- All partitions filtered out by the equivalence
equivPartitions :: (Ord a) => [a] -> [Partition a]
equivPartitions = nubBy equiv . partitions
Testing
>>> equivPartitions [1,2,3,4]
[([1,2],[3,4]),([3,2],[1,4]),([3,1],[2,4])]
Note
After using QuickCheck to test the equivalence of this implementation with Daniel's, I found an important difference. Clearly, mine requires an (Ord a) constraint and his does not, and this hints at what the difference would be. In particular, if you give his [0,0,0,0], you will get a list with three copies of ([0,0],[0,0]), whereas mine will give only one copy. Which of these is correct was not specified; Daniel's is natural when considering the two output lists to be ordered sequences (which is what that type is usually considered to be), mine is natural when considering them as sets or bags (which is how this question seemed to be treating them).
Splitting The Difference
It is possible to get from an implementation that requires Ord to one that doesn't, by operating on the positions rather than the values in a list. I came up with this transformation -- an idea which I believe originates with Benjamin Pierce in his work on bidirectional programming.
import Data.Traversable
import Control.Monad.Trans.State
data Labelled a = Labelled { label :: Integer, value :: a }
instance Eq (Labelled a) where
a == b = compare a b == EQ
instance Ord (Labelled a) where
compare a b = compare (label a) (label b)
labels :: (Traversable t) => t a -> t (Labelled a)
labels t = evalState (traverse trav t) 0
where
trav x = state (\i -> i `seq` (Labelled i x, i + 1))
onIndices :: (Traversable t, Functor u)
=> (forall a. Ord a => t a -> u a)
-> forall b. t b -> u b
onIndices f = fmap value . f . labels
Using onIndices on equivPartitions wouldn't speed it up at all, but it would allow it to have the same semantics as Daniel's (up to equiv of the results) without the constraint, and with my more naive and obvious way of expressing it -- and I just thought it was an interesting way to get rid of the constraint.
My own generalized version, added much later, inspired by Will's answer:
import Data.Map (adjust, fromList, toList)
import Data.List (groupBy, sort)
divide xs n evenly = divide' xs (zip [0..] (replicate n [])) where
evenPSize = div (length xs) n
divide' [] result = [result]
divide' (x:xs) result = do
index <- indexes
divide' xs (toList $ adjust (x :) index (fromList result)) where
notEmptyBins = filter (not . null . snd) $ result
partlyFullBins | evenly == "evenly" = map fst . filter ((<evenPSize) . length . snd) $ notEmptyBins
| otherwise = map fst notEmptyBins
indexes = partlyFullBins
++ if any (null . snd) result
then map fst . take 1 . filter (null . snd) $ result
else if null partlyFullBins
then map fst. head . groupBy (\a b -> length (snd a) == length (snd b)) . sort $ result
else []

Need to partition a list into lists based on breaks in ascending order of elements (Haskell)

Say I have any list like this:
[4,5,6,7,1,2,3,4,5,6,1,2]
I need a Haskell function that will transform this list into a list of lists which are composed of the segments of the original list which form a series in ascending order. So the result should look like this:
[[4,5,6,7],[1,2,3,4,5,6],[1,2]]
Any suggestions?
You can do this by resorting to manual recursion, but I like to believe Haskell is a more evolved language. Let's see if we can develop a solution that uses existing recursion strategies. First some preliminaries.
{-# LANGUAGE NoMonomorphismRestriction #-}
-- because who wants to write type signatures, amirite?
import Data.List.Split -- from package split on Hackage
Step one is to observe that we want to split the list based on a criteria that looks at two elements of the list at once. So we'll need a new list with elements representing a "previous" and "next" value. There's a very standard trick for this:
previousAndNext xs = zip xs (drop 1 xs)
However, for our purposes, this won't quite work: this function always outputs a list that's shorter than the input, and we will always want a list of the same length as the input (and in particular we want some output even when the input is a list of length one). So we'll modify the standard trick just a bit with a "null terminator".
pan xs = zip xs (map Just (drop 1 xs) ++ [Nothing])
Now we're going to look through this list for places where the previous element is bigger than the next element (or the next element doesn't exist). Let's write a predicate that does that check.
bigger (x, y) = maybe False (x >) y
Now let's write the function that actually does the split. Our "delimiters" will be values that satisfy bigger; and we never want to throw them away, so let's keep them.
ascendingTuples = split . keepDelimsR $ whenElt bigger
The final step is just to throw together the bit that constructs the tuples, the bit that splits the tuples, and a last bit of munging to throw away the bits of the tuples we don't care about:
ascending = map (map fst) . ascendingTuples . pan
Let's try it out in ghci:
*Main> ascending [4,5,6,7,1,2,3,4,5,6,1,2]
[[4,5,6,7],[1,2,3,4,5,6],[1,2]]
*Main> ascending [7,6..1]
[[7],[6],[5],[4],[3],[2],[1]]
*Main> ascending []
[[]]
*Main> ascending [1]
[[1]]
P.S. In the current release of split, keepDelimsR is slightly stricter than it needs to be, and as a result ascending currently doesn't work with infinite lists. I've submitted a patch that makes it lazier, though.
ascend :: Ord a => [a] -> [[a]]
ascend xs = foldr f [] xs
where
f a [] = [[a]]
f a xs'#(y:ys) | a < head y = (a:y):ys
| otherwise = [a]:xs'
In ghci
*Main> ascend [4,5,6,7,1,2,3,4,5,6,1,2]
[[4,5,6,7],[1,2,3,4,5,6],[1,2]]
This problem is a natural fit for a paramorphism-based solution. Having (as defined in that post)
para :: (a -> [a] -> b -> b) -> b -> [a] -> b
foldr :: (a -> b -> b) -> b -> [a] -> b
para c n (x : xs) = c x xs (para c n xs)
foldr c n (x : xs) = c x (foldr c n xs)
para c n [] = n
foldr c n [] = n
we can write
partition_asc xs = para c [] xs where
c x (y:_) ~(a:b) | x<y = (x:a):b
c x _ r = [x]:r
Trivial, since the abstraction fits.
BTW they have two kinds of map in Common Lisp - mapcar
(processing elements of an input list one by one)
and maplist (processing "tails" of a list). With this idea we get
import Data.List (tails)
partition_asc2 xs = foldr c [] . init . tails $ xs where
c (x:y:_) ~(a:b) | x<y = (x:a):b
c (x:_) r = [x]:r
Lazy patterns in both versions make it work with infinite input lists
in a productive manner (as first shown in Daniel Fischer's answer).
update 2020-05-08: not so trivial after all. Both head . head . partition_asc $ [4] ++ undefined and the same for partition_asc2 fail with *** Exception: Prelude.undefined. The combining function g forces the next element y prematurely. It needs to be more carefully written to be productive right away before ever looking at the next element, as e.g. for the second version,
partition_asc2' xs = foldr c [] . init . tails $ xs where
c (x:ys) r#(~(a:b)) = (x:g):gs
where
(g,gs) | not (null ys)
&& x < head ys = (a,b)
| otherwise = ([],r)
(again, as first shown in Daniel's answer).
You can use a right fold to break up the list at down-steps:
foldr foo [] xs
where
foo x yss = (x:zs) : ws
where
(zs, ws) = case yss of
(ys#(y:_)) : rest
| x < y -> (ys,rest)
| otherwise -> ([],yss)
_ -> ([],[])
(It's a bit complicated in order to have the combining function lazy in the second argument, so that it works well for infinite lists too.)
One other way of approaching this task (which, in fact lays the fundamentals of a very efficient sorting algorithm) is using the Continuation Passing Style a.k.a CPS which, in this particular case applied to folding from right; foldr.
As is, this answer would only chunk up the ascending chunks however, it would be nice to chunk up the descending ones at the same time... preferably in reverse order all in O(n) which would leave us with only binary merging of the obtained chunks for a perfectly sorted output. Yet that's another answer for another question.
chunks :: Ord a => [a] -> [[a]]
chunks xs = foldr go return xs $ []
where
go :: Ord a => a -> ([a] -> [[a]]) -> ([a] -> [[a]])
go c f = \ps -> let (r:rs) = f [c]
in case ps of
[] -> r:rs
[p] -> if c > p then (p:r):rs else [p]:(r:rs)
*Main> chunks [4,5,6,7,1,2,3,4,5,6,1,2]
[[4,5,6,7],[1,2,3,4,5,6],[1,2]]
*Main> chunks [4,5,6,7,1,2,3,4,5,4,3,2,6,1,2]
[[4,5,6,7],[1,2,3,4,5],[4],[3],[2,6],[1,2]]
In the above code c stands for current and p is for previous and again, remember we are folding from right so previous, is actually the next item to process.