Getting combination pairs of list elements - list

I am trying to get all combinations of a list. The result for [1,2,3] should be [(1,2),(1,3),(2,3)]. My implementation below, however, gives [(1,2),(2,3)].
parings [d] = []
parings (y:ys) = (y, head ys): parings ys

The list comprehension mentioned in 9000's answer can be trivially factored into a map call.
pairwise :: [a] -> [(a, a)]
pairwise [] = []
pairwise (x:xs) = map (\y -> (x, y)) xs ++ pairwise xs
Every list comprehension can be factored into some combination of map, filter, and concatMap, possibly with some let bindings interspersed. Doing so is a good exercise for learning how to manipulate functions.

Here is one implementation using tails from Data.List:
import Data.List
pairings :: [a] -> [(a, a)]
pairings = concatMap makePairs . tails
where
makePairs [] = []
makePairs (x:xs) = fmap ((,) x) xs
Notes:
I don't know whether tails counts as a "special import" for you -- though it is not in the Prelude, it can be found in the base library, which is always available.
To see what tails does, try tails [1..3].
((,) x) is just a compact way of writing (\y -> (x, y)). If you find it ugly, you can write the longer version instead, or enable the TupleSections extension and spell it as (x,).
makePairs might be written without explicit recursion as...
makePairs = maybe [] (\(x, xs) -> fmap ((,) x) xs) . uncons
... with uncons also being found in Data.List.
All the implementations in the answers here have a not insignificant problem: they retain consumed list segments in memory. To see what I'm talking about, watch the memory usage as you run head . drop 8000000 $ pairings [1..] in GHCi. I'm not confident about there being a workaround for that -- a simple concat . tails, for instance, appears to run into the same issue, while fmap makePairs . tails doesn't (even head . drop (10^9) . head . drop (10^9) . fmap makePairs . tails $ [1..] won't eat up all your memory).

I don't know why you're opposed to list comprehensions; with them, the solution is trivial:
pairwise :: [a] -> [(a, a)]
pairwise [] = []
pairwise (x:xs) = [(x, y) | y <- xs] ++ pairwise xs
You can factor out the comprehension into a separate function with explicit tail recursion if you want.
(The whole thing can be made tail-recursive, too, by keeping parameters for both sides of the ++ and having an accumulator parameter.)

Related

How to find all minimum elements in a list of tuples?

How can I find all the minimum elements in a list? Right now I have a list of tuples, i.e.
[(10,'a'),(5,'b'),(1,'c'),(8,'d'),(1,'e')]
So I want the output which is all the minimum elements of the list, in a new list. For example
[(1,'c'),(1,'e')]
I tried
minimumBy (comparing fst) xs
but that only returns the first minimum element.
After you obtain the minimum of the first value, we can filter the list on these items. Because you here want to retrieve a list of minimum items, we can cover the empty list as well by returning an empty list:
minimumsFst :: Ord a => [(a, b)] -> [(a, b)]
minimumsFst [] = []
minimumsFst xs = filter ((==) minfst . fst) xs
where minfst = minimum (map fst xs)
For example:
Prelude> minimumsFst [(10,'a'),(5,'b'),(1,'c'),(8,'d'),(1,'e')]
[(1,'c'),(1,'e')]
Oneliner. The key is sorting.
Prelude Data.List> let a = [(1,'c'),(2,'b'),(1,'w')]
Prelude Data.List> (\xs#((m,_):_) -> takeWhile ((== m) . fst ) xs) . sortOn fst $ a
[(1,'c'),(1,'w')]
Here's a solution that works in one pass (most other answers here do two passes: one to find the minimum value and one to filter on it), and doesn't rely on how the sorting functions are implemented to be efficient.
{-# LANGUAGE ScopedTypeVariables #-}
import Data.Foldable (foldl')
minimumsBy :: forall a. (a -> a -> Ordering) -> [a] -> [a]
minimumsBy _ [] = []
minimumsBy f (x:xs) = postprocess $ foldl' go (x, id) xs
where
go :: (a, [a] -> [a]) -> a -> (a, [a] -> [a])
go acc#(x, xs) y = case f x y of
LT -> acc
EQ -> (x, xs . (y:))
GT -> (y, id)
postprocess :: (a, [a] -> [a]) -> [a]
postprocess (x, xs) = x:xs []
Note that the [a] -> [a] type I'm using here is called a difference list, aka a Hughes list.
You tried
minimumBy (comparing fst) xs
which can also be written as
= head . sortBy (comparing fst) $ xs
= head . sortOn fst $ xs
= head . head . group . sortOn fst $ xs
= head . head . groupBy ((==) `on` fst) . sortOn fst $ xs
This returns just the first element instead of the list of them, so just drop that extra head to get what you want:
= head . groupBy ((==) `on` fst) . sortOn fst $ xs
Of course having head is no good since it'll error out on the [] input. Instead, we can use the safe option,
= concat . take 1 . groupBy ((==) `on` fst) . sortOn fst $ xs
By the way any solution that calls minimum is also unsafe for the empty input list:
> head []
*** Exception: Prelude.head: empty list
> minimum []
*** Exception: Prelude.minimum: empty list
but takeWhile is safe:
> takeWhile undefined []
[]
edit: thanks to laziness, the overall time complexity of the final version should still be O(n) even in the worst case.
You can do it easily too with foldr:
minimumsFst :: Ord a => [(a, b)] -> [(a, b)]
minimumsFst xs = go (minfst xs) xs
where
go mn ls = foldr (\(x, y) rs -> if (x == mn) then (x,y) : rs else rs) [] xs
minfst ls = minimum (map fst ls)
with your example:
minimumsFst [(10,'a'),(5,'b'),(1,'c'),(8,'d'),(1,'e')]
=> [(1,'c'),(1,'e')]

Create a list of sublists of a given list

I need a function that returns a list of all possible sublists, without skipping elements, e.g. sublists [1,2,3,4] should return [[1,2,3,4],[1,2,3] etc.] but the list should NOT contain [1,2,4].
My current "solution" is
>sublists :: [Integer] -> [[Integer]]
>sublists [] = [[]]
>sublists (x:xs) = [x:ys | ys <- sublists xs] ++ sublists xs
which does include [1,2,4]
Thanks in advance
EDIT: Found a solution (with a little help of my friend)
Looks a bit clumsy but it works
>sublists :: [Integer] -> [[Integer]]
>sublists [] = [[]]
>sublists (x:xs) = subs [] (x:xs) ++ sublists xs
> where
> subs :: [Integer] -> [Integer] -> [[Integer]]
> subs xs [] = [xs]
> subs xs (a:as) = (xs ++ [a]) : (subs (xs ++ [a]) as)
Data.List contains both inits and tails. What you want is the inits of each member of the tails list (or possibly vice-versa, but see later for the reason why this way round is better)
sublists = concatMap inits . tails
> sublists [1,2,3,4]
[[],[1],[1,2],[1,2,3],[1,2,3,4],[],[2],[2,3],[2,3,4],[],[3],[3,4],[],[4],[]]
If you prefer, you might want to get rid of all the null lists:
sublists = filter (not . null) . concatMap inits . tails
Or if you prefer to avoid generating the null lists in the first place:
sublists = concatMap (tail . inits) . tails
The result of inits always starts with the empty list, while the result of tails always ends with the empty list. So tail . inits is safe because tail will never be applied to an empty list; it just returns the result without the leading empty list. inits [] just returns [[]], so the last empty list from tails gets dropped.

Algorithm to merge two lists

I'm trying to merge two lists of tuples, x and y. Basically is I have these lists:
[("hello", "hi"), ("foo", "baz"), ("this", "that")]
--and
[("foo", "bar"), ("hello", "world"), ("goo", "boo")]
--the result should be
[("hello", "world"), ("foo", "bar"), ("this", "that")]
I've written this so far:
merge :: (Eq a) => [(a, b)] -> [(a, b)] -> [(a, b)]
merge [] _ = []
merge _ [] = []
merge (x:xs) (y:ys)
| fst x == fst y = (fst y, snd y) : merge xs ys
| otherwise = (fst x, snd x) : merge xs ys
The problem with this solution is that it only merges that are the same index. How can I efficiently iterate over the second list and merge it into the first?
Right now, if the otherwise clause doesn’t match, your code discards both x and y. It should be trying to merge x with the rest of ys. Some hints: [x] is a list you can pass to merge, and if you can think of a way to divide and conquer the problem, you can concatenate a pair of lists with ++.
The correct solution is going to involve combining the results of different steps, and when you start doing that, the efficient approach is going to be tail-recursion.

What does this list permutations implementation in Haskell exactly do?

I am studying the code in the Data.List module and can't exactly wrap my head around this implementation of permutations:
permutations :: [a] -> [[a]]
permutations xs0 = xs0 : perms xs0 []
where
perms [] _ = []
perms (t:ts) is = foldr interleave (perms ts (t:is)) (permutations is)
where interleave xs r = let (_,zs) = interleave' id xs r in zs
interleave' _ [] r = (ts, r)
interleave' f (y:ys) r = let (us,zs) = interleave' (f . (y:)) ys r
in (y:us, f (t:y:us) : zs)
Can somebody explain in detail how these nested functions connect/work with each other?
Sorry about the late answer, it took a bit longer to write down than expected.
So, first of all to maximize lazyness in a list function like this there are two goals:
Produce as many answers as possible before inspecting the next element of the input list
The answers themselves must be lazy, and so there the same must hold.
Now consider the permutation function. Here maximal lazyness means:
We should determine that there are at least n! permutations after inspecting just n elements of input
For each of these n! permutations, the first n elements should depend only on the first n elements of the input.
The first condition could be formalized as
length (take (factorial n) $ permutations ([1..n] ++ undefined))) `seq` () == ()
David Benbennick formalized the second condition as
map (take n) (take (factorial n) $ permutations [1..]) == permutations [1..n]
Combined, we have
map (take n) (take (factorial n) $ permutations ([1..n] ++ undefined)) == permutations [1..n]
Let's start with some simple cases. First permutation [1..]. We must have
permutations [1..] = [1,???] : ???
And with two elements we must have
permutations [1..] = [1,2,???] : [2,1,???] : ???
Note that there is no choice about the order of the first two elements, we can't put [2,1,...] first, since we already decided that the first permutation must start with 1. It should be clear by now that the first element of permutations xs must be equal to xs itself.
Now on to the implementation.
First of all, there are two different ways to make all permutations of a list:
Selection style: keep picking elements from the list until there are none left
permutations [] = [[]]
permutations xxs = [(y:ys) | (y,xs) <- picks xxs, ys <- permutations xs]
where
picks (x:xs) = (x,xs) : [(y,x:ys) | (y,ys) <- picks xs]
Insertion style: insert or interleave each element in all possible places
permutations [] = [[]]
permutations (x:xs) = [y | p <- permutations xs, y <- interleave p]
where
interleave [] = [[x]]
interleave (y:ys) = (x:y:ys) : map (y:) (interleave ys)
Note that neither of these is maximally lazy. The first case, the first thing this function does is pick the first element from the entire list, which is not lazy at all. In the second case we need the permutations of the tail before we can make any permutation.
To start, note that interleave can be made more lazy. The first element of interleave yss list is [x] if yss=[] or (x:y:ys) if yss=y:ys. But both of these are the same as x:yss, so we can write
interleave yss = (x:yss) : interleave' yss
interleave' [] = []
interleave' (y:ys) = map (y:) (interleave ys)
The implementation in Data.List continues on this idea, but uses a few more tricks.
It is perhaps easiest to go through the mailing list discussion. We start with David Benbennick's version, which is the same as the one I wrote above (without the lazy interleave). We already know that the first elment of permutations xs should be xs itself. So, let's put that in
permutations xxs = xxs : permutations' xxs
permutations' [] = []
permutations' (x:xs) = tail $ concatMap interleave $ permutations xs
where interleave = ..
The call to tail is of course not very nice. But if we inline the definitions of permutations and interleave we get
permutations' (x:xs)
= tail $ concatMap interleave $ permutations xs
= tail $ interleave xs ++ concatMap interleave (permutations' xs)
= tail $ (x:xs) : interleave' xs ++ concatMap interleave (permutations' xs)
= interleave' xs ++ concatMap interleave (permutations' xs)
Now we have
permutations xxs = xxs : permutations' xxs
permutations' [] = []
permutations' (x:xs) = interleave' xs ++ concatMap interleave (permutations' xs)
where
interleave yss = (x:yss) : interleave' yss
interleave' [] = []
interleave' (y:ys) = map (y:) (interleave ys)
The next step is optimization. An important target would be to eliminate the (++) calls in interleave. This is not so easy, because of the last line, map (y:) (interleave ys). We can't immediately use the foldr/ShowS trick of passing the tail as a parameter. The way out is to get rid of the map. If we pass a parameter f as the function that has to be mapped over the result at the end, we get
permutations' (x:xs) = interleave' id xs ++ concatMap (interleave id) (permutations' xs)
where
interleave f yss = f (x:yss) : interleave' f yss
interleave' f [] = []
interleave' f (y:ys) = interleave (f . (y:)) ys
Now we can pass in the tail,
permutations' (x:xs) = interleave' id xs $ foldr (interleave id) [] (permutations' xs)
where
interleave f yss r = f (x:yss) : interleave' f yss r
interleave' f [] r = r
interleave' f (y:ys) r = interleave (f . (y:)) ys r
This is starting to look like the one in Data.List, but it is not the same yet. In particular, it is not as lazy as it could be.
Let's try it out:
*Main> let n = 4
*Main> map (take n) (take (factorial n) $ permutations ([1..n] ++ undefined))
[[1,2,3,4],[2,1,3,4],[2,3,1,4],[2,3,4,1]*** Exception: Prelude.undefined
Uh oh, only the first n elements are correct, not the first factorial n.
The reason is that we still try to place the first element (the 1 in the above example) in all possible locations before trying anything else.
Yitzchak Gale came up with a solution. Considered all ways to split the input into an initial part, a middle element, and a tail:
[1..n] == [] ++ 1 : [2..n]
== [1] ++ 2 : [3..n]
== [1,2] ++ 3 : [4..n]
If you haven't seen the trick to generate these before before, you can do this with zip (inits xs) (tails xs).
Now the permutations of [1..n] will be
[] ++ 1 : [2..n] aka. [1..n], or
2 inserted (interleaved) somewhere into a permutation of [1], followed by [3..n]. But not 2 inserted at the end of [1], since we already go that result in the previous bullet point.
3 interleaved into a permutation of [1,2] (not at the end), followed by [4..n].
etc.
You can see that this is maximally lazy, since before we even consider doing something with 3, we have given all permutations that start with some permutation of [1,2]. The code that Yitzchak gave was
permutations xs = xs : concat (zipWith newPerms (init $ tail $ tails xs)
(init $ tail $ inits xs))
where
newPerms (t:ts) = map (++ts) . concatMap (interleave t) . permutations3
interleave t [y] = [[t, y]]
interleave t ys#(y:ys') = (t:ys) : map (y:) (interleave t ys')
Note the recursive call to permutations3, which can be a variant that doesn't have to be maximally lazy.
As you can see this is a bit less optimized than what we had before. But we can apply some of the same tricks.
The first step is to get rid of init and tail. Let's look at what zip (init $ tail $ tails xs) (init $ tail $ inits xs) actually is
*Main> let xs = [1..5] in zip (init $ tail $ tails xs) (init $ tail $ inits xs)
[([2,3,4,5],[1]),([3,4,5],[1,2]),([4,5],[1,2,3]),([5],[1,2,3,4])]
The init gets rid of the combination ([],[1..n]), while the tail gets rid of the combination ([1..n],[]). We don't want the former, because that would fail the pattern match in newPerms. The latter would fail interleave. Both are easy to fix: just add a case for newPerms [] and for interleave t [].
permutations xs = xs : concat (zipWith newPerms (tails xs) (inits xs))
where
newPerms [] is = []
newPerms (t:ts) is = map (++ts) (concatMap (interleave t) (permutations is))
interleave t [] = []
interleave t ys#(y:ys') = (t:ys) : map (y:) (interleave t ys')
Now we can try to inline tails and inits. Their definition is
tails xxs = xxs : case xxs of
[] -> []
(_:xs) -> tails xs
inits xxs = [] : case xxs of
[] -> []
(x:xs) -> map (x:) (inits xs)
The problem is that inits is not tail recursive. But since we are going to take a permutation of the inits anyway, we don't care about the order of the elements. So we can use an accumulating parameter,
inits' = inits'' []
where
inits'' is xxs = is : case xxs of
[] -> []
(x:xs) -> inits'' (x:is) xs
Now we make newPerms a function of xxs and this accumulating parameter, instead of tails xxs and inits xxs.
permutations xs = xs : concat (newPerms' xs [])
where
newPerms' xxs is =
newPerms xxs is :
case xxs of
[] -> []
(x:xs) -> newPerms' xs (x:is)
newPerms [] is = []
newPerms (t:ts) is = map (++ts) (concatMap (interleave t) (permutations3 is))
inlining newPerms into newPerms' then gives
permutations xs = xs : concat (newPerms' xs [])
where
newPerms' [] is = [] : []
newPerms' (t:ts) is =
map (++ts) (concatMap (interleave t) (permutations is)) :
newPerms' ts (t:is)
inlining and unfolding concat, and moving the final map (++ts) into interleave,
permutations xs = xs : newPerms' xs []
where
newPerms' [] is = []
newPerms' (t:ts) is =
concatMap interleave (permutations is) ++
newPerms' ts (t:is)
where
interleave [] = []
interleave (y:ys) = (t:y:ys++ts) : map (y:) (interleave ys)
Then finally, we can reapply the foldr trick to get rid of the (++):
permutations xs = xs : newPerms' xs []
where
newPerms' [] is = []
newPerms' (t:ts) is =
foldr (interleave id) (newPerms' ts (t:is)) (permutations is)
where
interleave f [] r = r
interleave f (y:ys) r = f (t:y:ys++ts) : interleave (f . (y:)) ys r
Wait, I said get rid of the (++). We got rid of one of them, but not the one in interleave.
For that, we can see that we are always concatenating some tail of yys to ts. So, we can unfold the calculating (ys++ts) along with the recursion of interleave, and have the function interleave' f ys r return the tuple (ys++ts, interleave f ys r). This gives
permutations xs = xs : newPerms' xs []
where
newPerms' [] is = []
newPerms' (t:ts) is =
foldr interleave (newPerms' ts (t:is)) (permutations is)
where
interleave ys r = let (_,zs) = interleave' id ys r in zs
interleave' f [] r = (ts,r)
interleave' f (y:ys) r =
let (us,zs) = interleave' (f . (y:)) ys r
in (y:us, f (t:y:us) : zs)
And there you have it, Data.List.permutations in all its maximally lazy optimized glory.
Great write-up by Twan! I (#Yitz) will just add a few references:
The original email thread where Twan developed this algorithm, linked above by Twan, is fascinating reading.
Knuth classifies all possible algorithms that satisfy these criteria in Vol. 4 Fasc. 2 Sec. 7.2.1.2.
Twan's permutations3 is essentially the same as Knuth's "Algorithm P". As far as Knuth knows, that algorithm was first published by English church bell ringers in the 1600's.
The basic algorithm is based on the idea of taking one item from the list at a time, finding every permutation of items including that new one, and then repeating.
To explain what this looks like, [1..] will mean a list from one up, where no values (no even the first) have been examined yet. It is the parameter to the function. The resulting list is something like:
[[1..]] ++
[[2,1,3..]] ++
[[3,2,1,4..], [2,3,1,4..]] ++ [[3,1,2,4..], [1,3,2,4..]]
[[4,3,2,1,5..], etc
The clustering above reflects the core idea of the algorithm... each row represents a new item taken from the input list, and added to the set of items that are being permuted. Furthermore, it is recursive... on each new row, it takes all the existing permutations, and places the item in each place it hasn't been yet (all the places other then the last one). So, on the third row, we have the two permutations [2,1] and [1,2], and then we take place 3 in both available slots, so [[3,2,1], [2,3,1]] and [[3,1,2], [1,3,2]] respectively, and then append whatever the unobserved part is.
Hopefully, this at least clarifies the algorithm a little. However, there are some optimizations and implementation details to explain.
(Side note: There are two central performance optimizations that are used: first, if you want to repeatedly prepend some items to multiple lists, map (x:y:z:) list is a lot faster then matching some conditional or pattern matching, because it has not branch, just a calculated jump. Second, and this one is used a lot, it is cheap (and handy) to build lists from the back to the front, by repeatedly prepending items; this is used in a few places.
The first thing the function does is establish a two bases cases: first, every list has one permutation at least: itself. This can be returned with no evaluation whatsoever. This could be thought of as the "take 0" case.
The outer loop is the part that looks like the following:
perms (t:ts) is = <prepend_stuff_to> (perms ts (t:is))
ts is the "untouched" part of the list, that we are not yet permuting and haven't even examined yet, and is initially the entire input sequence.
t is the new item we will be sticking in between the permutations.
is is the list of items that we will permute, and then place t in between, and is initially empty.
Each time we calculate one of the above rows, we reach the end of the items we have prepended to the thunk containing (perms ts (t:is)) and will recurse.
The second loop in is a foldr. It for each permutation of is (the stuff before the current item in the original list), it interleaves the item into that list, and prepends it to the thunk.
foldr interleave <thunk> (permutations is)
The third loop is one of the most complex. We know that it prepends each possible interspersing of our target item t in a permutation, followed by the unobserved tail onto the result sequence. It does this with a recursive call, where it folds the permutation into a stack of functions as it recurses, and then as it returns, it executes what amounts to a two little state machines to build the results.
Lets look at an example: interleave [<thunk>] [1,2,3] where t = 4 and is = [5..]
First, as interleave' is called recursively, it builds up ys and fs on the stack, like this:
y = 1, f = id
y = 2, f = (id . (1:))
y = 3, f = ((id . (1:)) . (2:))
(the functions are conceptually the same as ([]++), ([1]++), and ([1,2]++) respectively)
Then, as we go back up, we return and evalute a tuple containing two values, (us, zs).
us is the list to which we prepend the ys after our target t.
zs is the result accumulator, where each time we get a new permutation, we prepend it to the results lists.
Thus, to finish the example, f (t:y:us) gets evaluated and returned as a result for each level of the stack above.
([1,2]++) (4:3:[5..]) === [1,2,4,3,5..]
([1]++) (4:2[3,5..]) === [1,4,2,3,5..]
([]++) (4:1[2,3,5..]) === [4,1,2,3,5..]
Hopefully that helps, or at least supplements the material linked in the author's comment above.
(Thanks to dfeuer for bringing this up on IRC and discussing it for a few hours)

Generating all contiguous sublists of a list

I'm kinda new to Haskell and I'm trying to generate all contiguous sublists of a list.
I current have the following:
listSublists :: [a] -> [[a]]
listSublists [] = [[]]
listSublists xs = [xs] ++ listSublists (init xs)
I know the function above would generate sublists with the last element removed but I've no idea how to finish my pseudocode.
My pseudocode is basically,
Take the whole complete list, remove tail. Pass xs of (x:xs) into
listSublists
For example, xs = [1,2,3]
[xs] ++ listSublists (init xs) would generate [1,2,3,4], [1,2,3], [1,2], [1], [] and I'm trying to continue that with passing in [2,3,4] as xs until the list is exhausted.
Can someone give me some pointers? Or am I thinking in a completely wrong way?
The listSublists function that you have is functionally almost identical to
the inits
function. You are on the right track, in that you can currently list all of
the prefixes of a given list.
What you want to ask is "what is a sublist of a list?" One answer is that it
is a suffix of a prefix of the list (i.e. chop off some portion from the
end of a list and then chop off some elements off the front of that list, and
you have one of the contiguous sublists).
So, if you have your prefixes, then what you want is a way to generate all
the suffixes of a given prefix (i.e. of some list). So, if you have
prefixes :: [a] -> [[a]]
prefixes [] = [[]]
prefixes xs = [xs] ++ prefixes (init xs)
you also want a corresponding function suffixes
suffixes :: [a] -> [[a]]
suffixes [] = [[]]
suffixes xs = [xs] ++ suffixes (??? xs)
I will leave it to you to figure out what to use for ???. With these two
functions, you then just take all the prefixes, and produce all the suffixes
to get all the contiguous sublists
allSublists :: [a] -> [[a]]
allSublists = concat . map suffixes . prefixes
You may want to remove all of the empty lists that will be in the result set,
as they are not that interesting of a case.
All sublists (not necessarily contiguous):
sublists [] = [[]]
sublists (x:xs) = [x:sublist | sublist <- sublists xs] ++ sublists xs
Only contiguous sublists:
nub $ concat $ map tails $ inits ls
or
(:) [] $ filter (\x -> length x /= 0) $ concat $ map tails $ inits ls