Related
So i've been praticing Haskell, and i was doing just fine, until i got stuck in this exercise. Basically i want a function that receives a list like this :
xs = [("a","b"),("a","c"),("b","e")]
returns something like this :
xs = [("a",["b","c"]), ("b",["e"])].
I come up with this code:
list xs = [(a,[b])|(a,b) <- xs]
but the problem is that this doesn't do what i want. i guess it's close, but not right.
Here's what this returns:
xs = [("a",["b"]),("a",["c"]),("b",["e"])]
If you don't care about the order of the tuples in the final list, the most efficient way (that doesn't reinvent the wheel) would be to make use of the Map type from Data.Map in the containers package:
import Data.Map as Map
clump :: Ord a => [(a,b)] -> [(a, [b])]
clump xs = Map.toList $ Map.fromListWith (flip (++)) [(a, [b]) | (a,b) <- xs]
main = do print $ clump [("a","b"),("a","c"),("b","e")]
If you do care about the result order, you'll probably have to do something ugly and O(n^2) like this:
import Data.List (nub)
clump' :: Eq a => [(a,b)] -> [(a, [b])]
clump' xs = [(a, [b | (a', b) <- xs, a' == a]) | a <- nub $ map fst xs]
main = do print $ clump' [("a","b"),("a","c"),("b","e")]
You could use right fold with Data.Map.insertWith:
import Data.Map as M hiding (foldr)
main :: IO ()
main = print . M.toList
$ foldr (\(k, v) m -> M.insertWith (++) k [v] m)
M.empty
[("a","b"),("a","c"),("b","e")]
Output:
./main
[("a",["b","c"]),("b",["e"])]
The basic principle is that you want to group "similar" elements together.
Whenever you want to group elements together, you have the group functions in Data.List. In this case, you want to specify yourself what counts as similar, so you will need to use the groupBy version. Most functions in Data.List have a By-version that lets you specify more in detail what you want.
Step 1
In your case, you want to define "similarity" as "having the same first element". In Haskell, "having the same first element on a pair" means
(==) `on` fst
In other words, equality on the first element of a pair.
So to do the grouping, we supply that requirement to groupBy, like so:
groupBy ((==) `on` fst) xs
This will get us back, in your example, the two groups:
[[("a","b"),("a","c")]
,[("b","e")]]
Step 2
Now what remains is turning those lists into pairs. The basic principle behind that is, if we let
ys = [("a","b"),("a","c")]
as an example, to take the first element of the first pair, and then just smash the second element of all pairs together into a list. Taking the first element of the first pair is easy!
fst (head ys) == "a"
Taking all the second elements is fairly easy as well!
map snd ys == ["b", "c"]
Both of these operations together give us what we want.
(fst (head ys), map snd ys) == ("a", ["b", "c"])
Finished product
So if you want to, you can write your clumping function as
clump xs = (fst (head ys), map snd ys)
where ys = groupBy ((==) `on` fst) xs
Here is an sample problem I'm working upon:
Example Input: test [4, 1, 5, 6] 6 returns 5
I'm solving this using this function:
test :: [Int] -> Int -> Int
test [] _ = 0
test (x:xs) time = if (time - x) < 0
then x
else test xs $ time - x
Any better way to solve this function (probably using any inbuilt higher order function) ?
How about
test xs time = maybe 0 id . fmap snd . find ((>time) . fst) $ zip sums xs
where sums = scanl1 (+) xs
or equivalently with that sugary list comprehension
test xs time = headDef 0 $ [v | (s, v) <- zip sums xs, s > time]
where sums = scanl1 (+) xs
headDef is provided by safe. It's trivial to implement (f _ (x:_) = x; f x _ = x) but the safe package has loads of useful functions like these so it's good to check out.
Which sums the list up to each point and finds the first occurence greater than time. scanl is a useful function that behaves like foldl but keeps intermediate results and zip zips two lists into a list of tuples. Then we just use fmap and maybe to manipulate the Maybe (Integer, Integer) to get our result.
This defaults to 0 like yours but I like the version that simply goes to Maybe Integer better from a user point of view, to get this simply remove the maybe 0 id.
You might like scanl and its close relative, scanl1. For example:
test_ xs time = [curr | (curr, tot) <- zip xs (scanl1 (+) xs), tot > time]
This finds all the places where the running sum is greater than time. Then you can pick the first one (or 0) like this:
safeHead def xs = head (xs ++ [def])
test xs time = safeHead 0 (test_ xs time)
This is verbose, and I don't necessarily recommend writing such a simple function like this (IMO the pattern matching & recursion is plenty clear). But, here's a pretty declarative pipeline:
import Control.Error
import Data.List
deadline :: (Num a, Ord a) => a -> [a] -> a
deadline time = fromMaybe 0 . findDeadline time
findDeadline :: (Num a, Ord a) => a -> [a] -> Maybe a
findDeadline time xs = decayWithDifferences time xs
>>= findIndex (< 0)
>>= atMay xs
decayWithDifferences :: Num b => b -> [b] -> Maybe [b]
decayWithDifferences time = tailMay . scanl (-) time
-- > deadline 6 [4, 1, 5, 6]
-- 5
This documents the code a bit and in principle lets you test a little better, though IMO these functions fit more-or-less into the 'obviously correct' category.
You can verify that it matches your implementation:
import Test.QuickCheck
prop_equality :: [Int] -> Int -> Bool
prop_equality time xs = test xs time == deadline time xs
-- > quickCheck prop_equality
-- +++ OK, passed 100 tests.
In this particular case zipping suggested by others in not quite necessary:
test xs time = head $ [y-x | (x:y:_) <- tails $ scanl1 (+) $ 0:xs, y > time]++[0]
Here scanl1 will produce a list of rolling sums of the list xs, starting it with 0. Therefore, tails will produce a list with at least one list having two elements for non-empty xs. Pattern-matching (x:y:_) extracts two elements from each tail of rolling sums, so in effect it enumerates pairs of neighbouring elements in the list of rolling sums. Filtering on the condition, we reconstruct a part of the list that starts with the first element that produces a rolling sum greater than time. Then use headDef 0 as suggested before, or append a [0], so that head always returns something.
If you want to retain readability, I would just stick with your current solution. It's easy to understand, and isn't doing anything wrong.
Just because you can make it into a one line scan map fold mutant doesn't mean that you should!
Say I have any list like this:
[4,5,6,7,1,2,3,4,5,6,1,2]
I need a Haskell function that will transform this list into a list of lists which are composed of the segments of the original list which form a series in ascending order. So the result should look like this:
[[4,5,6,7],[1,2,3,4,5,6],[1,2]]
Any suggestions?
You can do this by resorting to manual recursion, but I like to believe Haskell is a more evolved language. Let's see if we can develop a solution that uses existing recursion strategies. First some preliminaries.
{-# LANGUAGE NoMonomorphismRestriction #-}
-- because who wants to write type signatures, amirite?
import Data.List.Split -- from package split on Hackage
Step one is to observe that we want to split the list based on a criteria that looks at two elements of the list at once. So we'll need a new list with elements representing a "previous" and "next" value. There's a very standard trick for this:
previousAndNext xs = zip xs (drop 1 xs)
However, for our purposes, this won't quite work: this function always outputs a list that's shorter than the input, and we will always want a list of the same length as the input (and in particular we want some output even when the input is a list of length one). So we'll modify the standard trick just a bit with a "null terminator".
pan xs = zip xs (map Just (drop 1 xs) ++ [Nothing])
Now we're going to look through this list for places where the previous element is bigger than the next element (or the next element doesn't exist). Let's write a predicate that does that check.
bigger (x, y) = maybe False (x >) y
Now let's write the function that actually does the split. Our "delimiters" will be values that satisfy bigger; and we never want to throw them away, so let's keep them.
ascendingTuples = split . keepDelimsR $ whenElt bigger
The final step is just to throw together the bit that constructs the tuples, the bit that splits the tuples, and a last bit of munging to throw away the bits of the tuples we don't care about:
ascending = map (map fst) . ascendingTuples . pan
Let's try it out in ghci:
*Main> ascending [4,5,6,7,1,2,3,4,5,6,1,2]
[[4,5,6,7],[1,2,3,4,5,6],[1,2]]
*Main> ascending [7,6..1]
[[7],[6],[5],[4],[3],[2],[1]]
*Main> ascending []
[[]]
*Main> ascending [1]
[[1]]
P.S. In the current release of split, keepDelimsR is slightly stricter than it needs to be, and as a result ascending currently doesn't work with infinite lists. I've submitted a patch that makes it lazier, though.
ascend :: Ord a => [a] -> [[a]]
ascend xs = foldr f [] xs
where
f a [] = [[a]]
f a xs'#(y:ys) | a < head y = (a:y):ys
| otherwise = [a]:xs'
In ghci
*Main> ascend [4,5,6,7,1,2,3,4,5,6,1,2]
[[4,5,6,7],[1,2,3,4,5,6],[1,2]]
This problem is a natural fit for a paramorphism-based solution. Having (as defined in that post)
para :: (a -> [a] -> b -> b) -> b -> [a] -> b
foldr :: (a -> b -> b) -> b -> [a] -> b
para c n (x : xs) = c x xs (para c n xs)
foldr c n (x : xs) = c x (foldr c n xs)
para c n [] = n
foldr c n [] = n
we can write
partition_asc xs = para c [] xs where
c x (y:_) ~(a:b) | x<y = (x:a):b
c x _ r = [x]:r
Trivial, since the abstraction fits.
BTW they have two kinds of map in Common Lisp - mapcar
(processing elements of an input list one by one)
and maplist (processing "tails" of a list). With this idea we get
import Data.List (tails)
partition_asc2 xs = foldr c [] . init . tails $ xs where
c (x:y:_) ~(a:b) | x<y = (x:a):b
c (x:_) r = [x]:r
Lazy patterns in both versions make it work with infinite input lists
in a productive manner (as first shown in Daniel Fischer's answer).
update 2020-05-08: not so trivial after all. Both head . head . partition_asc $ [4] ++ undefined and the same for partition_asc2 fail with *** Exception: Prelude.undefined. The combining function g forces the next element y prematurely. It needs to be more carefully written to be productive right away before ever looking at the next element, as e.g. for the second version,
partition_asc2' xs = foldr c [] . init . tails $ xs where
c (x:ys) r#(~(a:b)) = (x:g):gs
where
(g,gs) | not (null ys)
&& x < head ys = (a,b)
| otherwise = ([],r)
(again, as first shown in Daniel's answer).
You can use a right fold to break up the list at down-steps:
foldr foo [] xs
where
foo x yss = (x:zs) : ws
where
(zs, ws) = case yss of
(ys#(y:_)) : rest
| x < y -> (ys,rest)
| otherwise -> ([],yss)
_ -> ([],[])
(It's a bit complicated in order to have the combining function lazy in the second argument, so that it works well for infinite lists too.)
One other way of approaching this task (which, in fact lays the fundamentals of a very efficient sorting algorithm) is using the Continuation Passing Style a.k.a CPS which, in this particular case applied to folding from right; foldr.
As is, this answer would only chunk up the ascending chunks however, it would be nice to chunk up the descending ones at the same time... preferably in reverse order all in O(n) which would leave us with only binary merging of the obtained chunks for a perfectly sorted output. Yet that's another answer for another question.
chunks :: Ord a => [a] -> [[a]]
chunks xs = foldr go return xs $ []
where
go :: Ord a => a -> ([a] -> [[a]]) -> ([a] -> [[a]])
go c f = \ps -> let (r:rs) = f [c]
in case ps of
[] -> r:rs
[p] -> if c > p then (p:r):rs else [p]:(r:rs)
*Main> chunks [4,5,6,7,1,2,3,4,5,6,1,2]
[[4,5,6,7],[1,2,3,4,5,6],[1,2]]
*Main> chunks [4,5,6,7,1,2,3,4,5,4,3,2,6,1,2]
[[4,5,6,7],[1,2,3,4,5],[4],[3],[2,6],[1,2]]
In the above code c stands for current and p is for previous and again, remember we are folding from right so previous, is actually the next item to process.
The function needs to take an ordered list of integer elements and return all the combinations of adjacent elements in the original list. e.g [1,2,3] would return [[1,2,3],[1],[1,2],[2],[2,3],[3]].
Note that [1,3] should not be included, as 1 and 3 are not adjacent in the original list.
Apart from the fact that inits and tails aren't found in Prelude, you can define your function as such:
yourFunction :: [a] -> [[a]]
yourFunction = filter (not . null) . concat . map inits . tails
This is what it does, step by step:
tails gives all versions of a list with zero or more starting elements removed: tails [1,2,3] == [[1,2,3],[2,3],[3],[]]
map inits applies inits to every list given by tails, and does exactly the opposite: it gives all versions of a list with zero or more ending elements removed: inits [1,2,3] == [[],[1],[1,2],[1,2,3]]
I hope you already know concat: it applies (++) where you see (:) in a list: concat [[1,2],[3],[],[4]] == [1,2,3,4]. You need this, because after map inits . tails, you end up with a list of lists of lists, while you want a list of lists.
filter (not . null) removes the empty lists from the result. There will be more than one (unless you use the function on the empty list).
You could also use concatMap inits instead of concat . map inits, which does exactly the same thing. It usually also performs better.
Edit: you can define this with Prelude-only functions as such:
yourFunction = concatMap inits . tails
where inits = takeWhile (not . null) . iterate init
tails = takeWhile (not . null) . iterate tail
So, if you need consecutive and non empty answers (as you've noticed in comment).
At first, let's define a simple sublist function.
sublist' [] = [[]]
sublist' (x:xs) = sublist' xs ++ map (x:) (sublist' xs)
It returns all sublists with empty and non-consecutive lists. So we need to filtering elements of that list. Something like sublists = (filter consecutive) . filter (/= []) . sublist'
To check list for it's consecution we need to get pairs of neighbors (compactByN 2) and check them.
compactByN :: Int -> [a] -> [[a]]
compactByN _ [] = [[]]
compactByN n list | length list == n = [list]
compactByN n list#(x:xs)= take n list : compactByN n xs
And finally
consecutive :: [Int] -> Bool
consecutive [_] = True
consecutive x = all (\[x,y] -> (x + 1 == y)) $ compact_by_n 2 x
And we have
λ> sublists [1,2,3]
[[3],[2],[2,3],[1],[1,2],[1,2,3]]
Done. http://hpaste.org/53965
Unless, I'm mistaken, you're just asking for the superset of the numbers.
The code is fairly self explanatory - our superset is recursively built by building the superset of the tail twice, once with our current head in it, and once without, and then combining them together and with a list containing our head.
superset xs = []:(superset' xs) -- remember the empty list
superset' (x:xs) = [x]:(map (x:) (superset' xs)) ++ superset' xs
superset' [] = []
Inspired by Comparing list length
If I want to find the longest list in a list of lists, the simplest way is probably:
longestList :: [[a]] -> [a]
longestList = maximumBy (comparing length)
A more efficient way would be to precompute the lengths:
longest :: [[a]] -> [a]
longest xss = snd $ maximumBy (comparing fst) [(length xs, xs) | xs <- xss]
Now, I want to take it one step further. It may not be more efficient for normal cases, but can you solve this using arrows? My idea is basically, step through all of the lists simultaneously, and keep stepping until you've overstepped the length of every list except the longest.
longest [[1],[1],[1..2^1000],[1],[1]]
In the forgoing (very contrived) example, you would only have to take two steps through each list in order to determine that the list [1..2^1000] is the longest, without ever needing to determine the entire length of said list. Am I right that this can be done with arrows? If so, then how? If not, then why not, and how could this approach be implemented?
OK, as I was writing the question, it dawned on me a simple way to implement this (without arrows, boo!)
longest [] = error "it's ambiguous"
longest [xs] = xs
longest xss = longest . filter (not . null) . map (drop 1) $ xss
Except this has a problem...it drops the first part of the list and doesn't recover it!
> take 3 $ longest [[1],[1],[1..2^1000],[1]]
[2,3,4]
Needs more bookkeeping :P
longest xs = longest' $ map (\x -> (x,x)) xs
longest' [] = error "it's ambiguous"
longest' [xs] = fst xs
longest' xss = longest . filter (not . null . snd) . map (sndMap (drop 1)) $ xss
sndMap f (x,y) = (x, f y)
Now it works.
> take 3 $ longest [[1],[1],[1..2^1000],[1]]
[1,2,3]
But no arrows. :( If it can be done with arrows, then hopefully this answer can give you someplace to start.
Thinking about this some more, there is a far simpler solution which gives the same performance characteristics. We can just use maximumBy with a lazy length comparison function:
compareLength [] [] = EQ
compareLength _ [] = GT
compareLength [] _ = LT
compareLength (_:xs) (_:ys) = compareLength xs ys
longest = maximumBy compareLength
Here's the most straightforward implementation I could think of. No arrows involved, though.
I keep a list of pairs where the first element is the original list, and the second is the remaining tail. If we only have one list left, we're done. Otherwise we try taking the tail of all the remaining lists, filtering out those who are empty. If some still remain, keep going. Otherwise, they are all the same length and we arbitrarily pick the first one.
longest [] = error "longest: empty list"
longest xss = go [(xs, xs) | xs <- xss]
where go [(xs, _)] = xs
go xss | null xss' = fst . head $ xss
| otherwise = go xss'
where xss' = [(xs, ys) | (xs, (_:ys)) <- xss]