Related
Given a list of tuples like this:
dic = [(1,"aa"),(1,"cc"),(2,"aa"),(3,"ff"),(3,"gg"),(1,"bb")]
How to group items of dic resulting in a list grp where,
grp = [(1,["aa","bb","cc"]), (2, ["aa"]), (3, ["ff","gg"])]
I'm actually a newcomer to Haskell...and seems to be falling in love with it..
Using group or groupBy in Data.List will only group similar adjacent items in a list.
I wrote an inefficient function for this, but it results in memory failures as I need to process a very large coded string list. Hope you would help me find a more efficient way.
Whenever possible, reuse library code.
import Data.Map
sortAndGroup assocs = fromListWith (++) [(k, [v]) | (k, v) <- assocs]
Try it out in ghci:
*Main> sortAndGroup [(1,"aa"),(1,"cc"),(2,"aa"),(3,"ff"),(3,"gg"),(1,"bb")]
fromList [(1,["bb","cc","aa"]),(2,["aa"]),(3,["gg","ff"])]
EDIT In the comments, some folks are worried about whether (++) or flip (++) is the right choice. The documentation doesn't say which way things get associated; you can find out by experimenting, or you can sidestep the whole issue using difference lists:
sortAndGroup assocs = ($[]) <$> fromListWith (.) [(k, (v:)) | (k, v) <- assocs]
-- OR
sortAndGroup = fmap ($[]) . M.fromListWith (.) . map (fmap (:))
These alternatives are about the same length as the original, but they're a bit less readable to me.
Here's my solution:
import Data.Function (on)
import Data.List (sortBy, groupBy)
import Data.Ord (comparing)
myGroup :: (Eq a, Ord a) => [(a, b)] -> [(a, [b])]
myGroup = map (\l -> (fst . head $ l, map snd l)) . groupBy ((==) `on` fst)
. sortBy (comparing fst)
This works by first sorting the list with sortBy:
[(1,"aa"),(1,"cc"),(2,"aa"),(3,"ff"),(3,"gg"),(1,"bb")]
=> [(1,"aa"),(1,"bb"),(1,"cc"),(2,"aa"),(3,"ff"),(3,"gg")]
then grouping the list elements by the associated key with groupBy:
[(1,"aa"),(1,"bb"),(1,"cc"),(2,"aa"),(3,"ff"),(3,"gg")]
=> [[(1,"aa"),(1,"bb"),(1,"cc")],[(2,"aa")],[(3,"ff"),(3,"gg")]]
and then transforming the grouped items to tuples with map:
[[(1,"aa"),(1,"bb"),(1,"cc")],[(2,"aa")],[(3,"ff"),(3,"gg")]]
=> [(1,["aa","bb","cc"]), (2, ["aa"]), (3, ["ff","gg"])]`)
Testing:
> myGroup dic
[(1,["aa","bb","cc"]),(2,["aa"]),(3,["ff","gg"])]
Also you can use TransformListComp extension, for example:
Prelude> :set -XTransformListComp
Prelude> import GHC.Exts (groupWith, the)
Prelude GHC.Exts> let dic = [ (1, "aa"), (1, "bb"), (1, "cc") , (2, "aa"), (3, "ff"), (3, "gg")]
Prelude GHC.Exts> [(the key, value) | (key, value) <- dic, then group by key using groupWith]
[(1,["aa","bb","cc"]),(2,["aa"]),(3,["ff","gg"])]
If the list is not sorted on the first element, I don't think you can do better than O(nlog(n)).
One simple way would be to just sort and then use anything from the answer of second part.
You can use from Data.Map a map like Map k [a] to use first element of tuple as key and keep on adding to the values.
You can write your own complex function, which even after you all the attempts will still take O(nlog(n)).
If list is sorted on the first element as is the case in your example, then the task is trivial for something like groupBy as given in the answer by #Mikhail or use foldr and there are numerous other ways.
An example of using foldr is here:
grp :: Eq a => [(a,b)] -> [(a,[b])]
grp = foldr f []
where
f (z,s) [] = [(z,[s])]
f (z,s) a#((x,y):xs) | x == z = (x,s:y):xs
| otherwise = (z,[s]):a
{-# LANGUAGE TransformListComp #-}
import GHC.Exts
import Data.List
import Data.Function (on)
process :: [(Integer, String)] -> [(Integer, [String])]
process list = [(the a, b) | let info = [ (x, y) | (x, y) <- list, then sortWith by y ], (a, b) <- info, then group by a using groupWith]
I am having trouble with an assignment question!
Write the function
freq2 :: String -> -> [(Int,[Char])]
Like freq, the function freq2 counts frequency of occurrence of alphabetic characters.
Given the string:
We hold these truths to be self-evident, that all men are created equal, that they are endowed by their Creator with certain unalienable Rights, that among these are Life, Liberty and the pursuit of Happiness
I need to end up with:
[(1,"qv"), (2,"gm"), (3,"cfpwy"), (4,"b"), (5,"u"), (6,"do"),(8,"s"), (9,"ln"), (10,"i"), (12,"r"), (13,"h"), (16,"a"),(22,"t"), (28,"e")]
So far I can get to:
[('q',1),('v',1),('g',2),('m',2),('c',3),('f',3),('p',3),('w',3),('y',3),('b',4),('u',5),('d',6),('o',6),('s',8),('l',9),('n',9),('i',10),('r',12),('h',13),('a',16),('t',22),('e',28)]
Using:
freq2 :: String -> [(Char,Int)]
freq2 input = result2
where
lower_case_list = L.map C.toLower input
filtered_list = L.filter C.isAlpha lower_case_list
result = L.map (\a -> (L.head a, L.length a)) $ L.group $ sort filtered_list
result2 = sortBy (compare `on` snd) result
Is there an easy way to get to the last stage or to do the whole thing, possibly using library functions? Or can you please provide some direction on how to finish off this question?
Thanks
Something like this appended to your solution should work:
result3 = map (\xs#((_,x):_) -> (x, map fst xs)) $ L.groupBy ((==) `on` snd) result2
My preference would be to use a Map for these types of problems though:
import qualified Data.Map as Map
import qualified Data.Char as C
import qualified Data.Tuple as T
string = filter C.isAlpha $ map C.toLower "We hold these truths to be self-evident, that all men are created equal, that they are endowed by their Creator with certain unalienable Rights, that among these are Life, Liberty and the pursuit of Happiness"
swapMapWith f = Map.fromListWith f . map T.swap . Map.toList
freq2 :: String -> [(Int, String)]
freq2 = Map.toList . swapMapWith (++) . foldl (\agg c -> Map.insertWith (+) [c] 1 agg) Map.empty
Method 1:
import needed modules
import Data.Char
import Data.List
filter out uninterested characters and convert the rest to lower case
toLowerAlpha :: String -> String
toLowerAlpha = map toLower . filter isAlpha
sort first, then group, after that the length of each group is the frequency of character in that group
elemFreq :: (Ord a) => [a] -> [(Int, a)]
elemFreq = map (\l -> (length l, head l)) . group . sort
sort and group as step 2, but according to frequency at here, then combine all those characters that have the same frequencies
groupByFreq :: (Integral a, Ord b) => [(a, b)] -> [[(a, b)]]
groupByFreq = groupBy (onFreq (==)) . sortBy (onFreq compare)
where onFreq op (f1,_) (f2,_) = op f1 f2
collectByFreq :: (Integral a) => [[(a, b)]] -> [(a, [b])]
collectByFreq = map (\ls -> (fst . head $ ls, map snd ls))
sequence the above functions will give the required function
freq2 = collectByFreq . groupByFreq . elemFreq . toLowerAlpha
Method 2:
import needed modules
import qualified Data.Char as Char
import qualified Data.Map as Map
filter out uninterested characters and convert the rest to lower case
toLowerAlpha :: String -> String
toLowerAlpha = map Char.toLower . filter Char.isAlpha
create a map, key and value are character and corresponding frequency, respectively
toFreqMap :: (Ord a, Num b) => [a] -> Map.Map a b
toFreqMap = foldr (\c -> Map.insertWith (+) c 1) Map.empty
convert the map created in step 2 to another map, using frequency as key, and characters have that frequency as value
toFreqCol :: (Ord a, Ord b) => Map.Map a b -> Map.Map b [a]
toFreqCol = Map.foldrWithKey (\k a m -> Map.insertWith (++) a [k] m) Map.empty
sequence the above functions will give the required function
freq2 = Map.toAscList . toFreqCol . toFreqMap . toLowerAlpha
I'm trying to write a function that given a list of numbers, returns a list where every 2nd number is doubled in value, starting from the last element. So if the list elements are 1..n, n-th is going to be left as-is, (n-1)-th is going to be doubled in value, (n-2)-th is going to be left as-is, etc.
So here's how I solved it:
MyFunc :: [Integer] -> [Integer]
MyFunc xs = reverse (MyFuncHelper (reverse xs))
MyFuncHelper :: [Integer] -> [Integer]
MyFuncHelper [] = []
MyFuncHelper (x:[]) = [x]
MyFuncHelper (x:y:zs) = [x,y*2] ++ MyFuncHelper zs
And it works:
MyFunc [1,1,1,1] = [2,1,2,1]
MyFunc [1,1,1] = [1,2,1]
However, I can't help but think there has to be a simpler solution than reversing the list, processing it and then reversing it again. Could I simply iterate the list backwards? If yes, how?
The under reversed f xs idiom from the lens library will apply f to xs in reverse order:
under reversed (take 5) [1..100] => [96,97,98,99,100]
When you need to process the list from the end, usually foldr works pretty well. Here is a solution for you without reversing the whole list twice:
doubleOdd :: Num a => [a] -> [a]
doubleOdd = fst . foldr multiplyCond ([], False)
where multiplyCond x (rest, flag) = ((if flag then (x * 2) else x) : rest, not flag)
The multiplyCond function takes a tuple with a flag and the accumulator list. The flag constantly toggles on and off to track whether we should multiply the element or not. The accumulator list simply gathers the resulting numbers. This solution may be not so concise, but avoids extra work and doesn't use anything but prelude functions.
myFunc = reverse
. map (\(b,x) -> if b then x*2 else x)
. zip (cycle [False,True])
. reverse
But this isn't much better. Your implementation is sufficiently elegant.
The simplest way to iterate the list backwards is to reverse the list. I don't think you can really do much better than that; I suspect that if you have to traverse the whole list to find the end, and remember how to get back up, you might as well just reverse it. If this is a big deal, maybe you should be using some other data structure instead of lists—Vector or Seq might be good choices.
Another way to write your helper function is to use Traversable:
import Control.Monad.State
import Data.Traversable (Traversable, traverse)
toggle :: (Bool -> a -> b) -> a -> State Bool b
toggle f a =
do active <- get
put (not active)
return (f active a)
doubleEvens :: (Num a, Traversable t) => t a -> t a
doubleEvens xs = evalState (traverse (toggle step) xs) False
where step True x = 2*x
step False x = x
yourFunc :: Num a => [a] -> [a]
yourFunc = reverse . doubleEvens
Or if we go a bit crazy with Foldable and Traversable, we can try this:
Use Foldable's foldl to extract a reverse-order list from any of its instances. For some types this will be more efficient than reversing a list.
Then we can use traverse and State to map each element of the original structure to its counterpart in the reversed order.
Here's how to do it:
import Control.Monad.State
import Data.Foldable (Foldable)
import qualified Data.Foldable as F
import Data.Traversable (Traversable, traverse)
import Data.Map (Map)
import qualified Data.Map as Map
toReversedList :: Foldable t => t a -> [a]
toReversedList = F.foldl (flip (:)) []
reverse' :: Traversable t => t a -> t a
reverse' ta = evalState (traverse step ta) (toReversedList ta)
where step _ = do (h:t) <- get
put t
return h
yourFunc' :: (Traversable t, Num a) => t a -> t a
yourFunc' = reverse' . doubleEvens
-- >>> yourFunc' $ Map.fromList [(1, 1), (2, 1), (3, 1), (4, 1)]
-- fromList [(1,2),(2,1),(3,2),(4,1)]
-- >>> yourFunc' $ Map.fromList [(1, 1), (2, 1), (3, 1)]
-- fromList [(1,1),(2,2),(3,1)]
There's probably a better way to do this, though...
func xs = zipWith (*) xs $ reverse . (take $ length xs) $ cycle [1,2]
So i've been praticing Haskell, and i was doing just fine, until i got stuck in this exercise. Basically i want a function that receives a list like this :
xs = [("a","b"),("a","c"),("b","e")]
returns something like this :
xs = [("a",["b","c"]), ("b",["e"])].
I come up with this code:
list xs = [(a,[b])|(a,b) <- xs]
but the problem is that this doesn't do what i want. i guess it's close, but not right.
Here's what this returns:
xs = [("a",["b"]),("a",["c"]),("b",["e"])]
If you don't care about the order of the tuples in the final list, the most efficient way (that doesn't reinvent the wheel) would be to make use of the Map type from Data.Map in the containers package:
import Data.Map as Map
clump :: Ord a => [(a,b)] -> [(a, [b])]
clump xs = Map.toList $ Map.fromListWith (flip (++)) [(a, [b]) | (a,b) <- xs]
main = do print $ clump [("a","b"),("a","c"),("b","e")]
If you do care about the result order, you'll probably have to do something ugly and O(n^2) like this:
import Data.List (nub)
clump' :: Eq a => [(a,b)] -> [(a, [b])]
clump' xs = [(a, [b | (a', b) <- xs, a' == a]) | a <- nub $ map fst xs]
main = do print $ clump' [("a","b"),("a","c"),("b","e")]
You could use right fold with Data.Map.insertWith:
import Data.Map as M hiding (foldr)
main :: IO ()
main = print . M.toList
$ foldr (\(k, v) m -> M.insertWith (++) k [v] m)
M.empty
[("a","b"),("a","c"),("b","e")]
Output:
./main
[("a",["b","c"]),("b",["e"])]
The basic principle is that you want to group "similar" elements together.
Whenever you want to group elements together, you have the group functions in Data.List. In this case, you want to specify yourself what counts as similar, so you will need to use the groupBy version. Most functions in Data.List have a By-version that lets you specify more in detail what you want.
Step 1
In your case, you want to define "similarity" as "having the same first element". In Haskell, "having the same first element on a pair" means
(==) `on` fst
In other words, equality on the first element of a pair.
So to do the grouping, we supply that requirement to groupBy, like so:
groupBy ((==) `on` fst) xs
This will get us back, in your example, the two groups:
[[("a","b"),("a","c")]
,[("b","e")]]
Step 2
Now what remains is turning those lists into pairs. The basic principle behind that is, if we let
ys = [("a","b"),("a","c")]
as an example, to take the first element of the first pair, and then just smash the second element of all pairs together into a list. Taking the first element of the first pair is easy!
fst (head ys) == "a"
Taking all the second elements is fairly easy as well!
map snd ys == ["b", "c"]
Both of these operations together give us what we want.
(fst (head ys), map snd ys) == ("a", ["b", "c"])
Finished product
So if you want to, you can write your clumping function as
clump xs = (fst (head ys), map snd ys)
where ys = groupBy ((==) `on` fst) xs
The standard libraries include a function
unzip :: [(a, b)] -> ([a], [b])
The obvious way to define this is
unzip xs = (map fst xs, map snd xs)
However, this means traversing the list twice to construct the result. What I'm wondering is, is there some way to do this with only one traversal?
Appending to a list is expensive - O(n) in fact. But, as any newbie knows, we can make clever use of laziness and recursion to "append" to a list with a recursive call. Thus, zip may easily be implemented as
zip :: [a] -> [b] -> [(a, b)]
zip (a:as) (b:bs) = (a,b) : zip as bs
This trick appear to only work if you're returning one list, however. I can't see how to extend this to allow constructing the tails of multiple lists simultaneously without ending up duplicating the source traversal.
I always presumed that the unzip from the standard library manages to do this in a single traversal (that's kind of the whole point of implementing this otherwise trivial function in a library), but I don't actually know how it works.
Yes, it is possible:
unzip = foldr (\(a,b) ~(as,bs) -> (a:as,b:bs)) ([],[])
With explicit recursion, this would look thus:
unzip [] = ([], [])
unzip ((a,b):xs) = (a:as, b:bs)
where ( as, bs) = unzip xs
The reason that the standard library has the irrefutable pattern match ~(as, bs) is to allow it to work actually lazily:
Prelude> let unzip' = foldr (\(a,b) ~(as,bs) -> (a:as,b:bs)) ([],[])
Prelude> let unzip'' = foldr (\(a,b) (as,bs) -> (a:as,b:bs)) ([],[])
Prelude> head . fst $ unzip' [(n,n) | n<-[1..]]
1
Prelude> head . fst $ unzip'' [(n,n) | n<-[1..]]
*** Exception: stack overflow
The following ideas stem from The Beautiful Folding.
When you have two folding operations over a list, you can always perform them at once by folding with keeping both their states. Let's express this in Haskell. First we need to capture what is a folding operation:
{-# LANGUAGE ExistentialQuantification #-}
import Control.Applicative
data Foldr a b = forall r . Foldr (a -> r -> r) r (r -> b)
A folding operation has a folding function, a start value, and a function that produces a result from a final state. By using existential quantification we can hide the type of the state, which is necessary to combine folds with different states.
Applying a Foldr to a list is just the matter of calling foldr with the appropriate arguments:
fold :: Foldr a b -> [a] -> b
fold (Foldr f s g) = g . foldr f s
Naturally, Foldr is a functor, we can always append a function to the finalizing one:
instance Functor (Foldr a) where
fmap f (Foldr k s r) = Foldr k s (f . r)
More interestingly, it's also an Applicative functor. Implementing pure is easy, we just return a given value and don't fold anything. The most interesting part is <*>. It creates a new fold that keeps the states of both give folds and at the end, combines the results.
instance Applicative (Foldr a) where
pure x = Foldr (\_ _ -> ()) () (\_ -> x)
(Foldr f1 s1 r1) <*> (Foldr f2 s2 r2)
= Foldr foldPair (s1, s2) finishPair
where
foldPair a ~(x1, x2) = (f1 a x1, f2 a x2)
finishPair ~(x1, x2) = r1 x1 (r2 x2)
f *> g = g
f <* g = f
Notice (as in leftaroundabout's answer) that we have lazy pattern matches ~ on tuples. This ensures that <*> is sufficiently lazy.
Now we can express map as a Foldr:
fromMap :: (a -> b) -> Foldr a [b]
fromMap f = Foldr (\x xs -> f x : xs) [] id
With that, defining unzip becomes easy. We just combine two maps, one using fst and another using snd:
unzip' :: Foldr (a, b) ([a], [b])
unzip' = (,) <$> fromMap fst <*> fromMap snd
unzip :: [(a, b)] -> ([a], [b])
unzip = fold unzip'
We can verify that it processes an input only once (and lazily): Both
head . snd $ unzip (repeat (3,'a'))
head . fst $ unzip (repeat (3,'a'))
yield the correct result.