Joining lists in Haskell - list

So i've been praticing Haskell, and i was doing just fine, until i got stuck in this exercise. Basically i want a function that receives a list like this :
xs = [("a","b"),("a","c"),("b","e")]
returns something like this :
xs = [("a",["b","c"]), ("b",["e"])].
I come up with this code:
list xs = [(a,[b])|(a,b) <- xs]
but the problem is that this doesn't do what i want. i guess it's close, but not right.
Here's what this returns:
xs = [("a",["b"]),("a",["c"]),("b",["e"])]

If you don't care about the order of the tuples in the final list, the most efficient way (that doesn't reinvent the wheel) would be to make use of the Map type from Data.Map in the containers package:
import Data.Map as Map
clump :: Ord a => [(a,b)] -> [(a, [b])]
clump xs = Map.toList $ Map.fromListWith (flip (++)) [(a, [b]) | (a,b) <- xs]
main = do print $ clump [("a","b"),("a","c"),("b","e")]
If you do care about the result order, you'll probably have to do something ugly and O(n^2) like this:
import Data.List (nub)
clump' :: Eq a => [(a,b)] -> [(a, [b])]
clump' xs = [(a, [b | (a', b) <- xs, a' == a]) | a <- nub $ map fst xs]
main = do print $ clump' [("a","b"),("a","c"),("b","e")]

You could use right fold with Data.Map.insertWith:
import Data.Map as M hiding (foldr)
main :: IO ()
main = print . M.toList
$ foldr (\(k, v) m -> M.insertWith (++) k [v] m)
M.empty
[("a","b"),("a","c"),("b","e")]
Output:
./main
[("a",["b","c"]),("b",["e"])]

The basic principle is that you want to group "similar" elements together.
Whenever you want to group elements together, you have the group functions in Data.List. In this case, you want to specify yourself what counts as similar, so you will need to use the groupBy version. Most functions in Data.List have a By-version that lets you specify more in detail what you want.
Step 1
In your case, you want to define "similarity" as "having the same first element". In Haskell, "having the same first element on a pair" means
(==) `on` fst
In other words, equality on the first element of a pair.
So to do the grouping, we supply that requirement to groupBy, like so:
groupBy ((==) `on` fst) xs
This will get us back, in your example, the two groups:
[[("a","b"),("a","c")]
,[("b","e")]]
Step 2
Now what remains is turning those lists into pairs. The basic principle behind that is, if we let
ys = [("a","b"),("a","c")]
as an example, to take the first element of the first pair, and then just smash the second element of all pairs together into a list. Taking the first element of the first pair is easy!
fst (head ys) == "a"
Taking all the second elements is fairly easy as well!
map snd ys == ["b", "c"]
Both of these operations together give us what we want.
(fst (head ys), map snd ys) == ("a", ["b", "c"])
Finished product
So if you want to, you can write your clumping function as
clump xs = (fst (head ys), map snd ys)
where ys = groupBy ((==) `on` fst) xs

Related

Haskell combine the elements of 2 lists at different index's

Apologies for the awful title, I'm not too sure how to describe it in words but here is what I mean. If you know a better way to phrase this please let me know.
Suppose I had 2 lists, of equal length.
[a, b, c] [x, y, z]
I want to create the list
[[a, y, z], [b, x, z], [c, x, y]]
Essentially for every element of list1, I want the 2 elements at different indexes to the first element in list2.
so for "a" at index 0, the other 2 are "y" at index 1 and "z" at index 2.
I'm pretty sure I know how to do it using indexes, however, I know that that's not very efficient and wanted to see if there is a more functional solution out there.
Thank you.
You haven't described any edge cases, but you can get the basic behavior you're looking for with something like:
import Data.List (inits, tails)
combine :: [a] -> [a] -> [[a]]
combine xs ys = zipWith3 go xs (tails ys) (inits ys)
where
go a (_:xs) ys = a:ys ++ xs
go _ _ _ = []
The key is that tails returns successive suffixes of its list starting with the full list, and inits returns successive prefixes starting with the empty list.
I would do it using zippers. Here's a function I've written into so many projects I've memorized it:
zippers :: [a] -> [([a], a, [a])]
zippers = go [] where
go b [] = []
go b (h:e) = (b, h, e) : go (h:b) e
(This actually returns a bit more information than we technically need for this application. But it is the general form -- useful in many situations where the restricted version that only returned the prefix/suffix pair and omitted the current focus would sometimes not be enough.)
With this tool in hand, we can just zip (different kind of zip!) together the values from the one list with the zippers of the other list.
combine :: [a] -> [a] -> [[a]]
combine xs ys = zipWith (\x (b, h, e) -> reverse b ++ [x] ++ e) xs (zippers ys)
Try it out in ghci:
> combine "abc" "xyz"
["ayz","xbz","xyc"]
You can try this:
\xs ys -> zipWith3 (((++) .) . (:)) xs (init $ inits ys) (tail $ tails ys)

Haskell function to keep the repeating elements of a list

Here is the expected input/output:
repeated "Mississippi" == "ips"
repeated [1,2,3,4,2,5,6,7,1] == [1,2]
repeated " " == " "
And here is my code so far:
repeated :: String -> String
repeated "" = ""
repeated x = group $ sort x
I know that the last part of the code doesn't work. I was thinking to sort the list then group it, then I wanted to make a filter on the list of list which are greater than 1, or something like that.
Your code already does half of the job
> group $ sort "Mississippi"
["M","iiii","pp","ssss"]
You said you want to filter out the non-duplicates. Let's define a predicate which identifies the lists having at least two elements:
atLeastTwo :: [a] -> Bool
atLeastTwo (_:_:_) = True
atLeastTwo _ = False
Using this:
> filter atLeastTwo . group $ sort "Mississippi"
["iiii","pp","ssss"]
Good. Now, we need to take only the first element from such lists. Since the lists are non-empty, we can use head safely:
> map head . filter atLeastTwo . group $ sort "Mississippi"
"ips"
Alternatively, we could replace the filter with filter (\xs -> length xs >= 2) but this would be less efficient.
Yet another option is to use a list comprehension
> [ x | (x:_y:_) <- group $ sort "Mississippi" ]
"ips"
This pattern matches on the lists starting with x and having at least another element _y, combining the filter with taking the head.
Okay, good start. One immediate problem is that the specification requires the function to work on lists of numbers, but you define it for strings. The list must be sorted, so its elements must have the typeclass Ord. Therefore, let’s fix the type signature:
repeated :: Ord a => [a] -> [a]
After calling sort and group, you will have a list of lists, [[a]]. Let’s take your idea of using filter. That works. Your predicate should, as you said, check the length of each list in the list, then compare that length to 1.
Filtering a list of lists gives you a subset, which is another list of lists, of type [[a]]. You need to flatten this list. What you want to do is map each entry in the list of lists to one of its elements. For example, the first. There’s a function in the Prelude to do that.
So, you might fill in the following skeleton:
module Repeated (repeated) where
import Data.List (group, sort)
repeated :: Ord a => [a] -> [a]
repeated = map _
. filter (\x -> _)
. group
. sort
I’ve written this in point-free style with the filtering predicate as a lambda expression, but many other ways to write this are equally good. Find one that you like! (For example, you could also write the filter predicate in point-free style, as a composition of two functions: a comparison on the result of length.)
When you try to compile this, the compiler will tell you that there are two typed holes, the _ entries to the right of the equal signs. It will also tell you the type of the holes. The first hole needs a function that takes a list and gives you back a single element. The second hole needs a Boolean expression using x. Fill these in correctly, and your program will work.
Here's some other approaches, to evaluate #chepner's comment on the solution using group $ sort. (Those solutions look simpler, because some of the complexity is hidden in the library routines.)
While it's true that sorting is O(n lg n), ...
It's not just the sorting but especially the group: that uses span, and both of them build and destroy temporary lists. I.e. they do this:
a linear traversal of an unsorted list will require some other data structure to keep track of all possible duplicates, and lookups in each will add to the space complexity at the very least. While carefully chosen data structures could be used to maintain an overall O(n) running time, the constant would probably make the algorithm slower in practice than the O(n lg n) solution, ...
group/span adds considerably to that complexity, so O(n lg n) is not a correct measure.
while greatly complicating the implementation.
The following all traverse the input list just once. Yes they build auxiliary lists. (Probably a Set would give better performance/quicker lookup.) They maybe look more complex, but to compare apples with apples look also at the code for group/span.
repeated2, repeated3, repeated4 :: Ord a => [a] -> [a]
repeated2/inserter2 builds an auxiliary list of pairs [(a, Bool)], in which the Bool is True if the a appears more than once, False if only once so far.
repeated2 xs = sort $ map fst $ filter snd $ foldr inserter2 [] xs
inserter2 :: Ord a => a -> [(a, Bool)] -> [(a, Bool)]
inserter2 x [] = [(x, False)]
inserter2 x (xb#(x', _): xs)
| x == x' = (x', True): xs
| otherwise = xb: inserter2 x xs
repeated3/inserter3 builds an auxiliary list of pairs [(a, Int)], in which the Int counts how many of the a appear. The aux list is sorted anyway, just for the heck of it.
repeated3 xs = map fst $ filter ((> 1).snd) $ foldr inserter3 [] xs
inserter3 :: Ord a => a -> [(a, Int)] -> [(a, Int)]
inserter3 x [] = [(x, 1)]
inserter3 x xss#(xc#(x', c): xs) = case x `compare` x' of
{ LT -> ((x, 1): xss)
; EQ -> ((x', c+1): xs)
; GT -> (xc: inserter3 x xs)
}
repeated4/go4 builds an output list of elements known to repeat. It maintains an intermediate list of elements met once (so far) as it traverses the input list. If it meets a repeat: it adds that element to the output list; deletes it from the intermediate list; filters that element out of the tail of the input list.
repeated4 xs = sort $ go4 [] [] xs
go4 :: Ord a => [a] -> [a] -> [a] -> [a]
go4 repeats _ [] = repeats
go4 repeats onces (x: xs) = case findUpd x onces of
{ (True, oncesU) -> go4 (x: repeats) oncesU (filter (/= x) xs)
; (False, oncesU) -> go4 repeats oncesU xs
}
findUpd :: Ord a => a -> [a] -> (Bool, [a])
findUpd x [] = (False, [x])
findUpd x (x': os) | x == x' = (True, os) -- i.e. x' removed
| otherwise =
let (b, os') = findUpd x os in (b, x': os')
(That last bit of list-fiddling in findUpd is very similar to span.)

Haskell add unique combinations of list to tuple

Say for example that I have a list like this
list = ["AC", "BA"]
I would like to add every unique combination of this list to a tuple so the result is like this:
[("AC", "AC"),("AC","BA"),("BA", "BA")]
where ("BA","AC") is excluded.
My first approach was to use a list comprehension like this:
ya = [(x,y) | x <- list, y <- list]
But I couldn't manage to get it to work, is there anyway to achieve my result by using list comprehensions?
My preferred solution uses a list comprehension
f :: [t] -> [(t, t)]
f list = [ (a,b) | theTail#(a:_) <- tails list , b <- theTail ]
I find this to be quite readable: first you choose (non-deterministically) a suffix theTail, starting with a, and then you choose (non-deterministically) an element b of the suffix. Finally, the pair (a,b) is produced, which clearly ranges over the wanted pairs.
It should also be optimally efficient: every time you demand an element from it, that is produced in constant time.
ThreeFx's answer will work, but it adds the constraint that you elements must be orderable. Instead, you can get away with functions in Prelude and Data.List to implement this more efficiently and more generically:
import Data.List (tails)
permutations2 :: [a] -> [(a, a)]
permutations2 list
= concat
$ zipWith (zip . repeat) list
$ tails list
It doesn't use list comprehensions, but it works without having to perform potentially expensive comparisons and without any constraints on what kind of values you can put through it.
To see how this works, consider that if you had the list [1, 2, 3], you'd have the groups
[(1, 1), (1, 2), (1, 3),
(2, 2), (2, 3),
(3, 3)]
This is equivalent to
[(1, [1, 2, 3]),
(2, [2, 3]),
(3, [3])]
since it doesn't contain any extra or any less information. The transformation from this form to our desired output is to map the function f (x, ys) = map (\y -> (x, y)) ys over each tuple, then concat them together. Now we just need to figure out how to get the second element of those tuples. Quite clearly, we see that all its doing is dropping successive elements off the front of the list. Luckily, this is already implemented for us by the tails function in Data.List. The first element in each of these tuples is just makes up the original list, so we know we can use a zip. Initially, you could implement this with
> concatMap (\(x, ys) -> map (\y -> (x, y)) ys) $ zip list $ tails list
But I personally prefer zips, so I'd turn the inner function into one that doesn't use lambdas more than necessary:
> concatMap (\(x, ys) -> zip (repeat x) ys) $ zip list $ tails list
And since I prefer zipWith f over map (uncurry f) . zip, I'd turn this into
> concat $ zipWith (\x ys -> zip (repeat x) ys) list $ tails list
Now, we can reduce this further:
> concat $ zipWith (\x -> zip (repeat x)) list $ tails list
> concat $ zipWith (zip . repeat) list $ tails list
thanks the eta-reduction and function composition. We could make this entirely pointfree where
> permutations2 = concat . ap (zipWith (zip . repeat)) tails
But I find this pretty hard to read and understand, so I think I'll stick with the previous version.
Just use a list comprehension:
f :: (Ord a) => [a] -> [(a, a)]
f list = [ (a, b) | a <- list, b <- list, a <= b ]
Since Haskell's String is in the Ord typeclass, which means it can be ordered, you first tell Haskell to get all possible combinations and then exclude every combination where b is greater than a which removes all "duplicate" combinations.
Example output:
> f [1,2,3,4]
[(1,1),(1,2),(1,3),(1,4),(2,2),(2,3),(2,4),(3,3),(3,4),(4,4)]

Haskell:: how to compare/extract/add each element between lists

I'm trying to get each element from list of lists.
For example, [1,2,3,4] [1,2,3,4]
I need to create a list which is [1+1, 2+2, 3+3, 4+4]
list can be anything. "abcd" "defg" => ["ad","be","cf","dg"]
The thing is that two list can have different length so I can't use zip.
That's one thing and the other thing is comparing.
I need to compare [1,2,3,4] with [1,2,3,4,5,6,7,8]. First list can be longer than the second list, second list might be longer than the first list.
So, if I compare [1,2,3,4] with [1,2,3,4,5,6,7,8], the result should be [5,6,7,8]. Whatever that first list doesn't have, but the second list has, need to be output.
I also CAN NOT USE ANY RECURSIVE FUNCTION. I can only import Data.Char
The thing is that two list can have different length so I can't use zip.
And what should the result be in this case?
CAN NOT USE ANY RECURSIVE FUNCTION
Then it's impossible. There is going to be recursion somewhere, either in the library functions you use (as in other answers), or in functions you write yourself. I suspect you are misunderstanding your task.
For your first question, you can use zipWith:
zipWith f [a1, a2, ...] [b1, b2, ...] == [f a1 b1, f a2 b2, ...]
like, as in your example,
Prelude> zipWith (+) [1 .. 4] [1 .. 4]
[2,4,6,8]
I'm not sure what you need to have in case of lists with different lengths. Standard zip and zipWith just ignore elements from the longer one which don't have a pair. You could leave them unchanged, and write your own analog of zipWith, but it would be something like zipWithRest :: (a -> a -> a) -> [a] -> [a] -> [a] which contradicts to the types of your second example with strings.
For the second, you can use list comprehensions:
Prelude> [e | e <- [1 .. 8], e `notElem` [1 .. 4]]
[5,6,7,8]
It would be O(nm) slow, though.
For your second question (if I'm reading it correctly), a simple filter or list comprehension would suffice:
uniques a b = filter (not . flip elem a) b
I believe you can solve this using a combination of concat and nub http://www.haskell.org/ghc/docs/6.12.1/html/libraries/base-4.2.0.0/Data-List.html#v%3anub which will remove all duplicates ...
nub (concat [[0,1,2,3], [1,2,3,4]])
you will need to remove unique elements from the first list before doing this. ie 0
(using the same functions)
Padding then zipping
You suggested in a comment the examples:
[1,2,3,4] [1,2,3] => [1+1, 2+2, 3+3, 4+0]
"abcd" "abc" => ["aa","bb","cc"," d"]
We can solve those sorts of problems by padding the list with a default value:
padZipWith :: a -> (a -> a -> b) -> [a] -> [a] -> [b]
padZipWith def op xs ys = zipWith op xs' ys' where
maxlen = max (length xs) (length ys)
xs' = take maxlen (xs ++ repeat def)
ys' = take maxlen (ys ++ repeat def)
so for example:
ghci> padZipWith 0 (+) [4,3] [10,100,1000,10000]
[14,103,1000,10000]
ghci> padZipWith ' ' (\x y -> [x,y]) "Hi" "Hello"
["HH","ie"," l"," l"," o"]
(You could rewrite padZipWith to have two separate defaults, one for each list, so you could allow the two lists to have different types, but that doesn't sound super useful.)
General going beyond the common length
For your first question about zipping beyond common length:
How about splitting your lists into an initial segment both have and a tail that only one of them has, using splitAt :: Int -> [a] -> ([a], [a]) from Data.List:
bits xs ys = (frontxs,frontys,backxs,backys) where
(frontxs,backxs) = splitAt (length ys) xs
(frontys,backys) = splitAt (length xs) ys
Example:
ghci> bits "Hello Mum" "Hi everyone else"
("Hello Mum","Hi everyo","","ne else")
You could use that various ways:
larger xs ys = let (frontxs,frontys,backxs,backys) = bits xs ys in
zipWith (\x y -> if x > y then x else y) frontxs frontys ++ backxs ++ backys
needlesslyComplicatedCmpLen xs ys = let (_,_,backxs,backys) = bits xs ys in
if null backxs && null backys then EQ
else if null backxs then LT else GT
-- better written as compare (length xs) (length ys)
so
ghci> larger "Hello Mum" "Hi everyone else"
"Hillveryone else"
ghci> needlesslyComplicatedCmpLen "Hello Mum" "Hi everyone else"
LT
but once you've got the hang of splitAt, take, takeWhile, drop etc, I doubt you'll need to write an auxiliary function like bits.

Haskell - how to iterate list elements in reverse order in an elegant way?

I'm trying to write a function that given a list of numbers, returns a list where every 2nd number is doubled in value, starting from the last element. So if the list elements are 1..n, n-th is going to be left as-is, (n-1)-th is going to be doubled in value, (n-2)-th is going to be left as-is, etc.
So here's how I solved it:
MyFunc :: [Integer] -> [Integer]
MyFunc xs = reverse (MyFuncHelper (reverse xs))
MyFuncHelper :: [Integer] -> [Integer]
MyFuncHelper [] = []
MyFuncHelper (x:[]) = [x]
MyFuncHelper (x:y:zs) = [x,y*2] ++ MyFuncHelper zs
And it works:
MyFunc [1,1,1,1] = [2,1,2,1]
MyFunc [1,1,1] = [1,2,1]
However, I can't help but think there has to be a simpler solution than reversing the list, processing it and then reversing it again. Could I simply iterate the list backwards? If yes, how?
The under reversed f xs idiom from the lens library will apply f to xs in reverse order:
under reversed (take 5) [1..100] => [96,97,98,99,100]
When you need to process the list from the end, usually foldr works pretty well. Here is a solution for you without reversing the whole list twice:
doubleOdd :: Num a => [a] -> [a]
doubleOdd = fst . foldr multiplyCond ([], False)
where multiplyCond x (rest, flag) = ((if flag then (x * 2) else x) : rest, not flag)
The multiplyCond function takes a tuple with a flag and the accumulator list. The flag constantly toggles on and off to track whether we should multiply the element or not. The accumulator list simply gathers the resulting numbers. This solution may be not so concise, but avoids extra work and doesn't use anything but prelude functions.
myFunc = reverse
. map (\(b,x) -> if b then x*2 else x)
. zip (cycle [False,True])
. reverse
But this isn't much better. Your implementation is sufficiently elegant.
The simplest way to iterate the list backwards is to reverse the list. I don't think you can really do much better than that; I suspect that if you have to traverse the whole list to find the end, and remember how to get back up, you might as well just reverse it. If this is a big deal, maybe you should be using some other data structure instead of lists—Vector or Seq might be good choices.
Another way to write your helper function is to use Traversable:
import Control.Monad.State
import Data.Traversable (Traversable, traverse)
toggle :: (Bool -> a -> b) -> a -> State Bool b
toggle f a =
do active <- get
put (not active)
return (f active a)
doubleEvens :: (Num a, Traversable t) => t a -> t a
doubleEvens xs = evalState (traverse (toggle step) xs) False
where step True x = 2*x
step False x = x
yourFunc :: Num a => [a] -> [a]
yourFunc = reverse . doubleEvens
Or if we go a bit crazy with Foldable and Traversable, we can try this:
Use Foldable's foldl to extract a reverse-order list from any of its instances. For some types this will be more efficient than reversing a list.
Then we can use traverse and State to map each element of the original structure to its counterpart in the reversed order.
Here's how to do it:
import Control.Monad.State
import Data.Foldable (Foldable)
import qualified Data.Foldable as F
import Data.Traversable (Traversable, traverse)
import Data.Map (Map)
import qualified Data.Map as Map
toReversedList :: Foldable t => t a -> [a]
toReversedList = F.foldl (flip (:)) []
reverse' :: Traversable t => t a -> t a
reverse' ta = evalState (traverse step ta) (toReversedList ta)
where step _ = do (h:t) <- get
put t
return h
yourFunc' :: (Traversable t, Num a) => t a -> t a
yourFunc' = reverse' . doubleEvens
-- >>> yourFunc' $ Map.fromList [(1, 1), (2, 1), (3, 1), (4, 1)]
-- fromList [(1,2),(2,1),(3,2),(4,1)]
-- >>> yourFunc' $ Map.fromList [(1, 1), (2, 1), (3, 1)]
-- fromList [(1,1),(2,2),(3,1)]
There's probably a better way to do this, though...
func xs = zipWith (*) xs $ reverse . (take $ length xs) $ cycle [1,2]