How to merge two lists of tuples? - list

I have two lists in Scala, how to merge them such that the tuples are grouped together?
Is there an existing Scala list API which can do this or need I do it by myself?
Input:
List((a,4), (b,1), (c,1), (d,1))
List((a,1), (b,1), (c,1))
Expected output:
List((a,5),(b,2),(c,2),(d,1))

You can try the following one-line:
scala> ( l1 ++ l2 ).groupBy( _._1 ).map( kv => (kv._1, kv._2.map( _._2).sum ) ).toList
res6: List[(Symbol, Int)] = List(('a,5), ('c,2), ('b,2), ('d,1))
Where l1 and l2 are the lists of tuples you want merge.
Now, the breakdown:
(l1 ++ l2) you just concatenate both lists
.groupBy( _._1) you group all tuples by their first element. You will receive a Map with
the first element as key and lists of tuples starting with this element as values.
.map( kv => (kv._1, kv._2.map( _._2).sum ) ) you make a new map, with similar keys, but the values are the sum of all second elements.
.toList you convert the result back to a list.
Alternatively, you can use pattern matching to access the tuple elements.
( l1 ++ l2 ).groupBy( _._1 ).map{
case (key,tuples) => (key, tuples.map( _._2).sum )
}.toList

Alternatively you can also use mapValues to shorten the code.
mapValues, as you can probably guess, allows you to re-map just the value for each (key, value) pair in the Map created by groupBy.
In this case the function passed to mapValues reduces each (Char, Int) tuple to just the Int then sums the resulting List of Ints.
(l1 ::: l2).groupBy(_._1).mapValues(_.map(_._2).sum).toList
If the order of the output list needs to follow your example, just add sorted which relies on an Ordering[(Char, Int)] implicit instance.
(l1 ::: l2).groupBy(_._1).mapValues(_.map(_._2).sum).toList.sorted

If you can assume that both List[(A,B)] are ordered according to Ordering[A], you could write something like:
def mergeLists[A,B](one:List[(A,B)], two:List[(A,B)])(op:(B,B)=>B)(implicit ord:Ordering[A]): List[(A,B)] = (one,two) match {
case (xs, Nil) => xs
case (Nil, ys) => ys
case((a,b)::xs,(aa,bb)::ys) =>
if (a == aa) (a, op(b,bb)) :: mergeLists(xs,ys)(op)(ord)
else if (ord.lt(a,aa)) (a, b) :: mergeLists(xs, (aa,bb)::ys)(op)(ord)
else (aa, bb) :: mergeLists((a,b) :: xs, ys)(op)(ord)
}
Unfortunately this isn't tail recursive.

Using foldLeft and toMap:
Get a map out of one list, and iterate through the second list. We upsert entries into the map.
l1.foldLeft(l2.toMap)((accumulator, tuple) =>
accumulator + (tuple._1 -> (accumulator.getOrElse(tuple._1, 0) + tuple._2))
).toList
results into:
List((Symbol(a),5), (Symbol(b),2), (Symbol(c),2), (Symbol(d),1))
Explanation:
l2.toMap converts List((a,1), (b,1), (c,1)) into immutable.Map(a->1, b->1, c->1)
foldLeft iterates through each tuple of list#1 l1.
(a,4) of list1 is added to the generated map, resulting to immutable.Map(a->1+4, b->1, c->1)
(b,1) of list1 is added to the generated map, resulting to immutable.Map(a->5, b->2, c->1)
(c,1) of list1 is added to the generated map, resulting to immutable.Map(a->5, b->2, c->2)
(d,1) of list1 is added to the generated map, resulting to immutable.Map(a->5, b->2, c->2, d->1)
toList converts the map back to the original input form, i.e. List[(Symbol, Int)]

Related

Get the first elements of a list of tuples

I have this list of tuples
[(4,'a'), (1,'b'), (2,'c'), (2,'a'), (1,'d'), (4,'e')]
I want to get the first elements of every tuple then replicate it to make the following: "aaaabccaadeeee"
I came up with this code, but it only gives me the replicate of the first tuple.
replicate (fst ( head [(4,'a'), (1,'b')])) ( snd ( head [(4,'a'), (1,'b')]))
--output is: "aaaa"
I was thinking to use map for to get the replicate of every tuple, but I didn't succeed.
Since you already know how to find the correct answer for a single element, all you need is a little recursion
func :: [(Int, a)] -> [a]
func [] = []
func ((n, elem):rest) = (replicate n elem) ++ (func rest)
Mapping the values should also work. You just need to concatenate the resulting strings into one.
func :: [(Int, a)] -> [a]
func xs = concat $ map func2 xs where
func2 (n, elem) = replicate n elem
Or, if you are familiar with currying:
func :: [(Int, a)] -> [a]
func xs = concat $ map (uncurry replicate) xs
Finally, if you are comfortable using function composition, the definition becomes:
func :: [(Int, a)] -> [a]
func = concat . map (uncurry replicate)
Using concat and map is so common, there is a function to do just that. It's concatMap.
func :: [(Int, a)] -> [a]
func = concatMap (uncurry replicate)
Let
ls = [(4,'a'), (1,'b'), (2,'c'), (2,'a'), (1,'d'), (4,'e')]
in
concat [replicate i x | (i, x) <- ls]
will give
"aaaabccaadeeee"
The point-free version
concat . map (uncurry replicate)
You are correct about trying to use map. But first lets see why your code did not work
replicate (fst ( head [(4,'a'), (1,'b')])) ( snd ( head [(4,'a'), (1,'b')]))
Your first parameter to replicate is the head of your list which is (4, 'a'). Then you are calling fst on this, thus the first parameter is 4. Same things happens with second parameter and you get 'a'. The result of which you see.
Before using map lets try to do this with recursion. You want to take one element of list and apply replicate to it and then combine it with the result of applying replicate on the second element.
generate [] = []
generate (x:xs) = replicate (fst x) (snd x) ++ generate xs
Do note I am using pattern matching to get the first element of list. You can us the pattern matching to get the element inside the tuple as well, and then you would not need to use the fst/snd functions. Also note I am using pattern matching to define the base case of empty list.
generate [] = []
generate ((x,y):xs) = replicate x y ++ generate xs
Now coming to map, so map will apply your function to every element of the list, here's the first try
generate (x,y) = replicate x y
map generate xs
The result of the above will be slightly different from recursion. Think about it, map is going to apply generate to every element and store the result in a list. generate creates a list. So when you apply map you are creating a list of list. You can use concat to flatten it if you want, which will give you the same result as recursion.
Last thing, if you can use recursion, then you can use fold as well. Fold will just apply a function to every element of the list and return the accumulated results (broadly speaking).
--first parameter is the function to apply, second is the accumulator, third is your list
foldr step [] xs
where step (x,y) acc =
(replicate x y) ++ acc
Again here I have used pattern matching in the function step to extract the elements of the tuple out.

How do I pair the elements in the first list with all the elements in a second list in SML?

The Problem Statement: Write a function pair that takes two lists of integers and generates a list of pairs, where each pair is a combination of each element from each list.
For example, pair ([1,2], [3,4,5]) should return
[(1,3), (1,4), (1,5), (2,3), (2,4), (2,5)].
My work so far:
-fun pair(a:int list, b:int list) = if null a then nil else if null b then nil else (hd a, hd b)::pair(a, tl b)#pair(tl a, b);
val pair = fn : int list * int list -> (int * int) list
-pair([1,2],[3,4,5]);
val it = [(1,3),(1,4),(1,5),(2,5),(2,4),(2,5),(2,3),(2,4),(2,5)]
I've tried to trace the function to find out why the (2,5), (2,4), (2,5) are showing up but I'm not seeing it clearly still.
It seems straightforward enough but I can't seem to get the last bits ironed out. Some assistance pinpointing why those elements are being added in the middle would be of help.
Thanks.
Peter
The main problem is that you're recursing over both lists.
If you look at your example,
pair ([1,2], [3,4,5]) -> [(1,3), (1,4), (1,5), (2,3), (2,4), (2,5)]
you will see that it has two sublists,
[(1,3), (1,4), (1,5)]
[(2,3), (2,4), (2,5)]
where the first consists of pairs formed from the first element of [1,2] and every element of [3,4,5], and the second is the second element of [1,2] also paired with every element of [3,4,5].
Note that each sublist contains all of [3,4,5] but only one element of [1,2] - the first is the same as pair ([1], [3,4,5]) and the second is pair ([2], [3,4,5]) - so you should only need to recurse over the first list.
You can create such a list like this:
If any input list is empty, the result is empty.
Otherwise:
Take the first element of a and pair it with every element of b in a list (hint: think about map.)
Recursively make pairs from the tail of a and all of b.
Combine the results of 1 and 2.
With pattern matching:
fun pair ([], _) = []
| pair (_, []) = []
| pair (x::xs, ys) = <something involving x and ys, suitably combined with 'pairs (xs, ys)'>
It might help if you write step 1 as a separate function.
Since this is an exercise, I'm not going to show you the answer to the problem statement.
What you're trying to generate is called the cartesian product of the two lists.
Your current approach (formatted a bit nicer),
fun pair (a, b) =
if null a then nil else
if null b then nil else
(hd a, hd b) :: pair (a, tl b) # pair (tl a, b);
produces duplicate results down the line because you leave out hd b in pair (a, tl b) and you leave out hd a in pair (b, tl a), but on the second iteration of e.g. pair (a, tl b), the first element of a is processed once again for each remaining element of tl b.
You can avoid this duplication of work by addressing each element once. I would recommend that you look at the functions map and concat. The general approach is this: For every element x of a, generate (x,y) for every element y of b. "For every element" is map. And
map (fn x => ...something with (x,y)...) a
produces a list of results, just like you want. But if you repeat the same approach of map (fn y => ...) b as the ...something with (x,y)... part, you'll run into an inconvenient surprise that concat can help you with.
You can solve this exercise without using map and concat and instead using manual recursion, but you might have to split the work into two functions, since you'll want one function that folds over the x of a and, for each x, fold once over b. The function map takes the recursion part that both of these functions would have in common and lets you only write the things they don't have in common.

Calculating the difference between two strings

I have two strings
a :: [String]
a = ["A1","A2","B3","C3"]
and
b :: [String]
b = ["A1","B2","B3","D5"]
And I want to calculate the difference between two strings based on the first character and second character and combination of two characters.
If the combination of two elements are the same, it would be calculate as 1
The function I declared is
calcP :: [String] -> [String] -> (Int,[String])
calcP (x:xs) (y:ys) = (a,b)
where
a = 0 in
???
b = ????
I know that I should have a increment variable to count the correct element, and where I should put it in? For now I totally have no idea about how to do that, can anyone give me some hint??
The desired result would be
(2,["B2","D5"])
How should I do that?
I assume that the lists have the same size.
The differences between the two lists
Let's focus on the main part of the problem:
Prelude> a=["A1","A2","B3","C3"]
Prelude> b=["A1","B2","B3","D5"]
First, notice that the zip method zips two lists. If you use it on a and b, you get:
Prelude> zip a b
[("A1","A1"),("A2","B2"),("B3","B3"),("C3","D5")]
Ok. It's now time to compare the terms one to one. There are many ways to do it.
Filter
Prelude> filter(\(x,y)->x/=y)(zip a b)
[("A2","B2"),("C3","D5")]
The lambda function returns True if the elements of the pair are different (/= operator). Thus, the filter keeps only the pairs that don't match.
It's ok, but you have to do a little more job to keep only the second element of each pair.
Prelude> map(snd)(filter(\(x,y)->x/=y)(zip a b))
["B2","D5"]
map(snd) applies snd, which keeps only the second element of a pair, to every discordant pair.
Fold
A fold is more generic, and may be used to implement a filter. Let's see how:
Prelude> foldl(\l(x,y)->if x==y then l else l++[y])[](zip a b)
["B2","D5"]
The lambda function takes every pair (x,y) and compares the two elements. If they have the same value, the accumulator list remains the identical, but if the values are different, the accumulator list is augmented by the second element.
List comprehension
This is more compact, and should seem obvious to every Python programmer:
Prelude> [y|(x,y)<-zip a b, x/=y] -- in Python [y for (x,y) in zip(a,b) if x!= y]
["B2","D5"]
The number of elements
You want a pair with the number of elements and the elements themselves.
Fold
With a fold, it's easy but cumbersome: you will use a slightly more complicated accumulator, that stores simultaneously the differences (l) and the number of those differences (n).
Prelude> foldl(\(n,l)(x,y)->if x==y then (n,l) else (n+1,l++[y]))(0,[])$zip a b
(2,["B2","D5"])
Lambda
But you can use the fact that your output is redundant: you want a list preceeded by the length of that list. Why not apply a lambda that does the job?
Prelude> (\x->(length x,x))[1,2,3]
(3,[1,2,3])
With a list comprehension, it gives:
Prelude> (\x->(length x,x))[y|(x,y)<-zip a b, x/=y]
(2,["B2","D5"])
Bind operator
Finally, and for the fun, you don't need to build the lambda this way. You could do:
Prelude> ((,)=<<length)[y|(x,y)<-zip a b,x/=y]
(2,["B2","D5"])
What happens here? (,) is a operator that makes a pair from two elements:
Prelude> (,) 1 2
(1,2)
and ((,)=<<length) : 1. takes a list (technically a Foldable) and passes it to the length function; 2. the list and the length are then passed by =<< (the "bind" operator) to the (,) operator, hence the expected result.
Partial conclusion
"There is more than than one way to do it" (but it's not Perl!)
Haskell offers a lot of builtins functions and operators to handle this kind of basic manipulation.
What about doing it recursively? If two elements are the same, the first element of the resulting tuple is incremented; otherwise, the second element of the resulting tuple is appended by the mismatched element:
calcP :: [String] -> [String] -> (Int,[String])
calcP (x:xs) (y:ys)
| x == y = increment (calcP xs ys)
| otherwise = append y (calcP xs ys)
where
increment (count, results) = (count + 1, results)
append y (count, results) = (count, y:results)
calcP [] x = (0, x)
calcP x [] = (0, [])
a = ["A1","A2","B3","C3"]
b = ["A1","B2","B3","D5"]
main = print $ calcP a b
The printed result is (2,["B2","D5"])
Note, that
calcP [] x = (0, x)
calcP x [] = (0, [])
are needed to provide exhaustiveness for the pattern matching. In other words, you need to provide the case when one of the passed elements is an empty list. This also provides the following logic:
If the first list is greater than the second one on n elements, these n last elements are ignored.
If the second list is greater than the first one on n elements, these n last elements are appended to the second element of the resulting tuple.
I'd like to propose a very different method than the other folks: namely, compute a "summary statistic" for each pairing of elements between the two lists, and then combine the summaries into your desired result.
First some imports.
import Data.Monoid
import Data.Foldable
For us, the summary statistic is how many matches there are, together with the list of mismatches from the second argument:
type Statistic = (Sum Int, [String])
I've used Sum Int instead of Int to specify how statistics should be combined. (Other options here include Product Int, which would multiply together the values instead of adding them.) We can compute the summary of a single pairing quite simply:
summary :: String -> String -> Statistic
summary a b | a == b = (1, [ ])
| otherwise = (0, [b])
Combining the summaries for all the elements is just a fold:
calcP :: [String] -> [String] -> Statistic
calcP as bs = fold (zipWith summary as bs)
In ghci:
> calcP ["A1", "A2", "B3", "C3"] ["A1", "B2", "B3", "D5"]
(Sum {getSum = 2},["B2","D5"])
This general pattern (of processing elements one at a time into a Monoidal type) is frequently useful, and spotting where it's applicable can greatly simplify your code.

How can I find the index where one list appears as a sublist of another?

I have been working with Haskell for a little over a week now so I am practicing some functions that might be useful for something. I want to compare two lists recursively. When the first list appears in the second list, I simply want to return the index at where the list starts to match. The index would begin at 0. Here is an example of what I want to execute for clarification:
subList [1,2,3] [4,4,1,2,3,5,6]
the result should be 2
I have attempted to code it:
subList :: [a] -> [a] -> a
subList [] = []
subList (x:xs) = x + 1 (subList xs)
subList xs = [ y:zs | (y,ys) <- select xs, zs <- subList ys]
where select [] = []
select (x:xs) = x
I am receiving an "error on input" and I cannot figure out why my syntax is not working. Any suggestions?
Let's first look at the function signature. You want to take in two lists whose contents can be compared for equality and return an index like so
subList :: Eq a => [a] -> [a] -> Int
So now we go through pattern matching on the arguments. First off, when the second list is empty then there is nothing we can do, so we'll return -1 as an error condition
subList _ [] = -1
Then we look at the recursive step
subList as xxs#(x:xs)
| all (uncurry (==)) $ zip as xxs = 0
| otherwise = 1 + subList as xs
You should be familiar with the guard syntax I've used, although you may not be familiar with the # syntax. Essentially it means that xxs is just a sub-in for if we had used (x:xs).
You may not be familiar with all, uncurry, and possibly zip so let me elaborate on those more. zip has the function signature zip :: [a] -> [b] -> [(a,b)], so it takes two lists and pairs up their elements (and if one list is longer than the other, it just chops off the excess). uncurry is weird so lets just look at (uncurry (==)), its signature is (uncurry (==)) :: Eq a => (a, a) -> Bool, it essentially checks if both the first and second element in the pair are equal. Finally, all will walk over the list and see if the first and second of each pair is equal and return true if that is the case.

How can I write a function in Haskell that takes a list of Ints and returns all the contiguous sublists of that list?

The function needs to take an ordered list of integer elements and return all the combinations of adjacent elements in the original list. e.g [1,2,3] would return [[1,2,3],[1],[1,2],[2],[2,3],[3]].
Note that [1,3] should not be included, as 1 and 3 are not adjacent in the original list.
Apart from the fact that inits and tails aren't found in Prelude, you can define your function as such:
yourFunction :: [a] -> [[a]]
yourFunction = filter (not . null) . concat . map inits . tails
This is what it does, step by step:
tails gives all versions of a list with zero or more starting elements removed: tails [1,2,3] == [[1,2,3],[2,3],[3],[]]
map inits applies inits to every list given by tails, and does exactly the opposite: it gives all versions of a list with zero or more ending elements removed: inits [1,2,3] == [[],[1],[1,2],[1,2,3]]
I hope you already know concat: it applies (++) where you see (:) in a list: concat [[1,2],[3],[],[4]] == [1,2,3,4]. You need this, because after map inits . tails, you end up with a list of lists of lists, while you want a list of lists.
filter (not . null) removes the empty lists from the result. There will be more than one (unless you use the function on the empty list).
You could also use concatMap inits instead of concat . map inits, which does exactly the same thing. It usually also performs better.
Edit: you can define this with Prelude-only functions as such:
yourFunction = concatMap inits . tails
where inits = takeWhile (not . null) . iterate init
tails = takeWhile (not . null) . iterate tail
So, if you need consecutive and non empty answers (as you've noticed in comment).
At first, let's define a simple sublist function.
sublist' [] = [[]]
sublist' (x:xs) = sublist' xs ++ map (x:) (sublist' xs)
It returns all sublists with empty and non-consecutive lists. So we need to filtering elements of that list. Something like sublists = (filter consecutive) . filter (/= []) . sublist'
To check list for it's consecution we need to get pairs of neighbors (compactByN 2) and check them.
compactByN :: Int -> [a] -> [[a]]
compactByN _ [] = [[]]
compactByN n list | length list == n = [list]
compactByN n list#(x:xs)= take n list : compactByN n xs
And finally
consecutive :: [Int] -> Bool
consecutive [_] = True
consecutive x = all (\[x,y] -> (x + 1 == y)) $ compact_by_n 2 x
And we have
λ> sublists [1,2,3]
[[3],[2],[2,3],[1],[1,2],[1,2,3]]
Done. http://hpaste.org/53965
Unless, I'm mistaken, you're just asking for the superset of the numbers.
The code is fairly self explanatory - our superset is recursively built by building the superset of the tail twice, once with our current head in it, and once without, and then combining them together and with a list containing our head.
superset xs = []:(superset' xs) -- remember the empty list
superset' (x:xs) = [x]:(map (x:) (superset' xs)) ++ superset' xs
superset' [] = []