Removing inverted duplicates from list of tuples

Removing inverted duplicates from list of tuples - list

So basically I have a list of tuples [(a,b)], from which i have to do some filtering. One job is to remove inverted duplicates such that if (a,b) and (b,a) exist in the list, I only take one instance of them. But the list comprehension has not been very helpful. How to go about this in an efficient manner?
Thanks

Perhaps an efficient way to do so (O(n log(n))) would be to track the tuples (and their reverses) already added, using Set:
import qualified Data.Set as Set
removeDups' :: Ord a => [(a, a)] -> Set.Set (a, a) -> [(a, a)]
removeDups' [] _ = []
removeDups' ((a, b):tl) s | (a, b) `Set.member` s = removeDups' tl s
removeDups' ((a, b):tl) s | (b, a) `Set.member` s = removeDups' tl s
removeDups' ((a, b):tl) s = ((a, b):rest) where
s' = Set.insert (a, b) s
rest = removeDups' tl s'
removeDups :: Ord a => [(a, a)] -> [(a, a)]
removeDups l = removeDups' l (Set.fromList [])
The function removeDups calls the auxiliary function removeDups' with the list, and an empty set. For each pair, if it or its inverse are in the set, it is passed; otherwise, both it and its inverses are added, and the tail is processed. \
The complexity is O(n log(n)), as the size of the set is at most linear in n, at each step.
Example
...
main = do
putStrLn $ show $ removeDups [(1, 2), (1, 3), (2, 1)]
and
$ ghc ord.hs && ./ord
[1 of 1] Compiling Main ( ord.hs, ord.o )
Linking ord ...
[(1,2),(1,3)]

You can filter them using your own function:
checkEqTuple :: (a, b) -> (a, b) -> Bool
checkEqTuple (x, y) (x', y') | (x==y' && y == x') = True
| (x==x' && y == y') = True
| otherwise = False
then use nubBy
Prelude Data.List> nubBy checkEqTuple [(1,2), (2,1)]
[(1,2)]

I feel like I'm repeating myself a bit, but that's okay. None of this code had been tested or even compiled, so there may be bugs. Suppose we can impose an Ord constraint for efficiency. I'll start with a limited implementation of sets of pairs.
import qualified Data.Set as S
import qualified Data.Map.Strict as M
newtype PairSet a b =
PS (M.Map a (S.Set b))
empty :: PairSet a b
empty = PS M.empty
insert :: (Ord a, Ord b)
=> (a, b) -> PairSet a b -> PairSet a b
insert (a, b) (PS m) = PS $ M.insertWith S.union a (S.singleton b) m
member :: (Ord a, Ord b)
=> (a, b) -> PairSet a b -> Bool
member (a, b) (PS m) =
case M.lookup a m of
Nothing -> False
Just s -> S.member b s
Now we just need to keep track of which pairs we've seen.
order :: Ord a => (a, a) -> (a, a)
order p#(a, b)
| a <= b = p
| otherwise = (b, a)
nubSwaps :: Ord a => [(a,a)] -> [(a,a)]
nubSwaps xs = foldr go (`seq` []) xs empty where
go p r s
| member op s = r s
| otherwise = p : r (insert op s)
where op = order p

If a and b are ordered and compareable, you could just do this:
[(a,b) | (a,b) <- yourList, a<=b]

Related

Reverse list of tuples of nodes and edges (Haskell)

I have a list of nodes and edges, represented as tuples where the first element is a node, and the second element is a list of all nodes it has an edge to. I am trying to reverse the list like so:
ghci> snuN [("a",["b"]),("b",["c"]),("c",["a","d"]),("e",["d"])]
ghci> [("a",["c"]),("b",["a"]),("c",["b"]),("d",["c","e"]),("e",[])]
So far, I've written this code:
snuH :: Eq t => [(t,[t])] -> [(t,[t])]
snuH [] = []
snuH ps#((x, xs):rest) =
if (length xs <= 1) && not (x `isInSublist` ps)
then [(y,[x])| y <- xs] ++ snuH rest ++ [(x, [])]
else [(y,[x])| y <- xs] ++ snuH rest
isInSublist :: Eq t => t -> [(t,[t])] -> Bool
isInSublist _ [] = False
isInSublist x ((y, ys):rest) = (x `elem` ys) || isInSublist x rest
combine :: Eq t => [(t,[t])] -> [(t,[t])]
combine ps#((x, xs):(y, ys):rest) = if x == y then (x, xs++ys):rest else (x, xs):combine((y, ys):rest)
snuN :: Eq t => [(t, [t])] -> [(t, [t])]
snuN ls = combine $ snuH ls
The first function gives me this output:
ghci> snuH [("a",["b"]),("b",["c"]),("c",["a","d"]),("e",["d"])]
ghci> [("b",["a"]),("c",["b"]),("a",["c"]),("d",["c"]),("d",["e"]),("e",[]),("b",[])]
Which is not quite the result I wanted, because it creates two tuples with the same first element (("d",["c"]),("d",["e"])), and it has the extra ("b",[]) as an element when it shouldn't. I wrote the combine helper-function to fix the problem, which gives me this output:
ghci> snuN [("a",["b"]),("b",["c"]),("c",["a","d"]),("e",["d"])]
ghci> [("b",["a"]),("c",["b"]),("a",["c"]),("d",["c","e"]),("e",[]),("b",[])]
Which fixes the problem with the two tuples with the same first element, but I still have the extra ("b",[]) which I can't figure out how to fix, I assume there's something wrong with my snuH but I can't see where the problem is.
Can you tell me what im doing wrong here? I don't understan why I get the extra ("b",[]). All help is appreciated!

I'd argue that the following list comprehension gives you what you need:
type Graph node = [(node, [node])]
converse :: Eq node => Graph node -> Graph node
converse g = [(v, [e | (e, es) <- g, v `elem` es]) | (v, _) <- g]
However, if you try it out, you'll get:
> converse [("a",["b"]),("b",["c"]),("c",["a","d"]),("e",["d"])]
[("a",["c"]),("b",["a"]),("c",["b"]),("e",[])]
Compared to the example you gave, the entry for "d" is missing from the output. That's because the input did not mention an explicit entry ("d", []).
To compensate for this, we could put a bit more logic in retrieving the complete list of nodes from the graph, also accounting for the "implied" ones:
nodes :: Eq node => Graph node -> [node]
nodes g = nub $ concat [v : es | (v, es) <- g]
Note: this requires importing nub from Data.List.
Then, we can write:
converse' :: Eq node => Graph node -> Graph node
converse' g = [(v, [e | (e, es) <- g, v `elem` es]) | v <- nodes g]
And, indeed, we yield:
> converse' [("a",["b"]),("b",["c"]),("c",["a","d"]),("e",["d"])]
[("a",["c"]),("b",["a"]),("c",["b"]),("d",["c","e"]),("e",[])]

You have [(a, [a])], which maps nodes to the nodes they have an edge to. One approach to "reversing" this is to first convert it to a list of all the edges. We can actually generalize the type a bit here, to distinguish from and to nodes.
allEdges :: [(a, [b])] -> [(a, b)]
allEdges g = [(a, b) | (a, bs) <- g, b <- bs]
Now it's just a matter of gathering up the nodes with an edge to each particular node:
import Data.Map.Strict (Map)
import qualified Data.Map.Strict as M
gather :: Ord b => [(a,b)] -> Map b [a]
gather edges = M.fromListWith (++) [(b, [a]) | (a, b) <- edges]
Now we can just use M.assocs to convert that map to a list!
The above code will leave out nodes that have no edges going to them. We can patch that up with a bit of extra work.
reverseGraph :: Ord a => [(a, [a])] -> [(a, [a])]
reverseGraph = M.assocs . M.fromListWith (++) . gunk
where
gunk g = [q | (a, bs) <- g, q <- (a, []) : [(b, [a]) | b <- bs]]
The idea here is that when we see (a, bs), we insert the empty edge set (a, []) along with the nonempty ones (b, [a] for each b in bs.

How to extract the maximum element from a List in haskell?

I am new to Haskell and I want to extract the maximum element from a given List so that I end up with the maximum element x and the remaining list xs (not containing x). It can be assumed that the elements of the list are unique.
The type of function I want to implement is somewhat like this:
maxElement :: (Ord b) => (a -> b) -> [a] -> (a, [a])
Notably, the first argument is a function that turns an element into a comparable form. Also, this function is non-total as it would fail given an empty List.
My current approach fails to keep the elements in the remainder list in place, meaning given [5, 2, 4, 6] it returns (6, [2, 4, 5]) instead of (6, [5, 2, 4]). Furthermore, it feels like there should be a nicer looking solution.
compareElement :: (Ord b) => (a -> b) -> a -> (b, (a, [a])) -> (b, (a, [a]))
compareElement p x (s, (t, ts))
| s' > s = (s', (x, t:ts))
| otherwise = (s, (t, x:ts))
where s' = p x
maxElement :: (Ord b) => (a -> b) -> [a] -> (a, [a])
maxElement p (t:ts) = snd . foldr (compareElement p) (p t, (t, [])) $ ts
UPDATE
Thanks to the help of the answer of #Ismor and the comment #chi I've updated my implementation and I feel happy with the result.
maxElement :: (Ord b) => (a -> b) -> [a] -> Maybe (b, a, [a], [a])
maxElement p =
let
f x Nothing = Just (p x, x, [], [x])
f x (Just (s, m, xs, ys))
| s' > s = Just (s', x, ys, x:ys)
| otherwise = Just (s, m, x:xs, x:ys)
where s' = p x
in
foldr f Nothing
The result is either Nothing when the given list is empty or Maybe (_, x, xs, _). I could write another "wrapper" function with the originally intended type and call maxElement under the hood, but I believe this also ok.

This answer is more of a personal advise than a proper answer. As a rule of thumb, whenever you find yourself trying to write a loop with an accumulator (as in this case), try to write it in this form
foldr updateAccumulator initialAccumulator --use foldl' if it is better for your use case`
then, follow the types to complete It as shown below
Step 1
Write undefined where needed. You know the function should look like this
maxElement :: (Ord b) => (a -> b) -> [a] -> (a, [a])
maxElement f xs = foldr updateAccumulator initalAccumulator xs
where
updateAccumulator = undefined
initialAccumulator = undefined
Step 2
"Chase the type". Meaning that using the type of maxElement and foldr you can
deduce the types of updateAccumulator and initialAccumulator. Try to reduce polymorphism as much as you can. In this case:
You know foldr :: Foldable t => (a -> b -> b) -> b -> t a -> b
You know your Foldable is [] so It'd be easier to substitute
Hence foldr :: (a -> b -> b) -> b -> [a] -> b
Because you want foldr to produce (a, [a]) you know b ~ (a, [a])
etc... keep going until you know what types your functions have. You can use ghc typed holes in this process, which is a very nice feature
maxElement :: (Ord b) => (a -> b) -> [a] -> (a, [a])
maxElement f xs = foldr updateAccumulator initalAccumulator xs
where
-- Notice that you need to enable an extension to write type signature in where clause
-- updateAccumulator :: a -> (a, [a]) -> (a, [a])
updateAccumulator newElement (currentMax, currentList) = undefined
-- initialAccumulator :: (a, [a])
initialAccumulator = undefined
Step 3
Now, writing down the function should be easier. Below I leave some incomplete parts for you to fill
maxElement :: (Ord b) => (a -> b) -> [a] -> (a, [a])
maxElement f xs = foldr updateAccumulator initalAccumulator xs
where
-- updateAccumulator :: a -> (a, [a]) -> (a, [a])
updateAccumulator newElement (currentMax, currentList) =
if f newElement > f currentMax
then undefined -- How does the accumulator should look when the new element is bigger than the previous maximum?
else undefined
-- initialAccumulator :: (a, [a])
initialAccumulator = undefined -- Tricky!, what does happen if xs is empty?
Hope this clarifies some doubts, and understand I don't give you a complete answer.

I don't know if you were trying to avoid using certain library functions, but Data.List has a maximumBy and deleteBy that do exactly what you want:
import Data.Function (on)
import Data.List (deleteBy, maximumBy)
import Data.Ord (comparing)
maxElement :: (Ord b) => (a -> b) -> [a] -> (a, [a])
maxElement f xs = (max, remaining) where
max = maximumBy (comparing f) xs
remaining = deleteBy ((==) `on` f) max xs

Thanks to the help of the answer of #Ismor and the comment #chi I've updated my implementation and I feel happy with the result.
maxElement :: (Ord b) => (a -> b) -> [a] -> Maybe (b, a, [a], [a])
maxElement p =
let
f x Nothing = Just (p x, x, [], [x])
f x (Just (s, m, xs, ys))
| s' > s = Just (s', x, ys, x:ys)
| otherwise = Just (s, m, x:xs, x:ys)
where s' = p x
in
foldr f Nothing
The result is either Nothing when the given list is empty or Maybe (_, x, xs, _). I could write another "wrapper" function with the originally intended type and call maxElement under the hood, but I believe this is also ok.

Construct the list of all the "zippers" over the input list, then take the maximumBy (comparing (\(_,x,_) -> foo x)) of it, where foo is your Ord b => a -> b function, then reverse-append the first half to the second and put it in a tuple together with the middle element.
A zipper over a list xs is a triple (revpx, x, suffx) where xs == reverse revpx ++ [x] ++ suffx:
> :t comparing (\(_,x,_) -> x)
comparing (\(_,x,_) -> x)
:: Ord a => (t, a, t1) -> (t, a, t1) -> Ordering
Constructing the zippers list is an elementary exercise (see the function picks3 there).
About your edited solution, it can be coded as a foldr over the tails so it's a bit clearer what's going on there:
maxElement :: (Ord b) => (a -> b) -> [a] -> Maybe (b, a, [a])
maxElement p [] = Nothing
maxElement p xs = Just $ foldr f undefined (tails xs)
where
f [x] _ = (p x, x, [])
f (x:xs) (b, m, ys)
| b' > b = (b', x, xs) -- switch over
| otherwise = (b, m, x:ys)
where b' = p x
It's also a bit cleaner as it doesn't return the input list's copy for no apparent reason, as your version did since it used it for internal purposes.
Both ways are in fact emulating a paramorphism.

Checking for all Elements in a Set in Haskell using syntactic sugar

I try to remove the Integer duplicates of a List of (String, Int), where I am guaranteed that there is no String duplicate.
Is it possible to evaluate something like this in Haskell:
I tried:
[(a,b) | (a,b) <- bs, (c,k) <- bs, ((k == b) <= (a == c))]
but this does not yet work.
Edit: I am well aware, that you can achieve that using more complex syntax. For example by recursively searching the List for each elements duplicates...

(NB: this is a completely new version of this answer. Previous was totally off-base.)
To follow your mathematical set comprehension more closely, we can tweak the definition in your answer as
uniquesOnly :: (Eq a, Eq b) => [(a, b)] -> [(a, b)]
uniquesOnly bs =
[(a,b) | (a,b) <- bs,
[(c,d) | (c,d) <- bs, d == b] ==
[(a,d) | (c,d) <- bs, d == b]]
"for all (c,d) in bs such that d==b it follows c==a".
uniquesOnly [(1,1),(2,2),(3,1)] returns [(2,2)].

This is a possible solution:
For example, I have made up this equivalent statement:
removeDuplicates :: [(String, Int)] -> [(String, Int)]
removeDuplicates bs =
[(a,b) | (a,b) <- bs,
length [(c,d) | (c,d) <- bs, d == b] == 1]
But this is not the same statement only an equal one.

The existing answers don't take advantage of the guarantee that the strings are unique or the fact that Int is ordered. Here's one that does.
import Data.List (sortBy, groupBy)
import Data.Function (on)
uniquesOnly :: Ord b => [(a, b)] -> [(a, b)]
uniquesOnly ps
= [ p
| [p] <- groupBy ((==) `on` snd) .
sortBy (compare `on` snd) $ ps ]

haskell: how to get list of numbers which are higher then their neighbours in a starting list

I am trying to learn Haskell and I want to solve one task. I have a list of Integers and I need to add them to another list if they are bigger then both of their neighbors. For Example:
I have a starting list of [0,1,5,2,3,7,8,4] and I need to print out a list which is [5, 8]
This is the code I came up but it returns an empty list:
largest :: [Integer]->[Integer]
largest n
| head n > head (tail n) = head n : largest (tail n)
| otherwise = largest (tail n)

I would solve this as outlined by Thomas M. DuBuisson. Since we want the ends of the list to "count", we'll add negative infinities to each end before creating triples. The monoid-extras package provides a suitable type for this.
import Data.Monoid.Inf
pad :: [a] -> [NegInf a]
pad xs = [negInfty] ++ map negFinite xs ++ [negInfty]
triples :: [a] -> [(a, a, a)]
triples (x:rest#(y:z:_)) = (x,y,z) : triples rest
triples _ = []
isBig :: Ord a => (a,a,a) -> Bool
isBig (x,y,z) = y > x && y > z
scnd :: (a, b, c) -> b
scnd (a, b, c) = b
finites :: [Inf p a] -> [a]
finites xs = [x | Finite x <- xs]
largest :: Ord a => [a] -> [a]
largest = id
. finites
. map scnd
. filter isBig
. triples
. pad
It seems to be working appropriately; in ghci:
> largest [0,1,5,2,3,7,8,4]
[5,8]
> largest [10,1,10]
[10,10]
> largest [3]
[3]
> largest []
[]
You might also consider merging finites, map scnd, and filter isBig in a single list comprehension (then eliminating the definitions of finites, scnd, and isBig):
largest :: Ord a => [a] -> [a]
largest xs = [x | (a, b#(Finite x), c) <- triples (pad xs), a < b, c < b]
But I like the decomposed version better; the finites, scnd, and isBig functions may turn out to be useful elsewhere in your development, especially if you plan to build a few variants of this for different needs.

One thing you might try is lookahead. (Thomas M. DuBuisson suggested a different one that will also work if you handle the final one or two elements correctly.) Since it sounds like this is a problem you want to solve on your own as a learning exercise, I’ll write a skeleton that you can take as a starting-point if you want:
largest :: [Integer] -> [Integer]
largest [] = _
largest [x] = _ -- What should this return?
largest [x1,x2] | x1 > x2 = _
| x1 < x2 = _
| otherwise = _
largest [x1,x2,x3] | x2 > x1 && x2 > x3 = _
| x3 > x2 = _
| otherwise = _
largest (x1:x2:x3:xs) | x2 > x1 && x2 > x3 = _
| otherwise = _
We need the special case of [x1,x2,x3] in addition to (x1:x2:x3:[]) because, according to the clarification in your comment, largest [3,3,2] should return []. but largest [3,2] should return [3]. Therefore, the final three elements require special handling and cannot simply recurse on the final two.
If you also want the result to include the head of the list if it is greater than the second element, you’d make this a helper function and your largest would be something like largest (x1:x2:xs) = (if x1>x2 then [x1] else []) ++ largest' (x1:x2:xs). That is, you want some special handling for the first elements of the original list, which you don’t want to apply to all the sublists when you recurse.

As suggested in the comments, one approach would be to first group the list into tuples of length 3 using Preludes zip3 and tail:
*Main> let xs = [0,1,5,2,3,7,8,4]
*Main> zip3 xs (tail xs) (tail (tail xs))
[(0,1,5),(1,5,2),(5,2,3),(2,3,7),(3,7,8),(7,8,4)]
Which is of type: [a] -> [b] -> [c] -> [(a, b, c)] and [a] -> [a] respectively.
Next you need to find a way to filter out the tuples where the middle element is bigger than the first and last element. One way would be to use Preludes filter function:
*Main> let xs = [(0,1,5),(1,5,2),(5,2,3),(2,3,7),(3,7,8),(7,8,4)]
*Main> filter (\(a, b, c) -> b > a && b > c) xs
[(1,5,2),(7,8,4)]
Which is of type: (a -> Bool) -> [a] -> [a]. This filters out elements of a list based on a Boolean returned from the predicate passed.
Now for the final part, you need to extract the middle element from the filtered tuples above. You can do this easily with Preludes map function:
*Main> let xs = [(1,5,2),(7,8,4)]
*Main> map (\(_, x, _) -> x) xs
[5,8]
Which is of type: (a -> b) -> [a] -> [b]. This function maps elements from a list of type a to b.
The above code stitched together would look like this:
largest :: (Ord a) => [a] -> [a]
largest xs = map (\(_, x, _) -> x) $ filter (\(a, b, c) -> b > a && b > c) $ zip3 xs (tail xs) (tail (tail xs))
Note here I used typeclass Ord, since the above code needs to compare with > and <. It's fine to keep it as Integer here though.

How can i separate tuples in Haskell?

How can i merge a list of tuples without repeating any items in those tuples ?
for example :
from the list [("a","b"),("c,"d"),("a","b)], it should return ["a","b","c","d"]
So i get this error message with that code:
No instance for (Eq a0) arising from a use of `nub'
The type variable `a0' is ambiguous
Possible cause: the monomorphism restriction applied to the following:
merge :: [(a0, a0)] -> [a0] (bound at P.hs:9:1)
Probable fix: give these definition(s) an explicit type signature
or use -XNoMonomorphismRestriction
Note: there are several potential instances:
instance Eq a => Eq (GHC.Real.Ratio a) -- Defined in `GHC.Real'
instance Eq () -- Defined in `GHC.Classes'
instance (Eq a, Eq b) => Eq (a, b) -- Defined in `GHC.Classes'
...plus 22 others
In the first argument of `(.)', namely `nub'
In the expression: nub . mergeTuples
In an equation for `merge':
merge
= nub . mergeTuples
where
mergeTuples = foldr (\ (a, b) r -> a : b : r) []
Failed, modules loaded: none.

Let's separate this out, first, merge the tuples
mergeTuples :: [(a, a)] -> [a]
mergeTuples = concatMap (\(a, b) -> [a, b]) -- Thanks Chuck
-- mergeTuples = foldr (\(a, b) r -> a : b : r) []
and then we can use nub to make it unique
merge :: Eq a => [(a, a)] -> [a]
merge = nub . mergeTuples
If you want this to all be together
merge = nub . mergeTuples
where mergeTuples = concatMap (\(a, b) -> [a, b])
Or if you want to smash it really together (don't do this)
merge [] = []
merge ((a, b) : r) = a : b : filter (\x -> x /= a && x /= b) (merge r)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Removing inverted duplicates from list of tuples - list

You can filter them using your own function: checkEqTuple :: (a, b) -> (a, b) -> Bool checkEqTuple (x, y) (x', y') | (x==y' && y == x') = True | (x==x' && y == y') = True | otherwise = False then use nubBy Prelude Data.List> nubBy checkEqTuple [(1,2), (2,1)] [(1,2)]

If a and b are ordered and compareable, you could just do this: [(a,b) | (a,b) <- yourList, a<=b]

Related

Reverse list of tuples of nodes and edges (Haskell)

How to extract the maximum element from a List in haskell?

Checking for all Elements in a Set in Haskell using syntactic sugar

haskell: how to get list of numbers which are higher then their neighbours in a starting list

How can i separate tuples in Haskell?

Categories

Resources