groupBy with multiple test functions - list

Is there a better and more concise way to write the following code in Haskell? I've tried using if..else but that is getting less readable than the following. I want to avoid traversing the xs list (which is huge!) 8 times to just separate the elements into 8 groups. groupBy from Data.List takes only one test condition function: (a -> a -> Bool) -> [a] -> [[a]].
x1 = filter (check condition1) xs
x2 = filter (check condition2) xs
x3 = filter (check condition3) xs
x4 = filter (check condition4) xs
x5 = filter (check condition5) xs
x6 = filter (check condition6) xs
x7 = filter (check condition7) xs
x8 = filter (check condition8) xs
results = [x1,x2,x3,x4,x5,x6,x7,x8]

This only traverses the list once:
import Data.Functor
import Control.Monad
filterN :: [a -> Bool] -> [a] -> [[a]]
filterN ps =
map catMaybes . transpose .
map (\x -> map (\p -> x <$ guard (p x)) ps)
For each element of the list, the map produces a list of Maybes, each Maybe corresponding to one of the predicates; it is Nothing if the element does not satisfy the predicate, or Just x if it does satisfy the predicate. Then, the transpose shuffles all these lists so that the list is organised by predicate, rather than by element, and the map catMaybes discards the entries for elements that did not satisfy a predicate.
Some explanation: x <$ m is fmap (const x) m, and for Maybe, guard b is if b then Just () else Nothing, so x <$ guard b is if b then Just x else Nothing.
The map could also be written as map (\x -> [x <$ guard (p x) | p <- ps]).

If you insist on one traversing the list only once, you can write
filterMulti :: [a -> Bool] -> [a] -> [[a]]
filterMulti fs xs = go (reverse xs) (repeat []) where
go [] acc = acc
go (y:ys) acc = go ys $ zipWith (\f a -> if f y then y:a else a) fs acc

map (\ cond -> filter (check cond) xs) [condition1, condition2, ..., condition8]

I think you could use groupWith from GHC.Exts.
If you write the a -> b function to assign every element in xs its 'class', I belive groupWith would split xs just the way you want it to, traversing the list just once.

groupBy doesn't really do what you're wanting; even if it did accept multiple predicate functions, it doesn't do any filtering on the list. It just groups together contiguous runs of list elements that satisfy some condition. Even if your filter conditions, when combined, cover all of the elements in the supplied list, this is still a different operation. For instance, groupBy won't modify the order of the list elements, nor will it have the possibility of including a given element more than once in the result, while your operation can do both of those things.
This function will do what you're looking for:
import Control.Applicative
filterMulti :: [a -> Bool] -> [a] -> [[a]]
filterMulti ps as = filter <$> ps <*> pure as
As an example:
> filterMulti [(<2), (>=5)] [2, 5, 1, -2, 5, 1, 7, 3, -20, 76, 8]
[[1, -2, 1, -20], [5, 5, 7, 76, 8]]

As an addendum to nietaki's answer (this should be a comment but it's too long, so if his answer is correct, accept his!), the function a -> b could be written as a series of nested if ... then .. else, but that is not very idiomatic Haskell and not very extensible. This might be slightly better:
import Data.List (elemIndex)
import GHC.Exts (groupWith)
f xs = groupWith test xs
where test x = elemIndex . map ($ x) $ [condition1, ..., condition8]
It categorises each element by the first condition_ it satisfies (and puts those that don't satisfy any into their own category).
(The documentation for elemIndex is here.)

The first function will return a list of "uppdated" lists and the second function will go through the whole list and for each value uppdate the list
myfilter :: a -> [a -> Bool] -> [[a]] -> [[a]]
myfilter _ [] [] = []
myfilter x f:fs l:ls | f x = (x:l): Myfilter x fs ls
| otherwise = l:Myfilter x fs ls
filterall :: [a] -> [a -> Bool] -> [[a]] -> [[a]]
filterall [] _ l = l
filterall x:xs fl l:ls = filterall xs fl (myfilter x fl l)
This should be called with filterall xs [condition1,condition2...] [[],[]...]

Related

Sum corresponding elements of two lists, with the extra elements of the longer list added at the end

I'm trying to add two lists together and keep the extra elements that are unused and add those into the new list e.g.
addLists [1,2,3] [1,3,5,7,9] = [2,5,8,7,9]
I have this so far:
addLists :: Num a => [a] -> [a] -> [a]
addLists xs ys = zipWith (+) xs ys
but unsure of how to get the extra elements into the new list.
and the next step is changing this to a higher order function that takes the combining function
as an argument:
longZip :: (a -> a -> a) -> [a] -> [a] -> [a]
zipWith :: (a -> b -> c) -> [a] -> [b] -> [c] is implemented as [src]:
zipWith :: (a->b->c) -> [a]->[b]->[c]
zipWith f = go
where
go [] _ = []
go _ [] = []
go (x:xs) (y:ys) = f x y : go xs ys
It thus uses explicit recursion where go will check if the two lists are non-empty and in that case yield f x y, otherwise it stops and returns an empty list [].
You can implement a variant of zipWith which will continue, even if one of the lists is empty. THis will look like:
zipLongest :: (a -> a -> a) -> [a] -> [a] -> [a]
zipLongest f = go
where go [] ys = …
go xs [] = …
go (x:xs) (y:ys) = f x y : go xs ys
where you still need to fill in ….
You can do it with higher order functions as simple as
import Data.List (transpose)
addLists :: Num a => [a] -> [a] -> [a]
addLists xs ys = map sum . transpose $ [xs, ys]
because the length of transpose[xs, ys, ...] is the length of the longest list in its argument list, and sum :: (Foldable t, Num a) => t a -> a is already defined to sum the elements of a list (since lists are Foldable).
transpose is used here as a kind of a zip (but cutting on the longest instead of the shortest list), with [] being a default element for the lists addition ++, like 0 is a default element for the numbers addition +:
cutLongest [xs, ys] $
zipWith (++) (map pure xs ++ repeat []) (map pure ys ++ repeat [])
See also:
Zip with default value instead of dropping values?
You're looking for the semialign package. It gives you an operation like zipping, but that keeps going until both lists run out. It also generalizes to types other than lists, such as rose trees. In your case, you'd use it like this:
import Data.Semialign
import Data.These
addLists :: (Semialign f, Num a) => f a -> f a -> f a
addLists = alignWith (mergeThese (+))
longZip :: Semialign f => (a -> a -> a) -> f a -> f a -> f a
longZip = alignWith . mergeThese
The new type signatures are optional. If you want, you can keep using your old ones that restrict them to lists.

How to enhance small Haskell Code Snippet

just recently I started to try out haskell.
It's fun trying out different exercises, but sometimes I get the feeling, that my found solutions are far from elegant: The following Code Snipplet will find the longest sub-sequence in a list, which will satisfy a given condition (for example uppercase letters etc.)
Could you help a noob to make everything shorter and more elegant - every advice is highly appreciated.
import Data.Char
longer :: [a] -> [a] -> [a]
longer x y = if length x > length y
then x
else y
longest :: [[a]]->[a]
longest = foldl longer []
nextSequence :: (a->Bool) -> [a] ->([a],[a])
nextSequence f x = span f (dropWhile (not . f) x)
longestSubsequence :: (a -> Bool) -> [a] -> [a]
longestSubsequence _ x | null x = []
longestSubsequence f x =
longest $ (\y -> [fst y , longestSubsequence f $ snd y]) (nextSequence f x)
testSequence :: String
testSequence = longestSubsequence Data.Char.isUpper
"hkerhklehrERJKJKJERKJejkrjekERHkhkerHERKLJHERJKHKJHERdjfkj"
At first, you can define your longest like this:
import Data.Function
import Data.List
longest :: [[a]] -> [a]
longest = maximumBy (compare `on` length)
And to get all subsequences that satisfy a given condition you can write a function like this:
import Data.List
getSatisfyingSubseqs :: (a -> Bool) -> [a] -> [[a]]
getSatisfyingSubseqs f = filter (f . head) . groupBy same
where same x y = f x == f y
Here we group elements where the condition yields the same result and filter only subsequences that satisfy the condition.
In the total:
longestSubsequence :: (a -> Bool) -> [a] -> [a]
longestSubsequence f = longest . getSatisfyingSubseqs f
UPDATE: And if you want to make it shorter, you can just throw out the auxiliary functions and write the whole at a time:
longestSubsequence :: (a -> Bool) -> [a] -> [a]
longestSubsequence f = maximumBy (compare `on` length) . filter (f . head) . groupBy same
where same x y = f x == f y
(Don't forget the imports)
You can run it there: https://repl.it/#Yuri12358/so-longestsequence
The span :: (a -> Bool) -> [a] -> ([a], [a]) function could be very handy here. Also note that f <$> (a,b) = (a,f b). Probably not very efficient due to the length checks but it should do the job.
lss :: (a -> Bool) -> [a] -> [a]
lss f [] = []
lss f ls#(x:xs) = if f x then longer (lss f <$> span f ls)
else lss f xs
where
longer ::([a],[a]) -> [a]
longer (xs,ys) = if length xs >= length ys then xs else ys
Your longer function uses length, which means it doesn't work if either input is infinite. However, it can be improved to work when at most one is infinite:
longer l1 l2 = go l1 l2
where
go [] _ = l2
go _ [] = l1
go (_:xs) (_:ys) = go xs ys
This is also a performance optimization. Before, if you had a 10-element list and a 10-million-element list, it would walk through all 10 million elements of the 10-million-element list before returning it. Here, it will return it as soon as it gets to the 11th element instead.

haskell: how to get list of numbers which are higher then their neighbours in a starting list

I am trying to learn Haskell and I want to solve one task. I have a list of Integers and I need to add them to another list if they are bigger then both of their neighbors. For Example:
I have a starting list of [0,1,5,2,3,7,8,4] and I need to print out a list which is [5, 8]
This is the code I came up but it returns an empty list:
largest :: [Integer]->[Integer]
largest n
| head n > head (tail n) = head n : largest (tail n)
| otherwise = largest (tail n)
I would solve this as outlined by Thomas M. DuBuisson. Since we want the ends of the list to "count", we'll add negative infinities to each end before creating triples. The monoid-extras package provides a suitable type for this.
import Data.Monoid.Inf
pad :: [a] -> [NegInf a]
pad xs = [negInfty] ++ map negFinite xs ++ [negInfty]
triples :: [a] -> [(a, a, a)]
triples (x:rest#(y:z:_)) = (x,y,z) : triples rest
triples _ = []
isBig :: Ord a => (a,a,a) -> Bool
isBig (x,y,z) = y > x && y > z
scnd :: (a, b, c) -> b
scnd (a, b, c) = b
finites :: [Inf p a] -> [a]
finites xs = [x | Finite x <- xs]
largest :: Ord a => [a] -> [a]
largest = id
. finites
. map scnd
. filter isBig
. triples
. pad
It seems to be working appropriately; in ghci:
> largest [0,1,5,2,3,7,8,4]
[5,8]
> largest [10,1,10]
[10,10]
> largest [3]
[3]
> largest []
[]
You might also consider merging finites, map scnd, and filter isBig in a single list comprehension (then eliminating the definitions of finites, scnd, and isBig):
largest :: Ord a => [a] -> [a]
largest xs = [x | (a, b#(Finite x), c) <- triples (pad xs), a < b, c < b]
But I like the decomposed version better; the finites, scnd, and isBig functions may turn out to be useful elsewhere in your development, especially if you plan to build a few variants of this for different needs.
One thing you might try is lookahead. (Thomas M. DuBuisson suggested a different one that will also work if you handle the final one or two elements correctly.) Since it sounds like this is a problem you want to solve on your own as a learning exercise, I’ll write a skeleton that you can take as a starting-point if you want:
largest :: [Integer] -> [Integer]
largest [] = _
largest [x] = _ -- What should this return?
largest [x1,x2] | x1 > x2 = _
| x1 < x2 = _
| otherwise = _
largest [x1,x2,x3] | x2 > x1 && x2 > x3 = _
| x3 > x2 = _
| otherwise = _
largest (x1:x2:x3:xs) | x2 > x1 && x2 > x3 = _
| otherwise = _
We need the special case of [x1,x2,x3] in addition to (x1:x2:x3:[]) because, according to the clarification in your comment, largest [3,3,2] should return []. but largest [3,2] should return [3]. Therefore, the final three elements require special handling and cannot simply recurse on the final two.
If you also want the result to include the head of the list if it is greater than the second element, you’d make this a helper function and your largest would be something like largest (x1:x2:xs) = (if x1>x2 then [x1] else []) ++ largest' (x1:x2:xs). That is, you want some special handling for the first elements of the original list, which you don’t want to apply to all the sublists when you recurse.
As suggested in the comments, one approach would be to first group the list into tuples of length 3 using Preludes zip3 and tail:
*Main> let xs = [0,1,5,2,3,7,8,4]
*Main> zip3 xs (tail xs) (tail (tail xs))
[(0,1,5),(1,5,2),(5,2,3),(2,3,7),(3,7,8),(7,8,4)]
Which is of type: [a] -> [b] -> [c] -> [(a, b, c)] and [a] -> [a] respectively.
Next you need to find a way to filter out the tuples where the middle element is bigger than the first and last element. One way would be to use Preludes filter function:
*Main> let xs = [(0,1,5),(1,5,2),(5,2,3),(2,3,7),(3,7,8),(7,8,4)]
*Main> filter (\(a, b, c) -> b > a && b > c) xs
[(1,5,2),(7,8,4)]
Which is of type: (a -> Bool) -> [a] -> [a]. This filters out elements of a list based on a Boolean returned from the predicate passed.
Now for the final part, you need to extract the middle element from the filtered tuples above. You can do this easily with Preludes map function:
*Main> let xs = [(1,5,2),(7,8,4)]
*Main> map (\(_, x, _) -> x) xs
[5,8]
Which is of type: (a -> b) -> [a] -> [b]. This function maps elements from a list of type a to b.
The above code stitched together would look like this:
largest :: (Ord a) => [a] -> [a]
largest xs = map (\(_, x, _) -> x) $ filter (\(a, b, c) -> b > a && b > c) $ zip3 xs (tail xs) (tail (tail xs))
Note here I used typeclass Ord, since the above code needs to compare with > and <. It's fine to keep it as Integer here though.

Zip with default value instead of dropping values?

I'm looking for a function in haskell to zip two lists that may vary in length.
All zip functions I could find just drop all values of a lists that is longer than the other.
For example:
In my exercise I have two example lists.
If the first one is shorter than the second one I have to fill up using 0's. Otherwise I have to use 1's.
I'm not allowed to use any recursion. I just have to use higher order functions.
Is there any function I can use?
I really could not find any solution so far.
There is some structure to this problem, and here it comes. I'll be using this stuff:
import Control.Applicative
import Data.Traversable
import Data.List
First up, lists-with-padding are a useful concept, so let's have a type for them.
data Padme m = (:-) {padded :: [m], padder :: m} deriving (Show, Eq)
Next, I remember that the truncating-zip operation gives rise to an Applicative instance, in the library as newtype ZipList (a popular example of a non-Monad). The Applicative ZipList amounts to a decoration of the monoid given by infinity and minimum. Padme has a similar structure, except that its underlying monoid is positive numbers (with infinity), using one and maximum.
instance Applicative Padme where
pure = ([] :-)
(fs :- f) <*> (ss :- s) = zapp fs ss :- f s where
zapp [] ss = map f ss
zapp fs [] = map ($ s) fs
zapp (f : fs) (s : ss) = f s : zapp fs ss
I am obliged to utter the usual incantation to generate a default Functor instance.
instance Functor Padme where fmap = (<*>) . pure
Thus equipped, we can pad away! For example, the function which takes a ragged list of strings and pads them with spaces becomes a one liner.
deggar :: [String] -> [String]
deggar = transpose . padded . traverse (:- ' ')
See?
*Padme> deggar ["om", "mane", "padme", "hum"]
["om ","mane ","padme","hum "]
This can be expressed using These ("represents values with two non-exclusive possibilities") and Align ("functors supporting a zip operation that takes the union of non-uniform shapes") from the these library:
import Data.Align
import Data.These
zipWithDefault :: Align f => a -> b -> f a -> f b -> f (a, b)
zipWithDefault da db = alignWith (fromThese da db)
salign and the other specialised aligns in Data.Align are also worth having a look at.
Thanks to u/WarDaft, u/gallais and u/sjakobi over at r/haskell for pointing out this answer should exist here.
You can append an inifinte list of 0 or 1 to each list and then take the number you need from the result zipped list:
zipWithDefault :: a -> b -> [a] -> [b] -> [(a,b)]
zipWithDefault da db la lb = let len = max (length la) (length lb)
la' = la ++ (repeat da)
lb' = lb ++ (repeat db)
in take len $ zip la' lb'
This should do the trick:
import Data.Maybe (fromMaybe)
myZip dx dy xl yl =
map (\(x,y) -> (fromMaybe dx x, fromMaybe dy y)) $
takeWhile (/= (Nothing, Nothing)) $
zip ((map Just xl) ++ (repeat Nothing)) ((map Just yl) ++ (repeat Nothing))
main = print $ myZip 0 1 [1..10] [42,43,44]
Basically, append an infinite list of Nothing to the end of both lists, then zip them, and drop the results when both are Nothing. Then replace the Nothings with the appropriate default value, dropping the no longer needed Justs while you're at it.
No length, no counting, no hand-crafted recursions, no cooperating folds. transpose does the trick:
zipLongest :: a -> b -> [a] -> [b] -> [(a,b)]
zipLongest x y xs ys = map head . transpose $ -- longest length;
[ -- view from above:
zip xs
(ys ++ repeat y) -- with length of xs
, zip (xs ++ repeat x)
ys -- with length of ys
]
The result of transpose is as long a list as the longest one in its input list of lists. map head takes the first element in each "column", which is the pair we need, whichever the longest list was.
(update:) For an arbitrary number of lists, efficient padding to the maximal length -- aiming to avoid the potentially quadratic behaviour of other sequentially-combining approaches -- can follow the same idea:
padAll :: a -> [[a]] -> [[a]]
padAll x xss = transpose $
zipWith const
(transpose [xs ++ repeat x | xs <- xss]) -- pad all, and cut
(takeWhile id . map or . transpose $ -- to the longest list
[ (True <$ xs) ++ repeat False | xs <- xss])
> mapM_ print $ padAll '-' ["ommmmmmm", "ommmmmm", "ommmmm", "ommmm", "ommm",
"omm", "om", "o"]
"ommmmmmm"
"ommmmmm-"
"ommmmm--"
"ommmm---"
"ommm----"
"omm-----"
"om------"
"o-------"
You don't have to compare list lengths. Try to think about your zip function as a function taking only one argument xs and returning a function which will take ys and perform the required zip. Then, try to write a recursive function which recurses on xs only, as follows.
type Result = [Int] -> [(Int,Int)]
myZip :: [Int] -> Result
myZip [] = map (\y -> (0,y)) -- :: Result
myZip (x:xs) = f x (myZip xs) -- :: Result
where f x k = ??? -- :: Result
Once you have found f, notice that you can turn the recursion above into a fold!
As you said yourself, the standard zip :: [a] -> [b] -> [(a, b)] drops elements from the longer list. To amend for this fact you can modify your input before giving it to zip. First you will have to find out which list is the shorter one (most likely, using length). E.g.,
zip' x xs y ys | length xs <= length ys = ...
| otherwise = ...
where x is the default value for shorter xs and y the default value for shorter ys.
Then you extend the shorter list with the desired default elements (enough to account for the additional elements of the other list). A neat trick for doing so without having to know the length of the longer list is to use the function repeat :: a -> [a] that repeats its argument infinitely often.
zip' x xs y ys | length xs <= length ys = zip {-do something with xs-} ys
| otherwise = zip xs {-do something with ys-}
Here is another solution, that does work on infinite lists and is a straightforward upgrade of Prelude's zip functions:
zipDefault :: a -> b -> [a] -> [b] -> [(a,b)]
zipDefault _da _db [] [] = []
zipDefault da db (a:as) [] = (a,db) : zipDefault da db as []
zipDefault da db [] (b:bs) = (da,b) : zipDefault da db [] bs
zipDefault da db (a:as) (b:bs) = (a,b) : zipDefault da db as bs
and
zipDefaultWith :: a -> b -> (a->b->c) -> [a] -> [b] -> [c]
zipDefaultWith _da _db _f [] [] = []
zipDefaultWith da db f (a:as) [] = f a db : zipDefaultWith da db f as []
zipDefaultWith da db f [] (b:bs) = f da b : zipDefaultWith da db f [] bs
zipDefaultWith da db f (a:as) (b:bs) = f a b : zipDefaultWith da db f as bs
#pigworker, thank you for your enlightening solution!
Yet another implementation:
zipWithDefault :: a -> b -> (a -> b -> c) -> [a] -> [b] -> [c]
zipWithDefault dx _ f [] ys = zipWith f (repeat dx) ys
zipWithDefault _ dy f xs [] = zipWith f xs (repeat dy)
zipWithDefault dx dy f (x:xs) (y:ys) = f x y : zipWithDefault dx dy f xs ys
And also:
zipDefault :: a -> b -> [a] -> [b] -> [c]
zipDefault dx dy = zipWithDefault dx dy (,)
I would like to address the second part of Will Ness's solution, with its excellent use of known functions, by providing another to the original question.
zipPadWith :: a -> b -> (a -> b -> c) -> [a] -> [b] -> [c]
zipPadWith n _ f [] l = [f n x | x <- l]
zipPadWith _ m f l [] = [f x m | x <- l]
zipPadWith n m f (x:xs) (y:ys) = f x y : zipPadWith n m f xs ys
This function will pad a list with an element of choice. You can use a list of the same element repeated as many times as the number of lists in another like this:
rectangularWith :: a -> [[a]] -> [[a]]
rectangularWith _ [] = []
rectangularWith _ [ms] = [[m] | m <- ms]
rectangularWith n (ms:mss) = zipPadWith n [n | _ <- mss] (:) ms (rectangularWith n mss)
The end result will have been a transposed rectangular list of lists padded by the element that we provided so we only need to import transpose from Data.List and recover the order of the elements.
mapM_ print $ transpose $ rectangularWith 0 [[1,2,3,4],[5,6],[7,8],[9]]
[1,2,3,4]
[5,6,0,0]
[7,8,0,0]
[9,0,0,0]

unique elements in a haskell list

okay, this is probably going to be in the prelude, but: is there a standard library function for finding the unique elements in a list? my (re)implementation, for clarification, is:
has :: (Eq a) => [a] -> a -> Bool
has [] _ = False
has (x:xs) a
| x == a = True
| otherwise = has xs a
unique :: (Eq a) => [a] -> [a]
unique [] = []
unique (x:xs)
| has xs x = unique xs
| otherwise = x : unique xs
I searched for (Eq a) => [a] -> [a] on Hoogle.
First result was nub (remove duplicate elements from a list).
Hoogle is awesome.
The nub function from Data.List (no, it's actually not in the Prelude) definitely does something like what you want, but it is not quite the same as your unique function. They both preserve the original order of the elements, but unique retains the last
occurrence of each element, while nub retains the first occurrence.
You can do this to make nub act exactly like unique, if that's important (though I have a feeling it's not):
unique = reverse . nub . reverse
Also, nub is only good for small lists.
Its complexity is quadratic, so it starts to get slow if your list can contain hundreds of elements.
If you limit your types to types having an Ord instance, you can make it scale better.
This variation on nub still preserves the order of the list elements, but its complexity is O(n * log n):
import qualified Data.Set as Set
nubOrd :: Ord a => [a] -> [a]
nubOrd xs = go Set.empty xs where
go s (x:xs)
| x `Set.member` s = go s xs
| otherwise = x : go (Set.insert x s) xs
go _ _ = []
In fact, it has been proposed to add nubOrd to Data.Set.
import Data.Set (toList, fromList)
uniquify lst = toList $ fromList lst
I think that unique should return a list of elements that only appear once in the original list; that is, any elements of the orginal list that appear more than once should not be included in the result.
May I suggest an alternative definition, unique_alt:
unique_alt :: [Int] -> [Int]
unique_alt [] = []
unique_alt (x:xs)
| elem x ( unique_alt xs ) = [ y | y <- ( unique_alt xs ), y /= x ]
| otherwise = x : ( unique_alt xs )
Here are some examples that highlight the differences between unique_alt and unqiue:
unique [1,2,1] = [2,1]
unique_alt [1,2,1] = [2]
unique [1,2,1,2] = [1,2]
unique_alt [1,2,1,2] = []
unique [4,2,1,3,2,3] = [4,1,2,3]
unique_alt [4,2,1,3,2,3] = [4,1]
I think this would do it.
unique [] = []
unique (x:xs) = x:unique (filter ((/=) x) xs)
Another way to remove duplicates:
unique :: [Int] -> [Int]
unique xs = [x | (x,y) <- zip xs [0..], x `notElem` (take y xs)]
Algorithm in Haskell to create a unique list:
data Foo = Foo { id_ :: Int
, name_ :: String
} deriving (Show)
alldata = [ Foo 1 "Name"
, Foo 2 "Name"
, Foo 3 "Karl"
, Foo 4 "Karl"
, Foo 5 "Karl"
, Foo 7 "Tim"
, Foo 8 "Tim"
, Foo 9 "Gaby"
, Foo 9 "Name"
]
isolate :: [Foo] -> [Foo]
isolate [] = []
isolate (x:xs) = (fst f) : isolate (snd f)
where
f = foldl helper (x,[]) xs
helper (a,b) y = if name_ x == name_ y
then if id_ x >= id_ y
then (x,b)
else (y,b)
else (a,y:b)
main :: IO ()
main = mapM_ (putStrLn . show) (isolate alldata)
Output:
Foo {id_ = 9, name_ = "Name"}
Foo {id_ = 9, name_ = "Gaby"}
Foo {id_ = 5, name_ = "Karl"}
Foo {id_ = 8, name_ = "Tim"}
A library-based solution:
We can use that style of Haskell programming where all looping and recursion activities are pushed out of user code and into suitable library functions. Said library functions are often optimized in ways that are way beyond the skills of a Haskell beginner.
A way to decompose the problem into two passes goes like this:
produce a second list that is parallel to the input list, but with duplicate elements suitably marked
eliminate elements marked as duplicates from that second list
For the first step, duplicate elements don't need a value at all, so we can use [Maybe a] as the type of the second list. So we need a function of type:
pass1 :: Eq a => [a] -> [Maybe a]
Function pass1 is an example of stateful list traversal where the state is the list (or set) of distinct elements seen so far. For this sort of problem, the library provides the mapAccumL :: (s -> a -> (s, b)) -> s -> [a] -> (s, [b]) function.
Here the mapAccumL function requires, besides the initial state and the input list, a step function argument, of type s -> a -> (s, Maybe a).
If the current element x is not a duplicate, the output of the step function is Just x and x gets added to the current state. If x is a duplicate, the output of the step function is Nothing, and the state is passed unchanged.
Testing under the ghci interpreter:
$ ghci
GHCi, version 8.8.4: https://www.haskell.org/ghc/ :? for help
λ>
λ> stepFn s x = if (elem x s) then (s, Nothing) else (x:s, Just x)
λ>
λ> import Data.List(mapAccumL)
λ>
λ> pass1 xs = mapAccumL stepFn [] xs
λ>
λ> xs2 = snd $ pass1 "abacrba"
λ> xs2
[Just 'a', Just 'b', Nothing, Just 'c', Just 'r', Nothing, Nothing]
λ>
Writing a pass2 function is even easier. To filter out Nothing non-values, we could use:
import Data.Maybe( fromJust, isJust)
pass2 = (map fromJust) . (filter isJust)
but why bother at all ? - as this is precisely what the catMaybes library function does.
λ>
λ> import Data.Maybe(catMaybes)
λ>
λ> catMaybes xs2
"abcr"
λ>
Putting it all together:
Overall, the source code can be written as:
import Data.Maybe(catMaybes)
import Data.List(mapAccumL)
uniques :: (Eq a) => [a] -> [a]
uniques = let stepFn s x = if (elem x s) then (s, Nothing) else (x:s, Just x)
in catMaybes . snd . mapAccumL stepFn []
This code is reasonably compatible with infinite lists, something occasionally referred to as being “laziness-friendly”:
λ>
λ> take 5 $ uniques $ "abacrba" ++ (cycle "abcrf")
"abcrf"
λ>
Efficiency note:
If we anticipate that it is possible to find many distinct elements in the input list and we can have an Ord a instance, the state can be implemented as a Set object rather than a plain list, this without having to alter the overall structure of the solution.
Here's a solution that uses only Prelude functions:
uniqueList theList =
if not (null theList)
then head theList : filter (/= head theList) (uniqueList (tail theList))
else []
I'm assuming this is equivalent to running two or three nested "for" loops (running through each element, then running through each element again to check for other elements with the same value, then removing those other elements) so I'd estimate this is O(n^2) or O(n^3)
Might even be better than reversing a list, nubbing it, then reversing it again, depending on your circumstances.