I'm interested in writing an efficient Haskell function triangularize :: [a] -> [[a]] that takes a (perhaps infinite) list and "triangularizes" it into a list of lists. For example, triangularize [1..19] should return
[[1, 3, 6, 10, 15]
,[2, 5, 9, 14]
,[4, 8, 13, 19]
,[7, 12, 18]
,[11, 17]
,[16]]
By efficient, I mean that I want it to run in O(n) time where n is the length of the list.
Note that this is quite easy to do in a language like Python, because appending to the end of a list (array) is a constant time operation. A very imperative Python function which accomplishes this is:
def triangularize(elements):
row_index = 0
column_index = 0
diagonal_array = []
for a in elements:
if row_index == len(diagonal_array):
diagonal_array.append([a])
else:
diagonal_array[row_index].append(a)
if row_index == 0:
(row_index, column_index) = (column_index + 1, 0)
else:
row_index -= 1
column_index += 1
return diagonal_array
This came up because I have been using Haskell to write some "tabl" sequences in the On-Line Encyclopedia of Integer Sequences (OEIS), and I want to be able to transform an ordinary (1-dimensional) sequence into a (2-dimensional) sequence of sequences in exactly this way.
Perhaps there's some clever (or not-so-clever) way to foldr over the input list, but I haven't been able to sort it out.
Make increasing size chunks:
chunks :: [a] -> [[a]]
chunks = go 0 where
go n [] = []
go n as = b : go (n+1) e where (b,e) = splitAt n as
Then just transpose twice:
diagonalize :: [a] -> [[a]]
diagonalize = transpose . transpose . chunks
Try it in ghci:
> diagonalize [1..19]
[[1,3,6,10,15],[2,5,9,14],[4,8,13,19],[7,12,18],[11,17],[16]]
This appears to be directly related to the set theory argument proving that the set of integer pairs are in one-to-one correspondence with the set of integers (denumerable). The argument involves a so-called Cantor pairing function.
So, out of curiosity, let's see if we can get a diagonalize function that way.
Define the infinite list of Cantor pairs recursively in Haskell:
auxCantorPairList :: (Integer, Integer) -> [(Integer, Integer)]
auxCantorPairList (x,y) =
let nextPair = if (x > 0) then (x-1,y+1) else (x+y+1, 0)
in (x,y) : auxCantorPairList nextPair
cantorPairList :: [(Integer, Integer)]
cantorPairList = auxCantorPairList (0,0)
And try that inside ghci:
λ> take 15 cantorPairList
[(0,0),(1,0),(0,1),(2,0),(1,1),(0,2),(3,0),(2,1),(1,2),(0,3),(4,0),(3,1),(2,2),(1,3),(0,4)]
λ>
We can number the pairs, and for example extract the numbers for those pairs which have a zero x coordinate:
λ>
λ> xs = [1..]
λ> take 5 $ map fst $ filter (\(n,(x,y)) -> (x==0)) $ zip xs cantorPairList
[1,3,6,10,15]
λ>
We recognize this is the top row from the OP's result in the text of the question.
Similarly for the next two rows:
λ>
λ> makeRow xs row = map fst $ filter (\(n,(x,y)) -> (x==row)) $ zip xs cantorPairList
λ> take 5 $ makeRow xs 1
[2,5,9,14,20]
λ>
λ> take 5 $ makeRow xs 2
[4,8,13,19,26]
λ>
From there, we can write our first draft of a diagonalize function:
λ>
λ> printAsLines xs = mapM_ (putStrLn . show) xs
λ> diagonalize xs = takeWhile (not . null) $ map (makeRow xs) [0..]
λ>
λ> printAsLines $ diagonalize [1..19]
[1,3,6,10,15]
[2,5,9,14]
[4,8,13,19]
[7,12,18]
[11,17]
[16]
λ>
EDIT: performance update
For a list of 1 million items, the runtime is 18 sec, and 145 seconds for 4 millions items. As mentioned by Redu, this seems like O(n√n) complexity.
Distributing the pairs among the various target sublists is inefficient, as most filter operations fail.
To improve performance, we can use a Data.Map structure for the target sublists.
{-# LANGUAGE ExplicitForAll #-}
{-# LANGUAGE ScopedTypeVariables #-}
import qualified Data.List as L
import qualified Data.Map as M
type MIL a = M.Map Integer [a]
buildCantorMap :: forall a. [a] -> MIL a
buildCantorMap xs =
let ts = zip xs cantorPairList -- triplets (a,(x,y))
m0 = (M.fromList [])::MIL a
redOp m (n,(x,y)) = let afn as = case as of
Nothing -> Just [n]
Just jas -> Just (n:jas)
in M.alter afn x m
m1r = L.foldl' redOp m0 ts
in
fmap reverse m1r
diagonalize :: [a] -> [[a]]
diagonalize xs = let cm = buildCantorMap xs
in map snd $ M.toAscList cm
With that second version, performance appears to be much better: 568 msec for the 1 million items list, 2669 msec for the 4 millions item list. So it is close to the O(n*Log(n)) complexity we could have hoped for.
It might be a good idea to craete a comb filter.
So what does comb filter do..? It's like splitAt but instead of splitting at a single index it sort of zips the given infinite list with the given comb to separate the items coressponding to True and False in the comb. Such that;
comb :: [Bool] -- yields [True,False,True,False,False,True,False,False,False,True...]
comb = iterate (False:) [True] >>= id
combWith :: [Bool] -> [a] -> ([a],[a])
combWith _ [] = ([],[])
combWith (c:cs) (x:xs) = let (f,s) = combWith cs xs
in if c then (x:f,s) else (f,x:s)
λ> combWith comb [1..19]
([1,3,6,10,15],[2,4,5,7,8,9,11,12,13,14,16,17,18,19])
Now all we need to do is to comb our infinite list and take the fst as the first row and carry on combing the snd with the same comb.
Lets do it;
diags :: [a] -> [[a]]
diags [] = []
diags xs = let (h,t) = combWith comb xs
in h : diags t
λ> diags [1..19]
[ [1,3,6,10,15]
, [2,5,9,14]
, [4,8,13,19]
, [7,12,18]
, [11,17]
, [16]
]
also seems to be lazy too :)
λ> take 5 . map (take 5) $ diags [1..]
[ [1,3,6,10,15]
, [2,5,9,14,20]
, [4,8,13,19,26]
, [7,12,18,25,33]
, [11,17,24,32,41]
]
I think the complexity could be like O(n√n) but i can not make sure. Any ideas..?
Related
I'm playing around with Haskell, mostly trying to learn some new techniques to solve problems. Without any real application in mind I came to think about an interesting thing I can't find a satisfying solution to. Maybe someone has any better ideas?
The problem:
Let's say we want to generate a list of Ints using a starting value and a list of Ints, representing the pattern of numbers to be added in the specified order. So the first value is given, then second value should be the starting value plus the first value in the list, the third that value plus the second value of the pattern, and so on. When the pattern ends, it should start over.
For example: Say we have a starting value v and a pattern [x,y], we'd like the list [v,v+x,v+x+y,v+2x+y,v+2x+2y, ...]. In other words, with a two-valued pattern, next value is created by alternatingly adding x and y to the number last calculated.
If the pattern is short enough (2-3 values?), one could generate separate lists:
[v,v,v,...]
[0,x,x,2x,2x,3x, ...]
[0,0,y,y,2y,2y,...]
and then zip them together with addition. However, as soon as the pattern is longer this gets pretty tedious. My best attempt at a solution would be something like this:
generateLstByPattern :: Int -> [Int] -> [Int]
generateLstByPattern v pattern = v : (recGen v pattern)
where
recGen :: Int -> [Int] -> [Int]
recGen lastN (x:[]) = (lastN + x) : (recGen (lastN + x) pattern)
recGen lastN (x:xs) = (lastN + x) : (recGen (lastN + x) xs)
It works as intended - but I have a feeling there is a bit more elegant Haskell solution somewhere (there almost always is!). What do you think? Maybe a cool list-comprehension? A higher-order function I've forgotten about?
Separate the concerns. First look a just a list to process once. Get that working, test it. Hint: “going through the list elements with some accumulator” is in general a good fit for a fold.
Then all that's left to is to repeat the list of inputs and feed it into the pass-once function. Conveniently, there's a standard function for that purpose. Just make sure your once-processor is lazy enough to handle the infinite list input.
What you describe is
foo :: Num a => a -> [a] -> [a]
foo v pattern = scanl (+) v (cycle pattern)
which would normally be written even as just
foo :: Num a => a -> [a] -> [a]
foo v = scanl (+) v . cycle
scanl (+) v xs is the standard way to calculate the partial sums of (v:xs), and cycle is the standard way to repeat a given list cyclically. This is what you describe.
This works for a pattern list of any positive length, as you wanted.
Your way of generating it is inventive, but it's almost too clever for its own good (i.e. it seems overly complicated). It can be expressed with some list comprehensions, as
foo v pat =
let -- the lists, as you describe them:
lists = repeat v :
[ replicate i 0 ++
[ y | x <- [p, p+p ..]
, y <- map (const x) pat ]
| (p,i) <- zip pat [1..] ]
in
-- OK, so what do we do with that? How do we zipWith
-- over an arbitrary amount of lists?
-- with a fold!
foldr (zipWith (+)) (repeat 0) lists
map (const x) pat is a "clever" way of writing replicate (length pat) x. It can be further shortened to x <$ pat since (<$) x xs == map (const x) xs by definition. It might seem obfuscated, until you've become accustomed to it, and then it seems clear and obvious. :)
Surprised noone's mentioned the silly way yet.
mylist x xs = x : zipWith (+) (mylist x xs) (cycle xs)
(If you squint a bit you can see the connection to scanl answer).
When it is about generating series my first approach would be iterate or unfoldr. iterate is for simple series and unfoldr is for those who carry kind of state but without using any State monad.
In this particular case I think unfoldr is ideal.
series :: Int -> [Int] -> [Int]
series s [x,y] = unfoldr (\(f,s) -> Just (f*x + s*y, (s+1,f))) (s,0)
λ> take 10 $ series 1 [1,1]
[1,2,3,4,5,6,7,8,9,10]
λ> take 10 $ series 3 [1,1]
[3,4,5,6,7,8,9,10,11,12]
λ> take 10 $ series 0 [1,2]
[0,1,3,4,6,7,9,10,12,13]
It is probably better to implement the lists separately, for example the list with x can be implement with:
xseq :: (Enum a, Num a) => a -> [a]
xseq x = 0 : ([x, x+x ..] >>= replicate 2)
Whereas the sequence for y can be implemented as:
yseq :: (Enum a, Num a) => a -> [a]
yseq y = [0,y ..] >>= replicate 2
Then you can use zipWith :: (a -> b -> c) -> [a] -> [b] -> [c] to add the two lists together and add v to it:
mylist :: (Enum a, Num a) => a -> a -> a -> [a]
mylist v x y = zipWith ((+) . (v +)) (xseq x) (yseq y)
So for v = 1, x = 2, and y = 3, we obtain:
Prelude> take 10 (mylist 1 2 3)
[1,3,6,8,11,13,16,18,21,23]
An alternative is to see as pattern that we each time first add x and then y. We thus can make an infinite list [(x+), (y+)], and use scanl :: (b -> a -> b) -> b -> [a] -> [b] to each time apply one of the functions and yield the intermediate result:
mylist :: Num a => a -> a -> a -> [a]
mylist v x y = scanl (flip ($)) v (cycle [(x+), (y+)])
this yields the same result:
Prelude> take 10 $ mylist 1 2 3
[1,3,6,8,11,13,16,18,21,23]
Now the only thing left to do is to generalize this to a list. So for example if the list of additions is given, then you can impelement this as:
mylist :: Num a => [a] -> [a]
mylist v xs = scanl (flip ($)) v (cycle (map (+) xs))
or for a list of functions:
mylist :: Num a => [a -> a] -> [a]
mylist v xs = scanl (flip ($)) v (cycle (xs))
In Haskell I'm trying to create a function with the typing Int -> [a] -> [[a]], that generates a list such as: [[0, 0], [0, 1], [1, 0], [1, 1]] where each element in the smaller lists can take the value of either 1 or 0. Each of the smaller lists has the same size, which in this case is 2. If the size of the smaller lists was 3, I would expect to get the output [[0,0,0], [0,0,1], [0,1,0], [1,0,0], [1,1,0], [0,1,1], [1,0,1], [1,1,1]]
I've looked in to the permutations function, but this does not achieve exactly what I want. I believe there is also a variate function, but I cannot access this library.
Rather than the exact function (which would also be useful), what would be the process to generate such a list?
As oisdk mentions in a comment, a more general version of this exact function is already defined, with the name Control.Monad.replicateM:
Prelude> import Control.Monad (replicateM)
Prelude Control.Monad> replicateM 3 [0,1]
[[0,0,0],[0,0,1],[0,1,0],[0,1,1],[1,0,0],[1,0,1],[1,1,0],[1,1,1]]
We can use the list monad for this:
example :: [[Int]]
example = do
x <- [0,1]
y <- [0,1]
pure [x,y]
ghci> example
[[0,0],[0,1],[1,0],[1,1]]
Play with this. Then you should be able to combine it with recursion on n to create the function you need.
I'm not sure I understood the specification, but from the examples, one possible definition is
lists :: Int -> [[Int]]
lists 0 = [[]]
lists n = map (0:) xss ++ map (1:) xss
where xss = lists (n-1)
-- λ> lists 2
-- [[0,0],[0,1],[1,0],[1,1]]
-- λ> lists 3
-- [[0,0,0],[0,0,1],[0,1,0],[0,1,1],[1,0,0],[1,0,1],[1,1,0],[1,1,1]]
Another definition, using comprehension instead of map, is
lists :: Int -> [[Int]]
lists 0 = [[]]
lists n = [x:xs | x <- [0,1], xs <- lists (n-1)]
-- λ> lists 2
-- [[0,0],[0,1],[1,0],[1,1]]
-- λ> lists 3
-- [[0,0,0],[0,0,1],[0,1,0],[0,1,1],[1,0,0],[1,0,1],[1,1,0],[1,1,1]]
You can use the sequence function.
Like this:
λ>
λ> :t sequence
sequence :: (Traversable t, Monad m) => t (m a) -> m (t a)
λ>
λ> let { allLists :: Int -> [a] -> [[a]] ; allLists n xs = sequence $ replicate n xs ; }
λ>
λ> allLists 3 [0,1]
[[0,0,0],[0,0,1],[0,1,0],[0,1,1],[1,0,0],[1,0,1],[1,1,0],[1,1,1]]
λ>
This is for a class
We're supposed to write 3 functions :
1 : Prints list of fibbonaci numbers
2 : Prints list of prime numbers
3 : Prints list of fibonacci numbers whose indexes are prime
EG : Let this be fibbonaci series
Then In partC - certain elements are only shown
1: 1
*2: 1 (shown as index 2 is prime )
*3: 2 (shown as index 3 is prime )
4: 3
*5: 5 (shown )
6: 8
*7: 13 (shown as index 7 prime and so on)
I'm done with part 1 & 2 but I'm struggling with part 3. I created a function listNum that creates a sort of mapping [Integer, Integer] from the Fibbonaci series - where 1st Int is the index and 2nd int is the actual fibbonaci numbers.
Now my function partC is trying to stitch snd elements of the fibonaci series by filtering the indexes but I'm doing something wrong in the filter step.
Any help would be appreciated as I'm a beginner to Haskell.
Thanks!
fib :: [Integer]
fib = 0 : 1 : zipWith (+) fib (tail fib)
listNum :: [(Integer, Integer)]
listNum = zip [1 .. ] fib
primes :: [Integer]
primes = sieve (2 : [3,5 ..])
where
sieve (p:xs) = p : sieve [x | x <- xs , x `mod` p > 0]
partC :: [Integer] -- Problem in filter part of this function
partC = map snd listNum $ filter (\x -> x `elem` primes) [1,2 ..]
main = do
print (take 10 fib) -- Works fine
print (take 10 primes) --works fine
print (take 10 listNum) --works fine
print ( take 10 partC) -- Causes error
Error :
prog0.hs:14:9: error:
• Couldn't match expected type ‘[Integer] -> [Integer]’
with actual type ‘[Integer]’
• The first argument of ($) takes one argument,
but its type ‘[Integer]’ has none
In the expression:
map snd listNum $ filter (\ x -> x `elem` primes) [1, 2 .. ]
In an equation for ‘partC’:
partC
= map snd listNum $ filter (\ x -> x `elem` primes) [1, 2 .. ]
|
14 | partC = map snd listNum $ filter (\x -> x `elem` primes) [1,2 ..]
Here's what I think you intended as the original logic of partC. You got the syntax mostly right, but the logic has a flaw.
partC = snd <$> filter ((`elem` primes) . fst) (zip [1..] fib)
-- note that (<$>) = fmap = map, just infix
-- list comprehension
partC = [fn | (idx, fn) <- zip [1..] fib, idx `elem` primes]
But this cannot work. As #DanRobertson notes, you'll try to check 4 `elem` primes and run into an infinite loop, because primes is infinite and elem tries to be really sure that 4 isn't an element before giving up. We humans know that 4 isn't an element of primes, but elem doesn't.
There are two ways out. We can write a custom version of elem that gives up once it finds an element larger than the one we're looking for:
sortedElem :: Ord a => a -> [a] -> Bool
sortedElem x (h:tl) = case x `compare` h of
LT -> False
EQ -> True
GT -> sortedElem x tl
sortedElem _ [] = False
-- or
sortedElem x = foldr (\h tl -> case x `compare` h of
LT -> False
EQ -> True
GT -> tl
) False
Since primes is a sorted list, sortedElem will always give the correct answer now:
partC = snd <$> filter ((`sortedElem` primes) . fst) (zip [1..] fib)
However, there is a performance issue, because every call to sortedElem has to start at the very beginning of primes and walk all the way down until it figures out whether or not the index is right. This leads into the second way:
partC = go primeDiffs fib
where primeDiffs = zipWith (-) primes (1:primes)
-- primeDiffs = [1, 1, 2, 2, 4, 2, 4, 2, 4, 6, ...]
-- The distance from one prime (incl. 1) to the next
go (step:steps) xs = x:go steps xs'
where xs'#(x:_) = drop step xs
go [] _ = [] -- unused here
-- in real code you might pull this out into an atOrderedIndices :: [Int] -> [a] -> [a]
We transform the list of indices (primes) into a list of offsets, each one building on the next, and we call it primeDiffs. We then define go to take such a list of offsets and extract elements from another list. It first drops the elements being skipped, and then puts the top element into the result before building the rest of the list. Under -O2, on my machine, this version is twice as fast as the other one when finding partC !! 5000.
Only began using Haskell a couple of weeks ago - I am attempting to randomly shuffle a list of type Card by splitting the list into two at a random point int eh list (depending on an array of random integers produced by the randomList function) and swapping the order of these two parts a number of times, but the output is not at all random, and the parse only seems to be happening once, pretty desperate as I need it working and the deadline is tonight!
randomList :: (Random a) => (a,a) -> Int -> StdGen -> [a]
randomList bnds n = take n . randomRs bnds
randomise :: [Int] -> [Card] -> [Card]
randomise [] p = p
randomise (x : xs) p = do
randomise xs ((drop x p) ++ (take x p))
shuffle :: Int -> [Card] -> [Card]
shuffle r p = do
let g = mkStdGen r
randomise(randomList (1, (length p)-1) 500 g :: [Int]) p
You can just make a random number of permutations on your list. You can do it like:
import System.Random
import Data.List
shuffle xs = do
gen <- getStdGen
let (permNum,newGen) = randomR (0,fac (length xs) -1) gen
return $ permutation permNum xs
permutation makes n permutations on the (assumed sorted) list xs. When randomizing, xs need not be sorted however.
fac is just an implementation of the factorial function.
shuffle makes a random number and applies that many permutations to xs.
It's a bit different from what you are trying to do, but it works wonders. I assumed you didn't need to explicitly use your proposed method. You will have to implement permutation and fac yourself though.
For help on permutation, you could look here. It's a description to solve a Project Euler Problem, but you could use the same procedure to make n permutations.
EDIT: I don't know if anyone cares anymore, but I found another way to do it WAY easier:
import System.Random
randPerm :: StdGen -> [a] -> [a]
randPerm _ [] = []
randPerm gen xs = let (n,newGen) = randomR (0,length xs -1) gen
front = xs !! n
in front : randPerm newGen (take n xs ++ drop (n+1) xs)
Quite late to the party, but a small improvement over the suggested solution is to use splitAt instead of take & drop:
shuffle :: [a] -> StdGen -> [a]
shuffle [] _ = []
shuffle list generator = let (index,newGenerator) = randomR (0,length list -1) generator
(listUntilIndex, element:listAfterIndex) = splitAt index list
in element : shuffle (listUntilIndex ++ listAfterIndex) newGenerator
okay, this is probably going to be in the prelude, but: is there a standard library function for finding the unique elements in a list? my (re)implementation, for clarification, is:
has :: (Eq a) => [a] -> a -> Bool
has [] _ = False
has (x:xs) a
| x == a = True
| otherwise = has xs a
unique :: (Eq a) => [a] -> [a]
unique [] = []
unique (x:xs)
| has xs x = unique xs
| otherwise = x : unique xs
I searched for (Eq a) => [a] -> [a] on Hoogle.
First result was nub (remove duplicate elements from a list).
Hoogle is awesome.
The nub function from Data.List (no, it's actually not in the Prelude) definitely does something like what you want, but it is not quite the same as your unique function. They both preserve the original order of the elements, but unique retains the last
occurrence of each element, while nub retains the first occurrence.
You can do this to make nub act exactly like unique, if that's important (though I have a feeling it's not):
unique = reverse . nub . reverse
Also, nub is only good for small lists.
Its complexity is quadratic, so it starts to get slow if your list can contain hundreds of elements.
If you limit your types to types having an Ord instance, you can make it scale better.
This variation on nub still preserves the order of the list elements, but its complexity is O(n * log n):
import qualified Data.Set as Set
nubOrd :: Ord a => [a] -> [a]
nubOrd xs = go Set.empty xs where
go s (x:xs)
| x `Set.member` s = go s xs
| otherwise = x : go (Set.insert x s) xs
go _ _ = []
In fact, it has been proposed to add nubOrd to Data.Set.
import Data.Set (toList, fromList)
uniquify lst = toList $ fromList lst
I think that unique should return a list of elements that only appear once in the original list; that is, any elements of the orginal list that appear more than once should not be included in the result.
May I suggest an alternative definition, unique_alt:
unique_alt :: [Int] -> [Int]
unique_alt [] = []
unique_alt (x:xs)
| elem x ( unique_alt xs ) = [ y | y <- ( unique_alt xs ), y /= x ]
| otherwise = x : ( unique_alt xs )
Here are some examples that highlight the differences between unique_alt and unqiue:
unique [1,2,1] = [2,1]
unique_alt [1,2,1] = [2]
unique [1,2,1,2] = [1,2]
unique_alt [1,2,1,2] = []
unique [4,2,1,3,2,3] = [4,1,2,3]
unique_alt [4,2,1,3,2,3] = [4,1]
I think this would do it.
unique [] = []
unique (x:xs) = x:unique (filter ((/=) x) xs)
Another way to remove duplicates:
unique :: [Int] -> [Int]
unique xs = [x | (x,y) <- zip xs [0..], x `notElem` (take y xs)]
Algorithm in Haskell to create a unique list:
data Foo = Foo { id_ :: Int
, name_ :: String
} deriving (Show)
alldata = [ Foo 1 "Name"
, Foo 2 "Name"
, Foo 3 "Karl"
, Foo 4 "Karl"
, Foo 5 "Karl"
, Foo 7 "Tim"
, Foo 8 "Tim"
, Foo 9 "Gaby"
, Foo 9 "Name"
]
isolate :: [Foo] -> [Foo]
isolate [] = []
isolate (x:xs) = (fst f) : isolate (snd f)
where
f = foldl helper (x,[]) xs
helper (a,b) y = if name_ x == name_ y
then if id_ x >= id_ y
then (x,b)
else (y,b)
else (a,y:b)
main :: IO ()
main = mapM_ (putStrLn . show) (isolate alldata)
Output:
Foo {id_ = 9, name_ = "Name"}
Foo {id_ = 9, name_ = "Gaby"}
Foo {id_ = 5, name_ = "Karl"}
Foo {id_ = 8, name_ = "Tim"}
A library-based solution:
We can use that style of Haskell programming where all looping and recursion activities are pushed out of user code and into suitable library functions. Said library functions are often optimized in ways that are way beyond the skills of a Haskell beginner.
A way to decompose the problem into two passes goes like this:
produce a second list that is parallel to the input list, but with duplicate elements suitably marked
eliminate elements marked as duplicates from that second list
For the first step, duplicate elements don't need a value at all, so we can use [Maybe a] as the type of the second list. So we need a function of type:
pass1 :: Eq a => [a] -> [Maybe a]
Function pass1 is an example of stateful list traversal where the state is the list (or set) of distinct elements seen so far. For this sort of problem, the library provides the mapAccumL :: (s -> a -> (s, b)) -> s -> [a] -> (s, [b]) function.
Here the mapAccumL function requires, besides the initial state and the input list, a step function argument, of type s -> a -> (s, Maybe a).
If the current element x is not a duplicate, the output of the step function is Just x and x gets added to the current state. If x is a duplicate, the output of the step function is Nothing, and the state is passed unchanged.
Testing under the ghci interpreter:
$ ghci
GHCi, version 8.8.4: https://www.haskell.org/ghc/ :? for help
λ>
λ> stepFn s x = if (elem x s) then (s, Nothing) else (x:s, Just x)
λ>
λ> import Data.List(mapAccumL)
λ>
λ> pass1 xs = mapAccumL stepFn [] xs
λ>
λ> xs2 = snd $ pass1 "abacrba"
λ> xs2
[Just 'a', Just 'b', Nothing, Just 'c', Just 'r', Nothing, Nothing]
λ>
Writing a pass2 function is even easier. To filter out Nothing non-values, we could use:
import Data.Maybe( fromJust, isJust)
pass2 = (map fromJust) . (filter isJust)
but why bother at all ? - as this is precisely what the catMaybes library function does.
λ>
λ> import Data.Maybe(catMaybes)
λ>
λ> catMaybes xs2
"abcr"
λ>
Putting it all together:
Overall, the source code can be written as:
import Data.Maybe(catMaybes)
import Data.List(mapAccumL)
uniques :: (Eq a) => [a] -> [a]
uniques = let stepFn s x = if (elem x s) then (s, Nothing) else (x:s, Just x)
in catMaybes . snd . mapAccumL stepFn []
This code is reasonably compatible with infinite lists, something occasionally referred to as being “laziness-friendly”:
λ>
λ> take 5 $ uniques $ "abacrba" ++ (cycle "abcrf")
"abcrf"
λ>
Efficiency note:
If we anticipate that it is possible to find many distinct elements in the input list and we can have an Ord a instance, the state can be implemented as a Set object rather than a plain list, this without having to alter the overall structure of the solution.
Here's a solution that uses only Prelude functions:
uniqueList theList =
if not (null theList)
then head theList : filter (/= head theList) (uniqueList (tail theList))
else []
I'm assuming this is equivalent to running two or three nested "for" loops (running through each element, then running through each element again to check for other elements with the same value, then removing those other elements) so I'd estimate this is O(n^2) or O(n^3)
Might even be better than reversing a list, nubbing it, then reversing it again, depending on your circumstances.