Haskell: How to update each infinite variable once in list comprehension using infinite lists [duplicate] - list

I wish to produce the Cartesian product of 2 lists in Haskell, but I cannot work out how to do it. The cartesian product gives all combinations of the list elements:
xs = [1,2,3]
ys = [4,5,6]
cartProd :: [a] -> [b] -> [(a,b)]
cartProd xs ys ==> [(1,4),(1,5),(1,6),(2,4),(2,5),(2,6),(3,4),(3,5),(3,6)]
This is not an actual homework question and is not related to any such question, but the way in which this problem is solved may help with one I am stuck on.

This is very easy with list comprehensions. To get the cartesian product of the lists xs and ys, we just need to take the tuple (x,y) for each element x in xs and each element y in ys.
This gives us the following list comprehension:
cartProd xs ys = [(x,y) | x <- xs, y <- ys]

As other answers have noted, using a list comprehension is the most natural way to do this in Haskell.
If you're learning Haskell and want to work on developing intuitions about type classes like Monad, however, it's a fun exercise to figure out why this slightly shorter definition is equivalent:
import Control.Monad (liftM2)
cartProd :: [a] -> [b] -> [(a, b)]
cartProd = liftM2 (,)
You probably wouldn't ever want to write this in real code, but the basic idea is something you'll see in Haskell all the time: we're using liftM2 to lift the non-monadic function (,) into a monad—in this case specifically the list monad.
If this doesn't make any sense or isn't useful, forget it—it's just another way to look at the problem.

If your input lists are of the same type, you can get the cartesian product of an arbitrary number of lists using sequence (using the List monad). This will get you a list of lists instead of a list of tuples:
> sequence [[1,2,3],[4,5,6]]
[[1,4],[1,5],[1,6],[2,4],[2,5],[2,6],[3,4],[3,5],[3,6]]

There is a very elegant way to do this with Applicative Functors:
import Control.Applicative
(,) <$> [1,2,3] <*> [4,5,6]
-- [(1,4),(1,5),(1,6),(2,4),(2,5),(2,6),(3,4),(3,5),(3,6)]
The basic idea is to apply a function on "wrapped" arguments, e.g.
(+) <$> (Just 4) <*> (Just 10)
-- Just 14
In case of lists, the function will be applied to all combinations, so all you have to do is to "tuple" them with (,).
See http://learnyouahaskell.com/functors-applicative-functors-and-monoids#applicative-functors or (more theoretical) http://www.soi.city.ac.uk/~ross/papers/Applicative.pdf for details.

Other answers assume that the two input lists are finite. Frequently, idiomatic Haskell code includes infinite lists, and so it is worthwhile commenting briefly on how to produce an infinite Cartesian product in case that is needed.
The standard approach is to use diagonalization; writing the one input along the top and the other input along the left, we could write a two-dimensional table that contained the full Cartesian product like this:
1 2 3 4 ...
a a1 a2 a3 a4 ...
b b1 b2 b3 b4 ...
c c1 c2 c3 c4 ...
d d1 d2 d3 d4 ...
. . . . . .
. . . . . .
. . . . . .
Of course, working across any single row will give us infinitely elements before it reaches the next row; similarly going column-wise would be disastrous. But we can go along diagonals that go down and to the left, starting again in a bit farther to the right each time we reach the edge of the grid.
a1
a2
b1
a3
b2
c1
a4
b3
c2
d1
...and so on. In order, this would give us:
a1 a2 b1 a3 b2 c1 a4 b3 c2 d1 ...
To code this in Haskell, we can first write the version that produces the two-dimensional table:
cartesian2d :: [a] -> [b] -> [[(a, b)]]
cartesian2d as bs = [[(a, b) | a <- as] | b <- bs]
An inefficient method of diagonalizing is to then iterate first along diagonals and then along depth of each diagonal, pulling out the appropriate element each time. For simplicity of explanation, I'll assume that both the input lists are infinite, so we don't have to mess around with bounds checking.
diagonalBad :: [[a]] -> [a]
diagonalBad xs =
[ xs !! row !! col
| diagonal <- [0..]
, depth <- [0..diagonal]
, let row = depth
col = diagonal - depth
]
This implementation is a bit unfortunate: the repeated list indexing operation !! gets more and more expensive as we go, giving quite a bad asymptotic performance. A more efficient implementation will take the above idea but implement it using zippers. So, we'll divide our infinite grid into three shapes like this:
a1 a2 / a3 a4 ...
/
/
b1 / b2 b3 b4 ...
/
/
/
c1 c2 c3 c4 ...
---------------------------------
d1 d2 d3 d4 ...
. . . . .
. . . . .
. . . . .
The top left triangle will be the bits we've already emitted; the top right quadrilateral will be rows that have been partially emitted but will still contribute to the result; and the bottom rectangle will be rows that we have not yet started emitting. To begin with, the upper triangle and upper quadrilateral will be empty, and the bottom rectangle will be the entire grid. On each step, we can emit the first element of each row in the upper quadrilateral (essentially moving the slanted line over by one), then add one new row from the bottom rectangle to the upper quadrilateral (essentially moving the horizontal line down by one).
diagonal :: [[a]] -> [a]
diagonal = go [] where
go upper lower = [h | h:_ <- upper] ++ case lower of
[] -> concat (transpose upper')
row:lower' -> go (row:upper') lower'
where upper' = [t | _:t <- upper]
Although this looks a bit more complicated, it is significantly more efficient. It also handles the bounds checking that we punted on in the simpler version.
But you shouldn't write all this code yourself, of course! Instead, you should use the universe package. In Data.Universe.Helpers, there is (+*+), which packages together the above cartesian2d and diagonal functions to give just the Cartesian product operation:
Data.Universe.Helpers> "abcd" +*+ [1..4]
[('a',1),('a',2),('b',1),('a',3),('b',2),('c',1),('a',4),('b',3),('c',2),('d',1),('b',4),('c',3),('d',2),('c',4),('d',3),('d',4)]
You can also see the diagonals themselves if that structure becomes useful:
Data.Universe.Helpers> mapM_ print . diagonals $ cartesian2d "abcd" [1..4]
[('a',1)]
[('a',2),('b',1)]
[('a',3),('b',2),('c',1)]
[('a',4),('b',3),('c',2),('d',1)]
[('b',4),('c',3),('d',2)]
[('c',4),('d',3)]
[('d',4)]
If you have many lists to product together, iterating (+*+) can unfairly bias certain lists; you can use choices :: [[a]] -> [[a]] for your n-dimensional Cartesian product needs.

Yet another way to accomplish this is using applicatives:
import Control.Applicative
cartProd :: [a] -> [b] -> [(a,b)]
cartProd xs ys = (,) <$> xs <*> ys

Yet another way, using the do notation:
cartProd :: [a] -> [b] -> [(a,b)]
cartProd xs ys = do x <- xs
y <- ys
return (x,y)

The right way is using list comprehensions, as other people have already pointed out, but if you wanted to do it without using list comprehensions for any reason, then you could do this:
cartProd :: [a] -> [b] -> [(a,b)]
cartProd xs [] = []
cartProd [] ys = []
cartProd (x:xs) ys = map (\y -> (x,y)) ys ++ cartProd xs ys

Well, one very easy way to do this would be with list comprehensions:
cartProd :: [a] -> [b] -> [(a, b)]
cartProd xs ys = [(x, y) | x <- xs, y <- ys]
Which I suppose is how I would do this, although I'm not a Haskell expert (by any means).

something like:
cartProd x y = [(a,b) | a <- x, b <- y]

It's a job for sequenceing. A monadic implementation of it could be:
cartesian :: [[a]] -> [[a]]
cartesian [] = return []
cartesian (x:xs) = x >>= \x' -> cartesian xs >>= \xs' -> return (x':xs')
*Main> cartesian [[1,2,3],[4,5,6]]
[[1,4],[1,5],[1,6],[2,4],[2,5],[2,6],[3,4],[3,5],[3,6]]
As you may notice, the above resembles the implementation of map by pure functions but in monadic type. Accordingly you can simplify it down to
cartesian :: [[a]] -> [[a]]
cartesian = mapM id
*Main> cartesian [[1,2,3],[4,5,6]]
[[1,4],[1,5],[1,6],[2,4],[2,5],[2,6],[3,4],[3,5],[3,6]]

Just adding one more way for the enthusiast, using only recursive pattern matching.
cartProd :: [a]->[b]->[(a,b)]
cartProd _ []=[]
cartProd [] _ = []
cartProd (x:xs) (y:ys) = [(x,y)] ++ cartProd [x] ys ++ cartProd xs ys ++ cartProd xs [y]

Here is my implementation of n-ary cartesian product:
crossProduct :: [[a]] -> [[a]]
crossProduct (axis:[]) = [ [v] | v <- axis ]
crossProduct (axis:rest) = [ v:r | v <- axis, r <- crossProduct rest ]

Recursive pattern matching with out List comprehension
crossProduct [] b=[]
crossProduct (x : xs) b= [(x,b)] ++ crossProduct xs b
cartProd _ []=[]
cartProd x (u:uv) = crossProduct x u ++ cartProd x uv

If all you want is the Cartesian product, any of the above answers will do. Usually, though, the Cartesian product is a means to an end. Usually, this means binding the elements of the tuple to some variables, x and y, then calling some function f x y on them. If this is the plan anyway, you might be better off just going full monad:
do
x <- [1, 2]
y <- [6, 8, 10]
pure $ f x y
This will produce the list [f 1 6, f 1 8, f 1 10, f 2 6, f 2 8, f 2 10].

Related

Generate list of Ints in Haskell by adding Ints from a pattern list

I'm playing around with Haskell, mostly trying to learn some new techniques to solve problems. Without any real application in mind I came to think about an interesting thing I can't find a satisfying solution to. Maybe someone has any better ideas?
The problem:
Let's say we want to generate a list of Ints using a starting value and a list of Ints, representing the pattern of numbers to be added in the specified order. So the first value is given, then second value should be the starting value plus the first value in the list, the third that value plus the second value of the pattern, and so on. When the pattern ends, it should start over.
For example: Say we have a starting value v and a pattern [x,y], we'd like the list [v,v+x,v+x+y,v+2x+y,v+2x+2y, ...]. In other words, with a two-valued pattern, next value is created by alternatingly adding x and y to the number last calculated.
If the pattern is short enough (2-3 values?), one could generate separate lists:
[v,v,v,...]
[0,x,x,2x,2x,3x, ...]
[0,0,y,y,2y,2y,...]
and then zip them together with addition. However, as soon as the pattern is longer this gets pretty tedious. My best attempt at a solution would be something like this:
generateLstByPattern :: Int -> [Int] -> [Int]
generateLstByPattern v pattern = v : (recGen v pattern)
where
recGen :: Int -> [Int] -> [Int]
recGen lastN (x:[]) = (lastN + x) : (recGen (lastN + x) pattern)
recGen lastN (x:xs) = (lastN + x) : (recGen (lastN + x) xs)
It works as intended - but I have a feeling there is a bit more elegant Haskell solution somewhere (there almost always is!). What do you think? Maybe a cool list-comprehension? A higher-order function I've forgotten about?
Separate the concerns. First look a just a list to process once. Get that working, test it. Hint: “going through the list elements with some accumulator” is in general a good fit for a fold.
Then all that's left to is to repeat the list of inputs and feed it into the pass-once function. Conveniently, there's a standard function for that purpose. Just make sure your once-processor is lazy enough to handle the infinite list input.
What you describe is
foo :: Num a => a -> [a] -> [a]
foo v pattern = scanl (+) v (cycle pattern)
which would normally be written even as just
foo :: Num a => a -> [a] -> [a]
foo v = scanl (+) v . cycle
scanl (+) v xs is the standard way to calculate the partial sums of (v:xs), and cycle is the standard way to repeat a given list cyclically. This is what you describe.
This works for a pattern list of any positive length, as you wanted.
Your way of generating it is inventive, but it's almost too clever for its own good (i.e. it seems overly complicated). It can be expressed with some list comprehensions, as
foo v pat =
let -- the lists, as you describe them:
lists = repeat v :
[ replicate i 0 ++
[ y | x <- [p, p+p ..]
, y <- map (const x) pat ]
| (p,i) <- zip pat [1..] ]
in
-- OK, so what do we do with that? How do we zipWith
-- over an arbitrary amount of lists?
-- with a fold!
foldr (zipWith (+)) (repeat 0) lists
map (const x) pat is a "clever" way of writing replicate (length pat) x. It can be further shortened to x <$ pat since (<$) x xs == map (const x) xs by definition. It might seem obfuscated, until you've become accustomed to it, and then it seems clear and obvious. :)
Surprised noone's mentioned the silly way yet.
mylist x xs = x : zipWith (+) (mylist x xs) (cycle xs)
(If you squint a bit you can see the connection to scanl answer).
When it is about generating series my first approach would be iterate or unfoldr. iterate is for simple series and unfoldr is for those who carry kind of state but without using any State monad.
In this particular case I think unfoldr is ideal.
series :: Int -> [Int] -> [Int]
series s [x,y] = unfoldr (\(f,s) -> Just (f*x + s*y, (s+1,f))) (s,0)
λ> take 10 $ series 1 [1,1]
[1,2,3,4,5,6,7,8,9,10]
λ> take 10 $ series 3 [1,1]
[3,4,5,6,7,8,9,10,11,12]
λ> take 10 $ series 0 [1,2]
[0,1,3,4,6,7,9,10,12,13]
It is probably better to implement the lists separately, for example the list with x can be implement with:
xseq :: (Enum a, Num a) => a -> [a]
xseq x = 0 : ([x, x+x ..] >>= replicate 2)
Whereas the sequence for y can be implemented as:
yseq :: (Enum a, Num a) => a -> [a]
yseq y = [0,y ..] >>= replicate 2
Then you can use zipWith :: (a -> b -> c) -> [a] -> [b] -> [c] to add the two lists together and add v to it:
mylist :: (Enum a, Num a) => a -> a -> a -> [a]
mylist v x y = zipWith ((+) . (v +)) (xseq x) (yseq y)
So for v = 1, x = 2, and y = 3, we obtain:
Prelude> take 10 (mylist 1 2 3)
[1,3,6,8,11,13,16,18,21,23]
An alternative is to see as pattern that we each time first add x and then y. We thus can make an infinite list [(x+), (y+)], and use scanl :: (b -> a -> b) -> b -> [a] -> [b] to each time apply one of the functions and yield the intermediate result:
mylist :: Num a => a -> a -> a -> [a]
mylist v x y = scanl (flip ($)) v (cycle [(x+), (y+)])
this yields the same result:
Prelude> take 10 $ mylist 1 2 3
[1,3,6,8,11,13,16,18,21,23]
Now the only thing left to do is to generalize this to a list. So for example if the list of additions is given, then you can impelement this as:
mylist :: Num a => [a] -> [a]
mylist v xs = scanl (flip ($)) v (cycle (map (+) xs))
or for a list of functions:
mylist :: Num a => [a -> a] -> [a]
mylist v xs = scanl (flip ($)) v (cycle (xs))

Building an exponentially sized list in haskell

I have two functions which only do something if C is a specific pattern.
Each function outputs a list of C.
My goal is, given [C], I want to get all possibilities of calling f1 and f2 on the list while leaving the rest unchanged. For example:
suppose the list of C is:
c1
c2 --matches the pattern
c3
then I want a list of two lists
[[c1] ++ (f1 c2) ++ [c3],[c1] ++ (f2 c2) ++ [c3]]
However, if I have
c1
c2 --matches the pattern
c3 --matches the pattern
Then we should have 4 lists because we want all combinations of calling f1 and f2.
So it would be:
[(f1 c1) ++ (f1 c2) ++ [c3], (f2 c1) ++ (f2 c2) ++ [c3],
(f1 c1) ++ (f2 c2) ++ [c3], (f2 c1) ++ (f1 c2) ++ [c3]]
currently, my code is structured roughly in the following way:
f1 :: C -> [C]
f2 :: C -> [C]
combine :: [C] -> [[C]]
combine my_pattern:xs = ?
combine (x:xs) = ?
combine [] = []
where first_set = (f1 my_pattern)
second_set = (f2 my_pattern)
Could someone give intuition on how I could fill the remaining part? Is there any functions from Data.List that can be useful? I looked at the documentation, but wasn't able to immediately notice which one could be helpful.
The other answers seem very complicated to me. In this answer I will expand on my comment: this is just a foldMap combining the nondeterminism monad (lists!) with the sequence monoid (lists!).
First write a thing that works on a single element of the list:
singleElement x
| matchesThePattern x = [f1 x, f2 x]
| otherwise = [[x]]
Then apply it to each element:
import Data.Monoid
combine = foldMap (Ap . singleElement)
That's it. That's the whole code.
For example, suppose we want to repeat each letter either 2 or 3 times, i.e. x -> xx or xxx, and all other characters to stay the same.
singleElement x
| 'a' <= x && x <= 'z' = [[x, x], [x, x, x]]
| otherwise = [[x]]
Then we can try it in ghci:
> combine "123def"
Ap {getAp = ["123ddeeff","123ddeefff","123ddeeeff","123ddeeefff","123dddeeff","123dddeefff","123dddeeeff","123dddeeefff"]}
Pick a better name than singleElement in your own code, of course.
You must have
applicable_f1 :: C -> Bool
applicable_f2 :: C -> Bool
defined somehow. Then,
combinations :: [C] -> [[C]]
combinations cs = map concat . sequence $
[ concat $ [ [ [c] | not (applicable_f1 c || applicable_f2 c)]
, [ f1 c | applicable_f1 c]
, [ f2 c | applicable_f2 c] ]
| c <- cs]
My approach would be to
Solve the problem for the element in the list you're currently looking at (x or my_pattern). This means generating one or more new lists.
Solve the problem for the rest of the list (xs). This will give you back a list of lists ([[C]]).
Combine the two solutions. If you have multiple lists generated from step 1, each of these lists ([C]) will combine with each list (also [C]) in the list of lists ([[C]]) from step 2.
I have two possible approaches.
It isn't clear to me how much help you are looking for, so I've left my answers somewhat "spoiler free." Ask for clarification or more details if you need it.
List comprehension
Without delving into the weeds of the Applicative or Traversable typeclasses, you can accomplish what you want with a list comprehension.
Let's consider the case where your pattern is matched. I would write a list comprehension as follows:
[ x ++ y | x <- _, y <- _] :: [[C]]
-- this means
-- x :: [C]
-- y :: [C]
-- _ :: [[C]]
This list comprehension creates a list of lists. x is what is being prepended, so it would make sense for it to be coming from the application of the functions f1 and f2. y is the tail end of each resulting list. I'll leave you to figure out what it might be.
The non matching case is simpler than this, and can be written like
[ x : y | y <- _] :: [[C]]
-- note that x is not local to the list comprehension
-- y :: [C]
-- _ :: [[C]]
although this really is just a special case of the above list comprehension.
Applicative
Another way of approaching this problem would be by using the Applicative instance of [a].
Let's examine the function (<*>) under the list Applicative instance.
-- this is the type when specialized to lists
(<*>) :: [a -> b] -> [a] -> [b]
This function has a kind of strange type signature. It takes a list of functions, and a list, then returns you another list. It has the effect of applying each function a -> b to each element of [a] in order.
>>> [(+1), (+2)] <*> [1,2,3]
-- [2,3,4] comes from (+1)
-- [3,4,5] comes from (+2)
[2,3,4,3,4,5]
We want to get out [[C]], not [C], so if we want to use (<*>) we can specialize its type more to
(<*>) :: [a -> [C]] -> [a] -> [[C]]
To avoid confusion, I recommend picking a = [C], which gives
(<*>) :: [[C] -> [C]] -> [[C]] -> [[C]]
Your list of functions should be prepending the right elements onto the lists you're generating. The second argument should be the lists returned by a recursive call.

Creating a lists of lists with new element in each position

i'm new in the haskell world and i'd like to know how to insert a value in each position of a list in haskell, and return a lists of sublists containing the value in each position. For example:
insert' :: a -> [a] -> [[a]]
insert' a [] = [[a]]
insert' a list = ??
To get something like:
insert' 7 [1,2,3] = [[7,1,2,3],[1,7,2,3],[1,2,7,3],[1,2,3,7]]
insert' :: a -> [a] -> [[a]]
insert' y [] = [[y]]
insert' y xss#(x:xs) = (y : xss) : map (x :) (insert' y xs)
While the empty list case comes natural, let's take a look at insert' y xss#(x:xs). We essentially have two cases we need to cover:
y appears in front of x. Then we can just use y : xss.
y appears somewhere after x. We therefore just insert it in the rest of our list and make sure that x is the first element with map (x:).
Although #delta's answer is definitely more elegant, here a solution with difference lists. If we insert an element x on every location of list ys = [y1,y2,...,yn], the first time we will insert it as head, so that means we can construct x : ys.
. For the second element of the resulting list, we want to construct a list [y1,x,y2,...,yn]. We can do this like y1 : x : y2s. The next lists will all have a structure y1 : ....
The question is: how can we write a recursive structure that keeps track of the fact that we want to put elements in the head. We can use a function for that: we start with a function id. If we now call id (x:ys) then we will of course generate the list (x:ys).
We can however, based on the id function, construct a new function id2 = \z -> id (y1:z). This function will thus put y1 in the head of the list and then add the list with which we call id2 as tail. Next we can construct id3 = \z -> id2 (y2:z). This will put y1 and y2 as first elements followed by the tail z.
So we can put this into the following recursive format:
insert' :: a -> [a] -> [[a]]
insert' x = go id
where go d [] = [d [x]]
go d ys#(yh:yt) = (d (x : ys)) : go (d . (yh :)) yt
So we redirect insert' to go where the initial difference list is simply the id function. Each time we check if we have reached the end of the given list. If that is the case, we return the basecase: we call [x] (as tail) on the difference list, and thus construct a list where we append x as last element.
In case we have not yet reached the last element, we will first emit d (x : ys): we prepend x to the list and provide this as argument to the difference list d. d will prepend y1 : y2 : ... : yk up to the point where we insert x. Furthermore we call recursively go (d . (yh :)) yt on the tail of the list: we thus construct a new difference list, wehere we insert (yh :) as tail of the list. We thus produce a new function with one argument: the tail after the yh element.
This function produces the expected results:
*Main> insert' 4 []
[[4]]
*Main> insert' 4 [1,2,5]
[[4,1,2,5],[1,4,2,5],[1,2,4,5],[1,2,5,4]]
*Main> insert' 7 [1,2,3]
[[7,1,2,3],[1,7,2,3],[1,2,7,3],[1,2,3,7]]
You may also do as follows;
import Data.List
spread :: a -> [a] -> [[a]]
spread x xs = zipWith (++) (inits xs) ((x:) <$> tails xs)
*Main> spread 7 [1,2,3]
[[7,1,2,3],[1,7,2,3],[1,2,7,3],[1,2,3,7]]
*Main> spread 7 []
[[7]]
So this is about three stages.
(x:) <$> tails xs is all about applying the (x:) function to all elements of tails xs function. So tails [1,2,3] would return [[1,2,3],[2,3],[3],[]] and we are to apply an fmap which is designated by <$> in the inline form. This is going to be the third argument of the zipWith function.
(inits xs) which would return [[],[1],[1,2],[1,2,3]], is going to be the second argument to zipWith.
zipWith (++) is obviously will zip two list of lists by concatenating the list elements.
So we may also express the same functionality with applicative function functors as follows;
spread :: a -> [a] -> [[a]]
spread x = zipWith (++) <$> inits <*> fmap (x:) . tails
In this case we fmap the zipWith (++) function with type [[a]] -> [[a]] -> [[a]] over inits and then apply it over to fmap (x:) . tails.
It could get more pointfree but becomes more complicated to read through (at least for me). In my opinion this is as best as it gets.

Intersection of infinite lists

I know from computability theory that it is possible to take the intersection of two infinite lists, but I can't find a way to express it in Haskell.
The traditional method fails as soon as the second list is infinite, because you spend all your time checking it for a non-matching element in the first list.
Example:
let ones = 1 : ones -- an unending list of 1s
intersect [0,1] ones
This never yields 1, as it never stops checking ones for the element 0.
A successful method needs to ensure that each element of each list will be visited in finite time.
Probably, this will be by iterating through both lists, and spending approximately equal time checking all previously-visited elements in each list against each other.
If possible, I'd like to also have a way to ignore duplicates in the lists, as it is occasionally necessary, but this is not a requirement.
Using the universe package's Cartesian product operator we can write this one-liner:
import Data.Universe.Helpers
isect :: Eq a => [a] -> [a] -> [a]
xs `isect` ys = [x | (x, y) <- xs +*+ ys, x == y]
-- or this, which may do marginally less allocation
xs `isect` ys = foldr ($) [] $ cartesianProduct
(\x y -> if x == y then (x:) else id)
xs ys
Try it in ghci:
> take 10 $ [0,2..] `isect` [0,3..]
[0,6,12,18,24,30,36,42,48,54]
This implementation will not produce any duplicates if the input lists don't have any; but if they do, you can tack on your favorite dup-remover either before or after calling isect. For example, with nub, you might write
> nub ([0,1] `isect` repeat 1)
[1
and then heat up your computer pretty good, since it can never be sure there might not be a 0 in that second list somewhere if it looks deep enough.
This approach is significantly faster than David Fletcher's, produces many fewer duplicates and produces new values much more quickly than Willem Van Onsem's, and doesn't assume the lists are sorted like freestyle's (but is consequently much slower on such lists than freestyle's).
An idea might be to use incrementing bounds. Let is first relax the problem a bit: yielding duplicated values is allowed. In that case you could use:
import Data.List (intersect)
intersectInfinite :: Eq a => [a] -> [a] -> [a]
intersectInfinite = intersectInfinite' 1
where intersectInfinite' n = intersect (take n xs) (take n ys) ++ intersectInfinite' (n+1)
In other words we claim that:
A∩B = A1∩B1 ∪ A2∩B2 ∪ ... ∪ ...
with A1 is a set containing the first i elements of A (yes there is no order in a set, but let's say there is somehow an order). If the set contains less elements then the full set is returned.
If c is in A (at index i) and in B (at index j), c will be emitted in segment (not index) max(i,j).
This will thus always generate an infinite list (with an infinite amount of duplicates) regardless whether the given lists are finite or not. The only exception is when you give it an empty list, in which case it will take forever. Nevertheless we here ensured that every element in the intersection will be emitted at least once.
Making the result finite (if the given lists are finite)
Now we can make our definition better. First we make a more advanced version of take, takeFinite (let's first give a straight-forward, but not very efficient defintion):
takeFinite :: Int -> [a] -> (Bool,[a])
takeFinite _ [] = (True,[])
takeFinite 0 _ = (False,[])
takeFinite n (x:xs) = let (b,t) = takeFinite (n-1) xs in (b,x:t)
Now we can iteratively deepen until both lists have reached the end:
intersectInfinite :: Eq a => [a] -> [a] -> [a]
intersectInfinite = intersectInfinite' 1
intersectInfinite' :: Eq a => Int -> [a] -> [a] -> [a]
intersectInfinite' n xs ys | fa && fb = intersect xs ys
| fa = intersect ys xs
| fb = intersect xs ys
| otherwise = intersect xfa xfb ++ intersectInfinite' (n+1) xs ys
where (fa,xfa) = takeFinite n xs
(fb,xfb) = takeFinite n ys
This will now terminate given both lists are finite, but still produces a lot of duplicates. There are definitely ways to resolve this issue more.
Here's one way. For each x we make a list of maybes which has
Just x only where x appeared in ys. Then we interleave all
these lists.
isect :: Eq a => [a] -> [a] -> [a]
isect xs ys = (catMaybes . foldr interleave [] . map matches) xs
where
matches x = [if x == y then Just x else Nothing | y <- ys]
interleave :: [a] -> [a] -> [a]
interleave [] ys = ys
interleave (x:xs) ys = x : interleave ys xs
Maybe it can be improved using some sort of fairer interleaving -
it's already pretty slow on the example below because (I think)
it's doing an exponential amount of work.
> take 10 (isect [0..] [0,2..])
[0,2,4,6,8,10,12,14,16,18]
If elements in the lists are ordered then you can easy to do that.
intersectOrd :: Ord a => [a] -> [a] -> [a]
intersectOrd [] _ = []
intersectOrd _ [] = []
intersectOrd (x:xs) (y:ys) = case x `compare` y of
EQ -> x : intersectOrd xs ys
LT -> intersectOrd xs (y:ys)
GT -> intersectOrd (x:xs) ys
Here's yet another alternative, leveraging Control.Monad.WeightedSearch
import Control.Monad (guard)
import Control.Applicative
import qualified Control.Monad.WeightedSearch as W
We first define a cost for digging inside the list. Accessing the tail costs 1 unit more. This will ensure a fair scheduling among the two infinite lists.
eachW :: [a] -> W.T Int a
eachW = foldr (\x w -> pure x <|> W.weight 1 w) empty
Then, we simply disregard infinite lists.
intersection :: [Int] -> [Int] -> [Int]
intersection xs ys = W.toList $ do
x <- eachW xs
y <- eachW ys
guard (x==y)
return y
Even better with MonadComprehensions on:
intersection2 :: [Int] -> [Int] -> [Int]
intersection2 xs ys = W.toList [ y | x <- eachW xs, y <- eachW ys, x==y ]
Solution
I ended up using the following implementation; a slight modification of the answer by David Fletcher:
isect :: Eq a => [a] -> [a] -> [a]
isect [] = const [] -- don't bother testing against an empty list
isect xs = catMaybes . diagonal . map matches
where matches y = [if x == y then Just x else Nothing | x <- xs]
This can be augmented with nub to filter out duplicates:
isectUniq :: Eq a => [a] -> [a] -> [a]
isectUniq xs = nub . isect xs
Explanation
Of the line isect xs = catMaybes . diagonal . map matches
(map matches) ys computes a list of lists of comparisons between elements of xs and ys, where the list indices specify the indices in ys and xs respectively: i.e (map matches) ys !! 3 !! 0 would represent the comparison of ys !! 3 with xs !! 0, which would be Nothing if those values differ. If those values are the same, it would be Just that value.
diagonals takes a list of lists and returns a list of lists where the nth output list contains an element each from the first n lists. Another way to conceptualise it is that (diagonals . map matches) ys !! n contains comparisons between elements whose indices in xs and ys sum to n.
diagonal is simply a flat version of diagonals (diagonal = concat diagonals)
Therefore (diagonal . map matches) ys is a list of comparisons between elements of xs and ys, where the elements are approximately sorted by the sum of the indices of the elements of ys and xs being compared; this means that early elements are compared to later elements with the same priority as middle elements being compared to each other.
(catMaybes . diagonal . map matches) ys is a list of only the elements which are in both lists, where the elements are approximately sorted by the sum of the indices of the two elements being compared.
Note
(diagonal . map (catMaybes . matches)) ys does not work: catMaybes . matches only yields when it finds a match, instead of also yielding Nothing on no match, so the interleaving does nothing to distribute the work.
To contrast, in the chosen solution, the interleaving of Nothing and Just values by diagonal means that the program divides its attention between 'searching' for multiple different elements, not waiting for one to succeed; whereas if the Nothing values are removed before interleaving, the program may spend too much time waiting for a fruitless 'search' for a given element to succeed.
Therefore, we would encounter the same problem as in the original question: while one element does not match any elements in the other list, the program will hang; whereas the chosen solution will only hang while no matches are found for any elements in either list.

Help on a list manipulating list

Hi I am a newbie in Haskell.
I am trying to do a simple task.
test :: (RealFloat a) => a -> a -> [a]
test xs ys= [ w : h: [] | w <- xs, h <- ys]
I am getting an error here. (with out a doubt)
In this task, I am simply trying to bind two lists (ex: test [12.12] [14.14])
and hopefully return a new combined list (ex: [12.12,14.14])
thanks for your help
Your signature is wrong. Try:
test xs ys = ...
then in ghci:
> :t test
test :: [t] -> [t] -> [[t]]
You need two arguments, both are lists, not two arguments of single elements.
Drakosha is correct. List concatenation already has an operator in Haskell.
test :: (RealFloat a) => [a] -> [a] -> [a]
test xs ys= xs ++ ys
You probably don't want to use a list comprehension here, unless you want to extract every element in your first and second list and do something with them. For example, a Cartesian Product:
list1 = [1.0,1.1,1.2] :: [Double]
list2 = [2.0,2.1,2.2] :: [Double]
testComps xs ys = [(x,y) | x <- xs, y <- ys]
Or addition:
testComps2 xs ys = [ x + y | x <- xs, y <- ys]
Or even creating lists:
testComps3 xs ys = [x : y : [] | x <- xs, y <- ys]
In GHCi, this will yield the following:
*Main> testComps list1 list2
[(1.0,2.0),(1.0,2.1),(1.0,2.2),(1.1,2.0),(1.1,2.1),(1.1,2.2),(1.2,2.0),(1.2,2.1)
,(1.2,2.2)]
*Main> testComps2 list1 list2
[3.0,3.1,3.2,3.1,3.2,3.3000000000000003,3.2,3.3,3.4000000000000004]
*Main> testComps3 list1 list2
[[1.0,2.0],[1.0,2.1],[1.0,2.2],[1.1,2.0],[1.1,2.1],[1.1,2.2],[1.2,2.0],[1.2,2.1]
,[1.2,2.2]]
The weird results in testComps2 is, of course, normal cruft when you're dealing with floating-point numbers. In the real world you'd compensate for this by rounding.
Another problem you'll run into is the difference between (++) and (:). Simply put, (:) tacks individual items onto a list, whereas (++) concatenates two lists.
You need list concatenation:
[12.12] ++ [14.14]
=> [12.12,14.14]