Comparison of two lists in Haskell - list

I have two lists in Haskell.
Original List: ["hello", "HELLO", "world", "WORLD"]
Only Upper Case List: ["HELLO", "WORLD"]
Could you help me to create a function which should return a list conatining the indexes of intersection of two lists.
I can get the first index by doing this:
let upperIndex = findIndices(==(onlyUpper !! 0)) original
However, this only works for one instance, in this case I can only get index of "HELLO" in the original list, but I want to get all of them.
For this example, the answer should be: [1,3]

Edit: Another version suggested by David Young is
findIndicesIn xs ys = findIndices (`elem` ys) xs
which I prefer to my solution below.
If I understand correctly, you have two lists. Call them xs and ys. You want to find the index in xs of each element in ys. You don't mention what you want to do if the element in ys is not contained in xs, so I am going to choose something reasonable for you. Here it is:
findIndicesIn :: Eq a => [a] -> [a] -> [Maybe Int]
findIndicesIn xs ys = map (`elemIndex` xs) ys
elemIndex :: Eq a => a -> [a] -> Maybe Int finds the index of the given element in the list (comparing with (==)). If the element does not exist, Nothing is returned instead. To find all the indices, we map over each element in ys and try to find it in xs using elemIndex. Section syntax is used instead of flip elemIndex xs or \y -> elemIndex y xs for concision.
The result is a list of Maybe Int representing the possible indices in xs of each element in ys. Note that if you do not keep track of missing elements, the positions of the indices in the resulting list will no longer correspond to the positions of the elements in ys.
You can also write this using fewer points as
findIndicesIn :: Eq a => [a] -> [a] -> [Maybe Int]
findIndicesIn xs = map (`elemIndex` xs)
YMMV on which one is more clear. Both are equivalent. This version is pretty readable IMO. You can go one step further and write
findIndicesIn = map . flip elemIndex
but personally I find this less readable. YMMV again.

let upperIndex original onlyUpper = helper original 0 where helper [] _ = []; helper (x:xs) i = if elem x onlyUpper then i:(helper xs (i+1)) else helper xs (i+1)
Example of use:
Prelude> upperIndex ["hello", "HELLO", "world", "WORLD"] ["HELLO", "WORLD"]
[1,3]

Related

How to get the Index of an element in a list, by not using "list comprehensions"?

I'm new in haskell programming and I try to solve a problem by/not using list comprehensions.
The Problem is to find the index of an element in a list and return a list of the indexes (where the elements in the list was found.)
I already solved the problem by using list comprehensions but now i have some problems to solve the problem without using list comprehensions.
On my recursive way:
I tried to zip a list of [0..(length list)] and the list as it self.
then if the element a equals an element in the list -> make a new list with the first element of the Tupel of the zipped list(my index) and after that search the function on a recursive way until the list is [].
That's my list comprehension (works):
positions :: Eq a => a -> [a] -> [Int]
positions a list = [x | (x,y) <- zip [0..(length list)] list, a == y]
That's my recursive way (not working):
positions' :: Eq a => a -> [a] -> [Int]
positions' _ [] = []
positions' a (x:xs) =
let ((n,m):ns) = zip [0..(length (x:xs))] (x:xs)
in if (a == m) then n:(positions' a xs)
else (positions' a xs)
*sorry I don't know how to highlight words
but ghci says:
*Main> positions' 2 [1,2,3,4,5,6,7,8,8,9,2]
[0,0]
and it should be like that (my list comprehension):
*Main> positions 2 [1,2,3,4,5,6,7,8,8,9,2]
[1,10]
Where is my mistake ?
The problem with your attempt is simply that when you say:
let ((n,m):ns) = zip [0..(length (x:xs))] (x:xs)
then n will always be 0. That's because you are matching (n,m) against the first element of zip [0..(length (x:xs))] (x:xs), which will necessarily always be (0,x).
That's not a problem in itself - but it does mean you have to handle the recursive step properly. The way you have it now, positions _ _, if non-empty, will always have 0 as its first element, because the only way you allow it to find a match is if it's at the head of the list, resulting in an index of 0. That means that your result will always be a list of the correct length, but with all elements 0 - as you're seeing.
The problem isn't with your recursion scheme though, it's to do with the fact that you're not modifying the result to account for the fact that you don't always want 0 added to the front of the result list. Since each recursive call just adds 1 to the index you want to find, all you need to do is map the increment function (+1) over the recursive result:
positions' :: Eq a => a -> [a] -> [Int]
positions' _ [] = []
positions' a (x:xs) =
let ((0,m):ns) = zip [0..(length (x:xs))] (x:xs)
in if (a == m) then 0:(map (+1) (positions' a xs))
else (map (+1) (positions' a xs))
(Note that I've changed your let to be explicit that n will always be 0 - I prefer to be explicit this way but this in itself doesn't change the output.) Since m is always bound to x and ns isn't used at all, we can elide the let, inlining the definition of m:
positions' :: Eq a => a -> [a] -> [Int]
positions' _ [] = []
positions' a (x:xs) =
if a == x
then 0 : map (+1) (positions' a xs)
else map (+1) (positions' a xs)
You could go on to factor out the repeated map (+1) (positions' a xs) if you wanted to.
Incidentally, you didn't need explicit recursion to avoid a list comprehension here. For one, list comprehensions are basically a replacement for uses of map and filter. I was going to write this out explicitly, but I see #WillemVanOnsem has given this as an answer so I will simply refer you to his answer.
Another way, although perhaps not acceptable if you were asked to implement this yourself, would be to just use the built-in elemIndices function, which does exactly what you are trying to implement here.
We can make use of a filter :: (a -> Bool) -> [a] -> [a] and map :: (a -> b) -> [a] -> [b] approach, like:
positions :: Eq a => a -> [a] -> [Int]
positions x = map fst . filter ((x ==) . snd) . zip [0..]
We thus first construct tuples of the form (i, yi), next we filter such that we only retain these tuples for which x == yi, and finally we fetch the first item of these tuples.
For example:
Prelude> positions 'o' "foobaraboof"
[1,2,8,9]
Your
let ((n,m):ns) = zip [0..(length (x:xs))] (x:xs)
is equivalent to
== {- by laziness -}
let ((n,m):ns) = zip [0..] (x:xs)
== {- by definition of zip -}
let ((n,m):ns) = (0,x) : zip [1..] xs
== {- by pattern matching -}
let {(n,m) = (0,x)
; ns = zip [1..] xs }
== {- by pattern matching -}
let { n = 0
; m = x
; ns = zip [1..] xs }
but you never reference ns! So we don't need its binding at all:
positions' a (x:xs) =
let { n = 0 ; m = x } in
if (a == m) then n : (positions' a xs)
else (positions' a xs)
and so, by substitution, you actually have
positions' :: Eq a => a -> [a] -> [Int]
positions' _ [] = []
positions' a (x:xs) =
if (a == x) then 0 : (positions' a xs) -- NB: 0
else (positions' a xs)
And this is why all you ever produce are 0s. But you want to produce the correct index: 0, 1, 2, 3, ....
First, let's tweak your code a little bit further into
positions' :: Eq a => a -> [a] -> [Int]
positions' a = go xs
where
go [] = []
go (x:xs) | a == x = 0 : go xs -- NB: 0
| otherwise = go xs
This is known as a worker/wrapper transform. go is a worker, positions' is a wrapper. There's no need to pass a around from call to call, it doesn't change, and we have access to it anyway. It is in the enclosing scope with respect to the inner function, go. We've also used guards instead of the more verbose and less visually apparent if ... then ... else.
Now we just need to use something -- the correct index value -- instead of 0.
To use it, we must have it first. What is it? It starts as 0, then it is incremented on each step along the input list.
When do we make a step along the input list? At the recursive call:
positions' :: Eq a => a -> [a] -> [Int]
positions' a = go xs 0
where
go [] _ = []
go (x:xs) i | a == x = 0 : go xs (i+1) -- NB: 0
| otherwise = go xs (i+1)
_ as a pattern means we don't care about the argument's value -- it's there but we're not going to use it.
Now all that's left for us to do is to use that i in place of that 0.

Intersection of infinite lists

I know from computability theory that it is possible to take the intersection of two infinite lists, but I can't find a way to express it in Haskell.
The traditional method fails as soon as the second list is infinite, because you spend all your time checking it for a non-matching element in the first list.
Example:
let ones = 1 : ones -- an unending list of 1s
intersect [0,1] ones
This never yields 1, as it never stops checking ones for the element 0.
A successful method needs to ensure that each element of each list will be visited in finite time.
Probably, this will be by iterating through both lists, and spending approximately equal time checking all previously-visited elements in each list against each other.
If possible, I'd like to also have a way to ignore duplicates in the lists, as it is occasionally necessary, but this is not a requirement.
Using the universe package's Cartesian product operator we can write this one-liner:
import Data.Universe.Helpers
isect :: Eq a => [a] -> [a] -> [a]
xs `isect` ys = [x | (x, y) <- xs +*+ ys, x == y]
-- or this, which may do marginally less allocation
xs `isect` ys = foldr ($) [] $ cartesianProduct
(\x y -> if x == y then (x:) else id)
xs ys
Try it in ghci:
> take 10 $ [0,2..] `isect` [0,3..]
[0,6,12,18,24,30,36,42,48,54]
This implementation will not produce any duplicates if the input lists don't have any; but if they do, you can tack on your favorite dup-remover either before or after calling isect. For example, with nub, you might write
> nub ([0,1] `isect` repeat 1)
[1
and then heat up your computer pretty good, since it can never be sure there might not be a 0 in that second list somewhere if it looks deep enough.
This approach is significantly faster than David Fletcher's, produces many fewer duplicates and produces new values much more quickly than Willem Van Onsem's, and doesn't assume the lists are sorted like freestyle's (but is consequently much slower on such lists than freestyle's).
An idea might be to use incrementing bounds. Let is first relax the problem a bit: yielding duplicated values is allowed. In that case you could use:
import Data.List (intersect)
intersectInfinite :: Eq a => [a] -> [a] -> [a]
intersectInfinite = intersectInfinite' 1
where intersectInfinite' n = intersect (take n xs) (take n ys) ++ intersectInfinite' (n+1)
In other words we claim that:
A∩B = A1∩B1 ∪ A2∩B2 ∪ ... ∪ ...
with A1 is a set containing the first i elements of A (yes there is no order in a set, but let's say there is somehow an order). If the set contains less elements then the full set is returned.
If c is in A (at index i) and in B (at index j), c will be emitted in segment (not index) max(i,j).
This will thus always generate an infinite list (with an infinite amount of duplicates) regardless whether the given lists are finite or not. The only exception is when you give it an empty list, in which case it will take forever. Nevertheless we here ensured that every element in the intersection will be emitted at least once.
Making the result finite (if the given lists are finite)
Now we can make our definition better. First we make a more advanced version of take, takeFinite (let's first give a straight-forward, but not very efficient defintion):
takeFinite :: Int -> [a] -> (Bool,[a])
takeFinite _ [] = (True,[])
takeFinite 0 _ = (False,[])
takeFinite n (x:xs) = let (b,t) = takeFinite (n-1) xs in (b,x:t)
Now we can iteratively deepen until both lists have reached the end:
intersectInfinite :: Eq a => [a] -> [a] -> [a]
intersectInfinite = intersectInfinite' 1
intersectInfinite' :: Eq a => Int -> [a] -> [a] -> [a]
intersectInfinite' n xs ys | fa && fb = intersect xs ys
| fa = intersect ys xs
| fb = intersect xs ys
| otherwise = intersect xfa xfb ++ intersectInfinite' (n+1) xs ys
where (fa,xfa) = takeFinite n xs
(fb,xfb) = takeFinite n ys
This will now terminate given both lists are finite, but still produces a lot of duplicates. There are definitely ways to resolve this issue more.
Here's one way. For each x we make a list of maybes which has
Just x only where x appeared in ys. Then we interleave all
these lists.
isect :: Eq a => [a] -> [a] -> [a]
isect xs ys = (catMaybes . foldr interleave [] . map matches) xs
where
matches x = [if x == y then Just x else Nothing | y <- ys]
interleave :: [a] -> [a] -> [a]
interleave [] ys = ys
interleave (x:xs) ys = x : interleave ys xs
Maybe it can be improved using some sort of fairer interleaving -
it's already pretty slow on the example below because (I think)
it's doing an exponential amount of work.
> take 10 (isect [0..] [0,2..])
[0,2,4,6,8,10,12,14,16,18]
If elements in the lists are ordered then you can easy to do that.
intersectOrd :: Ord a => [a] -> [a] -> [a]
intersectOrd [] _ = []
intersectOrd _ [] = []
intersectOrd (x:xs) (y:ys) = case x `compare` y of
EQ -> x : intersectOrd xs ys
LT -> intersectOrd xs (y:ys)
GT -> intersectOrd (x:xs) ys
Here's yet another alternative, leveraging Control.Monad.WeightedSearch
import Control.Monad (guard)
import Control.Applicative
import qualified Control.Monad.WeightedSearch as W
We first define a cost for digging inside the list. Accessing the tail costs 1 unit more. This will ensure a fair scheduling among the two infinite lists.
eachW :: [a] -> W.T Int a
eachW = foldr (\x w -> pure x <|> W.weight 1 w) empty
Then, we simply disregard infinite lists.
intersection :: [Int] -> [Int] -> [Int]
intersection xs ys = W.toList $ do
x <- eachW xs
y <- eachW ys
guard (x==y)
return y
Even better with MonadComprehensions on:
intersection2 :: [Int] -> [Int] -> [Int]
intersection2 xs ys = W.toList [ y | x <- eachW xs, y <- eachW ys, x==y ]
Solution
I ended up using the following implementation; a slight modification of the answer by David Fletcher:
isect :: Eq a => [a] -> [a] -> [a]
isect [] = const [] -- don't bother testing against an empty list
isect xs = catMaybes . diagonal . map matches
where matches y = [if x == y then Just x else Nothing | x <- xs]
This can be augmented with nub to filter out duplicates:
isectUniq :: Eq a => [a] -> [a] -> [a]
isectUniq xs = nub . isect xs
Explanation
Of the line isect xs = catMaybes . diagonal . map matches
(map matches) ys computes a list of lists of comparisons between elements of xs and ys, where the list indices specify the indices in ys and xs respectively: i.e (map matches) ys !! 3 !! 0 would represent the comparison of ys !! 3 with xs !! 0, which would be Nothing if those values differ. If those values are the same, it would be Just that value.
diagonals takes a list of lists and returns a list of lists where the nth output list contains an element each from the first n lists. Another way to conceptualise it is that (diagonals . map matches) ys !! n contains comparisons between elements whose indices in xs and ys sum to n.
diagonal is simply a flat version of diagonals (diagonal = concat diagonals)
Therefore (diagonal . map matches) ys is a list of comparisons between elements of xs and ys, where the elements are approximately sorted by the sum of the indices of the elements of ys and xs being compared; this means that early elements are compared to later elements with the same priority as middle elements being compared to each other.
(catMaybes . diagonal . map matches) ys is a list of only the elements which are in both lists, where the elements are approximately sorted by the sum of the indices of the two elements being compared.
Note
(diagonal . map (catMaybes . matches)) ys does not work: catMaybes . matches only yields when it finds a match, instead of also yielding Nothing on no match, so the interleaving does nothing to distribute the work.
To contrast, in the chosen solution, the interleaving of Nothing and Just values by diagonal means that the program divides its attention between 'searching' for multiple different elements, not waiting for one to succeed; whereas if the Nothing values are removed before interleaving, the program may spend too much time waiting for a fruitless 'search' for a given element to succeed.
Therefore, we would encounter the same problem as in the original question: while one element does not match any elements in the other list, the program will hang; whereas the chosen solution will only hang while no matches are found for any elements in either list.

Inverting List Elements in Haskell

I am trying to invert two-elements lists in xs. For example, invert [[1,2], [5,6,7], [10,20]] will return [[2,1], [5,6,7], [20,10]]. It doesn't invert [5,6,7] because it is a 3 element list.
So I have written this so far:
invert :: [[a]] -> [[a]]
invert [[]] = [[]]
which is just the type declaration and an empty list case. I am new to Haskell so any suggestions on how to implement this problem would be helpful.
Here's one way to do this:
First we define a function to invert one list (if it has two elements; otherwise we return the list unchanged):
invertOne :: [a] -> [a]
invertOne [x, y] = [y, x]
invertOne xs = xs
Next we apply this function to all elements of an input list:
invert :: [[a]] -> [[a]]
invert xs = map invertOne xs
(Because that's exactly what map does: it applies a function to all elements of a list and collects the results in another list.)
Your inert function just operations on each element individually, so you can express it as a map:
invert xs = map go xs
where go = ...
Here go just inverts a single list according to your rules, i.e.:
go [1,2] = [2,1]
go [4,5,6] = [4,5,6]
go [] = []
The definition of go is pretty straight-forward:
go [a,b] = [b,a]
go xs = xs -- go of anything else is just itself
I would do this:
solution ([a,b]:xs) = [b,a] : solution xs
solution (x:xs) = x : solution xs
solution [] = []
This explicitly handles 2-element lists, leaving everything else alone.
Yes, you could do this with map and an auxiliary function, but for a beginner, understanding the recursion behind it all may be valuable.
Note that your 'empty list case' is not empty. length [[]] is 1.
Examine the following solution:
invert :: [[a]] -> [[a]]
invert = fmap conditionallyInvert
where
conditionallyInvert xs
| lengthOfTwo xs = reverse xs
| otherwise = xs
lengthOfTwo (_:_:_) = True
lengthOfTwo _ = False

How can I find the index where one list appears as a sublist of another?

I have been working with Haskell for a little over a week now so I am practicing some functions that might be useful for something. I want to compare two lists recursively. When the first list appears in the second list, I simply want to return the index at where the list starts to match. The index would begin at 0. Here is an example of what I want to execute for clarification:
subList [1,2,3] [4,4,1,2,3,5,6]
the result should be 2
I have attempted to code it:
subList :: [a] -> [a] -> a
subList [] = []
subList (x:xs) = x + 1 (subList xs)
subList xs = [ y:zs | (y,ys) <- select xs, zs <- subList ys]
where select [] = []
select (x:xs) = x
I am receiving an "error on input" and I cannot figure out why my syntax is not working. Any suggestions?
Let's first look at the function signature. You want to take in two lists whose contents can be compared for equality and return an index like so
subList :: Eq a => [a] -> [a] -> Int
So now we go through pattern matching on the arguments. First off, when the second list is empty then there is nothing we can do, so we'll return -1 as an error condition
subList _ [] = -1
Then we look at the recursive step
subList as xxs#(x:xs)
| all (uncurry (==)) $ zip as xxs = 0
| otherwise = 1 + subList as xs
You should be familiar with the guard syntax I've used, although you may not be familiar with the # syntax. Essentially it means that xxs is just a sub-in for if we had used (x:xs).
You may not be familiar with all, uncurry, and possibly zip so let me elaborate on those more. zip has the function signature zip :: [a] -> [b] -> [(a,b)], so it takes two lists and pairs up their elements (and if one list is longer than the other, it just chops off the excess). uncurry is weird so lets just look at (uncurry (==)), its signature is (uncurry (==)) :: Eq a => (a, a) -> Bool, it essentially checks if both the first and second element in the pair are equal. Finally, all will walk over the list and see if the first and second of each pair is equal and return true if that is the case.

Haskell:: how to compare/extract/add each element between lists

I'm trying to get each element from list of lists.
For example, [1,2,3,4] [1,2,3,4]
I need to create a list which is [1+1, 2+2, 3+3, 4+4]
list can be anything. "abcd" "defg" => ["ad","be","cf","dg"]
The thing is that two list can have different length so I can't use zip.
That's one thing and the other thing is comparing.
I need to compare [1,2,3,4] with [1,2,3,4,5,6,7,8]. First list can be longer than the second list, second list might be longer than the first list.
So, if I compare [1,2,3,4] with [1,2,3,4,5,6,7,8], the result should be [5,6,7,8]. Whatever that first list doesn't have, but the second list has, need to be output.
I also CAN NOT USE ANY RECURSIVE FUNCTION. I can only import Data.Char
The thing is that two list can have different length so I can't use zip.
And what should the result be in this case?
CAN NOT USE ANY RECURSIVE FUNCTION
Then it's impossible. There is going to be recursion somewhere, either in the library functions you use (as in other answers), or in functions you write yourself. I suspect you are misunderstanding your task.
For your first question, you can use zipWith:
zipWith f [a1, a2, ...] [b1, b2, ...] == [f a1 b1, f a2 b2, ...]
like, as in your example,
Prelude> zipWith (+) [1 .. 4] [1 .. 4]
[2,4,6,8]
I'm not sure what you need to have in case of lists with different lengths. Standard zip and zipWith just ignore elements from the longer one which don't have a pair. You could leave them unchanged, and write your own analog of zipWith, but it would be something like zipWithRest :: (a -> a -> a) -> [a] -> [a] -> [a] which contradicts to the types of your second example with strings.
For the second, you can use list comprehensions:
Prelude> [e | e <- [1 .. 8], e `notElem` [1 .. 4]]
[5,6,7,8]
It would be O(nm) slow, though.
For your second question (if I'm reading it correctly), a simple filter or list comprehension would suffice:
uniques a b = filter (not . flip elem a) b
I believe you can solve this using a combination of concat and nub http://www.haskell.org/ghc/docs/6.12.1/html/libraries/base-4.2.0.0/Data-List.html#v%3anub which will remove all duplicates ...
nub (concat [[0,1,2,3], [1,2,3,4]])
you will need to remove unique elements from the first list before doing this. ie 0
(using the same functions)
Padding then zipping
You suggested in a comment the examples:
[1,2,3,4] [1,2,3] => [1+1, 2+2, 3+3, 4+0]
"abcd" "abc" => ["aa","bb","cc"," d"]
We can solve those sorts of problems by padding the list with a default value:
padZipWith :: a -> (a -> a -> b) -> [a] -> [a] -> [b]
padZipWith def op xs ys = zipWith op xs' ys' where
maxlen = max (length xs) (length ys)
xs' = take maxlen (xs ++ repeat def)
ys' = take maxlen (ys ++ repeat def)
so for example:
ghci> padZipWith 0 (+) [4,3] [10,100,1000,10000]
[14,103,1000,10000]
ghci> padZipWith ' ' (\x y -> [x,y]) "Hi" "Hello"
["HH","ie"," l"," l"," o"]
(You could rewrite padZipWith to have two separate defaults, one for each list, so you could allow the two lists to have different types, but that doesn't sound super useful.)
General going beyond the common length
For your first question about zipping beyond common length:
How about splitting your lists into an initial segment both have and a tail that only one of them has, using splitAt :: Int -> [a] -> ([a], [a]) from Data.List:
bits xs ys = (frontxs,frontys,backxs,backys) where
(frontxs,backxs) = splitAt (length ys) xs
(frontys,backys) = splitAt (length xs) ys
Example:
ghci> bits "Hello Mum" "Hi everyone else"
("Hello Mum","Hi everyo","","ne else")
You could use that various ways:
larger xs ys = let (frontxs,frontys,backxs,backys) = bits xs ys in
zipWith (\x y -> if x > y then x else y) frontxs frontys ++ backxs ++ backys
needlesslyComplicatedCmpLen xs ys = let (_,_,backxs,backys) = bits xs ys in
if null backxs && null backys then EQ
else if null backxs then LT else GT
-- better written as compare (length xs) (length ys)
so
ghci> larger "Hello Mum" "Hi everyone else"
"Hillveryone else"
ghci> needlesslyComplicatedCmpLen "Hello Mum" "Hi everyone else"
LT
but once you've got the hang of splitAt, take, takeWhile, drop etc, I doubt you'll need to write an auxiliary function like bits.