Haskell, pulling both lists from a tuple of lists - list

I'm making a "merge sort" using 2 helper functions. The first helper function splits the lists into a tuple of lists putting odd and even indexes in the separate lists.
Example: [1,2,3,4,5,6]
Returns: ([1,3,5],[2,4,6])
The second helper function assumes that the lists are sorted and merges them.
I'm to implement a merge sort of an unsorted list using these 2 functions.
I have this terribly inefficient piece that essentially splits (length - 1) * 2 times and merges the list (length - 1) times.
sort length (z:zs)
| length == 0 = (z:zs)
| otherwise = sort (length - 1) (merge (fst (split(z:zs))) (snd (split(z:zs)))
I'm calling split twice to get the same info that was done on the first split, and I'm not recursing far enough (where each list is just a singleton and then merge them all).
How can I recurse to the singleton case and pull out both elements of the tuple at the same time?
Thank you in advance for any help you can offer.

You can use uncurry to convert merge to an un-curried function and pass split(z:zs) as argument:
sort (length - 1) $ uncurry merge $ split (z:zs)
The uncurry function transforms function of type a -> b -> c into functions of type (a, b) -> c. In your case merge has type [a] -> [a] -> [a] while uncurry merge has type ([a], [a]) -> [a] and ([a], [a]) is the return type of split.
Alternatively you can simply use a let or a where clause to refer to the result of split:
let (left, right) = split (z:zs)
in sort (length - 1) $ merge left right
which is an improved version of:
let res = split (z:zs)
in sort (length - 1) $ merge (fst res) (snd res)
As a side note your sort function is incorrect. Your definition is like:
sort length (z:zs) = ...
however this matches only non-empty lists. It's also pretty useless to consider the case length == 0 when it can never occurr.
Your definition of sort outght to consider the empty case too:
sort _ [] = []
sort length (z:zs) = ...

Related

Using foldr on a list of infinite lists

I am trying to write a function in Haskell, that does the following:
You input a list of integers, for these integers, using map, there is a function applied to them that returns an infinite list of these integers. Then, I want to apply foldr to the list of lists, using union, so that the result will be the union of those lists in the list.
Now the problem is that when I do for example take 10 'function' [1,2], it will first calculate the infinite list for 1, and because it is an infinite list, it will never do this for 2. So then it returns only the first 10 elements of this infinite list of the first elements in the input list, with union applied to it, which is just the same list.
My question is: is there a way to create the infinite lists for all the elements in the input list at the same time, so that when I do take 10 'function' [1,2] for example, it will return the first 10 elements of the union of the infinite lists for 1 and 2.
(I don't know the number of elements in the input list)
This is my code, to make it clearer:
pow :: Integer -> [Integer]
pow n = map (^n) [1, 2..]
function :: [Integer] -> [Integer]
function xs = foldr union [] (map pow xs)
The union function works on arbitrary lists and removes duplicates, so it must first evaluate one of its arguments completely before it can continue with the other argument.
I think you want to explicitly introduce the assumption that your lists are sorted, then you can write an efficient function that merges (like the merge in a merge sort) the input lists and computes the union without needing to evaluate one of the lists before the other.
I don't know if such a merge function exists in a library, but you can pretty easily define it yourself:
-- | Computes the union of two sorted lists
merge :: Ord a => [a] -> [a] -> [a]
merge [] ys = ys
merge xs [] = xs
merge (x:xs) (y:ys)
| x <= y = x : merge (dropWhile (== x) xs) (dropWhile (== x) (y:ys))
| otherwise = y : merge (x:xs) (dropWhile (== y) ys)
Then your original fold with this new merge function should behave as desired:
ghci> pow n = map (^n) [1..]
ghci> function xs = foldr merge [] (map pow xs)
ghci> take 10 (function [2,3])
[1,4,8,9,16,25,27,36,49,64]
If you intend the input and output lists to be sorted, check out data-ordlist. If you just want all the elements but don't care what order, try concat . transpose.

Haskell function to keep the repeating elements of a list

Here is the expected input/output:
repeated "Mississippi" == "ips"
repeated [1,2,3,4,2,5,6,7,1] == [1,2]
repeated " " == " "
And here is my code so far:
repeated :: String -> String
repeated "" = ""
repeated x = group $ sort x
I know that the last part of the code doesn't work. I was thinking to sort the list then group it, then I wanted to make a filter on the list of list which are greater than 1, or something like that.
Your code already does half of the job
> group $ sort "Mississippi"
["M","iiii","pp","ssss"]
You said you want to filter out the non-duplicates. Let's define a predicate which identifies the lists having at least two elements:
atLeastTwo :: [a] -> Bool
atLeastTwo (_:_:_) = True
atLeastTwo _ = False
Using this:
> filter atLeastTwo . group $ sort "Mississippi"
["iiii","pp","ssss"]
Good. Now, we need to take only the first element from such lists. Since the lists are non-empty, we can use head safely:
> map head . filter atLeastTwo . group $ sort "Mississippi"
"ips"
Alternatively, we could replace the filter with filter (\xs -> length xs >= 2) but this would be less efficient.
Yet another option is to use a list comprehension
> [ x | (x:_y:_) <- group $ sort "Mississippi" ]
"ips"
This pattern matches on the lists starting with x and having at least another element _y, combining the filter with taking the head.
Okay, good start. One immediate problem is that the specification requires the function to work on lists of numbers, but you define it for strings. The list must be sorted, so its elements must have the typeclass Ord. Therefore, let’s fix the type signature:
repeated :: Ord a => [a] -> [a]
After calling sort and group, you will have a list of lists, [[a]]. Let’s take your idea of using filter. That works. Your predicate should, as you said, check the length of each list in the list, then compare that length to 1.
Filtering a list of lists gives you a subset, which is another list of lists, of type [[a]]. You need to flatten this list. What you want to do is map each entry in the list of lists to one of its elements. For example, the first. There’s a function in the Prelude to do that.
So, you might fill in the following skeleton:
module Repeated (repeated) where
import Data.List (group, sort)
repeated :: Ord a => [a] -> [a]
repeated = map _
. filter (\x -> _)
. group
. sort
I’ve written this in point-free style with the filtering predicate as a lambda expression, but many other ways to write this are equally good. Find one that you like! (For example, you could also write the filter predicate in point-free style, as a composition of two functions: a comparison on the result of length.)
When you try to compile this, the compiler will tell you that there are two typed holes, the _ entries to the right of the equal signs. It will also tell you the type of the holes. The first hole needs a function that takes a list and gives you back a single element. The second hole needs a Boolean expression using x. Fill these in correctly, and your program will work.
Here's some other approaches, to evaluate #chepner's comment on the solution using group $ sort. (Those solutions look simpler, because some of the complexity is hidden in the library routines.)
While it's true that sorting is O(n lg n), ...
It's not just the sorting but especially the group: that uses span, and both of them build and destroy temporary lists. I.e. they do this:
a linear traversal of an unsorted list will require some other data structure to keep track of all possible duplicates, and lookups in each will add to the space complexity at the very least. While carefully chosen data structures could be used to maintain an overall O(n) running time, the constant would probably make the algorithm slower in practice than the O(n lg n) solution, ...
group/span adds considerably to that complexity, so O(n lg n) is not a correct measure.
while greatly complicating the implementation.
The following all traverse the input list just once. Yes they build auxiliary lists. (Probably a Set would give better performance/quicker lookup.) They maybe look more complex, but to compare apples with apples look also at the code for group/span.
repeated2, repeated3, repeated4 :: Ord a => [a] -> [a]
repeated2/inserter2 builds an auxiliary list of pairs [(a, Bool)], in which the Bool is True if the a appears more than once, False if only once so far.
repeated2 xs = sort $ map fst $ filter snd $ foldr inserter2 [] xs
inserter2 :: Ord a => a -> [(a, Bool)] -> [(a, Bool)]
inserter2 x [] = [(x, False)]
inserter2 x (xb#(x', _): xs)
| x == x' = (x', True): xs
| otherwise = xb: inserter2 x xs
repeated3/inserter3 builds an auxiliary list of pairs [(a, Int)], in which the Int counts how many of the a appear. The aux list is sorted anyway, just for the heck of it.
repeated3 xs = map fst $ filter ((> 1).snd) $ foldr inserter3 [] xs
inserter3 :: Ord a => a -> [(a, Int)] -> [(a, Int)]
inserter3 x [] = [(x, 1)]
inserter3 x xss#(xc#(x', c): xs) = case x `compare` x' of
{ LT -> ((x, 1): xss)
; EQ -> ((x', c+1): xs)
; GT -> (xc: inserter3 x xs)
}
repeated4/go4 builds an output list of elements known to repeat. It maintains an intermediate list of elements met once (so far) as it traverses the input list. If it meets a repeat: it adds that element to the output list; deletes it from the intermediate list; filters that element out of the tail of the input list.
repeated4 xs = sort $ go4 [] [] xs
go4 :: Ord a => [a] -> [a] -> [a] -> [a]
go4 repeats _ [] = repeats
go4 repeats onces (x: xs) = case findUpd x onces of
{ (True, oncesU) -> go4 (x: repeats) oncesU (filter (/= x) xs)
; (False, oncesU) -> go4 repeats oncesU xs
}
findUpd :: Ord a => a -> [a] -> (Bool, [a])
findUpd x [] = (False, [x])
findUpd x (x': os) | x == x' = (True, os) -- i.e. x' removed
| otherwise =
let (b, os') = findUpd x os in (b, x': os')
(That last bit of list-fiddling in findUpd is very similar to span.)

Ordering an Ordered list Function in Haskell

for my coursework I have to take two lists of numbers, sort them and then combine them and output the new list in order, this works if the lists are already in order as they are typed but not if say a 9 is at the start of a first list so the trouble I'm having is sorting the list after it's combined, in other languages I'd do this with a for loop, but not sure in Haskell
here the code I have:
merge :: Ord a => [a] -> [a] -> [a]
merge x [] = x
merge [] x = x
merge (x:xs) (y:ys) = if x < y
then x:(merge xs (y:ys))
else y:(merge (x:xs) ys)
It sounds like what you're actually supposed to implement is merge sort.
In merge sort you merge two sorted list to get one sorted list, yes. The missing observation is that a list of size 0 or 1 is necessarily already sorted.
This means that if you start applying your function to lists that are of size 0 or 1, then merge the results of that merge, then merge the result of that, eventually you will end up with a fully sorted list.
Here's an example:
-- Your function
merge :: Ord a => [a] -> [a] -> [a]
merge x [] = x
merge [] x = x
merge (x:xs) (y:ys) = if x < y
then x:(merge xs (y:ys))
else y:(merge (x:xs) ys)
-- Arbitrarily split a list into two ~equally sized smaller lists.
-- e.g. [2,7,1,8,2] -> ([2,7,1], [8,2])
split list = splitAt ((length list) `div` 2) list
-- Split a list into halves until each piece is size 0 or 1,
-- then 'merge' them back together.
mergeSort [] = []
mergeSort [x] = [x]
mergeSort list =
let (firstHalf, secondHalf) = split list
in merge (mergeSort firstHalf) (mergeSort secondHalf)
mergeSort [2,7,1,8,2] will evaluate to [1,2,2,7,8]. Using only your merge function, the list has been sorted.
So your current solution will return a sorted list if both input lists are sorted. If the input lists aren't sorted, you've got 2 options, sort the input lists individually, then merge them as you are already, or merge them unsorted, and sort the new list.
It seems more reasonable to merge unsorted lists and then sort them as one, so here is the solution. I've used a quick implementation of quicksort, but you could use whatever sorting algorithm you wish.
--takes 2 sorted or unsorted lists, merges them, then sorts them
merge :: (Ord a) => [a] -> [a] -> [a]
merge [] [] = []
merge x [] = sort x
merge [] y = sort y
merge x y = sort (x ++ y)
-- where first element of list is pivot
sort :: (Ord a) => [a] -> [a]
sort [] = []
sort (x:xs) = sort [x'|x'<-xs, x'<=x] ++ [x] ++ sort [x'|x'<-xs, x'>x]
There are many ways to do this, and this way has the downside of having to resort the list even if the lists were already sorted. You could get around this by checking if lists are sorted, then sorting them if needed. I hope this answer helps.
For a problem like merge sort, you want to divide-and-conquer so that your input lists are always ordered. One way to do this by breaking the input down into singletons, which are always ordered by definition, then making your merge function tail-recursively insert whichever of the two list heads is smaller. When one input list is finally empty, it appends the other.

How can I find the index where one list appears as a sublist of another?

I have been working with Haskell for a little over a week now so I am practicing some functions that might be useful for something. I want to compare two lists recursively. When the first list appears in the second list, I simply want to return the index at where the list starts to match. The index would begin at 0. Here is an example of what I want to execute for clarification:
subList [1,2,3] [4,4,1,2,3,5,6]
the result should be 2
I have attempted to code it:
subList :: [a] -> [a] -> a
subList [] = []
subList (x:xs) = x + 1 (subList xs)
subList xs = [ y:zs | (y,ys) <- select xs, zs <- subList ys]
where select [] = []
select (x:xs) = x
I am receiving an "error on input" and I cannot figure out why my syntax is not working. Any suggestions?
Let's first look at the function signature. You want to take in two lists whose contents can be compared for equality and return an index like so
subList :: Eq a => [a] -> [a] -> Int
So now we go through pattern matching on the arguments. First off, when the second list is empty then there is nothing we can do, so we'll return -1 as an error condition
subList _ [] = -1
Then we look at the recursive step
subList as xxs#(x:xs)
| all (uncurry (==)) $ zip as xxs = 0
| otherwise = 1 + subList as xs
You should be familiar with the guard syntax I've used, although you may not be familiar with the # syntax. Essentially it means that xxs is just a sub-in for if we had used (x:xs).
You may not be familiar with all, uncurry, and possibly zip so let me elaborate on those more. zip has the function signature zip :: [a] -> [b] -> [(a,b)], so it takes two lists and pairs up their elements (and if one list is longer than the other, it just chops off the excess). uncurry is weird so lets just look at (uncurry (==)), its signature is (uncurry (==)) :: Eq a => (a, a) -> Bool, it essentially checks if both the first and second element in the pair are equal. Finally, all will walk over the list and see if the first and second of each pair is equal and return true if that is the case.

Need to partition a list into lists based on breaks in ascending order of elements (Haskell)

Say I have any list like this:
[4,5,6,7,1,2,3,4,5,6,1,2]
I need a Haskell function that will transform this list into a list of lists which are composed of the segments of the original list which form a series in ascending order. So the result should look like this:
[[4,5,6,7],[1,2,3,4,5,6],[1,2]]
Any suggestions?
You can do this by resorting to manual recursion, but I like to believe Haskell is a more evolved language. Let's see if we can develop a solution that uses existing recursion strategies. First some preliminaries.
{-# LANGUAGE NoMonomorphismRestriction #-}
-- because who wants to write type signatures, amirite?
import Data.List.Split -- from package split on Hackage
Step one is to observe that we want to split the list based on a criteria that looks at two elements of the list at once. So we'll need a new list with elements representing a "previous" and "next" value. There's a very standard trick for this:
previousAndNext xs = zip xs (drop 1 xs)
However, for our purposes, this won't quite work: this function always outputs a list that's shorter than the input, and we will always want a list of the same length as the input (and in particular we want some output even when the input is a list of length one). So we'll modify the standard trick just a bit with a "null terminator".
pan xs = zip xs (map Just (drop 1 xs) ++ [Nothing])
Now we're going to look through this list for places where the previous element is bigger than the next element (or the next element doesn't exist). Let's write a predicate that does that check.
bigger (x, y) = maybe False (x >) y
Now let's write the function that actually does the split. Our "delimiters" will be values that satisfy bigger; and we never want to throw them away, so let's keep them.
ascendingTuples = split . keepDelimsR $ whenElt bigger
The final step is just to throw together the bit that constructs the tuples, the bit that splits the tuples, and a last bit of munging to throw away the bits of the tuples we don't care about:
ascending = map (map fst) . ascendingTuples . pan
Let's try it out in ghci:
*Main> ascending [4,5,6,7,1,2,3,4,5,6,1,2]
[[4,5,6,7],[1,2,3,4,5,6],[1,2]]
*Main> ascending [7,6..1]
[[7],[6],[5],[4],[3],[2],[1]]
*Main> ascending []
[[]]
*Main> ascending [1]
[[1]]
P.S. In the current release of split, keepDelimsR is slightly stricter than it needs to be, and as a result ascending currently doesn't work with infinite lists. I've submitted a patch that makes it lazier, though.
ascend :: Ord a => [a] -> [[a]]
ascend xs = foldr f [] xs
where
f a [] = [[a]]
f a xs'#(y:ys) | a < head y = (a:y):ys
| otherwise = [a]:xs'
In ghci
*Main> ascend [4,5,6,7,1,2,3,4,5,6,1,2]
[[4,5,6,7],[1,2,3,4,5,6],[1,2]]
This problem is a natural fit for a paramorphism-based solution. Having (as defined in that post)
para :: (a -> [a] -> b -> b) -> b -> [a] -> b
foldr :: (a -> b -> b) -> b -> [a] -> b
para c n (x : xs) = c x xs (para c n xs)
foldr c n (x : xs) = c x (foldr c n xs)
para c n [] = n
foldr c n [] = n
we can write
partition_asc xs = para c [] xs where
c x (y:_) ~(a:b) | x<y = (x:a):b
c x _ r = [x]:r
Trivial, since the abstraction fits.
BTW they have two kinds of map in Common Lisp - mapcar
(processing elements of an input list one by one)
and maplist (processing "tails" of a list). With this idea we get
import Data.List (tails)
partition_asc2 xs = foldr c [] . init . tails $ xs where
c (x:y:_) ~(a:b) | x<y = (x:a):b
c (x:_) r = [x]:r
Lazy patterns in both versions make it work with infinite input lists
in a productive manner (as first shown in Daniel Fischer's answer).
update 2020-05-08: not so trivial after all. Both head . head . partition_asc $ [4] ++ undefined and the same for partition_asc2 fail with *** Exception: Prelude.undefined. The combining function g forces the next element y prematurely. It needs to be more carefully written to be productive right away before ever looking at the next element, as e.g. for the second version,
partition_asc2' xs = foldr c [] . init . tails $ xs where
c (x:ys) r#(~(a:b)) = (x:g):gs
where
(g,gs) | not (null ys)
&& x < head ys = (a,b)
| otherwise = ([],r)
(again, as first shown in Daniel's answer).
You can use a right fold to break up the list at down-steps:
foldr foo [] xs
where
foo x yss = (x:zs) : ws
where
(zs, ws) = case yss of
(ys#(y:_)) : rest
| x < y -> (ys,rest)
| otherwise -> ([],yss)
_ -> ([],[])
(It's a bit complicated in order to have the combining function lazy in the second argument, so that it works well for infinite lists too.)
One other way of approaching this task (which, in fact lays the fundamentals of a very efficient sorting algorithm) is using the Continuation Passing Style a.k.a CPS which, in this particular case applied to folding from right; foldr.
As is, this answer would only chunk up the ascending chunks however, it would be nice to chunk up the descending ones at the same time... preferably in reverse order all in O(n) which would leave us with only binary merging of the obtained chunks for a perfectly sorted output. Yet that's another answer for another question.
chunks :: Ord a => [a] -> [[a]]
chunks xs = foldr go return xs $ []
where
go :: Ord a => a -> ([a] -> [[a]]) -> ([a] -> [[a]])
go c f = \ps -> let (r:rs) = f [c]
in case ps of
[] -> r:rs
[p] -> if c > p then (p:r):rs else [p]:(r:rs)
*Main> chunks [4,5,6,7,1,2,3,4,5,6,1,2]
[[4,5,6,7],[1,2,3,4,5,6],[1,2]]
*Main> chunks [4,5,6,7,1,2,3,4,5,4,3,2,6,1,2]
[[4,5,6,7],[1,2,3,4,5],[4],[3],[2,6],[1,2]]
In the above code c stands for current and p is for previous and again, remember we are folding from right so previous, is actually the next item to process.