Haskell pattern matching conundrum - list

I was trying to search through a list of pairs that could have the element ("$", Undefined) in it at some arbitrary location. I wanted to ONLY search the part of the list in front of that special element, so I tried something like this (alreadyThere is intended to take the element n and the list xs as arguments):
checkNotSameScope :: Env -> VarName -> Expr -> Expr
checkNotSameScope (xs:("$", Undefined):_) n e = if alreadyThere n xs then BoolLit False
else BoolLit True
But that does not work; the compiler seemed to indicate that (xs: ..) only deals with a SINGLE value prepending my list. I cannot use : to indicate the first chunk of a list; only a single element. Looking back, this makes sense; otherwise, how would the compiler know what to do? Adding an "s" to something like "x" doesn't magically make multiple elements! But how can I work around this?

Unfortunately, even with smart compilers and languages, some programming cannot be avoided...
In your case it seems you want the part of a list up to a specific element. More generally, to find the list up to some condition you can use the standard library takeWhile function. Then you can just run alreadyThere on it:
checkNotSameScope :: Env -> VarName -> Expr -> Expr
checkNotSameScope xs n e = if alreadyThere n (takeWhile (/= ("$", Undefined)) xs)
then BoolLit False
else BoolLit True
It maybe does not what you want for lists where ("$", Undefined) does not occur, so beware.

Similar to Joachim's answer, you can use break, which will allow you to detect when ("$", Undefined) doesn't occur (if this is necessary). i.e.
checkNotSameScope xs n e = case break (== ("$", Undefined)) xs of
(_, []) -> .. -- ("$", Undefined) didn't occur!
(xs', _) -> BoolLit . not $ alreadyThere n xs'
(NB. you lose some laziness in this solution, since the list has to be traversed until ("$", Undefined), or to the end, to check the first case.)

Haskell cannot do this kind of pattern matching out of the box, although there are some languages which can, like CLIPS for example, or F#, by using active patterns.
But we can use Haskell's existing pattern matching capabilities to obtain a similar result. Let us first define a function called deconstruct defined like this:
deconstruct :: [a] -> [([a], a, [a])]
deconstruct [] = []
deconstruct [x] = [([], x, [])]
deconstruct (x:xs) = ([], x, xs) : [(x:ys1, y, ys2) | (ys1, y, ys2) <- deconstruct xs]
What this function does is to obtain all the decompositions of a list xs into triples of form (ys1, y, ys2) such that ys1 ++ [y] ++ ys2 == xs. So for example:
deconstruct [1..4] => [([],1,[2,3,4]),([1],2,[3,4]),([1,2],3,[4]),([1,2,3],4,[])]
Using this you can define your function as follows:
checkNotSameScope xs n e =
case [ys | (ys, ("$", Undefined), _) <- deconstruct xs] of
[ys] -> BoolLit $ not $ alreadyThere n xs
_ -> -- handle the case when ("$", Undefined) doesn't occur at all or more than once
We can use the do-notation to obtain something even closer to what you are looking for:
checkNotSameScope xs n e = BoolLit $ not $ any (alreadyThere n) prefixes
where
prefixes = do
(xs, ("$", Undefined), _) <- deconstruct xs
return xs
There are several things going on here. First of all the prefixes variable will store all the prefix lists which occur before the ("$", Undefined) pair - or none if the pair is not in the input list xs. Then using the any function we are checking whether alreadyThere n gives us True for any of the prefixes. And the rest is to complete your function's logic.

Related

Haskell - using foldl or foldr instead of pattern matching for updating a list with a new value at a given index

I have implemented a function (!!=) that given a list and a tuple containing an index in the list and a
new value, updates the given list with the new value at the given
index.
(!!=) :: [a] -> (Int,a) -> [a]
(!!=) xs (0, a) = a : tail xs
(!!=) [] (i, a) = error "Index not in the list"
(!!=) (x:xs) (i, a) = x : xs !!= (i-1, a)
Being a beginner with the concept of folding I was wondering if there is a way to achieve the same result using foldl or foldr instead?
Thanks a lot in advance.
I'll give you the foldl version which is easier to understand I think and the easiest / most straight-forward version I can think of.
But please note that you should not use foldl (use foldl': https://wiki.haskell.org/Foldr_Foldl_Foldl') - nor should you use ++ like this (use : and reverse after) ;)
Anway this is the idea:
(!!=) xs (i, a) = snd $ foldl
(\(j, ys) x -> (j+1, if j == i then ys ++ [a] else ys ++ [x]))
(0, [])
xs
as the state/accumulator for the fold I take a tuple of the current index and the accumulated result list (therefore the snd because I only want this in the end)
then the folding function just have to look if we are at the index and exchange the element - returning the next index and the new accumulated list
as an exercise you can try to:
use : instead of ++ and a reverse
rewrite as foldr
look at zipWith and rewrite this using this (zipWith (...) [0..] xs) instead of the fold (this is similar to using a map with index
Neither foldl nor foldr can do this particular job efficiently (unless you "cheat" by pattern matching on the list as you fold over it), though foldr can do it a bit less badly. No, what you really need is a different style of fold, sometimes called para:
para :: (a -> [a] -> b -> b) -> b -> [a] -> b
para _f n [] = n
para f n (a : as) = f a as (para f n as)
para is very similar to foldr. Each of them takes a combining function and, for each element, passes the combining function that element and the result of folding up the rest of the list. But para adds something extra: it also passes in the rest of the list! So there's no need to reconstruct the tail of the list once you've reached the replacement point.
But ... how do you count from the beginning with foldr or para? That brings in a classic trick, sometimes called a "higher-order fold". Instead of para go stop xs producing a list, it's going to produce a function that takes the insertion position as an argument.
(!!=) :: [a] -> (Int, a) -> [a]
xs0 !!= (i0, new) = para go stop xs0 i0
where
-- If the list is empty, then no matter what index
-- you seek, it's not there.
stop = \_ -> error "Index not in the list"
-- We produce a function that takes an index. If the
-- index is 0, we combine the new element with "the rest of the list".
-- Otherwise we apply the function we get from folding up the rest of
-- the list to the predecessor of the index, and tack on the current
-- element.
go x xs r = \i -> case i of
0 -> new : xs
_ -> x : r (i - 1)
Note that para is easily powerful enough to implement foldr:
foldr c = para (\a _ b -> c a b)
What's perhaps less obvious is that foldr can implement a (very inefficient version of) para:
para f n = snd . foldr go ([], n)
where
go x ~(xs, r) = (x : xs, f x xs r)
Lest you get the wrong idea and think that para is "better than" foldr, know that when its extra power isn't needed, foldr is simpler to use and will very often be compiled to more efficient code.

Haskell function to keep the repeating elements of a list

Here is the expected input/output:
repeated "Mississippi" == "ips"
repeated [1,2,3,4,2,5,6,7,1] == [1,2]
repeated " " == " "
And here is my code so far:
repeated :: String -> String
repeated "" = ""
repeated x = group $ sort x
I know that the last part of the code doesn't work. I was thinking to sort the list then group it, then I wanted to make a filter on the list of list which are greater than 1, or something like that.
Your code already does half of the job
> group $ sort "Mississippi"
["M","iiii","pp","ssss"]
You said you want to filter out the non-duplicates. Let's define a predicate which identifies the lists having at least two elements:
atLeastTwo :: [a] -> Bool
atLeastTwo (_:_:_) = True
atLeastTwo _ = False
Using this:
> filter atLeastTwo . group $ sort "Mississippi"
["iiii","pp","ssss"]
Good. Now, we need to take only the first element from such lists. Since the lists are non-empty, we can use head safely:
> map head . filter atLeastTwo . group $ sort "Mississippi"
"ips"
Alternatively, we could replace the filter with filter (\xs -> length xs >= 2) but this would be less efficient.
Yet another option is to use a list comprehension
> [ x | (x:_y:_) <- group $ sort "Mississippi" ]
"ips"
This pattern matches on the lists starting with x and having at least another element _y, combining the filter with taking the head.
Okay, good start. One immediate problem is that the specification requires the function to work on lists of numbers, but you define it for strings. The list must be sorted, so its elements must have the typeclass Ord. Therefore, let’s fix the type signature:
repeated :: Ord a => [a] -> [a]
After calling sort and group, you will have a list of lists, [[a]]. Let’s take your idea of using filter. That works. Your predicate should, as you said, check the length of each list in the list, then compare that length to 1.
Filtering a list of lists gives you a subset, which is another list of lists, of type [[a]]. You need to flatten this list. What you want to do is map each entry in the list of lists to one of its elements. For example, the first. There’s a function in the Prelude to do that.
So, you might fill in the following skeleton:
module Repeated (repeated) where
import Data.List (group, sort)
repeated :: Ord a => [a] -> [a]
repeated = map _
. filter (\x -> _)
. group
. sort
I’ve written this in point-free style with the filtering predicate as a lambda expression, but many other ways to write this are equally good. Find one that you like! (For example, you could also write the filter predicate in point-free style, as a composition of two functions: a comparison on the result of length.)
When you try to compile this, the compiler will tell you that there are two typed holes, the _ entries to the right of the equal signs. It will also tell you the type of the holes. The first hole needs a function that takes a list and gives you back a single element. The second hole needs a Boolean expression using x. Fill these in correctly, and your program will work.
Here's some other approaches, to evaluate #chepner's comment on the solution using group $ sort. (Those solutions look simpler, because some of the complexity is hidden in the library routines.)
While it's true that sorting is O(n lg n), ...
It's not just the sorting but especially the group: that uses span, and both of them build and destroy temporary lists. I.e. they do this:
a linear traversal of an unsorted list will require some other data structure to keep track of all possible duplicates, and lookups in each will add to the space complexity at the very least. While carefully chosen data structures could be used to maintain an overall O(n) running time, the constant would probably make the algorithm slower in practice than the O(n lg n) solution, ...
group/span adds considerably to that complexity, so O(n lg n) is not a correct measure.
while greatly complicating the implementation.
The following all traverse the input list just once. Yes they build auxiliary lists. (Probably a Set would give better performance/quicker lookup.) They maybe look more complex, but to compare apples with apples look also at the code for group/span.
repeated2, repeated3, repeated4 :: Ord a => [a] -> [a]
repeated2/inserter2 builds an auxiliary list of pairs [(a, Bool)], in which the Bool is True if the a appears more than once, False if only once so far.
repeated2 xs = sort $ map fst $ filter snd $ foldr inserter2 [] xs
inserter2 :: Ord a => a -> [(a, Bool)] -> [(a, Bool)]
inserter2 x [] = [(x, False)]
inserter2 x (xb#(x', _): xs)
| x == x' = (x', True): xs
| otherwise = xb: inserter2 x xs
repeated3/inserter3 builds an auxiliary list of pairs [(a, Int)], in which the Int counts how many of the a appear. The aux list is sorted anyway, just for the heck of it.
repeated3 xs = map fst $ filter ((> 1).snd) $ foldr inserter3 [] xs
inserter3 :: Ord a => a -> [(a, Int)] -> [(a, Int)]
inserter3 x [] = [(x, 1)]
inserter3 x xss#(xc#(x', c): xs) = case x `compare` x' of
{ LT -> ((x, 1): xss)
; EQ -> ((x', c+1): xs)
; GT -> (xc: inserter3 x xs)
}
repeated4/go4 builds an output list of elements known to repeat. It maintains an intermediate list of elements met once (so far) as it traverses the input list. If it meets a repeat: it adds that element to the output list; deletes it from the intermediate list; filters that element out of the tail of the input list.
repeated4 xs = sort $ go4 [] [] xs
go4 :: Ord a => [a] -> [a] -> [a] -> [a]
go4 repeats _ [] = repeats
go4 repeats onces (x: xs) = case findUpd x onces of
{ (True, oncesU) -> go4 (x: repeats) oncesU (filter (/= x) xs)
; (False, oncesU) -> go4 repeats oncesU xs
}
findUpd :: Ord a => a -> [a] -> (Bool, [a])
findUpd x [] = (False, [x])
findUpd x (x': os) | x == x' = (True, os) -- i.e. x' removed
| otherwise =
let (b, os') = findUpd x os in (b, x': os')
(That last bit of list-fiddling in findUpd is very similar to span.)

Haskell - Removing adjacent duplicates from a list

I'm trying to learn haskell by solving some online problems and training exercises.
Right now I'm trying to make a function that'd remove adjacent duplicates from a list.
Sample Input
"acvvca"
"1456776541"
"abbac"
"aabaabckllm"
Expected Output
""
""
"c"
"ckm"
My first though was to make a function that'd simply remove first instance of adjacent duplicates and restore the list.
module Test where
removeAdjDups :: (Eq a) => [a] -> [a]
removeAdjDups [] = []
removeAdjDups [x] = [x]
removeAdjDups (x : y : ys)
| x == y = removeAdjDups ys
| otherwise = x : removeAdjDups (y : ys)
*Test> removeAdjDups "1233213443"
"122133"
This func works for first found pairs.
So now I need to apply same function over the result of the function.
Something I think foldl can help with but I don't know how I'd go about implementing it.
Something along the line of
removeAdjDups' xs = foldl (\acc x -> removeAdjDups x acc) xs
Also is this approach the best way to implement the solution or is there a better way I should be thinking of?
Start in last-first order: first remove duplicates from the tail, then check if head of the input equals to head of the tail result (which, by this moment, won't have any duplicates, so the only possible pair is head of the input vs. head of the tail result):
main = mapM_ (print . squeeze) ["acvvca", "1456776541", "abbac", "aabaabckllm"]
squeeze :: Eq a => [a] -> [a]
squeeze (x:xs) = let ys = squeeze xs in case ys of
(y:ys') | x == y -> ys'
_ -> x:ys
squeeze _ = []
Outputs
""
""
"c"
"ckm"
I don't see how foldl could be used for this. (Generally, foldl pretty much combines the disadvantages of foldr and foldl'... those, or foldMap, are the folds you should normally be using, not foldl.)
What you seem to intend is: repeating the removeAdjDups, until no duplicates are found anymore. The repetition is a job for
iterate :: (a -> a) -> a -> [a]
like
Prelude> iterate removeAdjDups "1233213443"
["1233213443","122133","11","","","","","","","","","","","","","","","","","","","","","","","","","","",""...
This is an infinite list of ever reduced lists. Generally, it will not converge to the empty list; you'll want to add some termination condition. If you want to remove as many dups as necessary, that's the fixpoint; it can be found in a very similar way to how you implemented removeAdjDups: compare neighbor elements, just this time in the list of reductions.
bipll's suggestion to handle recursive duplicates is much better though, it avoids unnecessary comparisons and traversing the start of the list over and over.
List comprehensions are often overlooked. They are, of course syntactic sugar but some, like me are addicted. First off, strings are lists as they are. This functions could handle any list, too as well as singletons and empty lists. You can us map to process many lists in a list.
(\l -> [ x | (x,y) <- zip l $ (tail l) ++ " ", x /= y]) "abcddeeffa"
"abcdefa"
I don't see either how to use foldl. It's maybe because, if you want to fold something here, you have to use foldr.
main = mapM_ (print . squeeze) ["acvvca", "1456776541", "abbac", "aabaabckllm"]
-- I like the name in #bipll answer
squeeze = foldr (\ x xs -> if xs /= "" && x == head(xs) then tail(xs) else x:xs) ""
Let's analyze this. The idea is taken from #bipll answer: go from right to left. If f is the lambda function, then by definition of foldr:
squeeze "abbac" = f('a' f('b' f('b' f('a' f('c' "")))
By definition of f, f('c' "") = 'c':"" = "c" since xs == "". Next char from the right: f('a' "c") = 'a':"c" = "ac" since 'a' != head("c") = 'c'. f('b' "ac") = "bac" for the same reason. But f('b' "bac") = tail("bac") = "ac" because 'b' == head("bac"). And so forth...
Bonus: by replacing foldr with scanr, you can see the whole process:
Prelude> squeeze' = scanr (\ x xs -> if xs /= "" && x == head(xs) then tail(xs) else x:xs) ""
Prelude> zip "abbac" (squeeze' "abbac")
[('a',"c"),('b',"ac"),('b',"bac"),('a',"ac"),('c',"c")]

How can I find the index where one list appears as a sublist of another?

I have been working with Haskell for a little over a week now so I am practicing some functions that might be useful for something. I want to compare two lists recursively. When the first list appears in the second list, I simply want to return the index at where the list starts to match. The index would begin at 0. Here is an example of what I want to execute for clarification:
subList [1,2,3] [4,4,1,2,3,5,6]
the result should be 2
I have attempted to code it:
subList :: [a] -> [a] -> a
subList [] = []
subList (x:xs) = x + 1 (subList xs)
subList xs = [ y:zs | (y,ys) <- select xs, zs <- subList ys]
where select [] = []
select (x:xs) = x
I am receiving an "error on input" and I cannot figure out why my syntax is not working. Any suggestions?
Let's first look at the function signature. You want to take in two lists whose contents can be compared for equality and return an index like so
subList :: Eq a => [a] -> [a] -> Int
So now we go through pattern matching on the arguments. First off, when the second list is empty then there is nothing we can do, so we'll return -1 as an error condition
subList _ [] = -1
Then we look at the recursive step
subList as xxs#(x:xs)
| all (uncurry (==)) $ zip as xxs = 0
| otherwise = 1 + subList as xs
You should be familiar with the guard syntax I've used, although you may not be familiar with the # syntax. Essentially it means that xxs is just a sub-in for if we had used (x:xs).
You may not be familiar with all, uncurry, and possibly zip so let me elaborate on those more. zip has the function signature zip :: [a] -> [b] -> [(a,b)], so it takes two lists and pairs up their elements (and if one list is longer than the other, it just chops off the excess). uncurry is weird so lets just look at (uncurry (==)), its signature is (uncurry (==)) :: Eq a => (a, a) -> Bool, it essentially checks if both the first and second element in the pair are equal. Finally, all will walk over the list and see if the first and second of each pair is equal and return true if that is the case.

Need to partition a list into lists based on breaks in ascending order of elements (Haskell)

Say I have any list like this:
[4,5,6,7,1,2,3,4,5,6,1,2]
I need a Haskell function that will transform this list into a list of lists which are composed of the segments of the original list which form a series in ascending order. So the result should look like this:
[[4,5,6,7],[1,2,3,4,5,6],[1,2]]
Any suggestions?
You can do this by resorting to manual recursion, but I like to believe Haskell is a more evolved language. Let's see if we can develop a solution that uses existing recursion strategies. First some preliminaries.
{-# LANGUAGE NoMonomorphismRestriction #-}
-- because who wants to write type signatures, amirite?
import Data.List.Split -- from package split on Hackage
Step one is to observe that we want to split the list based on a criteria that looks at two elements of the list at once. So we'll need a new list with elements representing a "previous" and "next" value. There's a very standard trick for this:
previousAndNext xs = zip xs (drop 1 xs)
However, for our purposes, this won't quite work: this function always outputs a list that's shorter than the input, and we will always want a list of the same length as the input (and in particular we want some output even when the input is a list of length one). So we'll modify the standard trick just a bit with a "null terminator".
pan xs = zip xs (map Just (drop 1 xs) ++ [Nothing])
Now we're going to look through this list for places where the previous element is bigger than the next element (or the next element doesn't exist). Let's write a predicate that does that check.
bigger (x, y) = maybe False (x >) y
Now let's write the function that actually does the split. Our "delimiters" will be values that satisfy bigger; and we never want to throw them away, so let's keep them.
ascendingTuples = split . keepDelimsR $ whenElt bigger
The final step is just to throw together the bit that constructs the tuples, the bit that splits the tuples, and a last bit of munging to throw away the bits of the tuples we don't care about:
ascending = map (map fst) . ascendingTuples . pan
Let's try it out in ghci:
*Main> ascending [4,5,6,7,1,2,3,4,5,6,1,2]
[[4,5,6,7],[1,2,3,4,5,6],[1,2]]
*Main> ascending [7,6..1]
[[7],[6],[5],[4],[3],[2],[1]]
*Main> ascending []
[[]]
*Main> ascending [1]
[[1]]
P.S. In the current release of split, keepDelimsR is slightly stricter than it needs to be, and as a result ascending currently doesn't work with infinite lists. I've submitted a patch that makes it lazier, though.
ascend :: Ord a => [a] -> [[a]]
ascend xs = foldr f [] xs
where
f a [] = [[a]]
f a xs'#(y:ys) | a < head y = (a:y):ys
| otherwise = [a]:xs'
In ghci
*Main> ascend [4,5,6,7,1,2,3,4,5,6,1,2]
[[4,5,6,7],[1,2,3,4,5,6],[1,2]]
This problem is a natural fit for a paramorphism-based solution. Having (as defined in that post)
para :: (a -> [a] -> b -> b) -> b -> [a] -> b
foldr :: (a -> b -> b) -> b -> [a] -> b
para c n (x : xs) = c x xs (para c n xs)
foldr c n (x : xs) = c x (foldr c n xs)
para c n [] = n
foldr c n [] = n
we can write
partition_asc xs = para c [] xs where
c x (y:_) ~(a:b) | x<y = (x:a):b
c x _ r = [x]:r
Trivial, since the abstraction fits.
BTW they have two kinds of map in Common Lisp - mapcar
(processing elements of an input list one by one)
and maplist (processing "tails" of a list). With this idea we get
import Data.List (tails)
partition_asc2 xs = foldr c [] . init . tails $ xs where
c (x:y:_) ~(a:b) | x<y = (x:a):b
c (x:_) r = [x]:r
Lazy patterns in both versions make it work with infinite input lists
in a productive manner (as first shown in Daniel Fischer's answer).
update 2020-05-08: not so trivial after all. Both head . head . partition_asc $ [4] ++ undefined and the same for partition_asc2 fail with *** Exception: Prelude.undefined. The combining function g forces the next element y prematurely. It needs to be more carefully written to be productive right away before ever looking at the next element, as e.g. for the second version,
partition_asc2' xs = foldr c [] . init . tails $ xs where
c (x:ys) r#(~(a:b)) = (x:g):gs
where
(g,gs) | not (null ys)
&& x < head ys = (a,b)
| otherwise = ([],r)
(again, as first shown in Daniel's answer).
You can use a right fold to break up the list at down-steps:
foldr foo [] xs
where
foo x yss = (x:zs) : ws
where
(zs, ws) = case yss of
(ys#(y:_)) : rest
| x < y -> (ys,rest)
| otherwise -> ([],yss)
_ -> ([],[])
(It's a bit complicated in order to have the combining function lazy in the second argument, so that it works well for infinite lists too.)
One other way of approaching this task (which, in fact lays the fundamentals of a very efficient sorting algorithm) is using the Continuation Passing Style a.k.a CPS which, in this particular case applied to folding from right; foldr.
As is, this answer would only chunk up the ascending chunks however, it would be nice to chunk up the descending ones at the same time... preferably in reverse order all in O(n) which would leave us with only binary merging of the obtained chunks for a perfectly sorted output. Yet that's another answer for another question.
chunks :: Ord a => [a] -> [[a]]
chunks xs = foldr go return xs $ []
where
go :: Ord a => a -> ([a] -> [[a]]) -> ([a] -> [[a]])
go c f = \ps -> let (r:rs) = f [c]
in case ps of
[] -> r:rs
[p] -> if c > p then (p:r):rs else [p]:(r:rs)
*Main> chunks [4,5,6,7,1,2,3,4,5,6,1,2]
[[4,5,6,7],[1,2,3,4,5,6],[1,2]]
*Main> chunks [4,5,6,7,1,2,3,4,5,4,3,2,6,1,2]
[[4,5,6,7],[1,2,3,4,5],[4],[3],[2,6],[1,2]]
In the above code c stands for current and p is for previous and again, remember we are folding from right so previous, is actually the next item to process.