Haskell - Removing adjacent duplicates from a list - list

I'm trying to learn haskell by solving some online problems and training exercises.
Right now I'm trying to make a function that'd remove adjacent duplicates from a list.
Sample Input
"acvvca"
"1456776541"
"abbac"
"aabaabckllm"
Expected Output
""
""
"c"
"ckm"
My first though was to make a function that'd simply remove first instance of adjacent duplicates and restore the list.
module Test where
removeAdjDups :: (Eq a) => [a] -> [a]
removeAdjDups [] = []
removeAdjDups [x] = [x]
removeAdjDups (x : y : ys)
| x == y = removeAdjDups ys
| otherwise = x : removeAdjDups (y : ys)
*Test> removeAdjDups "1233213443"
"122133"
This func works for first found pairs.
So now I need to apply same function over the result of the function.
Something I think foldl can help with but I don't know how I'd go about implementing it.
Something along the line of
removeAdjDups' xs = foldl (\acc x -> removeAdjDups x acc) xs
Also is this approach the best way to implement the solution or is there a better way I should be thinking of?

Start in last-first order: first remove duplicates from the tail, then check if head of the input equals to head of the tail result (which, by this moment, won't have any duplicates, so the only possible pair is head of the input vs. head of the tail result):
main = mapM_ (print . squeeze) ["acvvca", "1456776541", "abbac", "aabaabckllm"]
squeeze :: Eq a => [a] -> [a]
squeeze (x:xs) = let ys = squeeze xs in case ys of
(y:ys') | x == y -> ys'
_ -> x:ys
squeeze _ = []
Outputs
""
""
"c"
"ckm"

I don't see how foldl could be used for this. (Generally, foldl pretty much combines the disadvantages of foldr and foldl'... those, or foldMap, are the folds you should normally be using, not foldl.)
What you seem to intend is: repeating the removeAdjDups, until no duplicates are found anymore. The repetition is a job for
iterate :: (a -> a) -> a -> [a]
like
Prelude> iterate removeAdjDups "1233213443"
["1233213443","122133","11","","","","","","","","","","","","","","","","","","","","","","","","","","",""...
This is an infinite list of ever reduced lists. Generally, it will not converge to the empty list; you'll want to add some termination condition. If you want to remove as many dups as necessary, that's the fixpoint; it can be found in a very similar way to how you implemented removeAdjDups: compare neighbor elements, just this time in the list of reductions.
bipll's suggestion to handle recursive duplicates is much better though, it avoids unnecessary comparisons and traversing the start of the list over and over.

List comprehensions are often overlooked. They are, of course syntactic sugar but some, like me are addicted. First off, strings are lists as they are. This functions could handle any list, too as well as singletons and empty lists. You can us map to process many lists in a list.
(\l -> [ x | (x,y) <- zip l $ (tail l) ++ " ", x /= y]) "abcddeeffa"
"abcdefa"

I don't see either how to use foldl. It's maybe because, if you want to fold something here, you have to use foldr.
main = mapM_ (print . squeeze) ["acvvca", "1456776541", "abbac", "aabaabckllm"]
-- I like the name in #bipll answer
squeeze = foldr (\ x xs -> if xs /= "" && x == head(xs) then tail(xs) else x:xs) ""
Let's analyze this. The idea is taken from #bipll answer: go from right to left. If f is the lambda function, then by definition of foldr:
squeeze "abbac" = f('a' f('b' f('b' f('a' f('c' "")))
By definition of f, f('c' "") = 'c':"" = "c" since xs == "". Next char from the right: f('a' "c") = 'a':"c" = "ac" since 'a' != head("c") = 'c'. f('b' "ac") = "bac" for the same reason. But f('b' "bac") = tail("bac") = "ac" because 'b' == head("bac"). And so forth...
Bonus: by replacing foldr with scanr, you can see the whole process:
Prelude> squeeze' = scanr (\ x xs -> if xs /= "" && x == head(xs) then tail(xs) else x:xs) ""
Prelude> zip "abbac" (squeeze' "abbac")
[('a',"c"),('b',"ac"),('b',"bac"),('a',"ac"),('c',"c")]

Related

How to write a Haskell function that inserts an element into a sorted list

I tried with something like this but it doesn't work how I wanted it to do. I'm new kinda new to Haskell, and I don't really know how to do it, and what's wrong.
insert a (x:xs) = insert2 a (x:xs) []
where insert2 el (x:xs) hd =
if (x:xs) == []
then []
else if ( a>=x && a < head(xs))
then hd ++ [x] ++ [a] ++ xs
else insert2 a xs hd++[x]
main = do
let list =[1 ,2 ,3 ,4 ,5 ,6]
let out = insert 2 list
print out
The output I get is [2,2,3,4,5,6,1]
First a couple of cosmetics:
Ensure indentation is right. When copy/pasting into StackOverflow, it's generally best to use ctrl+k to get it in code-block style.
There's no point matching (x:xs) only to pass the entire thing into your local function.
Omit unnecessary parentheses and use standardised spacing.
With that, your code becomes
insert a allxs = insert2 a allxs []
where insert2 el (x:xs) hd =
if x:xs == []
then []
else if a >= x && a < head xs
then hd ++ [x] ++ [a] ++ xs
else insert2 a xs hd ++ [x]
main = do
let list = [1, 2, 3, 4, 5, 6]
let out = insert 2 list
print out
Algorithmically speaking, there's no point in using an “accumulator argument” here. It's easier and actually more efficient to directly recurse on the input, and simply pass on the remaining tail after done with the insertion. Also remember to have a base case:
insert a [] = [a]
insert a (x:xs) = ...
You also don't need to use head. You've already pattern-matched the head element with the x:xs pattern. If you did need another list element, you should match that right there too, like
insert a (x:x':xs) = ...
...but you don't in fact need that, x is enough to determine what to do. Namely,
insert a (x:xs)
| a<=x = -- if the list was ordered, this implies that now _all_
-- its elements must be greater or equal a. Do you
-- need any recursion anymore?
| otherwise = -- ok, `x` was smaller, so you need to insert after it.
-- Recursion is needed here.
Here are some hints. It's a lot simpler than you're making it. You definitely don't need a helper function.
insert a [] = ??
insert a (x : xs)
| a <= x = ???
| otherwise = ???
Two things:
Prepending to a list is more efficient than appending to one.
Haskell lets you write separate definitions to avoid having to write single, nested conditional expressions.
There are two kinds of list you can insert into: empty and non-empty. Each can be handled by a separate definition, which the compiler will use to define a single function.
insert a [] = [a]
insert a (x:xs) = ...
The first case is easy: inserting into an empty list produces a singleton list. The second case is tricker: what you do depends on whether a is smaller than x or not. You can use a conditional expression
insert a (x:xs) = if a < x then a : insert x xs else x : insert a xs
thought you may see guards used instead:
insert a (x:xs) | a < x = a : insert x xs
| otherwise = x : insert a xs
In both cases, we know (because the list argument is already sorted) that insert x xs == x : xs, so we can write that directly to "short-circuit" the recursion:
insert a (x:xs) = if a < x then a : x : xs else x : insert a xs
don't complicate! , make simple ...
insertme a list = takeWhile (<a) list ++ [a] ++ dropWhile (<a) list

How can I find the index where one list appears as a sublist of another?

I have been working with Haskell for a little over a week now so I am practicing some functions that might be useful for something. I want to compare two lists recursively. When the first list appears in the second list, I simply want to return the index at where the list starts to match. The index would begin at 0. Here is an example of what I want to execute for clarification:
subList [1,2,3] [4,4,1,2,3,5,6]
the result should be 2
I have attempted to code it:
subList :: [a] -> [a] -> a
subList [] = []
subList (x:xs) = x + 1 (subList xs)
subList xs = [ y:zs | (y,ys) <- select xs, zs <- subList ys]
where select [] = []
select (x:xs) = x
I am receiving an "error on input" and I cannot figure out why my syntax is not working. Any suggestions?
Let's first look at the function signature. You want to take in two lists whose contents can be compared for equality and return an index like so
subList :: Eq a => [a] -> [a] -> Int
So now we go through pattern matching on the arguments. First off, when the second list is empty then there is nothing we can do, so we'll return -1 as an error condition
subList _ [] = -1
Then we look at the recursive step
subList as xxs#(x:xs)
| all (uncurry (==)) $ zip as xxs = 0
| otherwise = 1 + subList as xs
You should be familiar with the guard syntax I've used, although you may not be familiar with the # syntax. Essentially it means that xxs is just a sub-in for if we had used (x:xs).
You may not be familiar with all, uncurry, and possibly zip so let me elaborate on those more. zip has the function signature zip :: [a] -> [b] -> [(a,b)], so it takes two lists and pairs up their elements (and if one list is longer than the other, it just chops off the excess). uncurry is weird so lets just look at (uncurry (==)), its signature is (uncurry (==)) :: Eq a => (a, a) -> Bool, it essentially checks if both the first and second element in the pair are equal. Finally, all will walk over the list and see if the first and second of each pair is equal and return true if that is the case.

How to sort a list in Haskell in command line ghci

I am new to Haskell, and I want to make 1 function that will take two lists and merge then together, and then sort the combined list from smallest to biggest.
this should be done in the command line without using modules.
This is what i currently have, I am having trouble getting the "sortList" function to work, and also I do not know how to combine these 3 lines into 1 function.
let combineList xs ys = xs++ys
let zs = combineList xs ys
let sortList (z:zs) = if (head zs) < z then (zs:z) else (z:(sortList zs))
How to sort list in ghci:
Prelude> :m + Data.List
Prelude Data.List> sort [1,4,2,0]
[0,1,2,4]
About your functions
let combineList xs ys = xs++ys
What is a point to create another alias for append function? But if you're really wants one - it could be defined like let combineList = (++).
let zs = combineList xs ys
It makes no sense because xs and ys are unknown outside of your combineList.
let sortList (z:zs) = if (head zs) < z then (zs:z) else (z:(sort zs))
This definition is not valid because it doesn't cover and empty list case and (zs:z) produces infinite type and sort is not defined yet. And you can get head of zs just by another pattern matching. And maybe you don't wanna to make another recursive call in the then part of if statement. And finally I should admit that this sorting algorithm doesn't work at all.
It's a bit awkward to define a sorting function within the ghci. I thing the easiest way to do it would be to write the sorting function in a file, and then loading it into ghci. For instance, you could write this concise (though not in-place!) version of quicksort in a file called sort.hs (taken from the HaskellWiki):
quicksort :: Ord a => [a] -> [a]
quicksort [] = []
quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater)
where
lesser = filter (< p) xs
greater = filter (>= p) xs
and load it into ghci:
> :l sort.hs
If you really want to define the function in ghci, you can do something like this (from the Haskell user's guide):
> :{
> let { quicksort [] = []
> ; quicksort (p:xs) = (quicksort (filter (< p) xs)) ++ [p] ++ (quicksort (filter (>= p) xs))
> }
> :}
once this is defined, you can do
> let combineAndSort xs ys = quicksort (xs ++ ys)
As another answer already explained, it would of course be quicker to just import sort from Data.List, but it is definitely a good exercise to do it manually.
Your question suggests that you are a bit confused about the scope of variables in Haskell. In this line
> let combineList xs ys = xs++ys
you introduce the variables xs and ys. Mentioning them to the left of the equals sign just means that combineList takes two variables, and in the body of that function, you are going to refer to these variables as xs and ys. It doesn't introduce the names outside of the function, so the next line
> let zs = combineList xs ys
doesn't really make sense, because the names xs and ys are only valid within the scope of combineList. To make zs have a value, you need to give combineList some concrete arguments, eg.:
> let zs = combineList [2,4,6] [1,3,5] --> [2,4,6,1,3,5]
But since the body of combineList is so simple, it would actually be easier to just do:
> let zs = [2,4,6] ++ [1,3,5] --> [2,4,6,1,3,5]
The last line is
> let sortList (z:zs) = if (head zs) < z then (zs:z) else (z:(sortList zs))
I think this line has confused you a lot, because there are quite a lot of different errors here. The answer by ДМИТРИЙ МАЛИКОВ mentions most of them, I would encourage you to try understand each of the errors he mentions.

Need to partition a list into lists based on breaks in ascending order of elements (Haskell)

Say I have any list like this:
[4,5,6,7,1,2,3,4,5,6,1,2]
I need a Haskell function that will transform this list into a list of lists which are composed of the segments of the original list which form a series in ascending order. So the result should look like this:
[[4,5,6,7],[1,2,3,4,5,6],[1,2]]
Any suggestions?
You can do this by resorting to manual recursion, but I like to believe Haskell is a more evolved language. Let's see if we can develop a solution that uses existing recursion strategies. First some preliminaries.
{-# LANGUAGE NoMonomorphismRestriction #-}
-- because who wants to write type signatures, amirite?
import Data.List.Split -- from package split on Hackage
Step one is to observe that we want to split the list based on a criteria that looks at two elements of the list at once. So we'll need a new list with elements representing a "previous" and "next" value. There's a very standard trick for this:
previousAndNext xs = zip xs (drop 1 xs)
However, for our purposes, this won't quite work: this function always outputs a list that's shorter than the input, and we will always want a list of the same length as the input (and in particular we want some output even when the input is a list of length one). So we'll modify the standard trick just a bit with a "null terminator".
pan xs = zip xs (map Just (drop 1 xs) ++ [Nothing])
Now we're going to look through this list for places where the previous element is bigger than the next element (or the next element doesn't exist). Let's write a predicate that does that check.
bigger (x, y) = maybe False (x >) y
Now let's write the function that actually does the split. Our "delimiters" will be values that satisfy bigger; and we never want to throw them away, so let's keep them.
ascendingTuples = split . keepDelimsR $ whenElt bigger
The final step is just to throw together the bit that constructs the tuples, the bit that splits the tuples, and a last bit of munging to throw away the bits of the tuples we don't care about:
ascending = map (map fst) . ascendingTuples . pan
Let's try it out in ghci:
*Main> ascending [4,5,6,7,1,2,3,4,5,6,1,2]
[[4,5,6,7],[1,2,3,4,5,6],[1,2]]
*Main> ascending [7,6..1]
[[7],[6],[5],[4],[3],[2],[1]]
*Main> ascending []
[[]]
*Main> ascending [1]
[[1]]
P.S. In the current release of split, keepDelimsR is slightly stricter than it needs to be, and as a result ascending currently doesn't work with infinite lists. I've submitted a patch that makes it lazier, though.
ascend :: Ord a => [a] -> [[a]]
ascend xs = foldr f [] xs
where
f a [] = [[a]]
f a xs'#(y:ys) | a < head y = (a:y):ys
| otherwise = [a]:xs'
In ghci
*Main> ascend [4,5,6,7,1,2,3,4,5,6,1,2]
[[4,5,6,7],[1,2,3,4,5,6],[1,2]]
This problem is a natural fit for a paramorphism-based solution. Having (as defined in that post)
para :: (a -> [a] -> b -> b) -> b -> [a] -> b
foldr :: (a -> b -> b) -> b -> [a] -> b
para c n (x : xs) = c x xs (para c n xs)
foldr c n (x : xs) = c x (foldr c n xs)
para c n [] = n
foldr c n [] = n
we can write
partition_asc xs = para c [] xs where
c x (y:_) ~(a:b) | x<y = (x:a):b
c x _ r = [x]:r
Trivial, since the abstraction fits.
BTW they have two kinds of map in Common Lisp - mapcar
(processing elements of an input list one by one)
and maplist (processing "tails" of a list). With this idea we get
import Data.List (tails)
partition_asc2 xs = foldr c [] . init . tails $ xs where
c (x:y:_) ~(a:b) | x<y = (x:a):b
c (x:_) r = [x]:r
Lazy patterns in both versions make it work with infinite input lists
in a productive manner (as first shown in Daniel Fischer's answer).
update 2020-05-08: not so trivial after all. Both head . head . partition_asc $ [4] ++ undefined and the same for partition_asc2 fail with *** Exception: Prelude.undefined. The combining function g forces the next element y prematurely. It needs to be more carefully written to be productive right away before ever looking at the next element, as e.g. for the second version,
partition_asc2' xs = foldr c [] . init . tails $ xs where
c (x:ys) r#(~(a:b)) = (x:g):gs
where
(g,gs) | not (null ys)
&& x < head ys = (a,b)
| otherwise = ([],r)
(again, as first shown in Daniel's answer).
You can use a right fold to break up the list at down-steps:
foldr foo [] xs
where
foo x yss = (x:zs) : ws
where
(zs, ws) = case yss of
(ys#(y:_)) : rest
| x < y -> (ys,rest)
| otherwise -> ([],yss)
_ -> ([],[])
(It's a bit complicated in order to have the combining function lazy in the second argument, so that it works well for infinite lists too.)
One other way of approaching this task (which, in fact lays the fundamentals of a very efficient sorting algorithm) is using the Continuation Passing Style a.k.a CPS which, in this particular case applied to folding from right; foldr.
As is, this answer would only chunk up the ascending chunks however, it would be nice to chunk up the descending ones at the same time... preferably in reverse order all in O(n) which would leave us with only binary merging of the obtained chunks for a perfectly sorted output. Yet that's another answer for another question.
chunks :: Ord a => [a] -> [[a]]
chunks xs = foldr go return xs $ []
where
go :: Ord a => a -> ([a] -> [[a]]) -> ([a] -> [[a]])
go c f = \ps -> let (r:rs) = f [c]
in case ps of
[] -> r:rs
[p] -> if c > p then (p:r):rs else [p]:(r:rs)
*Main> chunks [4,5,6,7,1,2,3,4,5,6,1,2]
[[4,5,6,7],[1,2,3,4,5,6],[1,2]]
*Main> chunks [4,5,6,7,1,2,3,4,5,4,3,2,6,1,2]
[[4,5,6,7],[1,2,3,4,5],[4],[3],[2,6],[1,2]]
In the above code c stands for current and p is for previous and again, remember we are folding from right so previous, is actually the next item to process.

Haskell pattern matching conundrum

I was trying to search through a list of pairs that could have the element ("$", Undefined) in it at some arbitrary location. I wanted to ONLY search the part of the list in front of that special element, so I tried something like this (alreadyThere is intended to take the element n and the list xs as arguments):
checkNotSameScope :: Env -> VarName -> Expr -> Expr
checkNotSameScope (xs:("$", Undefined):_) n e = if alreadyThere n xs then BoolLit False
else BoolLit True
But that does not work; the compiler seemed to indicate that (xs: ..) only deals with a SINGLE value prepending my list. I cannot use : to indicate the first chunk of a list; only a single element. Looking back, this makes sense; otherwise, how would the compiler know what to do? Adding an "s" to something like "x" doesn't magically make multiple elements! But how can I work around this?
Unfortunately, even with smart compilers and languages, some programming cannot be avoided...
In your case it seems you want the part of a list up to a specific element. More generally, to find the list up to some condition you can use the standard library takeWhile function. Then you can just run alreadyThere on it:
checkNotSameScope :: Env -> VarName -> Expr -> Expr
checkNotSameScope xs n e = if alreadyThere n (takeWhile (/= ("$", Undefined)) xs)
then BoolLit False
else BoolLit True
It maybe does not what you want for lists where ("$", Undefined) does not occur, so beware.
Similar to Joachim's answer, you can use break, which will allow you to detect when ("$", Undefined) doesn't occur (if this is necessary). i.e.
checkNotSameScope xs n e = case break (== ("$", Undefined)) xs of
(_, []) -> .. -- ("$", Undefined) didn't occur!
(xs', _) -> BoolLit . not $ alreadyThere n xs'
(NB. you lose some laziness in this solution, since the list has to be traversed until ("$", Undefined), or to the end, to check the first case.)
Haskell cannot do this kind of pattern matching out of the box, although there are some languages which can, like CLIPS for example, or F#, by using active patterns.
But we can use Haskell's existing pattern matching capabilities to obtain a similar result. Let us first define a function called deconstruct defined like this:
deconstruct :: [a] -> [([a], a, [a])]
deconstruct [] = []
deconstruct [x] = [([], x, [])]
deconstruct (x:xs) = ([], x, xs) : [(x:ys1, y, ys2) | (ys1, y, ys2) <- deconstruct xs]
What this function does is to obtain all the decompositions of a list xs into triples of form (ys1, y, ys2) such that ys1 ++ [y] ++ ys2 == xs. So for example:
deconstruct [1..4] => [([],1,[2,3,4]),([1],2,[3,4]),([1,2],3,[4]),([1,2,3],4,[])]
Using this you can define your function as follows:
checkNotSameScope xs n e =
case [ys | (ys, ("$", Undefined), _) <- deconstruct xs] of
[ys] -> BoolLit $ not $ alreadyThere n xs
_ -> -- handle the case when ("$", Undefined) doesn't occur at all or more than once
We can use the do-notation to obtain something even closer to what you are looking for:
checkNotSameScope xs n e = BoolLit $ not $ any (alreadyThere n) prefixes
where
prefixes = do
(xs, ("$", Undefined), _) <- deconstruct xs
return xs
There are several things going on here. First of all the prefixes variable will store all the prefix lists which occur before the ("$", Undefined) pair - or none if the pair is not in the input list xs. Then using the any function we are checking whether alreadyThere n gives us True for any of the prefixes. And the rest is to complete your function's logic.