Related
I am trying to define a "pairs" function, which given a positive integer, returns a list of pairs of positive integers, with the first component of the pair less than or equal to the second component, the sum of which is equal to the given positive integer. For example:
pairs 7 = [(1,6),(2,5),(3,4)]
pairs 10 = [(1,9),(2,8),(3,7),(4,6),(5,5)]
I've managed to do this without using recursion:
pairs :: Integral a => a -> [(a, a)]
pairs 1 = []
pairs x
| x <= 0 = error "It is not positive"
| mod x 2 == 0 = zip (enumFromTo 1 (div x 2)) (enumFromThenTo (pred x) (pred(pred x)) (div x 2))
| otherwise = zip (enumFromTo 1 (div x 2)) (enumFromThenTo (pred x) (pred(pred x)) (succ(div x 2)))
And using recursion:
generateTupleList:: Integral a => [(a, a)] -> [(a, a)]
generateTupleList[] = undefined
generateTupleList((a,b):[])
| (a + 1) > (b - 1) = ((a,b):[])
| otherwise = [(a,b)] ++ generateTupleList(((a+1),(b-1)):[])
pairs:: Integral a => a -> [(a, a)]
pairs 1 = []
pairs x
| x <= 0 = error "It is not positive"
| otherwise = generateTupleList[(1,x-1)]
My question is the following, Is there any other way of doing this "pairs" function using recursion and without creating another sub-function (generateTupleList in my case)?
You could just do this:
pairs :: Integral a => a -> [(a, a)]
pairs x = map (\y -> (y, x - y)) [1..x `div` 2]
In case you don't know, [a..b] is equivalent to enumFromThenTo a b, and wrapping a function in backticks makes it behave like an operator, so a `div` b is the same as div a b.
Say you have a list of numbers, [1,2,3,5,6,7,8,9,11,12,15,16,17]
and you want a function that takes that as an input and returns something like
[[1,3],[5,9],[11,12],[15,17]] or alternatively maybe
[(1,3), (5,9), (11,12), (15,17)]
how would this be done? all of the solutions i've found online are very very long and quite convoluted, when this seems like such an easy problem for a functional language like haskell
So we have a list of numbers,
xs = [1,2,3,5,6,7,8,9,11,12,14,16,17] -- 14 sic!
We turn it into a list of segments,
ys = [[x,x+1] | x <- xs]
-- [[1,2], [2,3], [3,4], [5,6], ..., [11,12], [12,13], [14,15], [16,17], [17,18] ]
we join the touching segments,
zs = foldr g [] ys
-- [[1,4], [5,10], [11,13], [14,15], [16,18]]
where
g [a,b] [] = [[a,b]]
g [a,b] r#([c,d]:t) | b==c = [a,d]:t
| otherwise = [a,b]:r
and we subtract 1 from each segment's ending value,
ws = [[a,b-1] | [a,b] <- zs]
-- [[1,3], [5,9], [11,12], [14,14], [16,17]]
All in all we get
ranges :: (Num t, Eq t) => [t] -> [[t]]
ranges = map (\[a,b] -> [a,b-1]) . foldr g [] . map (\x -> [x,x+1])
where
g [a,b] [] = [[a,b]]
g [a,b] r#([c,d]:t) | b==c = [a,d]:t
| otherwise = [a,b]:r
Simple and clear.
edit: or, to be properly lazy,
where
g [a,b] r = [a,x]:y
where
(x,y) = case r of ([c,d]:t) | b==c -> (d,t) -- delay forcing
_ -> (b,r)
update: as dfeuer notes, (a,a) type is better than [a,a]. Wherever [P,Q] appears in this code, replace it with (P,Q). This will improve the code, with zero cost to readability.
I would definitely prefer the alternative representation to the first one you give.
ranges :: (Num a, Eq a) => [a] -> [(a,a)]
ranges [] = []
ranges (a : as) = ranges1 a as
-- | A version of 'ranges' for non-empty lists, where
-- the first element is supplied separately.
ranges1 :: (Num a, Eq a) => a -> [a] -> [(a,a)]
ranges1 a as = (a, b) : bs
where
-- Calculate the right endpoint and the rest of the
-- result lazily, when needed.
(b, bs) = finish a as
-- | This takes the left end of the current interval
-- and the rest of the list and produces the right endpoint of
-- that interval and the rest of the result.
finish :: (Num a, Eq a) => a -> [a] -> (a, [(a, a)])
finish l [] = (l, [])
finish l (x : xs)
| x == l + 1 = finish x xs
| otherwise = (l, ranges1 x xs)
To solve the Rosetta Code problem linked in the comment above, this isn't really quite an optimal representation. I'll try to explain how to match the representation more precisely later.
So one might do it like the idea from #Will Ness on the stateful folding or mine under the same answer. All explanations are to be found there. Besides, if you get curious and want to read more about it then have a look at Haskell Continuation Passing Style page. I am currently trying to gerealize this in such a way that we can have a variant of foldr1 in a stateful manner. A foldS :: Foldable t => (a -> a -> b) -> t a -> b. However this is still not general stateful folding. It's just tailored to this question.
ranges :: (Ord a, Num a) => [a] -> [[a]]
ranges xs = foldr go return xs $ []
where
go :: (Ord a, Num a) => a -> ([a] -> [[a]]) -> ([a] -> [[a]])
go c f = \ps -> let rrs#(r:rs) = f [c]
in case ps of
[] -> [c]:r:rs
[p] -> if p + 1 == c then rrs else [p]:(c:r):rs
*Main> ranges [1,2,3,5,6,7,8,9,11,12,15,16,17]
[[1,3],[5,9],[11,12],[15,17]]
I haven't had time to test any edge cases. All advices are welcome.
I m a newbie to Haskell. I am pretty good with Imperative languages but not with functional. Haskell is my first as a functional language.
I am trying to figure out, how to get the index of the smallest element in the list where the minimum element is defined by me.
Let me explain by examples.
For example :
Function signature
minList :: x -> [x]
let x = 2
let list = [2,3,5,4,6,5,2,1,7,9,2]
minList x list --output 1 <- is index
This should return 1. Because the at list[1] is 3. It returns 1 because 3 is the smallest element after x (=2).
let x = 1
let list = [3,5,4,6,5,2,1,7,9,2]
minList x list -- output 9 <- is index
It should return 9 because at list[9] is 2 and 2 is the smallest element after 1. x = 1 which is defined by me.
What I have tried so far.
minListIndex :: (Ord a, Num a) => a -> [a] -> a
minListIndex x [] = 0
minListIndex x (y:ys)
| x > y = length ys
| otherwise = m
where m = minListIndex x ys
When I load the file I get this error
• Couldn't match expected type ‘a’ with actual type ‘Int’
‘a’ is a rigid type variable bound by
the type signature for:
minListIndex :: forall a. (Ord a, Num a) => a -> [a] -> a
at myFile.hs:36:17
• In the expression: 1 + length ys
In an equation for ‘minListIndex’:
minListIndex x (y : ys)
| x > y = 1 + length ys
| otherwise = 1 + m
where
m = minListIndex x ys
• Relevant bindings include
m :: a (bound at myFile.hs:41:19)
ys :: [a] (bound at myFile.hs:38:19)
y :: a (bound at myFile.hs:38:17)
x :: a (bound at myFile.hs:38:14)
minListIndex :: a -> [a] -> a (bound at myFile.hs:37:1)
When I modify the function like this
minListIndex :: (Ord a, Num a) => a -> [a] -> a
minListIndex x [] = 0
minListIndex x (y:ys)
| x > y = 2 -- <- modified...
| otherwise = 3 -- <- modifiedd
where m = minListIndex x ys
I load the file again then it compiles and runs but ofc the output is not desired.
What is the problem with
| x > y = length ys
| otherwise = m
?
In short: Basically, I want to find the index of the smallest element but higher than the x which is defined by me in parameter/function signature.
Thanks for the help in advance!
minListIndex :: (Ord a, Num a) => a -> [a] -> a
The problem is that you are trying to return result of generic type a but it is actually index in a list.
Suppose you are trying to evaluate your function for a list of doubles. In this case compiler should instantiate function's type to Double -> [Double] -> Double which is nonsense.
Actually compiler notices that you are returning something that is derived from list's length and warns you that it is not possible to match generic type a with concrete Int.
length ys returns Int, so you can try this instead:
minListIndex :: Ord a => a -> [a] -> Int
Regarding your original problem, seems that you can't solve it with plain recursion. Consider defining helper recursive function with accumulator. In your case it can be a pair (min_value_so_far, its_index).
First off, I'd separate the index type from the list element type altogether. There's no apparent reason for them to be the same. I will use the BangPatterns extension to avoid a space leak without too much notation; enable that by adding {-# language BangPatterns #-} to the very top of the file. I will also import Data.Word to get access to the Word64 type.
There are two stages: first, find the index of the given element (if it's present) and the rest of the list beyond that point. Then, find the index of the minimum of the tail.
-- Find the 0-based index of the first occurrence
-- of the given element in the list, and
-- the rest of the list after that element.
findGiven :: Eq a => a -> [a] -> Maybe (Word64, [a])
findGiven given = go 0 where
go !_k [] = Nothing --not found
go !k (x:xs)
| given == xs = Just (k, xs)
| otherwise = go (k+1) xs
-- Find the minimum (and its index) of the elements of the
-- list greater than the given one.
findMinWithIndexOver :: Ord a => a -> [a] -> Maybe (Word64, a)
findMinWithIndexOver given = go 0 Nothing where
go !_k acc [] = acc
go !k acc (x : xs)
| x <= given = go (k + 1) acc xs
| otherwise
= case acc of
Nothing -> go (k + 1) (Just (k, x)) xs
Just (ix_min, curr_min)
| x < ix_min = go (k + 1) (Just (k, x)) xs
| otherwise = go (k + 1) acc xs
You can now put these functions together to construct the one you seek. If you want a general Num result rather than a Word64 one, you can use fromIntegral at the very end. Why use Word64? Unlike Int or Word, it's (practically) guaranteed not to overflow in any reasonable amount of time. It's likely substantially faster than using something like Integer or Natural directly.
It is not clear for me what do you want exactly. Based on examples I guess it is: find the index of the smallest element higher than x which appears after x. In that case, This solution is plain Prelude. No imports
minList :: Ord a => a -> [a] -> Int
minList x l = snd . minimum . filter (\a -> x < fst a) . dropWhile (\a -> x /= fst a) $ zip l [0..]
The logic is:
create the list of pairs, [(elem, index)] using zip l [0..]
drop elements until you find the input x using dropWhile (\a -> x /= fst a)
discards elements less than x using filter (\a -> x < fst a)
find the minimum of the resulting list. Tuples are ordered using lexicographic order so it fits your problem
take the index using snd
Your function can be constructed out of ready-made parts as
import Data.Maybe (listToMaybe)
import Data.List (sortBy)
import Data.Ord (comparing)
foo :: (Ord a, Enum b) => a -> [a] -> Maybe b
foo x = fmap fst . listToMaybe . take 1
. dropWhile ((<= x) . snd)
. sortBy (comparing snd)
. dropWhile ((/= x) . snd)
. zip [toEnum 0..]
This Maybe finds the index of the next smallest element in the list above the given element, situated after the given element, in the input list. As you've requested.
You can use any Enum type of your choosing as the index.
Now you can implement this higher-level executable specs as direct recursion, using an efficient Map data structure to hold your sorted elements above x seen so far to find the next smallest, etc.
Correctness first, efficiency later!
Efficiency update: dropping after the sort drops them sorted, so there's a wasted effort there; indeed it should be replaced with the filtering (as seen in the answer by Luis Morillo) before the sort. And if our element type is in Integral (so it is a properly discrete type, unlike just an Enum, thanks to #dfeuer for pointing this out!), there's one more opportunity for an opportunistic optimization: if we hit on a succ minimal element by pure chance, there's no further chance of improvement, and so we should bail out at that point right there:
bar :: (Integral a, Enum b) => a -> [a] -> Maybe b
bar x = fmap fst . either Just (listToMaybe . take 1
. sortBy (comparing snd))
. findOrFilter ((== succ x).snd) ((> x).snd)
. dropWhile ((/= x) . snd)
. zip [toEnum 0..]
findOrFilter :: (a -> Bool) -> (a -> Bool) -> [a] -> Either a [a]
findOrFilter t p = go
where go [] = Right []
go (x:xs) | t x = Left x
| otherwise = fmap ([x | p x] ++) $ go xs
Testing:
> foo 5 [2,3,5,4,6,5,2,1,7,9,2] :: Maybe Int
Just 4
> foo 2 [2,3,5,4,6,5,2,1,7,9,2] :: Maybe Int
Just 1
> foo 1 [3,5,4,6,5,2,1,7,9,2] :: Maybe Int
Just 9
I am trying to learn Haskell and I want to solve one task. I have a list of Integers and I need to add them to another list if they are bigger then both of their neighbors. For Example:
I have a starting list of [0,1,5,2,3,7,8,4] and I need to print out a list which is [5, 8]
This is the code I came up but it returns an empty list:
largest :: [Integer]->[Integer]
largest n
| head n > head (tail n) = head n : largest (tail n)
| otherwise = largest (tail n)
I would solve this as outlined by Thomas M. DuBuisson. Since we want the ends of the list to "count", we'll add negative infinities to each end before creating triples. The monoid-extras package provides a suitable type for this.
import Data.Monoid.Inf
pad :: [a] -> [NegInf a]
pad xs = [negInfty] ++ map negFinite xs ++ [negInfty]
triples :: [a] -> [(a, a, a)]
triples (x:rest#(y:z:_)) = (x,y,z) : triples rest
triples _ = []
isBig :: Ord a => (a,a,a) -> Bool
isBig (x,y,z) = y > x && y > z
scnd :: (a, b, c) -> b
scnd (a, b, c) = b
finites :: [Inf p a] -> [a]
finites xs = [x | Finite x <- xs]
largest :: Ord a => [a] -> [a]
largest = id
. finites
. map scnd
. filter isBig
. triples
. pad
It seems to be working appropriately; in ghci:
> largest [0,1,5,2,3,7,8,4]
[5,8]
> largest [10,1,10]
[10,10]
> largest [3]
[3]
> largest []
[]
You might also consider merging finites, map scnd, and filter isBig in a single list comprehension (then eliminating the definitions of finites, scnd, and isBig):
largest :: Ord a => [a] -> [a]
largest xs = [x | (a, b#(Finite x), c) <- triples (pad xs), a < b, c < b]
But I like the decomposed version better; the finites, scnd, and isBig functions may turn out to be useful elsewhere in your development, especially if you plan to build a few variants of this for different needs.
One thing you might try is lookahead. (Thomas M. DuBuisson suggested a different one that will also work if you handle the final one or two elements correctly.) Since it sounds like this is a problem you want to solve on your own as a learning exercise, I’ll write a skeleton that you can take as a starting-point if you want:
largest :: [Integer] -> [Integer]
largest [] = _
largest [x] = _ -- What should this return?
largest [x1,x2] | x1 > x2 = _
| x1 < x2 = _
| otherwise = _
largest [x1,x2,x3] | x2 > x1 && x2 > x3 = _
| x3 > x2 = _
| otherwise = _
largest (x1:x2:x3:xs) | x2 > x1 && x2 > x3 = _
| otherwise = _
We need the special case of [x1,x2,x3] in addition to (x1:x2:x3:[]) because, according to the clarification in your comment, largest [3,3,2] should return []. but largest [3,2] should return [3]. Therefore, the final three elements require special handling and cannot simply recurse on the final two.
If you also want the result to include the head of the list if it is greater than the second element, you’d make this a helper function and your largest would be something like largest (x1:x2:xs) = (if x1>x2 then [x1] else []) ++ largest' (x1:x2:xs). That is, you want some special handling for the first elements of the original list, which you don’t want to apply to all the sublists when you recurse.
As suggested in the comments, one approach would be to first group the list into tuples of length 3 using Preludes zip3 and tail:
*Main> let xs = [0,1,5,2,3,7,8,4]
*Main> zip3 xs (tail xs) (tail (tail xs))
[(0,1,5),(1,5,2),(5,2,3),(2,3,7),(3,7,8),(7,8,4)]
Which is of type: [a] -> [b] -> [c] -> [(a, b, c)] and [a] -> [a] respectively.
Next you need to find a way to filter out the tuples where the middle element is bigger than the first and last element. One way would be to use Preludes filter function:
*Main> let xs = [(0,1,5),(1,5,2),(5,2,3),(2,3,7),(3,7,8),(7,8,4)]
*Main> filter (\(a, b, c) -> b > a && b > c) xs
[(1,5,2),(7,8,4)]
Which is of type: (a -> Bool) -> [a] -> [a]. This filters out elements of a list based on a Boolean returned from the predicate passed.
Now for the final part, you need to extract the middle element from the filtered tuples above. You can do this easily with Preludes map function:
*Main> let xs = [(1,5,2),(7,8,4)]
*Main> map (\(_, x, _) -> x) xs
[5,8]
Which is of type: (a -> b) -> [a] -> [b]. This function maps elements from a list of type a to b.
The above code stitched together would look like this:
largest :: (Ord a) => [a] -> [a]
largest xs = map (\(_, x, _) -> x) $ filter (\(a, b, c) -> b > a && b > c) $ zip3 xs (tail xs) (tail (tail xs))
Note here I used typeclass Ord, since the above code needs to compare with > and <. It's fine to keep it as Integer here though.
Problem
Let us suppose that we have a list xs (possibly a very big one), and we want to check that all its elements are the same.
I came up with various ideas:
Solution 0
checking that all elements in tail xs are equal to head xs:
allTheSame :: (Eq a) => [a] -> Bool
allTheSame xs = and $ map (== head xs) (tail xs)
Solution 1
checking that length xs is equal to the length of the list obtained by taking elements from xs while they're equal to head xs
allTheSame' :: (Eq a) => [a] -> Bool
allTheSame' xs = (length xs) == (length $ takeWhile (== head xs) xs)
Solution 2
recursive solution: allTheSame returns True if the first two elements of xs are equal and allTheSame returns True on the rest of xs
allTheSame'' :: (Eq a) => [a] -> Bool
allTheSame'' xs
| n == 0 = False
| n == 1 = True
| n == 2 = xs !! 0 == xs !! 1
| otherwise = (xs !! 0 == xs !! 1) && (allTheSame'' $ snd $ splitAt 2 xs)
where n = length xs
Solution 3
divide and conquer:
allTheSame''' :: (Eq a) => [a] -> Bool
allTheSame''' xs
| n == 0 = False
| n == 1 = True
| n == 2 = xs !! 0 == xs !! 1
| n == 3 = xs !! 0 == xs !! 1 && xs !! 1 == xs !! 2
| otherwise = allTheSame''' (fst split) && allTheSame''' (snd split)
where n = length xs
split = splitAt (n `div` 2) xs
Solution 4
I just thought about this while writing this question:
allTheSame'''' :: (Eq a) => [a] -> Bool
allTheSame'''' xs = all (== head xs) (tail xs)
Questions
I think Solution 0 is not very efficient, at least in terms of memory, because map will construct another list before applying and to its elements. Am I right?
Solution 1 is still not very efficient, at least in terms of memory, because takeWhile will again build an additional list. Am I right?
Solution 2 is tail recursive (right?), and it should be pretty efficient, because it will return False as soon as (xs !! 0 == xs !! 1) is False. Am I right?
Solution 3 should be the best one, because it complexity should be O(log n)
Solution 4 looks quite Haskellish to me (is it?), but it's probably the same as Solution 0, because all p = and . map p (from Prelude.hs). Am I right?
Are there other better ways of writing allTheSame? Now, I expect someone will answer this question telling me that there's a build-in function that does this: I've searched with hoogle and I haven't found it. Anyway, since I'm learning Haskell, I believe that this was a good exercise for me :)
Any other comment is welcome. Thank you!
gatoatigrado's answer gives some nice advice for measuring the performance of various solutions. Here is a more symbolic answer.
I think solution 0 (or, exactly equivalently, solution 4) will be the fastest. Remember that Haskell is lazy, so map will not have to construct the whole list before and is applied. A good way to build intuition about this is to play with infinity. So for example:
ghci> and $ map (< 1000) [1..]
False
This asks whether all numbers are less than 1,000. If map constructed the entire list before and were applied, then this question could never be answered. The expression will still answer quickly even if you give the list a very large right endpoint (that is, Haskell is not doing any "magic" depending on whether a list is infinite).
To start my example, let's use these definitions:
and [] = True
and (x:xs) = x && and xs
map f [] = []
map f (x:xs) = f x : map f xs
True && x = x
False && x = False
Here is the evaluation order for allTheSame [7,7,7,7,8,7,7,7]. There will be extra sharing that is too much of a pain to write down. I will also evaluate the head expression earlier than it would be for conciseness (it would have been evaluated anyway, so it's hardly different).
allTheSame [7,7,7,7,8,7,7,7]
allTheSame (7:7:7:7:8:7:7:7:[])
and $ map (== head (7:7:7:7:8:7:7:7:[])) (tail (7:7:7:7:8:7:7:7:[]))
and $ map (== 7) (tail (7:7:7:7:8:7:7:7:[]))
and $ map (== 7) (7:7:7:8:7:7:7:[])
and $ (== 7) 7 : map (== 7) (7:7:8:7:7:7:[])
(== 7) 7 && and (map (== 7) (7:7:8:7:7:7:[]))
True && and (map (== 7) (7:7:8:7:7:7:[]))
and (map (== 7) (7:7:8:7:7:7:[]))
(== 7) 7 && and (map (== 7) (7:8:7:7:7:[]))
True && and (map (== 7) (7:8:7:7:7:[]))
and (map (== 7) (7:8:7:7:7:[]))
(== 7) 7 && and (map (== 7) (8:7:7:7:[]))
True && and (map (== 7) (8:7:7:7:[]))
and (map (== 7) (8:7:7:7:[]))
(== 7) 8 && and (map (== 7) (7:7:7:[]))
False && and (map (== 7) (7:7:7:[]))
False
See how we didn't even have to check the last 3 7's? This is lazy evaluation making a list work more like a loop. All your other solutions use expensive functions like length (which have to walk all the way to the end of the list to give an answer), so they will be less efficient and also they will not work on infinite lists. Working on infinite lists and being efficient often go together in Haskell.
First of all, I don't think you want to be working with lists. A lot of your algorithms rely upon calculating the length, which is bad. You may want to consider the vector package, which will give you O(1) length compared to O(n) for a list. Vectors are also much more memory efficient, particularly if you can use Unboxed or Storable variants.
That being said, you really need to consider traversals and usage patterns in your code. Haskell's lists are very efficient if they can be generated on demand and consumed once. This means that you shouldn't hold on to references to a list. Something like this:
average xs = sum xs / length xs
requires that the entire list be retained in memory (by either sum or length) until both traversals are completed. If you can do your list traversal in one step, it'll be much more efficient.
Of course, you may need to retain the list anyway, such as to check if all the elements are equal, and if they aren't, do something else with the data. In this case, with lists of any size you're probably better off with a more compact data structure (e.g. vector).
Now that this is out of they way, here's a look at each of these functions. Where I show core, it was generated with ghc-7.0.3 -O -ddump-simpl. Also, don't bother judging Haskell code performance when compiled with -O0. Compile it with the flags you would actually use for production code, typically at least -O and maybe other options too.
Solution 0
allTheSame :: (Eq a) => [a] -> Bool
allTheSame xs = and $ map (== head xs) (tail xs)
GHC produces this Core:
Test.allTheSame
:: forall a_abG. GHC.Classes.Eq a_abG => [a_abG] -> GHC.Bool.Bool
[GblId,
Arity=2,
Str=DmdType LS,
Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=2, Value=True,
ConLike=True, Cheap=True, Expandable=True,
Guidance=IF_ARGS [3 3] 16 0}]
Test.allTheSame =
\ (# a_awM)
($dEq_awN :: GHC.Classes.Eq a_awM)
(xs_abH :: [a_awM]) ->
case xs_abH of _ {
[] ->
GHC.List.tail1
`cast` (CoUnsafe (forall a1_axH. [a1_axH]) GHC.Bool.Bool
:: (forall a1_axH. [a1_axH]) ~ GHC.Bool.Bool);
: ds1_axJ xs1_axK ->
letrec {
go_sDv [Occ=LoopBreaker] :: [a_awM] -> GHC.Bool.Bool
[LclId, Arity=1, Str=DmdType S]
go_sDv =
\ (ds_azk :: [a_awM]) ->
case ds_azk of _ {
[] -> GHC.Bool.True;
: y_azp ys_azq ->
case GHC.Classes.== # a_awM $dEq_awN y_azp ds1_axJ of _ {
GHC.Bool.False -> GHC.Bool.False; GHC.Bool.True -> go_sDv ys_azq
}
}; } in
go_sDv xs1_axK
}
This looks pretty good, actually. It will produce an error with an empty list, but that's easily fixed. This is the case xs_abH of _ { [] ->. After this GHC performed a worker/wrapper transformation, the recursive worker function is the letrec { go_sDv binding. The worker examines its argument. If [], it's reached the end of the list and returns True. Otherwise it compares the head of the remaining to the first element and either returns False or checks the rest of the list.
Three other features.
The map was entirely fused away
and doesn't allocate a temporary
list.
Near the top of the definition
notice the Cheap=True statement.
This means GHC considers the
function "cheap", and thus a
candidate for inlining. At a call
site, if a concrete argument type
can be determined, GHC will probably
inline allTheSame and produce a
very tight inner loop, completely
bypassing the Eq dictionary
lookup.
The worker function is
tail-recursive.
Verdict: Very strong contender.
Solution 1
allTheSame' :: (Eq a) => [a] -> Bool
allTheSame' xs = (length xs) == (length $ takeWhile (== head xs) xs)
Even without looking at core I know this won't be as good. The list is traversed more than once, first by length xs then by length $ takeWhile. Not only do you have the extra overhead of multiple traversals, it means that the list must be retained in memory after the first traversal and can't be GC'd. For a big list, this is a serious problem.
Test.allTheSame'
:: forall a_abF. GHC.Classes.Eq a_abF => [a_abF] -> GHC.Bool.Bool
[GblId,
Arity=2,
Str=DmdType LS,
Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=2, Value=True,
ConLike=True, Cheap=True, Expandable=True,
Guidance=IF_ARGS [3 3] 20 0}]
Test.allTheSame' =
\ (# a_awF)
($dEq_awG :: GHC.Classes.Eq a_awF)
(xs_abI :: [a_awF]) ->
case GHC.List.$wlen # a_awF xs_abI 0 of ww_aC6 { __DEFAULT ->
case GHC.List.$wlen
# a_awF
(GHC.List.takeWhile
# a_awF
(let {
ds_sDq :: a_awF
[LclId, Str=DmdType]
ds_sDq =
case xs_abI of _ {
[] -> GHC.List.badHead # a_awF; : x_axk ds1_axl -> x_axk
} } in
\ (ds1_dxa :: a_awF) ->
GHC.Classes.== # a_awF $dEq_awG ds1_dxa ds_sDq)
xs_abI)
0
of ww1_XCn { __DEFAULT ->
GHC.Prim.==# ww_aC6 ww1_XCn
}
}
Looking at the core doesn't tell much beyond that. However, note these lines:
case GHC.List.$wlen # a_awF xs_abI 0 of ww_aC6 { __DEFAULT ->
case GHC.List.$wlen
This is where the list traversals happen. The first gets the length of the outer list and binds it to ww_aC6. The second gets the length of the inner list, but the binding doesn't happen until near the bottom, at
of ww1_XCn { __DEFAULT ->
GHC.Prim.==# ww_aC6 ww1_XCn
The lengths (both Ints) can be unboxed and compared by a primop, but that's a small consolation after the overhead that's been introduced.
Verdict: Not good.
Solution 2
allTheSame'' :: (Eq a) => [a] -> Bool
allTheSame'' xs
| n == 0 = False
| n == 1 = True
| n == 2 = xs !! 0 == xs !! 1
| otherwise = (xs !! 0 == xs !! 1) && (allTheSame'' $ snd $ splitAt 2 xs)
where n = length xs
This has the same problem as solution 1. The list is traversed multiple times, and it can't be GC'd. It's worse here though, because now the length is calculated for each sub-list. I'd expect this to have the worst performance of all on lists of any significant size. Also, why are you special-casing lists of 1 and 2 elements when you're expecting the list to be big?
Verdict: Don't even think about it.
Solution 3
allTheSame''' :: (Eq a) => [a] -> Bool
allTheSame''' xs
| n == 0 = False
| n == 1 = True
| n == 2 = xs !! 0 == xs !! 1
| n == 3 = xs !! 0 == xs !! 1 && xs !! 1 == xs !! 2
| otherwise = allTheSame''' (fst split) && allTheSame''' (snd split)
where n = length xs
split = splitAt (n `div` 2) xs
This has the same problem as Solution 2. Namely, the list is traversed multiple times by length. I'm not certain a divide-and-conquer approach is a good choice for this problem, it could end up taking longer than a simple scan. It would depend on the data though, and be worth testing.
Verdict: Maybe, if you used a different data structure.
Solution 4
allTheSame'''' :: (Eq a) => [a] -> Bool
allTheSame'''' xs = all (== head xs) (tail xs)
This was basically my first thought. Let's check the core again.
Test.allTheSame''''
:: forall a_abC. GHC.Classes.Eq a_abC => [a_abC] -> GHC.Bool.Bool
[GblId,
Arity=2,
Str=DmdType LS,
Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=2, Value=True,
ConLike=True, Cheap=True, Expandable=True,
Guidance=IF_ARGS [3 3] 10 0}]
Test.allTheSame'''' =
\ (# a_am5)
($dEq_am6 :: GHC.Classes.Eq a_am5)
(xs_alK :: [a_am5]) ->
case xs_alK of _ {
[] ->
GHC.List.tail1
`cast` (CoUnsafe (forall a1_axH. [a1_axH]) GHC.Bool.Bool
:: (forall a1_axH. [a1_axH]) ~ GHC.Bool.Bool);
: ds1_axJ xs1_axK ->
GHC.List.all
# a_am5
(\ (ds_dwU :: a_am5) ->
GHC.Classes.== # a_am5 $dEq_am6 ds_dwU ds1_axJ)
xs1_axK
}
Ok, not too bad. Like solution 1, this will error on empty lists. The list traversal is hidden in GHC.List.all, but it will probably be expanded to good code at a call site.
Verdict: Another strong contender.
So between all of these, with lists I'd expect that Solutions 0 and 4 are the only ones worth using, and they are pretty much the same. I might consider Option 3 in some cases.
Edit: in both cases, the errors on empty lists can be simply fixed as in #augustss's answer.
The next step would be to do some time profiling with criterion.
A solution using consecutive pairs:
allTheSame xs = and $ zipWith (==) xs (tail xs)
Q1 -- Yeah, I think your simple solution is fine, there is no memory leak. Q4 -- Solution 3 is not log(n), via the very simple argument that you need to look at all list elements to determine whether they are the same, and looking at 1 element takes 1 time step. Q5 -- yes. Q6, see below.
The way to go about this is to type it in and run it
main = do
print $ allTheSame (replicate 100000000 1)
then run ghc -O3 -optc-O3 --make Main.hs && time ./Main. I like the last solution best (you can also use pattern matching to clean it up a little),
allTheSame (x:xs) = all (==x) xs
Open up ghci and run ":step fcn" on these things. It will teach you a lot about what lazy evaluation is expanding. In general, when you match a constructor, e.g. "x:xs", that's constant time. When you call "length", Haskell needs to compute all of the elements in the list (though their values are still "to-be-computed"), so solution 1 and 2 are bad.
edit 1
Sorry if my previous answer was a bit shallow. It seems like expanding things manually does help a little (though compared to the other options, it's a trivial improvement),
{-# LANGUAGE BangPatterns #-}
allTheSame [] = True
allTheSame ((!x):xs) = go x xs where
go !x [] = True
go !x (!y:ys) = (x == y) && (go x ys)
It seems that ghc is specializing the function already, but you can look at the specialize pragma too, in case it doesn't work for your code [ link ].
Here is another version (don't need to traverse whole list in case something doesn't match):
allTheSame [] = True
allTheSame (x:xs) = isNothing $ find (x /= ) xs
This may not be syntactically correct , but I hope you got the idea.
Here's another fun way:
{-# INLINABLE allSame #-}
allSame :: Eq a => [a] -> Bool
allSame xs = foldr go (`seq` True) xs Nothing where
go x r Nothing = r (Just x)
go x r (Just prev) = x == prev && r (Just x)
By keeping track of the previous element, rather than the first one, this implementation can easily be changed to implement increasing or decreasing. To check all of them against the first instead, you could rename prev to first, and replace Just x with Just first.
How will this be optimized? I haven't checked in detail, but I'm going to tell a good story based on some things I know about GHC's optimizations.
Suppose first that list fusion does not occur. Then foldr will be inlined, giving something like
allSame xs = allSame' xs Nothing where
allSame' [] = (`seq` True)
allSame' (x : xs) = go x (allSame' xs)
Eta expansion then yields
allSame' [] acc = acc `seq` True
allSame' (x : xs) acc = go x (allSame' xs) acc
Inlining go,
allSame' [] acc = acc `seq` True
allSame' (x : xs) Nothing = allSame' xs (Just x)
allSame' (x : xs) (Just prev) =
x == prev && allSame' xs (Just x)
Now GHC can recognize that the Maybe value is always Just on the recursive call, and use a worker-wrapper transformation to take advantage of this:
allSame' [] acc = acc `seq` True
allSame' (x : xs) Nothing = allSame'' xs x
allSame' (x : xs) (Just prev) = x == prev && allSame'' xs x
allSame'' [] prev = True
allSame'' (x : xs) prev = x == prev && allSame'' xs x
Remember now that
allSame xs = allSame' xs Nothing
and allSame' is no longer recursive, so it can be beta-reduced:
allSame [] = True
allSame (x : xs) = allSame'' xs x
allSame'' [] _ = True
allSame'' (x : xs) prev = x == prev && allSame'' xs x
So the higher-order code has turned into efficient recursive code with no extra allocation.
Compiling the module defining allSame using -O2 -ddump-simpl -dsuppress-all -dno-suppress-type-signatures yields the following (I've cleaned it up a bit):
allSame :: forall a. Eq a => [a] -> Bool
allSame =
\ (# a) ($dEq_a :: Eq a) (xs0 :: [a]) ->
let {
equal :: a -> a -> Bool
equal = == $dEq_a } in
letrec {
go :: [a] -> a -> Bool
go =
\ (xs :: [a]) (prev :: a) ->
case xs of _ {
[] -> True;
: y ys ->
case equal y prev of _ {
False -> False;
True -> go ys y
}
}; } in
case xs0 of _ {
[] -> True;
: x xs -> go xs x
}
As you can see, this is essentially the same as the result I described. The equal = == $dEq_a bit is where the equality method is extracted from the Eq dictionary and saved in a variable so it only needs to be extracted once.
What if list fusion does occur? Here's a reminder of the definition:
allSame xs = foldr go (`seq` True) xs Nothing where
go x r Nothing = r (Just x)
go x r (Just prev) = x == prev && r (Just x)
If we call allSame (build g), the foldr will fuse with the build according to the rule foldr c n (build g) = g c n, yielding
allSame (build g) = g go (`seq` True) Nothing
That doesn't get us anywhere interesting unless g is known. So let's choose something simple:
replicate k0 a = build $ \c n ->
let
rep 0 = n
rep k = a `c` rep (k - 1)
in rep k0
So if h = allSame (replicate k0 a), h becomes
let
rep 0 = (`seq` True)
rep k = go a (rep (k - 1))
in rep k0 Nothing
Eta expanding,
let
rep 0 acc = acc `seq` True
rep k acc = go a (rep (k - 1)) acc
in rep k0 Nothing
Inlining go,
let
rep 0 acc = acc `seq` True
rep k Nothing = rep (k - 1) (Just a)
rep k (Just prev) = a == prev && rep (k - 1) (Just a)
in rep k0 Nothing
Again, GHC can see the recursive call is always Just, so
let
rep 0 acc = acc `seq` True
rep k Nothing = rep' (k - 1) a
rep k (Just prev) = a == prev && rep' (k - 1) a
rep' 0 _ = True
rep' k prev = a == prev && rep' (k - 1) a
in rep k0 Nothing
Since rep is no longer recursive, GHC can reduce it:
let
rep' 0 _ = True
rep' k prev = a == prev && rep' (k - 1) a
in
case k0 of
0 -> True
_ -> rep' (k - 1) a
As you can see, this can run with no allocation whatsoever! Obviously, it's a silly example, but something similar will happen in many more interesting cases. For example, if you write an AllSameTest module importing the allSame function and defining
foo :: Int -> Bool
foo n = allSame [0..n]
and compile it as described above, you'll get the following (not cleaned up).
$wfoo :: Int# -> Bool
$wfoo =
\ (ww_s1bY :: Int#) ->
case tagToEnum# (># 0 ww_s1bY) of _ {
False ->
letrec {
$sgo_s1db :: Int# -> Int# -> Bool
$sgo_s1db =
\ (sc_s1d9 :: Int#) (sc1_s1da :: Int#) ->
case tagToEnum# (==# sc_s1d9 sc1_s1da) of _ {
False -> False;
True ->
case tagToEnum# (==# sc_s1d9 ww_s1bY) of _ {
False -> $sgo_s1db (+# sc_s1d9 1) sc_s1d9;
True -> True
}
}; } in
case ww_s1bY of _ {
__DEFAULT -> $sgo_s1db 1 0;
0 -> True
};
True -> True
}
foo :: Int -> Bool
foo =
\ (w_s1bV :: Int) ->
case w_s1bV of _ { I# ww1_s1bY -> $wfoo ww1_s1bY }
That may look disgusting, but you'll note that there are no : constructors anywhere, and that the Ints are all unboxed, so the function can run with zero allocation.
I think I might just be implementing find and redoing this. I think it's instructive, though, to see the innards of it. (Note how the solution depends on equality being transitive, though note also how the problem requires equality to be transitive to be coherent.)
sameElement x:y:xs = if x /= y then Nothing else sameElement y:xs
sameElement [x] = Just x
allEqual [] = True
allEqual xs = isJust $ sameElement xs
I like how sameElement peeks at the first O(1) elements of the list, then either returns a result or recurses on some suffix of the list, in particular the tail. I don't have anything smart to say about that structure, I just like it :-)
I think I do the same comparisons as this. If instead I had recursed with sameElement x:xs, I would compare the head of the input list to each element like in solution 0.
Tangent: one could, if one wanted, report the two mismatching elements by replacing Nothing with Left (x, y) and Just x with Right x and isJust with either (const False) (const True).
This implementation is superior.
allSame [ ] = True
allSame (h:t) = aux h t
aux x1 [ ] = True
aux x1 (x2:xs) | x1==x2 = aux x2 xs
| otherwise = False
Given the transitivity of the (==) operator, assuming the instance of Eq is well implemented if you wish to assure the equality of a chain of expressions, eg a = b = c = d, you will only need to assure that a=b, b=c, c=d, and that d=a, Instead of the provided techniques above, eg a=b, a=c, a=d, b=c , b=d, c=d.
The solution I proposed grows linearly with the number of elements you wish to test were's the latter is quadratic even if you introduce constant factors in hopes of improving its efficiency.
It's also superior to the solution using group since you don't have to use length in the end.
You can also write it nicely in pointwise fashion but I won't bore you with such trivial details.
While not very efficient (it will traverse the whole list even if the first two elements don't match), here's a cheeky solution:
import Data.List (group)
allTheSame :: (Eq a) => [a] -> Bool
allTheSame = (== 1) . length . group
Just for fun.