Haskell: List Boundary - list

I have a list of doubles(myList), which I want to add to a new List (someList), but once the new list reaches a set size i.e. 25, I want to stop adding to it. I have tried implementing this function using sum but was unsuccessful. Example code below.
someList = [(a)| a <- myList, sum someList < 30]

The way #DanielFischer phrased the question is compatible with the Haskell way of thinking.
Do you want someList to be the longest prefix of myList that has a sum < 30?
Here's how I'd approach it: let's say our list is
>>> let list = [1..20]
we can find the "cumulative sums" using:
>>> let sums = tail . scanl (+) 0
>>> sums list
[1,3,6,10,15,21,28,36,45,55,66,78,91,105,120,136,153,171,190,210]
Now zip that with the original list to get pairs of elements with the sum up to that point
>>> zip list (sums list)
[(1,1),(2,3),(3,6),(4,10),(5,15),(6,21),(7,28),(8,36),
(9,45),(10,55),(11,66),(12,78),(13,91),(14,105),(15,120),
(16,136),(17,153),(18,171),(19,190),(20,210)]
Then we can takeWhile this list to get the prefix we want:
>>> takeWhile (\x -> snd x < 30) (zip list (sums list))
[(1,1),(2,3),(3,6),(4,10),(5,15),(6,21),(7,28)]
finally we can get rid of the cumulative sums that we used to perform this calculation:
>>> map fst (takeWhile (\x -> snd x < 30) (zip list (sums list)))
[1,2,3,4,5,6,7]
Note that because of laziness, this is as efficient as the recursive solutions -- only the sums up to the point where they fail the test need to be calculated. This can be seen because the solution works on infinite lists (because if we needed to calculate all the sums, we would never finish).
I'd probably abstract this and take the limit as a parameter:
>>> :{
... let initial lim list =
... map fst (takeWhile (\x -> snd x < lim) (zip list (sums list)))
... :}
This function has an obvious property it should satisfy, namely that the sum of a list should always be less than the limit (as long as the limit is greater than 0). So we can use QuickCheck to make sure we did it right:
>>> import Test.QuickCheck
>>> quickCheck (\lim list -> lim > 0 ==> sum (initial lim list) < lim)
+++ OK, passed 100 tests.

someList = makeList myList [] 0 where
makeList (x:xs) ys total = let newTot = total + x
in if newTot >= 25
then ys
else makeList xs (ys ++ [x]) newTot
This takes elements from myList as long as their sum is less than 25.
The logic takes place in makeList. It takes the first element of the input list and adds it to the running total, to see if it's greater than 25. If it is, we shouldn't add it to the output list, and we finish recursing. Otherwise, we put x on the end of the output list (ys) and keep going with the rest of the input list.

The behaviour you want is
ghci> appendWhileUnder 25 [1..5] [1..5]
[1,2,3,4,5,1,2,3]
because that sums to 21 and adding the 4 would bring it to 25.
OK, one way to go about this is by just appending them with ++ then taking the initial segment that's under 25.
appendWhileUnder n xs ys = takeWhileUnder n (xs++ys)
I don't want to keep summing intermediate lists, so I'll keep track with how much I'm allowed (n).
takeWhileUnder n [] = []
takeWhileUnder n (x:xs) | x < n = x:takeWhileUnder (n-x) xs
| otherwise = []
Here I allow x through if it doesn't take me beyond what's left of my allowance.
Possibly undesired side effect: it'll chop out bits of the original list if it sums to over 25. Workaround: use
appendWhileUnder' n xs ys = xs ++ takeWhileUnder (n - sum xs)
which keeps the entire xs whether it brings you over n or not.

Related

How to create all combination of elements of certain length by intentional lists in Haskell

I would like to ask how to create all combinations of elements of a certain length by intentional lists in Haskell? Here is the example:
Function combo is taking two arguments list of elements - xs and value - n, the goal is to create all possible combinations of elements in xs of length n by intentional lists.
For example:
combo [1,2,3] 2
should return
[[1,1],[1,2],[1,3],[2,1],[2,2],[2,3],[3,1],[3,2],[3,3]]
Thank you in advance for any help
This could be done using a combinatorics library, but can also easily be done yourself.
A small example I made:
import Data.List
permutations' _ [] = []
permutations' 0 _ = []
permutations' n xs | n > length xs = error "n can't be larger than length of input"
| otherwise = permute n xs []
permute 0 xs ys = [ys]
permute n xs ys = concatMap (\x -> foo (n-1) (x `delete` xs) (ys ++ [x])) xs
The 'magic' is happening in premute where a simple combinatorics is applied.
You start off with an empty list of solutions which is extended until the character limit n is reached. The input xs e.g. [1,2,3] is mapped, thus each character in xs is fed into the lambda function. In the lambda the x is appended to the already existing result. In the first loop ys is empty thus only x is added. In subsequent calls to permute the xs list is shrunk and ys is appended with the value that xs is shrunk with. Thus growing the result until char limit is reached and subsequently removing characters from xs to prevent duplicate entries.
A walkthrough of permute 2 [1,2,3] [] might look like this:
(\1 -> foo (2-1) [2,3] [] ++ [1])
- (\2 -> foo (1-1) [3] [1] ++ [2])
- [1,2], since we hit the first pattern where n = 0
- (\3 -> foo (1-1) [2] [1] ++ [3])
- [1,3], since we hit the first pattern where n = 0
(\2 ....
(\3 ....

Haskell - split a list into two sublists with closest sums

I'm a Haskell beginner trying to learn more about the language by solving some online quizzes/problem sets.
The problem/question is quite lengthy but a part of it requires code that can find the number which divides a given list into two (nearly) equal (by sum) sub-lists.
Given [1..10]
Answer should be 7 since 1+2+..7 = 28 & 8+9+10 = 27
This is the way I implemented it
-- partitions list by y
partishner :: (Floating a) => Int -> [a] -> [[[a]]]
partishner 0 xs = [[xs],[]]
partishner y xs = [take y xs : [drop y xs]] ++ partishner (y - 1) xs
-- finds the equal sum
findTheEquilizer :: (Ord a, Floating a) => [a] -> [[a]]
findTheEquilizer xs = fst $ minimumBy (comparing snd) zipParty
where party = (tail . init) (partishner (length xs) xs) -- removes [xs,[]] types
afterParty = (map (\[x, y] -> (x - y) ** 2) . init . map (map sum)) party
zipParty = zip party afterParty -- zips partitions and squared diff betn their sums
Given (last . head) (findTheEquilizer [1..10])
output : 7
For numbers near 50k it works fine
λ> (last . head) (findTheEquilizer [1..10000])
7071.0
The trouble starts when I put in lists with any more than 70k elements in it. It takes forever to compute.
So what do I have to change in the code to make it run better or do I have to change my whole approach? I'm guessing it's the later, but I'm not sure how to go about do that.
It looks to me that the implementation is quite chaotic. For example partishner seems to construct a list of lists of lists of a, where, given I understood it correctly, the outer list contains lists with each two elements: the list of elements on "the left", and the list of elements at the "right". As a result, this will take O(n2) to construct the lists.
By using lists over 2-tuples, this is also quite "unsafe", since a list can - although here probably impossible - contain no elements, one element, or more than two elements. If you make a mistake in one of the functions, it will be hard to find out that mistake.
It looks to me that it might be easier to implement a "sweep algorithm": we first calculate the sum of all the elements in the list. This is the value on the "right" in case we decide to split at that specific point, next we start moving from left to right, each time subtracting the element from the sum on the right, and adding it to the sum on the left. We can each time evaluate the difference in score, like:
import Data.List(unfoldr)
sweep :: Num a => [a] -> [(Int, a, [a])]
sweep lst = x0 : unfoldr f x0
where x0 = (0, sum lst, lst)
f (_, _, []) = Nothing
f (i, r, (x: xs)) = Just (l, l)
where l = (i+1, r-2*x, xs)
For example:
Prelude Data.List> sweep [1,4,2,5]
[(0,12,[1,4,2,5]),(1,10,[4,2,5]),(2,2,[2,5]),(3,-2,[5]),(4,-12,[])]
So if we select to split at the first split point (before the first element), the sum on the right is 12 higher than the sum on the left, if we split after the first element, the sum on the right (11) is 10 higher than the sum on the left (1).
We can then obtain the minimum of these splits with minimumBy :: (a -> a -> Ordering) -> [a] -> a:
import Data.List(minimumBy)
import Data.Ord(comparing)
findTheEquilizer :: (Ord a, Num a) => [a] -> ([a], [a])
findTheEquilizer lst = (take idx lst, tl)
where (idx, _, tl) = minimumBy (comparing (abs . \(_, x, _) -> x)) (sweep lst)
We then obtain the correct value for [1..10]:
Prelude Data.List Data.Ord Data.List> findTheEquilizer [1..10]
([1,2,3,4,5,6,7],[8,9,10])
or for 70'000:
Prelude Data.List Data.Ord Data.List> head (snd (findTheEquilizer [1..70000]))
49498
The above is not ideal, it can be implemented more elegantly, but I leave this as an exercise.
Okay, firstly, let analyse why it run forever (...actually not forever, just slow), take a look of partishner function:
partishner y xs = [take y xs : [drop y xs]] ++ partishner (y - 1) xs
where take y xs and drop y xs are run linear time, i.e. O(N), and so as
[take y xs : [drop y xs]]
is O(N) too.
However, it is run again and again in recursive way over each element of given list. Now suppose the length of given list is M, each call of partishner function take O(N) times, to finish computation need:
O(1+2+...M) = (M(1+M)/2) ~ O(M^2)
Now, the list has 70k elements, it at least need 70k ^ 2 step. So why it hang.
Instead of using partishner function, you can sum the list in linear way as:
sumList::(Floating a)=>[a]->[a]
sumList xs = sum 0 xs
where sum _ [] = []
sum s (y:ys) = let s' = s + y in s' : sum s' ys
and findEqilizer just sum the given list from left to right (leftSum) and from right to left (rightSum) and take the result just as your original program, but the whole process just take linear time.
findEquilizer::(Ord a, Floating a) => [a] -> a
findEquilizer [] = 0
findEquilizer xs =
let leftSum = reverse $ 0:(sumList $ init xs)
rightSum = sumList $ reverse $ xs
afterParty = zipWith (\x y->(x-y) ** 2) leftSum rightSum
in fst $ minimumBy (comparing snd) (zip (reverse $ init xs) afterParty)
I assume that none of the list elements are negative, and use a "tortoise and hare" approach. The hare steps through the list, adding up elements. The tortoise does the same thing, but it keeps its sum doubled and it carefully ensures that it only takes a step when that step won't put it ahead of the hare.
approxEqualSums
:: (Num a, Ord a)
=> [a] -> (Maybe a, [a])
approxEqualSums as0 = stepHare 0 Nothing as0 0 as0
where
-- ht is the current best guess.
stepHare _tortoiseSum ht tortoise _hareSum []
= (ht, tortoise)
stepHare tortoiseSum ht tortoise hareSum (h:hs)
= stepTortoise tortoiseSum ht tortoise (hareSum + h) hs
stepTortoise tortoiseSum ht [] hareSum hare
= stepHare tortoiseSum ht [] hareSum hare
stepTortoise tortoiseSum ht tortoise#(t:ts) hareSum hare
| tortoiseSum' <= hareSum
= stepTortoise tortoiseSum' (Just t) ts hareSum hare
| otherwise
= stepHare tortoiseSum ht tortoise hareSum hare
where tortoiseSum' = tortoiseSum + 2*t
In use:
> approxEqualSums [1..10]
(Just 6,[7,8,9,10])
6 is the last element before going over half, and 7 is the first one after that.
I asked in the comment and OP says [1..n] is not really defining the question. Yes i guess what's asked is like [1 -> n] in random ascending sequence such as [1,3,7,19,37,...,1453,...,n].
Yet..! Even as per the given answers, for a list like [1..n] we really don't need to do any list operation at all.
The sum of [1..n] is n*(n+1)/2.
Which means we need to find m for n*(n+1)/4
Which means m(m+1)/2 = n*(n+1)/4.
So if n == 100 then m^2 + m - 5050 = 0
All we need is
formula where a = 1, b = 1 and c = -5050 yielding the reasonable root to be 70.565 ⇒ 71 (rounded). Lets check. 71*72/2 = 2556 and 5050-2556 = 2494 which says 2556 - 2494 = 62 minimal difference (<71). Yes we must split at 71. So just do like result = [[1..71],[72..100]] over..!
But when it comes to not subsequent ascending, that's a different animal. It has to be done by first finding the sum and then like binary search by jumping halfway the list and comparing the sums to decide whether to jump halfway back or forward accordingly. I will implement that one later.
Here's a code which is empirically behaving better than linear, and gets to the 2,000,000 in just over 1 second even when interpreted:
g :: (Ord c, Num c) => [c] -> [(Int, c)]
g = head . dropWhile ((> 0) . snd . last) . map (take 2) . tails . zip [1..]
. (\xs -> zipWith (-) (map (last xs -) xs) xs) . scanl1 (+)
g [1..10] ==> [(6,13),(7,-1)] -- 0.0s
g [1..70000] ==> [(49497,32494),(49498,-66502)] -- 0.09s
g [70000,70000-1..1] ==> [(20502,66502),(20503,-32494)] -- 0.09s
g [1..100000] ==> [(70710,75190),(70711,-66232)] -- 0.11s
g [1..1000000] ==> [(707106,897658),(707107,-516556)] -- 0.62s
g [1..2000000] ==> [(1414213,1176418),(1414214,-1652010)] -- 1.14s n^0.88
g [1..3000000] ==> [(2121320,836280),(2121321,-3406362)] -- 1.65s n^0.91
It works by running the partial sums with scanl1 (+) and taking the total sum as its last, so that for each partial sum, subtracting it from the total gives us the sum of the second part of the split.
The algorithm assumes all the numbers in the input list are strictly positive, so the partial sums list is monotonically increasing. Nothing else is assumed about the numbers.
The value must be chosen from the pair (the g's result) so that its second component's absolute value is the smaller between the two.
This is achieved by minimumBy (comparing (abs . snd)) . g.
clarifications: There's some confusion about "complexity" in the comments below, yet the answer says nothing at all about complexity but uses a specific empirical measurement. You can't argue with empirical data (unless you misinterpret its meaning).
The answer does not claim it "is better than linear", it says "it behaves better than linear" [in the tested range of problem sizes], which the empirical data incontrovertibly show.
Finally, an appeal to authority. Robert Sedgewick is an authority on algorithms. Take it up with him.
(and of course the algorithm handles unordered data as well as it does ordered).
As for the reasons for OP's code inefficiency: map sum . inits can't help being quadratic, but the equivalent scanl (+) 0 is linear. The radical improvement comes about from a lot of redundant calculations in the former being avoided in the latter. (Another example of this can be seen here.)

Haskell Sieve of Eratosthenes with list of composites

I have to implement the classic problem of the Sieve of Eratosthenes in Haskell for a project. Rather than computing each prime I only have to compare numbers between lists. For instance I pass a list of potential primes (parameter 1) and a list of composites (list 2). sieve [2..10] [] results with the list [2,3,5,7].
I think I am very close and it compiles, but it appends every item to the prime list rather than throwing out the composites. My thinking was that it would take the list x of all numbers 2..10 or whatever and a list y of composites use elem to see if the head of list x is found in list y and if so append to list z and print. Thanks in advance!
Currently my code returns everything in the first list and refuses to sort. sieve [2..10] [] results in [2,3,4,5,6,7,8,9,10]
sieve ::[Int]->[Int]->[Int]
z = []
sieve [] [] = []
sieve x [] = x
sieve [] y = y
sieve xs ys = if ((elem (head xs)) ys) then (sieve (tail xs) ys)
else ((head xs):z)
what you call sieve is usually called minus, subtracting the second list from the first, assuming the both are ordered, increasing lists of numbers. then it is enough to compare just the two head elements, without any elem calls.
but it could still work, had you provided a proper definition for z. z=[] is just a placeholder, to make it compile (right?); it's not the right definition. it should have been:
sieve :: [Int] -> [Int] -> [Int]
-- z = []
sieve [] [] = []
sieve x [] = x
sieve [] y = y
sieve xs ys = if ((elem (head xs)) ys) then (sieve (tail xs) z)
else ((head xs) : sieve (tail xs) ys )
where
z = ... -- need to remove (head xs) from ys
For the last comment's task, you could use e.g. delete function.
This still won't produce you a list of primes without the list of composites, so the initial call can not be with the second list empty (or else, you'd get the same first argument back, as you do, because of the sieve x [] = x equation):
primesAmong input = sieve input composites
But what are composites? Eratosthenes's answer is, why, they are multiples of primes! (and trial division says: composites have other primes as their divisors).
Given a prime, say 2, we just count: 2,4,6,...; and for 3, say, it's 3,6,9,12,...; to find its multiples. Let's write it down:
composites = mutliplesOf primes
mutliplesOf primes = [ mult | p <- primes, mult <- [...] ]
This doesn't quite fit: this multiplesOf needs an argument:
primes = primesAmong input
primesAmong input = sieve input (mutliplesOf primes)
We seem to be chasing our own tail here; we don't have primes yet; what can we use instead? Is there a harm in finding multiples of non-primes, as well as primes?
After you do have a running code, try to find a way to use primes after all.
The program you show doesn't make much sense, first of all sieve x [] will always be used, furthermore in you should check if an element is divisable by the other list. Finally you should make the call recursive, something you don't do with head xs : z since z is defined as the empty list.
Let's start with the base case: in case the left list is empty, regardless of the content of the second list, one returns the empty list. Since sieving nothing will result in nothing:
sieve [] _ = []
Next we look for the inductive case, with as pattern:
sieve (x:xs) ds = ...
Now we need to enumerate over the list of already found elements. From the moment any of the found elements can divide x, we know the number is not (relative) prime. This condition is formalized as:
(==) 0 . mod x :: Integral b => b -> Bool
Or to iterate over the list of ds:
any ((==) 0 . mod x) ds
In case such element exists, we simply skip the element, and call the inductive case with sieve xs ds.
In case there is no such element, we add it to the list of ds and emit it. The result is thus: x : sieve xs (x:ds). The inductive case is thus:
sieve (x:xs) ds | any ((==) 0 . mod x) ds = sieve xs ds
| otherwise = x : sieve xs (x:ds)
We can shorten this a bit by creating a specific variable for sieve xs:
sieve (x:xs) ds | any ((==) 0 . mod x) ds = rec ds
| otherwise = x : rec (x:ds)
where rec = sieve xs
The full function is thus:
sieve [] _ = []
sieve (x:xs) ds | any ((==) 0 . mod x) ds = rec ds
| otherwise = x : rec (x:ds)
where rec = sieve xs
You can boost the performance in two ways:
Adding x at the end of ds. This is indeed a more expensive operation. But after a while you don't add numbers that often. It is interesting because in that case ys looks like [2,3,5,7,11] instead of [11,7,5,3,2]. Now the chances that a number is divisible by 2 (50%) is greater than a number being divisible by 11 (9.9%). It is better to try to try the test that will succeed most probable first.
Furthermore you can end the check after the dividers you have reached the square root of the number to be tested: if a number is not divisible by a number smaller than such number, it is definitely not divisible by a number greater than the square root.
A more efficient approach is thus:
sieve [] _ = []
sieve (x:xs) ds | any ((==) 0 . mod x) $ takeWhile (\y -> y*y <= x) ds = rec ds
| otherwise = x : rec (ds++[x])
where rec = sieve xs

Better way to solve this [Int] -> Int -> Int

Here is an sample problem I'm working upon:
Example Input: test [4, 1, 5, 6] 6 returns 5
I'm solving this using this function:
test :: [Int] -> Int -> Int
test [] _ = 0
test (x:xs) time = if (time - x) < 0
then x
else test xs $ time - x
Any better way to solve this function (probably using any inbuilt higher order function) ?
How about
test xs time = maybe 0 id . fmap snd . find ((>time) . fst) $ zip sums xs
where sums = scanl1 (+) xs
or equivalently with that sugary list comprehension
test xs time = headDef 0 $ [v | (s, v) <- zip sums xs, s > time]
where sums = scanl1 (+) xs
headDef is provided by safe. It's trivial to implement (f _ (x:_) = x; f x _ = x) but the safe package has loads of useful functions like these so it's good to check out.
Which sums the list up to each point and finds the first occurence greater than time. scanl is a useful function that behaves like foldl but keeps intermediate results and zip zips two lists into a list of tuples. Then we just use fmap and maybe to manipulate the Maybe (Integer, Integer) to get our result.
This defaults to 0 like yours but I like the version that simply goes to Maybe Integer better from a user point of view, to get this simply remove the maybe 0 id.
You might like scanl and its close relative, scanl1. For example:
test_ xs time = [curr | (curr, tot) <- zip xs (scanl1 (+) xs), tot > time]
This finds all the places where the running sum is greater than time. Then you can pick the first one (or 0) like this:
safeHead def xs = head (xs ++ [def])
test xs time = safeHead 0 (test_ xs time)
This is verbose, and I don't necessarily recommend writing such a simple function like this (IMO the pattern matching & recursion is plenty clear). But, here's a pretty declarative pipeline:
import Control.Error
import Data.List
deadline :: (Num a, Ord a) => a -> [a] -> a
deadline time = fromMaybe 0 . findDeadline time
findDeadline :: (Num a, Ord a) => a -> [a] -> Maybe a
findDeadline time xs = decayWithDifferences time xs
>>= findIndex (< 0)
>>= atMay xs
decayWithDifferences :: Num b => b -> [b] -> Maybe [b]
decayWithDifferences time = tailMay . scanl (-) time
-- > deadline 6 [4, 1, 5, 6]
-- 5
This documents the code a bit and in principle lets you test a little better, though IMO these functions fit more-or-less into the 'obviously correct' category.
You can verify that it matches your implementation:
import Test.QuickCheck
prop_equality :: [Int] -> Int -> Bool
prop_equality time xs = test xs time == deadline time xs
-- > quickCheck prop_equality
-- +++ OK, passed 100 tests.
In this particular case zipping suggested by others in not quite necessary:
test xs time = head $ [y-x | (x:y:_) <- tails $ scanl1 (+) $ 0:xs, y > time]++[0]
Here scanl1 will produce a list of rolling sums of the list xs, starting it with 0. Therefore, tails will produce a list with at least one list having two elements for non-empty xs. Pattern-matching (x:y:_) extracts two elements from each tail of rolling sums, so in effect it enumerates pairs of neighbouring elements in the list of rolling sums. Filtering on the condition, we reconstruct a part of the list that starts with the first element that produces a rolling sum greater than time. Then use headDef 0 as suggested before, or append a [0], so that head always returns something.
If you want to retain readability, I would just stick with your current solution. It's easy to understand, and isn't doing anything wrong.
Just because you can make it into a one line scan map fold mutant doesn't mean that you should!

Ways to get the middle of a list in Haskell?

I've just started learning about Functional Programming, using Haskel.
I'm slowly getting through Erik Meijer's lectures on Channel 9 (I've watched the first 4 so far) and in the 4th video Erik explains how tail works, and it fascinated me.
I've tried to write a function that returns the middle of a list (2 items for even lengths, 1 for odd) and I'd like to hear how others would implement it in
The least amount of Haskell code
The fastest Haskell code
If you could explain your choices I'd be very grateful.
My beginners code looks like this:
middle as | length as > 2 = middle (drop 2 (reverse as))
| otherwise = as
Just for your amusement, a solution from someone who doesn't speak Haskell:
Write a recursive function that takes two arguments, a1 and a2, and pass your list in as both of them. At each recursion, drop 2 from a2 and 1 from a1. If you're out of elements for a2, you'll be at the middle of a1. You can handle the case of just 1 element remaining in a2 to answer whether you need 1 or 2 elements for your "middle".
I don't make any performance claims, though it only processes the elements of the list once (my assumption is that computing length t is an O(N) operation, so I avoid it), but here's my solution:
mid [] = [] -- Base case: the list is empty ==> no midpt
mid t = m t t -- The 1st t is the slow ptr, the 2nd is fast
where m (x:_) [_] = [x] -- Base case: list tracked by the fast ptr has
-- exactly one item left ==> the first item
-- pointed to by the slow ptr is the midpt.
m (x:y:_) [_,_] = [x,y] -- Base case: list tracked by the fast ptr has
-- exactly two items left ==> the first two
-- items pointed to by the slow ptr are the
-- midpts
m (_:t) (_:_:u) = m t u -- Recursive step: advance slow ptr by 1, and
-- advance fast ptr by 2.
The idea is to have two "pointers" into the list, one that increments one step at each point in the recursion, and one that increments by two.
(which is essentially what Carl Smotricz suggested)
Two versions
Using pattern matching, tail and init:
middle :: [a] -> [a]
middle l#(_:_:_:_) = middle $ tail $ init l
middle l = l
Using length, take, signum, mod, drop and div:
middle :: [a] -> [a]
middle xs = take (signum ((l + 1) `mod` 2) + 1) $ drop ((l - 1) `div ` 2) xs
where l = length xs
The second one is basically a one-liner (but uses where for readability).
I've tried to write a function that returns the middle of a list (2 items for even lengths, 1 for odd) and I'd like to hear how others would implement it in
The right datastructure for the right problem. In this case, you've specified something that only makes sense on a finite list, right? There is no 'middle' to an infinite list. So just reading the description, we know that the default Haskell list may not be the best solution: we may be paying the price for the laziness even when we don't need it. Notice how many of the solutions have difficulty avoiding 2*O(n) or O(n). Singly-linked lazy lists just don't match a quasi-array-problem too well.
Fortunately, we do have a finite list in Haskell: it's called Data.Sequence.
Let's tackle it the most obvious way: 'index (length / 2)'.
Data.Seq.length is O(1) according to the docs. Data.Seq.index is O(log(min(i,n-i))) (where I think i=index, and n=length). Let's just call it O(log n). Pretty good!
And note that even if we don't start out with a Seq and have to convert a [a] into a Seq, we may still win. Data.Seq.fromList is O(n). So if our rival was a O(n)+O(n) solution like xs !! (length xs), a solution like
middle x = let x' = Seq.fromList x in Seq.index(Seq.length x' `div` 2)
will be better since it would be O(1) + O(log n) + O(n), which simplifies to O(log n) + O(n), obviously better than O(n)+O(n).
(I leave as an exercise to the reader modifying middle to return 2 items if length be even and 1 if length be odd. And no doubt one could do better with an array with constant-time length and indexing operations, but an array isn't a list, I feel.)
Haskell solution inspired by Carl's answer.
middle = m =<< drop 1
where m [] = take 1
m [_] = take 2
m (_:_:ys) = m ys . drop 1
If the sequence is a linked list, traversal of this list is the dominating factor of efficiency. Since we need to know the overall length, we have to traverse the list at least once. There are two equivalent ways to get the middle elements:
Traverse the list once to get the length, then traverse it half to get at the middle elements.
Traverse the list in double steps and single steps at the same time, so that when the first traversal stops, the second traversal is in the middle.
Both need the same number of steps. The second is needlessly complicated, in my opinion.
In Haskell, it might be something like this:
middle xs = take (2 - r) $ drop ((div l 2) + r - 1) xs
where l = length xs
r = rem l 2
middle xs =
let (ms, len) = go xs 0 [] len
in ms
go (x:xs) i acc len =
let acc_ = case len `divMod` 2 of
(m, 0) -> if m == (i+1) then (take 2 (x:xs))
else acc
(m, 1) -> if m == i then [x]
else acc
in go xs (i+1) acc_ len
go [] i acc _ = (acc,i)
This solution traverses the list just once using lazy evaluation. While it traverses the list, it calculates the length and then backfeeds it to the function:
let (ms, len) = go xs 0 [] len
Now the middle elements can be calculated:
let acc' = case len `divMod` 2 of
...
F# solution based on Carl's answer:
let halve_list l =
let rec loop acc1 = function
| x::xs, [] -> List.rev acc1, x::xs
| x::xs, [y] -> List.rev (x::acc1), xs
| x::xs, y::y'::ys -> loop (x::acc1) (xs, ys)
| [], _ -> [], []
loop [] (l, l)
It's pretty easy to modify to get the median elements in the list too:
let median l =
let rec loop acc1 = function
| x::xs, [] -> [List.head acc1; x]
| x::xs, [y] -> [x]
| x::xs, y::y'::ys -> loop (x::acc1) (xs, ys)
| [], _ -> []
loop [] (l, l)
A more intuitive approach uses a counter:
let halve_list2 l =
let rec loop acc = function
| (_, []) -> [], []
| (0, rest) -> List.rev acc, rest
| (n, x::xs) -> loop (x::acc) (n - 1, xs)
let count = (List.length l) / 2
loop [] (count, l)
And a really ugly modification to get the median elements:
let median2 l =
let rec loop acc = function
| (n, [], isEven) -> []
| (0, rest, isEven) ->
match rest, isEven with
| x::xs, true -> [List.head acc; x]
| x::xs, false -> [x]
| _, _ -> failwith "Should never happen"
| (n, x::xs, isEven) -> loop (x::acc) (n - 1, xs, isEven)
let len = List.length l
let count = len / 2
let isEven = if len % 2 = 0 then true else false
loop [] (count, l, isEven)
Getting the length of a list requires traversing its entire contents at least once. Fortunately, it's perfectly easy to write your own list data structure which holds the length of the list in each node, allowing you get get the length in O(1).
Weird that this perfectly obvious formulation hasn't come up yet:
middle [] = []
middle [x] = [x]
middle [x,y] = [x,y]
middle xs = middle $ init $ tail xs
A very straightforward, yet unelegant and not so terse solution might be:
middle :: [a] -> Maybe [a]
middle xs
| len <= 2 = Nothing
| even len = Just $ take 2 . drop (half - 1) $ xs
| odd len = Just $ take 1 . drop (half) $ xs
where
len = length xs
half = len `div` 2
This iterates twice over the list.
mid xs = m where
l = length xs
m | l `elem` [0..2] = xs
m | odd l = drop (l `div` 2) $ take 1 $ xs
m | otherwise = drop (l `div` 2 - 1) $ take 2 $ xs
I live for one liners, although this example only works for odd lists. I just want to stretch my brain! Thank you for the fun =)
foo d = map (\(Just a) -> a) $ filter (/=Nothing) $ zipWith (\a b -> if a == b then Just a else Nothing) (Data.List.nub d) (Data.List.nub $ reverse d)
I'm not much of a haskeller myself but I tried this one.
First the tests (yes, you can do TDD using Haskell)
module Main
where
import Test.HUnit
import Middle
main = do runTestTT tests
tests = TestList [ test1
, test2
, test3
, test4
, test_final1
, test_final2
]
test1 = [0] ~=? middle [0]
test2 = [0, 1] ~=? middle [0, 1]
test3 = [1] ~=? middle [0, 1, 2]
test4 = [1, 2] ~=? middle [0, 1, 2, 3]
test_final1 = [3] ~=? middle [0, 1, 2, 3, 4, 5, 6]
test_final2 = [3, 4] ~=? middle [0, 1, 2, 3, 4, 5, 6, 7]
And the solution I came to:
module Middle
where
middle a = midlen a (length a)
midlen (a:xs) 1 = [a]
midlen (a:b:xs) 2 = [a, b]
midlen (a:xs) lg = midlen xs (lg - (2))
It will traverse list twice, once for getting length and a half more to get the middle, but I don't care it's still O(n) (and getting the middle of something implies to get it's length, so no reason to avoid it).
My solution, I like to keep things simple:
middle [] = []
middle xs | odd (length xs) = [xs !! ((length xs) `div` 2)]
| otherwise = [(xs !! ((length xs) `div` 2)),(reverse $ xs) !! ((length xs)`div` 2)]
Use of !! in Data.List as the function to get the value at a given index, which in this case is half the length of the list.
Edit: it actually works now
I like Svante's answer. My version:
> middle :: [a] -> [a]
> middle [] = []
> middle xs = take (r+1) . drop d $ xs
> where
> (d,r) = (length xs - 1) `divMod` 2
Here is my version. It was just a quick run up. I'm sure it's not very good.
middleList xs#(_:_:_:_) = take (if odd n then 1 else 2) $ drop en xs
where n = length xs
en = if n < 5 then 1 else 2 * (n `div` 4)
middleList xs = xs
I tried. :)
If anyone feels like commenting and telling me how awful or good this solution is, I would deeply appreciate it. I'm not very well versed in Haskell.
EDIT: Improved with suggestions from kmc on #haskell-blah
EDIT 2: Can now accept input lists with a length of less than 5.
Another one-line solution:
--
middle = ap (take . (1 +) . signum . (`mod` 2) . (1 +) . length) $ drop =<< (`div` 2) . subtract 1 . length
--