Problem with larg index when testing my code - list

I am trying to learn Haskell, I want to write a recursive function and do not use any library functions. The function
nth ::Integer -> [a ] -> Maybe a
takes an index n and a list of elements and returns the n-th element of the list (if the index is valid) or Nothing if
the index is invalid.
My code:
nth :: Integer -> [a] -> Maybe a
nth a [] = Nothing
nth a (x:xs) |a == 1 = Just x
|fromIntegral (length xs) < a = Nothing
|a==0 = Nothing
| otherwise = nth (a-1) xs
I want to do this test to my code:
spec = do
describe "nth" $ do
it "for valid indexes it behaves like (!!)" $
property $ \n xs -> n < 0 || (fromInteger n) >= length (xs::[Integer]) || Lists.nth n xs == Just (xs!!(fromInteger n))
it "for negative indexes it returns Nothing" $
property $ \n xs -> n >= 0 || Lists.nth n (xs::[Integer]) == Nothing
it "for too large indexes it returns Nothing" $
property $ \n xs -> (fromInteger n) < length xs || Lists.nth n (xs::[Integer]) == Nothing
but every time I am doing the test I'm getting an error
for valid indexes it behaves like (!!) FAILED [1]
for negative indexes it returns Nothing
+++ OK, passed 100 tests.
for too large indexes it returns Nothing FAILED [2]
1) Lists.nth for valid indexes it behaves like (!!)
Falsified (after 5 tests and 5 shrinks):
0
[0]
To rerun use: --match "/Lists/nth/for valid indexes it behaves like (!!)/"
./ListsSpec.hs:23:9:
2) Lists.nth for too large indexes it returns Nothing
Falsified (after 38 tests):
1
[0]

There are some problems here with your function. The reason why the first case (behaving like (!!)) fails, is because (!!) :: Int -> [a] -> a uses a zero-based index, whereas your function seems to work with a one-based index. That means that you will thus need to decrement the index you give to the function.
Furthermore in your function you make a a comparison between n and fromIntegral (length xs). Since xs is the tail of the list, the check is not correct since it will, in certain circumstances, never consider the last element. Indeed:
Prelude> nth 2 [0, 2]
Nothing
Furthermore it is typically not a good idea to use length in each iteration. length runs in O(n), that means that your algorithm now runs in O(n2), so as the list grows, this easily will start taking considerable time.
A shorter and more elegant way to fix this is probably:
nth :: Integral i => i -> [a] -> Maybe a
nth 1 (x:_) = Just x
nth i (_:xs) | i < 1 = Nothing
| otherwise = nth (i-1) xs
nth _ [] = Nothing
Here we thus have four cases: in case the index is 1 and the list is non-empty, we return the head of the list, wrapped in a Just. If the index is not one, and it is less than one, then the index is too small, and hence we return Nothing (this case is strictly speaking not necessary). If i is greater than one, then we call nth (i-1) xs. Finally if we have reached the end of the list (or the list was empty in the first place), we return Nothing as well).
Now in order to test this, we thus need to rewrite these three cases:
describe "nth" $ do
it "for valid indexes it behaves like (!!)" $
property $ \n xs -> n <= 0 || n > length (xs :: [Integer]) || Lists.nth n xs == Just (xs !! (n-1))
it "for negative indexes it returns Nothing" $
property $ \n xs -> n > 0 || Lists.nth n (xs :: [Integer]) == Nothing
it "for too large indexes it returns Nothing" $
property $ \n xs -> n <= length xs || Lists.nth n (xs :: [Integer]) == Nothing
The first one thus excludes n <= 0 (negative or zero indices) as well as n > length xs and thus checks if the value is Just (xs !! (n-1)).
In the second case excludes values greater than zero, and checks if all remaining indices map on Nothing.
Finally the last property checks that for values that are higher than length xs, we obtain nothing as well.
Note that here nth uses one-based indexing. I leave it as an exercise to make it zero-based.

Related

Haskell function that returns a list of elements in a list with more than given amount of occurrences

I tried making a function that as in the title takes 2 arguments, a number that specifies how many times the number must occur and a list that we are working on, I made a function that counts number of appearances of given number in a list and I tried using it in my main function, but I cannot comprehend how the if else and indentations work in Haskell, it's so much harder fixing errors than in other languages, i think that I'm missing else statement but even so I don't know that to put in there
count el list = count el list 0
where count el list output
| list==[] = output
| head(list)==el = count el (tail(list)) output+1
| otherwise = count el (tail(list)) output
moreThan :: Eq a => Int -> [a] -> [a]
moreThan a [] = []
moreThan a list = moreThan a list output i
where moreThan a list [] 0
if i == length (list)
then output
else if elem (list!!i) output
then moreThan a list output i+1
else if (count (list!!i) list) >= a
then moreThan a list (output ++ [list!!i]) i+1
All I get right now is
parse error (possibly incorrect indentation or mismatched brackets)
You just forgot the = sign and some brackets, and the final else case. But also you switched the order of the internal function declaration and call:
moreThan :: Eq a => Int -> [a] -> [a]
moreThan a [] = []
moreThan a list = go a list [] 0 -- call
where go a list output i = -- declaration =
if i == length (list)
then output
else if elem (list!!i) output
then go a list output (i+1) -- (i+1) !
else if (count (list!!i) list) >= a
then go a list (output ++ [list!!i]) (i+1) -- (i+1) !
else
undefined
I did rename your internal function as go, as is the custom.
As to how to go about fixing errors in general, just read the error messages, slowly, and carefully -- they usually say what went wrong and where.
That takes care of the syntax issues that you asked about.
As to what to put in the missing else clause, you've just dealt with this issue in the line above it -- you include the ith element in the output if its count in the list is greater than or equal to the given parameter, a. What to do else, we say in the else clause.
And that is, most probably, to not include that element in the output:
then go a list (output ++ [list!!i]) (i+1)
else ---------------------
undefined
So, just keep the output as it is, there, instead of the outlined part, and put that line instead of the undefined.
More importantly, accessing list elements via an index is an anti-pattern, it is much better to "slide along" by taking a tail at each recursive step, and always deal with the head element only, like you do in your count code (but preferably using the pattern matching, not those functions directly). That way our code becomes linear instead of quadratic as it is now.
Will Ness's answer is correct. I just wanted to offer some general advice for Haskell and some tips for improving your code.
First, I would always avoid using guards. The syntax is quite inconsistent with Haskell's usual fare, and guards aren't composable in the same way that other Haskell syntax is. If I were you, I'd stick to using let, if/then/else, and pattern matching.
Secondly, an if statement in Haskell is very often not the right answer. In many cases, it's better to avoid using if statements entirely (or at least as much as possible). For example, a more readable version of count would look like this:
count el list = go list 0 where
go [] output = output
go (x:xs) output = go xs (if x == el
then 1 + output
else output)
However, this code is still flawed because it is not properly strict in output. For example, consider the evaluation of the expression count 1 [1, 1, 1, 1], which proceeds as follows:
count 1 [1, 1, 1, 1]
go [1, 1, 1, 1] 0
go [1, 1, 1] (1 + 0)
go [1, 1] (1 + (1 + 0))
go [1] (1 + (1 + (1 + 0)))
go [] (1 + (1 + (1 + (1 + 0))))
(1 + (1 + (1 + (1 + 0))))
(1 + (1 + 2))
(1 + 3)
4
Notice the ballooning space usage of this evaluation. We need to force go to make sure output is evaluated before it makes a recursive call. We can do this using seq. The expression seq a b is evaluated as follows: first, a is partially evaluated. Then, seq a b evaluates to b. For the case of numbers, "partially evaluated" is the same as being totally evaluated.
So the code should in fact be
count el list = go list 0 where
go [] output = output
go (x:xs) output =
let new_output = if x == el
then 1 + output
else output
in seq new_output (go xs new_output)
Using this definition, we can again trace the execution:
go [1, 1, 1, 1] 0
go [1, 1, 1] 1
go [1, 1] 2
go [1] 3
go [] 4
4
which is a more efficient way to evaluate the expression. Without using library functions, this is basically as good as it gets for writing the count function.
But we're actually using a very common pattern - a pattern so common, there is a higher-order function named for it. We're using foldl' (which must be imported from Data.List using the statement import Data.List (foldl')). This function has the following definition:
foldl' :: (b -> a -> b) -> b -> [a] -> b
foldl' f = go where
go output [] = output
go output (x:xs) =
let new_output = f output x
in seq new_output (go new_output xs)
So we can further rewrite our count function as
count el list = foldl' f 0 list where
f output x = if x == el
then 1 + output
else output
This is good, but we can actually improve even further on this code by breaking up the count step into two parts.
count el list should be the number of times el occurs in list. We can break this computation up into two conceptual steps. First, construct the list list', which consists of all the elements in list which are equal to el. Then, compute the length of list'.
In code:
count el list = length (filter (el ==) list)
This is, in my view, the most readable version yet. And it is also just as efficient as the foldl' version of count because of laziness. Here, Haskell's length function takes care of finding the optimal way to do the counting part of count, while the filter (el ==) takes care of the part of the loop where we check whether to increment output. In general, if you're iterating over a list and have an if P x statement, you can very often replace this with a call to filter P.
We can rewrite this one more time in "point-free style" as
count el = length . filter (el ==)
which is most likely how the function would be written in a library. . refers to function composition. The meaning of this is as follows:
To apply the function count el to a list, we first filter the list to keep only the elements which el ==, and then take the length.
Incidentally, the filter function is exactly what we need to write moreThan compactly:
moreThan a list = filter occursOften list where
occursOften x = count x list >= a
Moral of the story: use higher-order functions whenever possible.
Whenever you solve a list problem in Haskell, the first tool you should reach for is functions defined in Data.List, especially map, foldl'/foldr, filter, and concatMap. Most list problems come down to map/fold/filter. These should be your go-to replacement for loops. If you're replacing a nested loop, you should use concatMap.
in a functional way, ;)
moreThan n xs = nub $ concat [ x | x <- ( group(sort(xs))), length x > n ]
... or in a fancy way, lol
moreThan n xs = map head [ x | x <- ( group(sort(xs))), length x > n ]
...
mt1 n xs = [ head x | x <- ( group(sort(xs))), length x > n ]

A faster way of generating combinations with a given length, preserving the order

TL;DR: I want the exact behavior as filter ((== 4) . length) . subsequences. Just using subsequences also creates variable length of lists, which takes a lot of time to process. Since in the end only lists of length 4 are needed, I was thinking there must be a faster way.
I have a list of functions. The list has the type [Wor -> Wor]
The list looks something like this
[f1, f2, f3 .. fn]
What I want is a list of lists of n functions while preserving order like this
input : [f1, f2, f3 .. fn]
argument : 4 functions
output : A list of lists of 4 functions.
Expected output would be where if there's an f1 in the sublist, it'll always be at the head of the list.
If there's a f2 in the sublist and if the sublist doens't have f1, f2 would be at head. If fn is in the sublist, it'll be at last.
In general if there's a fx in the list, it never will be infront of f(x - 1) .
Basically preserving the main list's order when generating sublists.
It can be assumed that length of list will always be greater then given argument.
I'm just starting to learn Haskell so I haven't tried all that much but so far this is what I have tried is this:
Generation permutations with subsequences function and applying (filter (== 4) . length) on it seems to generate correct permutations -but it doesn't preserve order- (It preserves order, I was confusing it with my own function).
So what should I do?
Also if possible, is there a function or a combination of functions present in Hackage or Stackage which can do this? Because I would like to understand the source.
You describe a nondeterministic take:
ndtake :: Int -> [a] -> [[a]]
ndtake 0 _ = [[]]
ndtake n [] = []
ndtake n (x:xs) = map (x:) (ndtake (n-1) xs) ++ ndtake n xs
Either we take an x, and have n-1 more to take from xs; or we don't take the x and have n more elements to take from xs.
Running:
> ndtake 3 [1..4]
[[1,2,3],[1,2,4],[1,3,4],[2,3,4]]
Update: you wanted efficiency. If we're sure the input list is finite, we can aim at stopping as soon as possible:
ndetake n xs = go (length xs) n xs
where
go spare n _ | n > spare = []
go spare n xs | n == spare = [xs]
go spare 0 _ = [[]]
go spare n [] = []
go spare n (x:xs) = map (x:) (go (spare-1) (n-1) xs)
++ go (spare-1) n xs
Trying it:
> length $ ndetake 443 [1..444]
444
The former version seems to be stuck on this input, but the latter one returns immediately.
But, it measures the length of the whole list, and needlessly so, as pointed out by #dfeuer in the comments. We can achieve the same improvement in efficiency while retaining a bit more laziness:
ndzetake :: Int -> [a] -> [[a]]
ndzetake n xs | n > 0 =
go n (length (take n xs) == n) (drop n xs) xs
where
go n b p ~(x:xs)
| n == 0 = [[]]
| not b = []
| null p = [(x:xs)]
| otherwise = map (x:) (go (n-1) b p xs)
++ go n b (tail p) xs
Now the last test also works instantly with this code as well.
There's still room for improvement here. Just as with the library function subsequences, the search space could be explored even more lazily. Right now we have
> take 9 $ ndzetake 3 [1..]
[[1,2,3],[1,2,4],[1,2,5],[1,2,6],[1,2,7],[1,2,8],[1,2,9],[1,2,10],[1,2,11]]
but it could be finding [2,3,4] before forcing the 5 out of the input list. Shall we leave it as an exercise?
Here's the best I've been able to come up with. It answers the challenge Will Ness laid down to be as lazy as possible in the input. In particular, ndtake m ([1..n]++undefined) will produce as many entries as possible before throwing an exception. Furthermore, it strives to maximize sharing among the result lists (note the treatment of end in ndtakeEnding'). It avoids problems with badly balanced list appends using a difference list. This sequence-based version is considerably faster than any pure-list version I've come up with, but I haven't teased apart just why that is. I have the feeling it may be possible to do even better with a better understanding of just what's going on, but this seems to work pretty well.
Here's the general idea. Suppose we ask for ndtake 3 [1..5]. We first produce all the results ending in 3 (of which there is one). Then we produce all the results ending in 4. We do this by (essentially) calling ndtake 2 [1..3] and adding the 4 onto each result. We continue in this manner until we have no more elements.
import qualified Data.Sequence as S
import Data.Sequence (Seq, (|>))
import Data.Foldable (toList)
We will use the following simple utility function. It's almost the same as splitAtExactMay from the 'safe' package, but hopefully a bit easier to understand. For reasons I haven't investigated, letting this produce a result when its argument is negative leads to ndtake with a negative argument being equivalent to subsequences. If you want, you can easily change ndtake to do something else for negative arguments.
-- to return an empty list in the negative case.
splitAtMay :: Int -> [a] -> Maybe ([a], [a])
splitAtMay n xs
| n <= 0 = Just ([], xs)
splitAtMay _ [] = Nothing
splitAtMay n (x : xs) = flip fmap (splitAtMay (n - 1) xs) $
\(front, rear) -> (x : front, rear)
Now we really get started. ndtake is implemented using ndtakeEnding, which produces a sort of "difference list", allowing all the partial results to be concatenated cheaply.
ndtake :: Int -> [t] -> [[t]]
ndtake n xs = ndtakeEnding n xs []
ndtakeEnding :: Int -> [t] -> ([[t]] -> [[t]])
ndtakeEnding 0 _xs = ([]:)
ndtakeEnding n xs = case splitAtMay n xs of
Nothing -> id -- Not enough elements
Just (front, rear) ->
(front :) . go rear (S.fromList front)
where
-- For each element, produce a list of all combinations
-- *ending* with that element.
go [] _front = id
go (r : rs) front =
ndtakeEnding' [r] (n - 1) front
. go rs (front |> r)
ndtakeEnding doesn't call itself recursively. Rather, it calls ndtakeEnding' to calculate the combinations of the front part. ndtakeEnding' is very much like ndtakeEnding, but with a few differences:
We use a Seq rather than a list to represent the input sequence. This lets us split and snoc cheaply, but I'm not yet sure why that seems to give amortized performance that is so much better in this case.
We already know that the input sequence is long enough, so we don't need to check.
We're passed a tail (end) to add to each result. This lets us share tails when possible. There are lots of opportunities for sharing tails, so this can be expected to be a substantial optimization.
We use foldr rather than pattern matching. Doing this manually with pattern matching gives clearer code, but worse constant factors. That's because the :<|, and :|> patterns exported from Data.Sequence are non-trivial pattern synonyms that perform a bit of calculation, including amortized O(1) allocation, to build the tail or initial segment, whereas folds don't need to build those.
NB: this implementation of ndtakeEnding' works well for recent GHC and containers; it seems less efficient for earlier versions. That might be the work of Donnacha Kidney on foldr for Data.Sequence. In earlier versions, it might be more efficient to pattern match by hand, using viewl for versions that don't offer the pattern synonyms.
ndtakeEnding' :: [t] -> Int -> Seq t -> ([[t]] -> [[t]])
ndtakeEnding' end 0 _xs = (end:)
ndtakeEnding' end n xs = case S.splitAt n xs of
(front, rear) ->
((toList front ++ end) :) . go rear front
where
go = foldr go' (const id) where
go' r k !front = ndtakeEnding' (r : end) (n - 1) front . k (front |> r)
-- With patterns, a bit less efficiently:
-- go Empty _front = id
-- go (r :<| rs) !front =
-- ndtakeEnding' (r : end) (n - 1) front
-- . go rs (front :|> r)

How to switch 2 elements in a list in Haskell

I will start with an example (I think it will show exactly my problem)
switch 1 2 [[1,2,3,4],[5,6,0,7]] -> [[1,2,0,4],[5,6,3,7]]
Where [[1,2,3,4],[5,6,0,7]] !! 1 !! 2 is the zero element. The first integer is always 1 and the second one ranges between 0 and 3 and I want to change the element I give as parameter through his indexes (from the second component list) with the element from the same position in the first component list.
I know the lists are immutable in Haskell, however I still can't figure it out.
How can I do this?
switch _ n [xs,ys] = [xs',ys']
where (xs',ys',_) = unzip3 $
map (\t#(x,y,m) -> if m==n then (y,x,m) else t) $
zip3 xs ys [0..]
switch i j l = [a,b]
where (a,b) = unzip [if j==n then (l!!1!!n, l!!0!!n) else (l!!0!!n,l!!1!!n) | n<-[0..3]]
the first integer is useless

Efficiently find indices of maxima of a list

Edit: I must not have worded it clearly enough, but I'm looking for a function like the one below, but not exactly it.
Given a list, I wanted to be able to find the index of the largest element in the list
(So, list !! (indexOfMaximum list) == maximum list)
I wrote some code that seems pretty efficient, although I feel I'm reinventing the wheel somewhere.
indexOfMaximum :: (Ord n, Num n) => [n] -> Int
indexOfMaximum list =
let indexOfMaximum' :: (Ord n, Num n) => [n] -> Int -> n -> Int -> Int
indexOfMaximum' list' currIndex highestVal highestIndex
| null list' = highestIndex
| (head list') > highestVal =
indexOfMaximum' (tail list') (1 + currIndex) (head list') currIndex
| otherwise =
indexOfMaximum' (tail list') (1 + currIndex) highestVal highestIndex
in indexOfMaximum' list 0 0 0
Now I want to return a list of the indices of the largest n numbers in the list.
My only solution is to store the top n elements in a list and replace (head list') > highestVal with a comparison across the n-largest-so-far list.
It feels like there has to be a more efficient way than to do this, and I also feel I'm making insufficient use of Prelude and Data.List. Any suggestions?
This solution associates each element with its index, sorts the list, so the smallest element is first, reverses it so the largest element is first, takes the first n elements, and then extracts the index.
maxn n xs = map snd . take n . reverse . sort $ zip xs [0..]
The shortest way finds the last index of a maximum element,
maxIndex list = snd . maximum $ zip list [0 .. ]
If you want the first index,
maxIndex list = snd . maximumBy cmp $ zip list [0 .. ]
where
cmp (v,i) (w,j) = case compare v w of
EQ -> compare j i
ne -> ne
The downside is that maximum and maximumBy are too lazy, so these may build large thunks. To avoid that, either use a manual recursion (like you did, but some strictness annotations may be necessary) or use a strict left fold with a strict accumulator type, tuples are not good for that because foldl' only evaluates to weak head normal form, that is to the outermost constructor here, and thus you build thunks in the tuple components.
Well, a simple way would be to use maximum to find the largest element and then use findIndices to find each occurrence of it. Something like:
largestIndices :: Ord a => [a] -> [Int]
largestIndices ls = findIndices (== maximum ls) ls
However, this is not perfect because maximum is a partial function and will barf horribly if given an empty list. You can easily avoid this by adding a [] case:
largestIndices :: Ord a => [a] -> [Int]
largestIndices [] = []
largestIndices ls = findIndices (== maximum ls) ls
The real trick to this answer is how I figured it out. I didn't even know about findIndices before now! However, GHCi has a neat command called :browse.
Prelude> :browse Data.List
This lists every single function exported by Data.List. Using this, I just search first for maximum and then for index to see what the options were. And, right by findIndex, there was findIndecies, which was perfect.
Finally, I would not worry about efficiency unless you actually see that code is running slowly. GHC can--and does--perform some very aggressive optimizations because the language is pure and it can get away with it. So the only time you need to worry about performance is when--after compiling with -O2--you see that it's a problem.
EDIT: If you want to find the n top elements' indices, here's an easy idea: sort the list in descending order, grab the first n unique elements, get their indices with elemIndices and take the first n indices from that. I hope this is relatively clear.
Here's a quick version of my idea:
nLargestInices n ls = take n $ concatMap (`elemIndices` ls) nums
where nums = take n . reverse . nub $ sort ls

Explain how a recursive Haskell list function works?

I know what the following function does I would just like an explanation of how it works and the calculations that take place:
sponge :: Int -> [a] -> [a]
sponge 0 xs = xs
sponge n [] = []
sponge n (x:xs) = sponge (n-1) xs
I just seem to have lost the plot with it all now :(
Any help to get me back on track would be much appreciated! :)
It's a recursive function over two variables. You can break it apart line-by-line to understand it:
sponge :: Int -> [a] -> [a]
Two arguments, one an Int, one a list of some elements.
sponge 0 xs = xs
The base case. If the Int argument is zero, just return the list argument unmodified.
sponge n [] = []
Another base case, if the list is empty, immediately return the empty list.
sponge n (x:xs) = sponge (n-1) xs
Finally, the inductive step. If the list is non-empty (i.e. made up of at least one element and a tail, denoted by x:xs), then the result is sponge called on n-1 and the tail of the list.
So what will this function do? It will return the tail of the list after dropping n elements. It is the same as the drop function:
> drop 10 [1..20]
[11,12,13,14,15,16,17,18,19,20]
And
> sponge 10 [1..20]
[11,12,13,14,15,16,17,18,19,20]
In fact, we can ask QuickCheck to confirm:
> quickCheck $ \n xs -> sponge n xs == drop n xs
*** Failed! Falsifiable (after 7 tests and 5 shrinks):
-1
[()]
Ah! They're different. When n is negative! So we can modify the property relating the two functions:
> quickCheck $ \n xs -> n >= 0 ==> sponge n xs == drop n xs
+++ OK, passed 100 tests.
So your function behaves like drop, for cases when n is positive.
Here's a trace of the intermediate values of n and xs, obtained via the hood debugger:
It takes two parameters, as you can see: an Int and a list. It pattern-matches to distinguish three cases: 1) the Int is zero; 2) the list is empty; or, 3) the Int is not zero nor is the list empty.
In case 1 it returns the list; in case 2, it returns the empty list (which is what the second parameter was anyway); in case 3, it recursively calls itself with original Int parameter minus 1 and the original list minus its first element.
It looks a lot like "drop" from the Prelude.