Implementing the Sieve of Eratosthenes in Idris - primes

I'm struggling to translate my definition of the Sieve of Eratosthenes into Idris. Here is the function so far:
%default total
eratos : Nat -> (l : List Nat) -> { auto ok: NonEmpty l } -> List Nat
eratos limit (prime :: rest) =
if prime * prime > limit -- if we've passed the square root of n
then prime :: xs -- then we're done!
-- otherwise, subtract the multiples of that prime and recurse
else prime :: (eratos limit (rest \\ [prime^2,prime^2+prime..limit]))
main : IO ()
main = printLn $ eratos [2..100]
unfortunately, I'm getting a strange compiler error:
idris --build euler.ipkg
./E003.idr:18:18: error: expected: ")",
dependent type signature
else prime :: (eratos n (xs \\ [prime^2,prime^2+prime..n]))
Why is the compiler looking for a type signature?

I was able to implement it as following:
eratos : Nat -> (l : List Nat) -> List Nat
eratos _ [] = []
eratos limit (prime :: rest) =
if prime * prime > limit -- if we've passed the square root of n
then prime :: rest -- then we're done!
-- otherwise, subtract the multiples of that prime and recurse
else prime :: eratos limit (rest \\ [(prime*prime),(prime*prime+prime)..limit])
With this implementation, the type checker considers this function "covering".
Optimally, we wouldn't need the first case and could restrict the input to the case where the list is of length >= 1. However, it's hard to show the compiler either that the list will never be null or that this function's second argument is getting structurally smaller on each recursive call. If anyone has suggestions, please add them as comments or another answer!

Related

Function to find the most frequent element

I am trying to code a function that returns the element that appears the most in a list. So far I have the following
task :: Eq a => [a] -> a
task xs = (map ((\l#(x:xs) -> (x,length l)) (occur (sort xs))))
occur is a function that takes a list and returns a list of pairs with the elements of the inputted list along with the amount of times they appear. So for example for a list [1,1,2,3,3] the output would be [(1,2),(2,1),(3,2)].
However, I am getting some errors related to the arguments of map. Can anyone tell me what I'm doing wrong?
A map maps every item to another item, so here \l is a 2-tuple, like (1,2), (2, 1) or (3, 2). It thus does not make much sense to work with length l, since length :: Foldable f => f a -> Int will always return one for a 2-tuple: this is because only the second part of the 2-tuple is used in the foldable. But we do not need length in the first place.
What you need is a function that can retrieve the maximum based on the second item of the 2-tuple. We can make use of the maximumOn :: Ord b => (a -> b) -> [a] -> a from the exta package, or we can implement our own function to calculate the maximum on a list of items.
Such function thus should look like:
maximumSnd :: Ord b => [(a, b)] -> (a, b)
maximumSnd [] = error "Empty list"
maximumSnd (x:xs) = go xs x
where go [] m = m
go (x#(xa, xb):xs) (ya, yb)
| xb > yb = go … … -- (1)
| otherwise = go … … -- (2)
Here (1) should be implemented such that we make a recursive call but work with x as the new maximum we found thus far. (2) should make a recursive call with the same thus far maximum.
Once we have implemented the maxSnd function, we can use this function as a helper function for:
task :: Eq a => [a] -> (a, Int)
task xs = maxSnd (occur xs)
or we can use fst :: (a, b) -> a to retrieve the first item of the 2-tuple:
task :: Eq a => [a] -> a
task xs = (fst . maxSnd) (occur xs)
In case there are two characters with a maximum number of elements, the maximumSnd will return the first one in the list of occurrences.

A faster way of generating combinations with a given length, preserving the order

TL;DR: I want the exact behavior as filter ((== 4) . length) . subsequences. Just using subsequences also creates variable length of lists, which takes a lot of time to process. Since in the end only lists of length 4 are needed, I was thinking there must be a faster way.
I have a list of functions. The list has the type [Wor -> Wor]
The list looks something like this
[f1, f2, f3 .. fn]
What I want is a list of lists of n functions while preserving order like this
input : [f1, f2, f3 .. fn]
argument : 4 functions
output : A list of lists of 4 functions.
Expected output would be where if there's an f1 in the sublist, it'll always be at the head of the list.
If there's a f2 in the sublist and if the sublist doens't have f1, f2 would be at head. If fn is in the sublist, it'll be at last.
In general if there's a fx in the list, it never will be infront of f(x - 1) .
Basically preserving the main list's order when generating sublists.
It can be assumed that length of list will always be greater then given argument.
I'm just starting to learn Haskell so I haven't tried all that much but so far this is what I have tried is this:
Generation permutations with subsequences function and applying (filter (== 4) . length) on it seems to generate correct permutations -but it doesn't preserve order- (It preserves order, I was confusing it with my own function).
So what should I do?
Also if possible, is there a function or a combination of functions present in Hackage or Stackage which can do this? Because I would like to understand the source.
You describe a nondeterministic take:
ndtake :: Int -> [a] -> [[a]]
ndtake 0 _ = [[]]
ndtake n [] = []
ndtake n (x:xs) = map (x:) (ndtake (n-1) xs) ++ ndtake n xs
Either we take an x, and have n-1 more to take from xs; or we don't take the x and have n more elements to take from xs.
Running:
> ndtake 3 [1..4]
[[1,2,3],[1,2,4],[1,3,4],[2,3,4]]
Update: you wanted efficiency. If we're sure the input list is finite, we can aim at stopping as soon as possible:
ndetake n xs = go (length xs) n xs
where
go spare n _ | n > spare = []
go spare n xs | n == spare = [xs]
go spare 0 _ = [[]]
go spare n [] = []
go spare n (x:xs) = map (x:) (go (spare-1) (n-1) xs)
++ go (spare-1) n xs
Trying it:
> length $ ndetake 443 [1..444]
444
The former version seems to be stuck on this input, but the latter one returns immediately.
But, it measures the length of the whole list, and needlessly so, as pointed out by #dfeuer in the comments. We can achieve the same improvement in efficiency while retaining a bit more laziness:
ndzetake :: Int -> [a] -> [[a]]
ndzetake n xs | n > 0 =
go n (length (take n xs) == n) (drop n xs) xs
where
go n b p ~(x:xs)
| n == 0 = [[]]
| not b = []
| null p = [(x:xs)]
| otherwise = map (x:) (go (n-1) b p xs)
++ go n b (tail p) xs
Now the last test also works instantly with this code as well.
There's still room for improvement here. Just as with the library function subsequences, the search space could be explored even more lazily. Right now we have
> take 9 $ ndzetake 3 [1..]
[[1,2,3],[1,2,4],[1,2,5],[1,2,6],[1,2,7],[1,2,8],[1,2,9],[1,2,10],[1,2,11]]
but it could be finding [2,3,4] before forcing the 5 out of the input list. Shall we leave it as an exercise?
Here's the best I've been able to come up with. It answers the challenge Will Ness laid down to be as lazy as possible in the input. In particular, ndtake m ([1..n]++undefined) will produce as many entries as possible before throwing an exception. Furthermore, it strives to maximize sharing among the result lists (note the treatment of end in ndtakeEnding'). It avoids problems with badly balanced list appends using a difference list. This sequence-based version is considerably faster than any pure-list version I've come up with, but I haven't teased apart just why that is. I have the feeling it may be possible to do even better with a better understanding of just what's going on, but this seems to work pretty well.
Here's the general idea. Suppose we ask for ndtake 3 [1..5]. We first produce all the results ending in 3 (of which there is one). Then we produce all the results ending in 4. We do this by (essentially) calling ndtake 2 [1..3] and adding the 4 onto each result. We continue in this manner until we have no more elements.
import qualified Data.Sequence as S
import Data.Sequence (Seq, (|>))
import Data.Foldable (toList)
We will use the following simple utility function. It's almost the same as splitAtExactMay from the 'safe' package, but hopefully a bit easier to understand. For reasons I haven't investigated, letting this produce a result when its argument is negative leads to ndtake with a negative argument being equivalent to subsequences. If you want, you can easily change ndtake to do something else for negative arguments.
-- to return an empty list in the negative case.
splitAtMay :: Int -> [a] -> Maybe ([a], [a])
splitAtMay n xs
| n <= 0 = Just ([], xs)
splitAtMay _ [] = Nothing
splitAtMay n (x : xs) = flip fmap (splitAtMay (n - 1) xs) $
\(front, rear) -> (x : front, rear)
Now we really get started. ndtake is implemented using ndtakeEnding, which produces a sort of "difference list", allowing all the partial results to be concatenated cheaply.
ndtake :: Int -> [t] -> [[t]]
ndtake n xs = ndtakeEnding n xs []
ndtakeEnding :: Int -> [t] -> ([[t]] -> [[t]])
ndtakeEnding 0 _xs = ([]:)
ndtakeEnding n xs = case splitAtMay n xs of
Nothing -> id -- Not enough elements
Just (front, rear) ->
(front :) . go rear (S.fromList front)
where
-- For each element, produce a list of all combinations
-- *ending* with that element.
go [] _front = id
go (r : rs) front =
ndtakeEnding' [r] (n - 1) front
. go rs (front |> r)
ndtakeEnding doesn't call itself recursively. Rather, it calls ndtakeEnding' to calculate the combinations of the front part. ndtakeEnding' is very much like ndtakeEnding, but with a few differences:
We use a Seq rather than a list to represent the input sequence. This lets us split and snoc cheaply, but I'm not yet sure why that seems to give amortized performance that is so much better in this case.
We already know that the input sequence is long enough, so we don't need to check.
We're passed a tail (end) to add to each result. This lets us share tails when possible. There are lots of opportunities for sharing tails, so this can be expected to be a substantial optimization.
We use foldr rather than pattern matching. Doing this manually with pattern matching gives clearer code, but worse constant factors. That's because the :<|, and :|> patterns exported from Data.Sequence are non-trivial pattern synonyms that perform a bit of calculation, including amortized O(1) allocation, to build the tail or initial segment, whereas folds don't need to build those.
NB: this implementation of ndtakeEnding' works well for recent GHC and containers; it seems less efficient for earlier versions. That might be the work of Donnacha Kidney on foldr for Data.Sequence. In earlier versions, it might be more efficient to pattern match by hand, using viewl for versions that don't offer the pattern synonyms.
ndtakeEnding' :: [t] -> Int -> Seq t -> ([[t]] -> [[t]])
ndtakeEnding' end 0 _xs = (end:)
ndtakeEnding' end n xs = case S.splitAt n xs of
(front, rear) ->
((toList front ++ end) :) . go rear front
where
go = foldr go' (const id) where
go' r k !front = ndtakeEnding' (r : end) (n - 1) front . k (front |> r)
-- With patterns, a bit less efficiently:
-- go Empty _front = id
-- go (r :<| rs) !front =
-- ndtakeEnding' (r : end) (n - 1) front
-- . go rs (front :|> r)

To leave in the list items whose values coincide with the numbers of their positions in the list

I need to change list for example:
[1,2,4,6,5,10]
To this one
[1,2,5] (the list of elements that are on correct position).
1st element value is 1 - ok,
element value is 2 - ok,
3rd element value is 4 but expected 3 (due to the index)- remove
and etc. How can I solve the error which is attached below?
My code:
module Count where
import Control.Monad.State
nthel n xs = last xsxs
where xsxs = take n xs
deleteNth i items = take i items ++ drop (1 + i) items
repeatNTimes 0 _ = return ()
repeatNTimes n xs =
do
if (n == nthel n xs) then return()
else deleteNth (n-1) xs
repeatNTimes (n-1) xs
list = [1,2,3,4,5]
main = repeatNTimes (length list) list
I have the following error:
* Couldn't match type `Int' with `()'
Expected type: [()]
Actual type: [Int]
* In the expression: deleteNth (n - 2) xs
In a stmt of a 'do' block:
if (n == nthel n xs) then return () else deleteNth (n - 2) xs
In the expression:
do { if (n == nthel n xs) then return () else deleteNth (n - 2) xs;
repeatNTimes (n - 1) xs }
A really nice way to work with this is to stitch functions together. First one might need to get to know the functions in the Data.List module, which you can find with hoogle: http://hoogle.haskell.org
Data.List Module functions
I'll give you a little bit of a boost here. The functions I would pick out are the zip function: https://hackage.haskell.org/package/base-4.9.1.0/docs/Data-List.html#v:zip whose type is [a] -> [b] -> [(a, b)] and then the filter function https://hackage.haskell.org/package/base-4.9.1.0/docs/Prelude.html#v:filter whose type is (a -> Bool) -> [a] -> [a] and then the map function whose type is (a -> b) -> [a] -> [b] along with the fst :: (a, b) -> a
Function Composition
These functions can be stitched together using the function composition operator: (.) :: (b -> c) -> (a -> b) -> a -> c it takes two functions that share a common input/output point (in the type signature they are the second and first parameters, respectively; a -> b and b -> c) and it will then join them into one single function.
Stacking it up - required knowledge
In order to do what you want to do, you really need to know about simple types, parameterised types, ranges (including lazy infinite ranges would help), functions and possibly recursion as well some higher order functions and how Haskell functions are curried, and to understand function composition. It wouldn't hurt to add a basic understanding of what typeclasses do and are into the mix.
I helped author a tutorial which can really help with understanding how this stuff works from a usage point of view by following a series of interesting examples. It's not too long, and you might find it much easier to approach your problem once you have understood some of the more foundational stuff: http://happylearnhaskelltutorial.com — note that it's not tuned to teaching you how to construct stuff, that'll be coming in a later volume, but it should give you enough understanding to be able to at least guess at an answer, or understand the one below.
The Answer - spoilers
If you want to work this out yourself, you should stop here and come back later on when you're feeling more confident. However, I'm going to put one possible answer just below, so don't look if you don't want to know yet!
positionals :: (Enum a, Eq a, Num a) => [a] -> [a]
positionals = map fst . filter (\(x, y) -> x == y) . zip [1..]
Keep in mind this is only one way of doing this. There are simpler more explanatory ways to do it, and while it might possibly seem inefficient, Haskell has list/stream fusion which compiles that function into something that will do a single pass across your data.

How does one signify an infinite list to be ascending for elem checks?

I have an infinite list of primes initialized by the following list comprehension:
primes = [x | x <- [2..], 0 `notElem` map (x `mod`) [2..(x `quot` 2)]]
This allows me to make checks like 17 `elem` primes to confirm that 17 is a prime. However, when I check whether a non-prime is in the list, the program does not stop computing. I assume that this is because it does not realize that if the number cannot be found in the list before a prime that is greater than the number, it cannot be found anywhere in the list. Therefore, is there anyway in Haskell to signify to the compiler that a list contains only ascending numbers, so that an elem check will know to stop and return false if it reaches a number in the list greater than its first argument?
One possibility would be to use dropWhile:
isPrime n = (head $ dropWhile (< n) primes) == n
Sure. You can define your own OrderedList newtype, wrap the infinite list, define more efficient searching function that takes OrderedList as its argument.
newtype OrderedList a = OL [a]
member a (OL as) = case dropWhile (<a) as of
[] -> False
(x:_) -> a == x
You cannot override the behavior of elem eventhough it's a class method of Foldable, since the definition of elem only requires the underlying element type to be Eqable, namely:
member :: (Ord a, Eq a) => a -> OrderedList a -> Bool
elem :: (Eq a, Foldable t) => a -> t a -> Bool
You can verify that by the following code:
instance Foldable OrderedList where
foldMap f (OL as) = foldMap f as
elem = member -- error: Could not deduce `Ord a` arising from a use of `member`
Just a note: when your list is not infinite, you'd better consider make use of the tree-like structures (e.g. IntSet), they optimize the complexity of search operaton from O(n) to O(log(n)).
One can code it as a fold:
memberOrd :: (Eq a, Ord a) => a -> [a] -> Bool
memberOrd x = foldr (\y b -> y==x || y<x && b) False
The laziness of || makes it work on infinite lists as well.
(Clearly, we must assume that the list does not contain infinitely many elements < x. We are not suddenly able to solve undecidable problems... ;-) )
Will Ness below suggests the following variant, which performs fewer comparisons:
memberOrd x = foldr (\y b -> y<x && b || y==x) False

Is there any way to separate infinite and finite lists?

For example, I am writing some function for lists and I want to use length function
foo :: [a] -> Bool
foo xs = length xs == 100
How can someone understand could this function be used with infinite lists or not?
Or should I always think about infinite lists and use something like this
foo :: [a] -> Bool
foo xs = length (take 101 xs) == 100
instead of using length directly?
What if haskell would have FiniteList type, so length and foo would be
length :: FiniteList a -> Int
foo :: FiniteList a -> Bool
length traverses the entire list, but to determine if a list has a particular length n you only need to look at the first n elements.
Your idea of using take will work. Alternatively
you can write a lengthIs function like this:
-- assume n >= 0
lengthIs 0 [] = True
lengthIs 0 _ = False
lengthIs n [] = False
lengthIs n (x:xs) = lengthIs (n-1) xs
You can use the same idea to write the lengthIsAtLeast and lengthIsAtMost variants.
On edit: I am primaily responding to the question in your title rather than the specifics of your particular example, (for which ErikR's answer is excellent).
A great many functions (such as length itself) on lists only make sense for finite lists. If the function that you are writing only makes sense for finite lists, make that clear in the documentation (if it isn't obvious). There isn't any way to enforce the restriction since the Halting problem is unsolvable. There simply is no algorithm to determine ahead of time whether or not the comprehension
takeWhile f [1..]
(where f is a predicate on integers) produces a finite or an infinite list.
Nats and laziness strike again:
import Data.List
data Nat = S Nat | Z deriving (Eq)
instance Num Nat where
fromInteger 0 = Z
fromInteger n = S (fromInteger (n - 1))
Z + m = m
S n + m = S (n + m)
lazyLength :: [a] -> Nat
lazyLength = genericLength
main = do
print $ lazyLength [1..] == 100 -- False
print $ lazyLength [1..100] == 100 -- True
ErikR and John Coleman have already answered the main parts of your question, however I'd like to point out something in addition:
It's best to write your functions in a way that they simply don't depend on the finiteness or infinity of their inputs — sometimes it's impossible but a lot of the time it's just a matter of redesign. For example instead of computing the average of the entire list, you can compute a running average, which is itself a list; and this list will itself be infinite if the input list is infinite, and finite otherwise.
avg :: [Double] -> [Double]
avg = drop 1 . scanl f 0.0 . zip [0..]
where f avg (n, i) = avg * (dbl n / dbl n') +
i / dbl n' where n' = n+1
dbl = fromInteger
in which case you could average an infinite list, not having to take its length:
*Main> take 10 $ avg [1..]
[1.0,1.5,2.0,2.5,3.0,3.5,4.0,4.5,5.0]
In other words, one option is to design as much of your functions to simply not care about the infinity aspect, and delay the (full) evaluation of lists, and other (potentially infinite) data structures, to as late a phase in your program as possible.
This way they will also be more reusable and composable — anything with fewer or more general assumptions about its inputs tends to be more composable; conversely, anything with more or more specific assumptions tends to be less composable and therefore less reusable.
There are a couple different ways to make a finite list type. The first is simply to make lists strict in their spines:
data FList a = Nil | Cons a !(FList a)
Unfortunately, this throws away all efficiency benefits of laziness. Some of these can be recovered by using length-indexed lists instead:
{-# LANGUAGE GADTs #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# OPTIONS_GHC -fwarn-incomplete-patterns #-}
data Nat = Z | S Nat deriving (Show, Read, Eq, Ord)
data Vec :: Nat -> * -> * where
Nil :: Vec 'Z a
Cons :: a -> Vec n a -> Vec ('S n) a
instance Functor (Vec n) where
fmap _f Nil = Nil
fmap f (Cons x xs) = Cons (f x) (fmap f xs)
data FList :: * -> * where
FList :: Vec n a -> FList a
instance Functor FList where
fmap f (FList xs) = FList (fmap f xs)
fcons :: a -> FList a -> FList a
fcons x (FList xs) = FList (Cons x xs)
funcons :: FList a -> Maybe (a, FList a)
funcons (FList Nil) = Nothing
funcons (FList (Cons x xs)) = Just (x, FList xs)
-- Foldable and Traversable instances are straightforward
-- as well, and in recent GHC versions, Foldable brings
-- along a definition of length.
GHC does not allow infinite types, so there's no way to build an infinite Vec and thus no way to build an infinite FList (1). However, an FList can be transformed and consumed somewhat lazily, with the cache and garbage collection benefits that entails.
(1) Note that the type system forces fcons to be strict in its FList argument, so any attempt to tie a knot with FList will bottom out.