Lazy List of Prime Numbers - list

How would one implement a list of prime numbers in Haskell so that they could be retrieved lazily?
I am new to Haskell, and would like to learn about practical uses of the lazy evaluation functionality.

Here's a short Haskell function that enumerates primes from Literate Programs:
primes :: [Integer]
primes = sieve [2..]
where
sieve (p:xs) = p : sieve [x|x <- xs, x `mod` p > 0]
Apparently, this is not the Sieve of Eratosthenes (thanks, Landei). I think it's still an instructive example that shows you can write very elegant, short code in Haskell and that shows how the choice of the wrong data structure can badly hurt efficiency.

There are a number of solutions for lazy generation of prime sequences right in the haskell wiki. The first and simplest is the Postponed Turner sieve: (old revision ... NB)
primes :: [Integer]
primes = 2: 3: sieve (tail primes) [5,7..]
where
sieve (p:ps) xs = h ++ sieve ps [x | x <- t, x `rem` p /= 0]
-- or: filter ((/=0).(`rem`p)) t
where (h,~(_:t)) = span (< p*p) xs

The accepted answer from #nikie is not very efficient, is gets relatively slow after some thousands, but the answer of #sleepynate is much better. It took me some time to understand it, therefore here is the same code, but just with variables named more clearly:
lazyPrimes :: [Integer]
lazyPrimes = 2: 3: calcNextPrimes (tail lazyPrimes) [5, 7 .. ]
where
calcNextPrimes (p:ps) candidates =
let (smallerSquareP, (_:biggerSquareP)) = span (< p * p) candidates in
smallerSquareP ++ calcNextPrimes ps [c | c <- biggerSquareP, rem c p /= 0]
The main idea is that the candidates for the next primes already contain no numbers that are divisible by any prime less than the first prime given to the function. So that if you call
calcNextPrimes (5:ps) [11,13,17..]
the candidate list contains no number, that is divisible by 2 or 3, that means that the first non-prime candidate will be 5 * 5, cause 5* 2 and 5 * 3 and 5 * 4 are already eliminated. That allows you to take all candidates, that are smaller than the square of 5 and add them straight away to the primes and sieve the rest to eliminate all numbers divisible by 5.

primes = 2 : [x | x <- [3..], all (\y -> x `mod` y /= 0)
(takeWhile (<= (floor . sqrt $ fromIntegral x)) primes)]
With 2 in the list initially, for each integer x greater than 2, check if for all y in primes such that y <= sqrt(x), x mod y != 0 holds, which means x has no other factors except 1 and itself.

Related

Building a list from left-to-right in Haskell without ++

Is there a way to build lists from left-to-right in Haskell without using ++?
cons is a constant time operation and I want to keep the code efficient. I feel like there's a general way to take advantage of Haskell's laziness to do something like this, but I can't think of it.
Right now I'm writing a function that creates a Collatz Sequence, but it's building the list in the wrong direction:
module CollatzSequence where
collatz :: (Integral a) => a -> [a] -> [a];
collatz n l
| n <= 0 = error "Enter a starting number > 0"
collatz n [] = collatz n [n]
collatz n l#(x:_)
| x == 1 = l
| even x = collatz n ((div x 2):l)
| otherwise = collatz n ((x*3 + 1):l)
In GHCi:
*CollatzSequence> collatz 13 []
[1,2,4,8,16,5,10,20,40,13]
There is indeed a way to take advantage of laziness. In Haskell you can safely do recursive calls inside lazy data constructors, and there will be no risk of stack overflow or divergence. Placing the recursive call inside a constructor eliminates the need for an accumulator, and the order of elements in the list will also correspond to the order in which they are computed:
collatz :: Integer -> [Integer]
collatz n | n <= 1 = []
collatz n = n : collatz next where
next = if even n then div n 2 else n * 3 + 1
For example, the expression head $ collatz 10 evaluates to head (10 : <thunk>) which evaluates to 10, and the thunk in the tail will stay unevaluated. Another advantage is that list nodes can be garbage collected while iterating over the list. foldl' (+) 0 (collatz n) runs in constant space, since the visited nodes are no longer referenced by the rest of the program and can be freed. This is not the case in your original function, since - being tail recursive - it cannot provide any partial result until the whole list is computed.
Is this what you are looking for?
collatz :: (Integral a) => a -> [a]
collatz n
| n <= 0 = error "Enter a starting number > 0"
| n == 1 = [1]
| even n = n : collatz (div n 2)
| otherwise = n : collatz (n*3 + 1)

how to use an incomplete list in its condition

I know.. The title isn't explaining well.. if you have a better title tell me in a comment..I'm making a prime numbers generator for fun and learning purposes..here's my code:
divisors x xs = [ y | y <- [1]++xs++[x], x `mod` y == 0]
isPrime x xs = divisors x xs == [1,x]
primeLst = [ x | x <- [2..], isPrime x primeLst]
As you can see.. I must use the already generated primes in a condition when generating a new one to reduce execution time.. and it's not working.. Is there a way of making it?
Let us start by looking at the divisors function. It is sort of correct, with only two issues:
What is the xs argument supposed to be? From the definition, it looks like it should be all the prime numbers below x - let's call these the candidate primes. So if x was 10, then the candidate primes should be [2,3,5,7]. However, this is not what the function gets as an argument. In your code, xs is the infinite list of primes.
Technically, the divisors doesn't return all divisors. divisors 16 [2,3,5,7,11,13] wouldn't return 8, for instance. But this is a minor nitpick.
So if we can call divisors with the right list of primes, then we should be ok, and the isPrime function would also be fine.
The problem is getting the list of candidate primes. For clarity, I will give the code first, and then explain:
primeLst = 2 : [ x | x <- [3..], isPrime x (takeWhile (\p -> p*p <= x) primeLst)]
I have made two changes:
I made sure that primeLst includes 2, by sticking it at the front.
I restricted the candidate primes by taking numbers from the infinite list of primes until I reached a number that was higher than the square root of the number I was testing for primeness. In doing this i changed the definition of the candidate primes slightly, so for instance the candidates for 26 are [2,3,5] instead of [2,3,5,7,11,13,17,19,23]. But it still works.
Two questions for you to think about:
Why does it still work with the new definition of candidate primes?
Why doesn't the following line of code work, even though it seems it should give the original definition of the candidate primes?
:
primeLst = 2 : [ x | x <- [3..], isPrime x (takeWhile (\p -> p < x) primeLst)]
The last question is hard, so if you have questions, post them in the comments.
dividedBy a b = a `mod` b == 0
isPrime x = null $ filter (x `dividedBy`) $ takeWhile (\y -> y * y <= x) primeLst
primeLst = 2:(filter isPrime [3..])
or, more verbose:
primeLst = 2:(filter isPrime [3..])
where
isPrime x = null $ primeDivisors x
primeDivisors x = filter (x `dividedBy`) $ potentialDivisors x
potentialDivisors x = takeWhile (\y -> y * y <= x) primeLst
a `dividedBy` b = a `mod` b == 0
Calculating at least one element of primeLst requires calculating isPrime 2 primeLst which requires calculating at least two elements of divisors 2 primeLst which requires calculating at least two elements of [1]++primeLst++[2], which requires calculating at least one element of primeLst. Not gonna happen.
You might be able to make something work that uses [1]++(primesLessThan x)++[x], but I don't see a straightforward way to define (primesLessThan x) in terms of primeLst which avoids computational circularity.

Making a list of lists to compute Pascal's triangle in Haskell

I'm trying to make a function that takes in an integer m and returns the rows of Pascal's triangle up to that mth row.
I have already constructed a choose function, that takes in two integers n and k, and returns the value n choose k. For example, choose 3 2 returns 3.
So far, I have
pascal 0 = [1]
pascal m = [x | x <- pascal (m-1)] ++ [choose m k | k <- [0,1..m]
This is returning one big list, but really, I want a list of lists, where each list corresponds to a row in Pascal's triangle. For example pascal 3 should return [[1],[1,1],[1,2,1],[1,3,3,1]]. It currently is returning [1,1,1,1,2,1,1,3,3,1].
There are solutions, and then there are solutions. Let's start with solutions first, and work our way up to solutions.
The first thing to observe is that if we want the result you claimed, we have to change the type, and do a bit more wrapping:
-- was pascal :: Integer -> [Integer]
pascal :: Integer -> [[Integer]]
pascal 0 = [[1]]
pascal m = [x | x <- pascal (m-1)] ++ [[choose m k | k <- [0,1..m]]]
Now then, a few syntactic pointers: [x | x <- foo] is better written just foo, and [0,1..m] is often the same as just [0..m]:
pascal m = pascal (m-1) ++ [[choose m k | k <- [0..m]]]
You'll observe that this is appending singleton lists to the end of another list on each recursive call. This is inefficient; it's better to build lists from the front. So, we'll use a common refactoring: we'll create a helper with an accumulator.
pascal = go [] where
go 0 acc = [1] : acc
go m acc = go (m-1) ([choose m k | k <- [0..m]] : acc)
The next observation is that you can do things a bit more efficiently than recomputing choose m k every time: you can compute the next row of Pascal's triangle using only the previous row and some additions. This means we can build a lazy (infinite) list of all the rows of Pascal's triangle.
nextRow vs = [1] ++ zipWith (+) vs (tail vs) ++ [1]
allPascals = iterate nextRow [1]
Finally, since all of the rows of Pascal's triangle are symmetric across their midpoint, you might try to build an infinite list of just the first halves of each row. This would have the benefit of eliminating the remaining "append to the end of a list" operation. I leave this as an exercise; keep in mind that rows alternate between an even and odd number of elements, which makes this part a bit trickier (and uglier).

haskell, counting how many prime numbers are there in a list

i m a newbie to haskell, currently i need a function 'f' which, given two integers, returns the number of prime numbers in between them (i.e., greater than the first integer but smaller than the second).
Main> f 2 4
1
Main> f 2 10
3
here is my code so far, but it dosent work. any suggestions? thanks..
f :: Int -> Int -> Int
f x y
| x < y = length [ n | n <- [x..y], y 'mod' n == 0]
| otherwise = 0
Judging from your example, you want the number of primes in the open interval (x,y), which in Haskell is denoted [x+1 .. y-1].
Your primality testing is flawed; you're testing for factors of y.
To use a function name as an infix operator, use backticks (`), not single quotes (').
Try this instead:
-- note: no need for the otherwise, since [x..y] == [] if x>y
nPrimes a b = length $ filter isPrime [a+1 .. b-1]
Exercise for the reader: implement isPrime. Note that it only takes one argument.
Look at what your list comprehension does.
n <- [x..y]
Draw n from a list ranging from x to y.
y `mod` n == 0
Only select those n which evenly divide y.
length (...)
Find how many such n there are.
What your code currently does is find out how many of the numbers between x and y (inclusive) are factors of y. So if you do f 2 4, the list will be [2, 4] (the numbers that evenly divide 4), and the length of that is 2. If you do f 2 10, the list will be `[2, 5, 10] (the numbers that evenly divide 10), and the length of that is 3.
It is important to try to understand for yourself why your code doesn't work. In this case, it's simply the wrong algorithm. For algorithms that find whether a number is prime, among many other sources, you can check the wikipedia article: Primality test.
I you want to work with large intervals, then it might be a better idea to compute a list of primes once (instead of doing a isPrime test for every number):
primes = -- A list with all prime numbers
candidates = [a+1 .. b-1]
myprimes = intersectSortedLists candidates primes
nPrimes = length $ myprimes

Concurrent Prime Generator

I'm going through the problems on projecteuler.net to learn how to program in Erlang, and I am having the hardest time creating a prime generator that can create all of the primes below 2 million, in less than a minute. Using the sequential style, I have already written three types of generators, including the Sieve of Eratosthenes, and none of them perform well enough.
I figured a concurrent Sieve would work great, but I'm getting bad_arity messages, and I'm not sure why. Any suggestions on why I have the problem, or how to code it properly?
Here's my code, the commented out sections are where I tried to make things concurrent:
-module(primeserver).
-compile(export_all).
start() ->
register(primes, spawn(fun() -> loop() end)).
is_prime(N) -> rpc({is_prime,N}).
rpc(Request) ->
primes ! {self(), Request},
receive
{primes, Response} ->
Response
end.
loop() ->
receive
{From, {is_prime, N}} ->
if
N From ! {primes, false};
N =:= 2 -> From ! {primes, true};
N rem 2 =:= 0 -> From ! {primes, false};
true ->
Values = is_not_prime(N),
Val = not(lists:member(true, Values)),
From ! {primes, Val}
end,
loop()
end.
for(N,N,_,F) -> [F(N)];
for(I,N,S,F) when I + S [F(I)|for(I+S, N, S, F)];
for(I,N,S,F) when I + S =:= N -> [F(I)|for(I+S, N, S, F)];
for(I,N,S,F) when I + S > N -> [F(I)].
get_list(I, Limit) ->
if
I
[I*A || A
[]
end.
is_not_prime(N) ->
for(3, N, 2,
fun(I) ->
List = get_list(I,trunc(N/I)),
lists:member(N,lists:flatten(List))
end
).
%%L = for(1,N, fun() -> spawn(fun(I) -> wait(I,N) end) end),
%%SeedList = [A || A
%% lists:foreach(fun(X) ->
%% Pid ! {in_list, X}
%% end, SeedList)
%% end, L).
%%wait(I,N) ->
%% List = [I*A || A lists:member(X,List)
%% end.
I wrote an Eratosthenesque concurrent prime sieve using the Go and channels.
Here is the code: http://github.com/aht/gosieve
I blogged about it here: http://blog.onideas.ws/eratosthenes.go
The program can sieve out the first million primes (all primes upto 15,485,863) in about 10 seconds. The sieve is concurrent, but the algorithm is mainly synchronous: there are far too many synchronization points required between goroutines ("actors" -- if you like) and thus they can not roam freely in parallel.
The 'badarity' error means that you're trying to call a 'fun' with the wrong number of arguments. In this case...
%%L = for(1,N, fun() -> spawn(fun(I) -> wait(I,N) end) end),
The for/3 function expects a fun of arity 1, and the spawn/1 function expects a fun of arity 0. Try this instead:
L = for(1, N, fun(I) -> spawn(fun() -> wait(I, N) end) end),
The fun passed to spawn inherits needed parts of its environment (namely I), so there's no need to pass it explicitly.
While calculating primes is always good fun, please keep in mind that this is not the kind of problem Erlang was designed to solve. Erlang was designed for massive actor-style concurrency. It will most likely perform rather badly on all examples of data-parallel computation. In many cases, a sequential solution in, say, ML will be so fast that any number of cores will not suffice for Erlang to catch up, and e.g. F# and the .NET Task Parallel Library would certainly be a much better vehicle for these kinds of operations.
Primes parallel algorithm : http://www.cs.cmu.edu/~scandal/cacm/node8.html
Another alternative to consider is to use probabalistic prime generation. There is an example of this in Joe's book (the "prime server") which uses Miller-Rabin I think...
You can find four different Erlang implementations for finding prime numbers (two of which are based on the Sieve of Eratosthenes) here. This link also contains graphs comparing the performance of the 4 solutions.
The Sieve of Eratosthenes is fairly easy to implement but -- as you have discovered -- not the most efficient. Have you tried the Sieve of Atkin?
Sieve of Atkin # Wikipedia
Two quick single-process erlang prime generators; sprimes generates all primes under 2m in ~2.7 seconds, fprimes ~3 seconds on my computer (Macbook with a 2.4 GHz Core 2 Duo). Both are based on the Sieve of Eratosthenes, but since Erlang works best with lists, rather than arrays, both keep a list of non-eliminated primes, checking for divisibility by the current head and keeping an accumulator of verified primes. Both also implement a prime wheel to do initial reduction of the list.
-module(primes).
-export([sprimes/1, wheel/3, fprimes/1, filter/2]).
sieve([H|T], M) when H=< M -> [H|sieve([X || X<- T, X rem H /= 0], M)];
sieve(L, _) -> L.
sprimes(N) -> [2,3,5,7|sieve(wheel(11, [2,4,2,4,6,2,6,4,2,4,6,6,2,6,4,2,6,4,6,8,4,2,4,2,4,8,6,4,6,2,4,6,2,6,6,4,2,4,6,2,6,4,2,4,2,10,2,10], N), math:sqrt(N))].
wheel([X|Xs], _Js, M) when X > M ->
lists:reverse(Xs);
wheel([X|Xs], [J|Js], M) ->
wheel([X+J,X|Xs], lazy:next(Js), M);
wheel(S, Js, M) ->
wheel([S], lazy:lazy(Js), M).
fprimes(N) ->
fprimes(wheel(11, [2,4,2,4,6,2,6,4,2,4,6,6,2,6,4,2,6,4,6,8,4,2,4,2,4,8,6,4,6,2,4,6,2,6,6,4,2,4,6,2,6,4,2,4,2,10,2,10], N), [7,5,3,2], N).
fprimes([H|T], A, Max) when H*H =< Max ->
fprimes(filter(H, T), [H|A], Max);
fprimes(L, A, _Max) -> lists:append(lists:reverse(A), L).
filter(N, L) ->
filter(N, N*N, L, []).
filter(N, N2, [X|Xs], A) when X < N2 ->
filter(N, N2, Xs, [X|A]);
filter(N, _N2, L, A) ->
filter(N, L, A).
filter(N, [X|Xs], A) when X rem N /= 0 ->
filter(N, Xs, [X|A]);
filter(N, [_X|Xs], A) ->
filter(N, Xs, A);
filter(_N, [], A) ->
lists:reverse(A).
lazy:lazy/1 and lazy:next/1 refer to a simple implementation of pseudo-lazy infinite lists:
lazy(L) ->
repeat(L).
repeat(L) -> L++[fun() -> L end].
next([F]) -> F()++[F];
next(L) -> L.
Prime generation by sieves is not a great place for concurrency (but it could use parallelism in checking for divisibility, although the operation is not sufficiently complex to justify the additional overhead of all parallel filters I have written thus far).
`
Project Euler problems (I'd say most of the first 50 if not more) are mostly about brute force with a splash of ingenuity in choosing your bounds.
Remember to test any if N is prime (by brute force), you only need to see if its divisible by any prime up to floor(sqrt(N)) + 1, not N/2.
Good luck
I love Project Euler.
On the subject of prime generators, I am a big fan of the Sieve of Eratosthenes.
For the purposes of the numbers under 2,000,000 you might try a simple isPrime check implementation. I don't know how you'd do it in erlang, but the logic is simple.
For Each NUMBER in LIST_OF_PRIMES
If TEST_VALUE % NUMBER == 0
Then FALSE
END
TRUE
if isPrime == TRUE add TEST_VALUE to your LIST_OF_PRIMES
iterate starting at 14 or so with a preset list of your beginning primes.
c# ran a list like this for 2,000,000 in well under the 1 minute mark
Edit: On a side note, the sieve of Eratosthenes can be implemented easily and runs quickly, but gets unwieldy when you start getting into huge lists. The simplest implementation, using a boolean array and int values runs extremely quickly. The trouble is that you begin running into limits for the size of your value as well as the length of your array. -- Switching to a string or bitarray implementation helps, but you still have the challenge of iterating through your list at large values.
here is a vb version
'Sieve of Eratosthenes
'http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
'1. Create a contiguous list of numbers from two to some highest number n.
'2. Strike out from the list all multiples of two (4, 6, 8 etc.).
'3. The list's next number that has not been struck out is a prime number.
'4. Strike out from the list all multiples of the number you identified in the previous step.
'5. Repeat steps 3 and 4 until you reach a number that is greater than the square root of n (the highest number in the list).
'6. All the remaining numbers in the list are prime.
Private Function Sieve_of_Eratosthenes(ByVal MaxNum As Integer) As List(Of Integer)
'tested to MaxNum = 10,000,000 - on 1.8Ghz Laptop it took 1.4 seconds
Dim thePrimes As New List(Of Integer)
Dim toNum As Integer = MaxNum, stpw As New Stopwatch
If toNum > 1 Then 'the first prime is 2
stpw.Start()
thePrimes.Capacity = toNum 'size the list
Dim idx As Integer
Dim stopAT As Integer = CInt(Math.Sqrt(toNum) + 1)
'1. Create a contiguous list of numbers from two to some highest number n.
'2. Strike out from the list all multiples of 2, 3, 5.
For idx = 0 To toNum
If idx > 5 Then
If idx Mod 2 <> 0 _
AndAlso idx Mod 3 <> 0 _
AndAlso idx Mod 5 <> 0 Then thePrimes.Add(idx) Else thePrimes.Add(-1)
Else
thePrimes.Add(idx)
End If
Next
'mark 0,1 and 4 as non-prime
thePrimes(0) = -1
thePrimes(1) = -1
thePrimes(4) = -1
Dim aPrime, startAT As Integer
idx = 7 'starting at 7 check for primes and multiples
Do
'3. The list's next number that has not been struck out is a prime number.
'4. Strike out from the list all multiples of the number you identified in the previous step.
'5. Repeat steps 3 and 4 until you reach a number that is greater than the square root of n (the highest number in the list).
If thePrimes(idx) <> -1 Then ' if equal to -1 the number is not a prime
'not equal to -1 the number is a prime
aPrime = thePrimes(idx)
'get rid of multiples
startAT = aPrime * aPrime
For mltpl As Integer = startAT To thePrimes.Count - 1 Step aPrime
If thePrimes(mltpl) <> -1 Then thePrimes(mltpl) = -1
Next
End If
idx += 2 'increment index
Loop While idx < stopAT
'6. All the remaining numbers in the list are prime.
thePrimes = thePrimes.FindAll(Function(i As Integer) i <> -1)
stpw.Stop()
Debug.WriteLine(stpw.ElapsedMilliseconds)
End If
Return thePrimes
End Function