Factoring a large number - sml

Need a way of factoring a large number in ml, disregarding 1 and the number. My way only works for small numbers, it involves basically starting at the lowest possible 2 factors and checking if when multiplied they equal the number, otherwise keep looping. This does not work for large numbers since it would take too many recursive calls
fun factor n=
let
val f1 = 2
val f2 = 3
fun lp f1 f2 = if f1 *f2 = n then (f1,f2)
else if f2 = (n-1)
then lp (f1+1) 2
else lp f1 (f2+1)
in
(lp f1 f2)
end;

The typical way of factoring a number n is to loop from 2 to sqrt(n), each time checking if n is divisible by your current loop index. There are a couple of problems with your implementation. First, (as mentioned in one of the above comments) if the number given is prime, you will go off into an infinite loop. Second, you are checking redundant cases. A function to get all factors of a number is given below:
fun factor n =
let val sqrtN = Real.ceil(Math.sqrt (Real.fromInt n))
fun lp(i, factors) =
if i > sqrtN
then factors
else
if Int.mod(n, i) = 0
then lp(i+1, (i, n div i)::factors)
else lp(i+1, factors)
in lp(2, nil) end
This seems to be able to handle large numbers, and has the added benefit of giving all factors of a number. if you want it to short circuit after finding one solution, then you can change lp(i+1, (i, n div i)::factors) to (i, n div i) and remove the second parameter from lp

Related

Traversing a binary tree to get from one number to another using only two operations

I'm doing a problem that says that we have to get from one number, n, to another, m, in as few steps as possible, where each "step" can be 1) doubling, or 2) subtracting one. The natural approach is two construct a binary tree and run BFS since we are given that n, m are bounded by 0 ≤ n, m ≤ 104 and so the tree doesn't get that big. However, I ran into a stunningly short solution, and have no idea why it works. It basically goes from m … n instead, halving or adding one as necessary to decrease m until it is less than n, and then just adding to get up to n. Here is the code:
while(n<m){
if (m%2) m++;
else m /= 2;
count++;
}
count = count + n - m;
return count;
Is it obvious why this is necessarily the shortest path? I get that going from m … n is natural because n is lower bounded by zero and so the tree becomes more "finite" in some sense, but this method of modified halving until you get below the number, then adding up until you reach it, doesn't seem like it should necessarily always return the correct answer, yet it does. Why, and how might I have recognized this approach from the get-go?
You only have 2 available operations:
double n
subtract 1 from n
That means the only way to go up is to double and the only way to go down is to subtract 1.
If m is an even number, then you can land on it by doubling n when 2*n = m. Otherwise, you will have to subtract 1 as well (if 2*n = m + 1 then you will have to double n and then subtract 1).
If doubling n lands too far above m then you will have to subtract twice as many times than if you used the subtraction before doubling n.
example:
n = 12 and m = 20.
You can either double n and then subtract 4 times as in 12*2 -4 = 20. - 5 steps
Or you can subtract twice and then double n as in (12-2)*2 = 20. - 3 steps
You might be wondering 'How should I pick between doubling or subtracting when n < m/2?'.
The idea is to use a reccurence-based approach. You know that you want n to reach a value of v such as v = m/2 or v = (m+1)/2. In other words you want n to reach v... and the shortest way to do that is to reach a value v' such as v' = v/2 or v' = (v+1)/2 and so on.
example:
n = 2 and m = 21.
You want n to reach (21+1)/2 = 11 which means you want to reach (11+1)/2 = 6 and thus to reach 6/2=3 and thus to reach (3+1)/2 = 2.
Since n=2 you now know that the shortest path is: (((n*2-1)*2)*2-1)*2-1.
other example:
n = 14 and m = 22.
You want n to reach 22/2 = 11.
n is already above 11 so the shortest path is : (n-1-1-1)*2.
From here, you can see that the shortest path can be deduced without a binary tree.
On top of that, you have to think starting from m and going down to an obvious path for n. This implies that it will be easier to code an algorithm going from m to n than the opposite.
Using recurrence, this function achieves the same result:
function shortest(n, m) {
if (n >= m) return n-m; //only way to go down
if(m%2==0) return 1 + shortest(n, m/2); //if m is even => optimum goal is m/2
else return 2 + shortest(n, (m+1)/2);//else optimum goal is (m+1)/2 which necessitates 2 operations
}

Finding number of divisors of a big integer using prime/quadratic factorization (C#)

I'm trying to get the number of divisors of a 64 bit integer (larger than 32 bit)
My first method (for small numbers) was to divide the number until the resulting number was 1, count the number of matching primes and use the formula (1 + P1)(1+ P2)..*(1 + Pn) = Number of divisors
For example:
24 = 2 * 2 * 2 * 3 = 2^3 * 3^1
==> (3 + 1)*(1 + 1) = 4 * 2 = 8 divisors
public static long[] GetPrimeFactor(long number)
{
bool Found = false;
long i = 2;
List<long> Primes = new List<long>();
while (number > 1)
{
if (number % i == 0)
{
number /= i;
Primes.Add(i);
i = 1;
}
i++;
}
return Primes.ToArray();
}
But for large integers this method is taking to many iterations. I found a method called Quadratic sieve to make a factorization using square numbers. Now using my script this can be much easier because the numbers are much smaller.
My question is, how can I implement this Quadratic Sieve?
The quadatic sieve is a method of finding large factors of large numbers; think 10^75, not 2^64. The quadratic sieve is complicated even in simple pseudocode form, and much more complicated if you want it to be efficient. It is very much overkill for 64-bit integers, and will be slower than other methods that are specialized for such small numbers.
If trial division is too slow for you, the next step up in complexity is John Pollard's rho method; for 64-bit integers, you might want to trial divide up to some small limit, maybe the primes less than a thousand, then switch to rho. Here's simple pseudocode to find a single factor of n; call it repeatedly on the composite cofactors to complete the factorization:
function factor(n, c=1)
if n % 2 == 0 return 2
h := 1; t := 1
repeat
h := (h*h+c) % n
h := (h*h+c) % n
t := (t*t+c) % n
g := gcd(t-h, n)
while g == 1
if g is prime return g
return factor(g, c+1)
There are other ways to factor 64-bit integers, but this will get you started, and is probably sufficient for most purposes; you might search for Richard Brent's variant of the rho algorithm for a modest speedup. If you want to know more, I modestly recommend the essay Programming with Prime Numbers at my blog.

Find {E1,..En} (E1+E2+..En=N, N is given) with the following property that E1* E2*..En is Maximum

Given the number N, write a program that computes the numbers E1, E2, ...En with the following properties:
1) N = E1 + E2 + ... + En;
2) E1 * E2 * ... En is maximum.
3) E1..En, are integers. No negative values :)
How would you do that ? I have a solution based on divide et impera but i want to check if is optimal.
Example: N=10
5,5 S=10,P=25
3,2,3,2 S=10,P=36
No need for an algorithm, mathematic intuition can do it on its own:
Step 1: prove that a result set with numbers higher than 3 is at most as good as a result set with only 3's and 2's
Given any number x in your result set, one might consider whether it would be better to divide it into two numbers.
The sum should still be x.
When x is even, The maximum for t (x - t) is reached when t = x/2 , and except for the special case x = 2, then it is greater than x, and for the special case x = 4, equal to x (see note 1).
When x is odd, The maximum for t (x - t) is reached when t = (x ± 1)/2.
What does this show? Only that you should only have 3's and 2's in your final set, because otherwise it is suboptimal (or equivalent to an optimal set).
Step 2: you should have as many 3's as possible
Now, as 3² > 2³, you should have as many 3's as possible as long as the remainder is not 1.
Conclusion: for every N >= 3:
If N = 0 mod 3, then the result set is only 3's
If N = 1 mod 3, then the result set has one pair of 2's (or a 4) and the rest is 3's
If N = 2 mod 3, then the result set has one 2 and the rest is 3's
Please correct this post. The times when I was writing well-structured mathematical proofs is far away...
Note 1: (2,4) is the only pair of distinct integers such that x^y = y^x. You can prove that with:
x^y = y^x
y ln(x) = x ln(y)
ln(x)/x = ln(y) / y
and the function ln(t)/t is strictly decreasing after its global maximum, reached between 2 and 3, so if you want two distinct integers such that ln(x)/x = ln(y)/y, one of them must be lower or equal to 2. From that you can infer that only (2,4) works
This is not a complete solution, but might help.
First off note that if you fix n, and two of the terms E_i and E_j differ by more than one (for example 3 and 8), then you can do better by "equalizing" them as much as possible, i.e., if the number p = E_i + E_j is even, you do better both terms by p/2. If p is odd, you do better by replacing them with p/2 and p/2+1 (where / is integer division).
That said, then if you knew what the optimal number of terms, n, was, you'd be done: let all E_i's equal N/n and N/n+1 (again integer division), so that their sum is still N (this is now a straightforward problem).
So the question now is what is the optimal n. Suppose for the moment that you are allowed to use real numbers. Then the solution would be N/n for each term and you could write the product as (N/n)^n. If you differentiate this with respect to n and find its root you find that n should be equal to N/e (where e is the Neper number, also known as Euler's number, e = 2.71828....). Therefore, I'd look for a solution where either n = floor(N/e) or n = floor(N/e)+1, and then choose all the E_i's equal to either N/n or N/n+1, as above.
Hope that helps.
The Online Encycolpedia of Integer Sequences gives a recurrence relation for the solution to this problem.
I'll leave it up to someone else to compare complexities. Not sure I can figure out the complexity of OP's method.

haskell, counting how many prime numbers are there in a list

i m a newbie to haskell, currently i need a function 'f' which, given two integers, returns the number of prime numbers in between them (i.e., greater than the first integer but smaller than the second).
Main> f 2 4
1
Main> f 2 10
3
here is my code so far, but it dosent work. any suggestions? thanks..
f :: Int -> Int -> Int
f x y
| x < y = length [ n | n <- [x..y], y 'mod' n == 0]
| otherwise = 0
Judging from your example, you want the number of primes in the open interval (x,y), which in Haskell is denoted [x+1 .. y-1].
Your primality testing is flawed; you're testing for factors of y.
To use a function name as an infix operator, use backticks (`), not single quotes (').
Try this instead:
-- note: no need for the otherwise, since [x..y] == [] if x>y
nPrimes a b = length $ filter isPrime [a+1 .. b-1]
Exercise for the reader: implement isPrime. Note that it only takes one argument.
Look at what your list comprehension does.
n <- [x..y]
Draw n from a list ranging from x to y.
y `mod` n == 0
Only select those n which evenly divide y.
length (...)
Find how many such n there are.
What your code currently does is find out how many of the numbers between x and y (inclusive) are factors of y. So if you do f 2 4, the list will be [2, 4] (the numbers that evenly divide 4), and the length of that is 2. If you do f 2 10, the list will be `[2, 5, 10] (the numbers that evenly divide 10), and the length of that is 3.
It is important to try to understand for yourself why your code doesn't work. In this case, it's simply the wrong algorithm. For algorithms that find whether a number is prime, among many other sources, you can check the wikipedia article: Primality test.
I you want to work with large intervals, then it might be a better idea to compute a list of primes once (instead of doing a isPrime test for every number):
primes = -- A list with all prime numbers
candidates = [a+1 .. b-1]
myprimes = intersectSortedLists candidates primes
nPrimes = length $ myprimes

Concurrent Prime Generator

I'm going through the problems on projecteuler.net to learn how to program in Erlang, and I am having the hardest time creating a prime generator that can create all of the primes below 2 million, in less than a minute. Using the sequential style, I have already written three types of generators, including the Sieve of Eratosthenes, and none of them perform well enough.
I figured a concurrent Sieve would work great, but I'm getting bad_arity messages, and I'm not sure why. Any suggestions on why I have the problem, or how to code it properly?
Here's my code, the commented out sections are where I tried to make things concurrent:
-module(primeserver).
-compile(export_all).
start() ->
register(primes, spawn(fun() -> loop() end)).
is_prime(N) -> rpc({is_prime,N}).
rpc(Request) ->
primes ! {self(), Request},
receive
{primes, Response} ->
Response
end.
loop() ->
receive
{From, {is_prime, N}} ->
if
N From ! {primes, false};
N =:= 2 -> From ! {primes, true};
N rem 2 =:= 0 -> From ! {primes, false};
true ->
Values = is_not_prime(N),
Val = not(lists:member(true, Values)),
From ! {primes, Val}
end,
loop()
end.
for(N,N,_,F) -> [F(N)];
for(I,N,S,F) when I + S [F(I)|for(I+S, N, S, F)];
for(I,N,S,F) when I + S =:= N -> [F(I)|for(I+S, N, S, F)];
for(I,N,S,F) when I + S > N -> [F(I)].
get_list(I, Limit) ->
if
I
[I*A || A
[]
end.
is_not_prime(N) ->
for(3, N, 2,
fun(I) ->
List = get_list(I,trunc(N/I)),
lists:member(N,lists:flatten(List))
end
).
%%L = for(1,N, fun() -> spawn(fun(I) -> wait(I,N) end) end),
%%SeedList = [A || A
%% lists:foreach(fun(X) ->
%% Pid ! {in_list, X}
%% end, SeedList)
%% end, L).
%%wait(I,N) ->
%% List = [I*A || A lists:member(X,List)
%% end.
I wrote an Eratosthenesque concurrent prime sieve using the Go and channels.
Here is the code: http://github.com/aht/gosieve
I blogged about it here: http://blog.onideas.ws/eratosthenes.go
The program can sieve out the first million primes (all primes upto 15,485,863) in about 10 seconds. The sieve is concurrent, but the algorithm is mainly synchronous: there are far too many synchronization points required between goroutines ("actors" -- if you like) and thus they can not roam freely in parallel.
The 'badarity' error means that you're trying to call a 'fun' with the wrong number of arguments. In this case...
%%L = for(1,N, fun() -> spawn(fun(I) -> wait(I,N) end) end),
The for/3 function expects a fun of arity 1, and the spawn/1 function expects a fun of arity 0. Try this instead:
L = for(1, N, fun(I) -> spawn(fun() -> wait(I, N) end) end),
The fun passed to spawn inherits needed parts of its environment (namely I), so there's no need to pass it explicitly.
While calculating primes is always good fun, please keep in mind that this is not the kind of problem Erlang was designed to solve. Erlang was designed for massive actor-style concurrency. It will most likely perform rather badly on all examples of data-parallel computation. In many cases, a sequential solution in, say, ML will be so fast that any number of cores will not suffice for Erlang to catch up, and e.g. F# and the .NET Task Parallel Library would certainly be a much better vehicle for these kinds of operations.
Primes parallel algorithm : http://www.cs.cmu.edu/~scandal/cacm/node8.html
Another alternative to consider is to use probabalistic prime generation. There is an example of this in Joe's book (the "prime server") which uses Miller-Rabin I think...
You can find four different Erlang implementations for finding prime numbers (two of which are based on the Sieve of Eratosthenes) here. This link also contains graphs comparing the performance of the 4 solutions.
The Sieve of Eratosthenes is fairly easy to implement but -- as you have discovered -- not the most efficient. Have you tried the Sieve of Atkin?
Sieve of Atkin # Wikipedia
Two quick single-process erlang prime generators; sprimes generates all primes under 2m in ~2.7 seconds, fprimes ~3 seconds on my computer (Macbook with a 2.4 GHz Core 2 Duo). Both are based on the Sieve of Eratosthenes, but since Erlang works best with lists, rather than arrays, both keep a list of non-eliminated primes, checking for divisibility by the current head and keeping an accumulator of verified primes. Both also implement a prime wheel to do initial reduction of the list.
-module(primes).
-export([sprimes/1, wheel/3, fprimes/1, filter/2]).
sieve([H|T], M) when H=< M -> [H|sieve([X || X<- T, X rem H /= 0], M)];
sieve(L, _) -> L.
sprimes(N) -> [2,3,5,7|sieve(wheel(11, [2,4,2,4,6,2,6,4,2,4,6,6,2,6,4,2,6,4,6,8,4,2,4,2,4,8,6,4,6,2,4,6,2,6,6,4,2,4,6,2,6,4,2,4,2,10,2,10], N), math:sqrt(N))].
wheel([X|Xs], _Js, M) when X > M ->
lists:reverse(Xs);
wheel([X|Xs], [J|Js], M) ->
wheel([X+J,X|Xs], lazy:next(Js), M);
wheel(S, Js, M) ->
wheel([S], lazy:lazy(Js), M).
fprimes(N) ->
fprimes(wheel(11, [2,4,2,4,6,2,6,4,2,4,6,6,2,6,4,2,6,4,6,8,4,2,4,2,4,8,6,4,6,2,4,6,2,6,6,4,2,4,6,2,6,4,2,4,2,10,2,10], N), [7,5,3,2], N).
fprimes([H|T], A, Max) when H*H =< Max ->
fprimes(filter(H, T), [H|A], Max);
fprimes(L, A, _Max) -> lists:append(lists:reverse(A), L).
filter(N, L) ->
filter(N, N*N, L, []).
filter(N, N2, [X|Xs], A) when X < N2 ->
filter(N, N2, Xs, [X|A]);
filter(N, _N2, L, A) ->
filter(N, L, A).
filter(N, [X|Xs], A) when X rem N /= 0 ->
filter(N, Xs, [X|A]);
filter(N, [_X|Xs], A) ->
filter(N, Xs, A);
filter(_N, [], A) ->
lists:reverse(A).
lazy:lazy/1 and lazy:next/1 refer to a simple implementation of pseudo-lazy infinite lists:
lazy(L) ->
repeat(L).
repeat(L) -> L++[fun() -> L end].
next([F]) -> F()++[F];
next(L) -> L.
Prime generation by sieves is not a great place for concurrency (but it could use parallelism in checking for divisibility, although the operation is not sufficiently complex to justify the additional overhead of all parallel filters I have written thus far).
`
Project Euler problems (I'd say most of the first 50 if not more) are mostly about brute force with a splash of ingenuity in choosing your bounds.
Remember to test any if N is prime (by brute force), you only need to see if its divisible by any prime up to floor(sqrt(N)) + 1, not N/2.
Good luck
I love Project Euler.
On the subject of prime generators, I am a big fan of the Sieve of Eratosthenes.
For the purposes of the numbers under 2,000,000 you might try a simple isPrime check implementation. I don't know how you'd do it in erlang, but the logic is simple.
For Each NUMBER in LIST_OF_PRIMES
If TEST_VALUE % NUMBER == 0
Then FALSE
END
TRUE
if isPrime == TRUE add TEST_VALUE to your LIST_OF_PRIMES
iterate starting at 14 or so with a preset list of your beginning primes.
c# ran a list like this for 2,000,000 in well under the 1 minute mark
Edit: On a side note, the sieve of Eratosthenes can be implemented easily and runs quickly, but gets unwieldy when you start getting into huge lists. The simplest implementation, using a boolean array and int values runs extremely quickly. The trouble is that you begin running into limits for the size of your value as well as the length of your array. -- Switching to a string or bitarray implementation helps, but you still have the challenge of iterating through your list at large values.
here is a vb version
'Sieve of Eratosthenes
'http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
'1. Create a contiguous list of numbers from two to some highest number n.
'2. Strike out from the list all multiples of two (4, 6, 8 etc.).
'3. The list's next number that has not been struck out is a prime number.
'4. Strike out from the list all multiples of the number you identified in the previous step.
'5. Repeat steps 3 and 4 until you reach a number that is greater than the square root of n (the highest number in the list).
'6. All the remaining numbers in the list are prime.
Private Function Sieve_of_Eratosthenes(ByVal MaxNum As Integer) As List(Of Integer)
'tested to MaxNum = 10,000,000 - on 1.8Ghz Laptop it took 1.4 seconds
Dim thePrimes As New List(Of Integer)
Dim toNum As Integer = MaxNum, stpw As New Stopwatch
If toNum > 1 Then 'the first prime is 2
stpw.Start()
thePrimes.Capacity = toNum 'size the list
Dim idx As Integer
Dim stopAT As Integer = CInt(Math.Sqrt(toNum) + 1)
'1. Create a contiguous list of numbers from two to some highest number n.
'2. Strike out from the list all multiples of 2, 3, 5.
For idx = 0 To toNum
If idx > 5 Then
If idx Mod 2 <> 0 _
AndAlso idx Mod 3 <> 0 _
AndAlso idx Mod 5 <> 0 Then thePrimes.Add(idx) Else thePrimes.Add(-1)
Else
thePrimes.Add(idx)
End If
Next
'mark 0,1 and 4 as non-prime
thePrimes(0) = -1
thePrimes(1) = -1
thePrimes(4) = -1
Dim aPrime, startAT As Integer
idx = 7 'starting at 7 check for primes and multiples
Do
'3. The list's next number that has not been struck out is a prime number.
'4. Strike out from the list all multiples of the number you identified in the previous step.
'5. Repeat steps 3 and 4 until you reach a number that is greater than the square root of n (the highest number in the list).
If thePrimes(idx) <> -1 Then ' if equal to -1 the number is not a prime
'not equal to -1 the number is a prime
aPrime = thePrimes(idx)
'get rid of multiples
startAT = aPrime * aPrime
For mltpl As Integer = startAT To thePrimes.Count - 1 Step aPrime
If thePrimes(mltpl) <> -1 Then thePrimes(mltpl) = -1
Next
End If
idx += 2 'increment index
Loop While idx < stopAT
'6. All the remaining numbers in the list are prime.
thePrimes = thePrimes.FindAll(Function(i As Integer) i <> -1)
stpw.Stop()
Debug.WriteLine(stpw.ElapsedMilliseconds)
End If
Return thePrimes
End Function