Optimizing my code for finding the factors of a given integer - c++

Here is my code,but i'lld like to optimize it.I don't like the idea of it testing all the numbers before the square root of n,considering the fact that one could be faced with finding the factors of a large number. Your answers would be of great help. Thanks in advance.
unsigned int* factor(unsigned int n)
{
unsigned int tab[40];
int dim=0;
for(int i=2;i<=(int)sqrt(n);++i)
{
while(n%i==0)
{
tab[dim++]=i;
n/=i;
}
}
if(n>1)
tab[dim++]=n;
return tab;
}

Here's a suggestion on how to do this in 'proper' c++ (since you tagged as c++).
PS. Almost forgot to mention: I optimized the call to sqrt away :)
See it live on http://liveworkspace.org/code/6e2fcc2f7956fafbf637b54be2db014a
#include <vector>
#include <iostream>
#include <iterator>
#include <algorithm>
typedef unsigned int uint;
std::vector<uint> factor(uint n)
{
std::vector<uint> tab;
int dim=0;
for(unsigned long i=2;i*i <= n; ++i)
{
while(n%i==0)
{
tab.push_back(i);
n/=i;
}
}
if(n>1)
tab.push_back(n);
return tab;
}
void test(uint x)
{
auto v = factor(x);
std::cout << x << ":\t";
std::copy(v.begin(), v.end(), std::ostream_iterator<uint>(std::cout, ";"));
std::cout << std::endl;
}
int main(int argc, const char *argv[])
{
test(1);
test(2);
test(4);
test(43);
test(47);
test(9997);
}
Output
1:
2: 2;
4: 2;2;
43: 43;
47: 47;
9997: 13;769;

There's a simple change that will cut the run time somewhat: factor out all the 2's, then only check odd numbers.

If you use
... i*i <= n; ...
It may run much faster than i <= sqrt(n)
By the way, you should try to handle factors of negative n or at least be sure you never pass a neg number

I'm afraid you cannot. There is no known method in the planet can factorize large integers in polynomial time. However, there are some methods can help you slightly (not significantly) speed up your program. Search Wikipedia for more references. http://en.wikipedia.org/wiki/Integer_factorization

As seen from your solution , you find basically all prime numbers ( the condition while (n%i == 0)) works like that , especially for the case of large numbers , you could compute prime numbers beforehand, and keep checking only those. The prime number calculation could be done using Sieve of Eratosthenes method or some other efficient method.

unsigned int* factor(unsigned int n)
If unsigned int is the typical 32-bit type, the numbers are too small for any of the more advanced algorithms to pay off. The usual enhancements for the trial division are of course worthwhile.
If you're moving the division by 2 out of the loop, and divide only by odd numbers in the loop, as mentioned by Pete Becker, you're essentially halving the number of divisions needed to factor the input number, and thus speed up the function by a factor of very nearly 2.
If you carry that one step further and also eliminate the multiples of 3 from the divisors in the loop, you reduce the number of divisions and hence increase the speed by a factor close to 3 (on average; most numbers don't have any large prime factors, but are divisible by 2 or by 3, and for those the speedup is much smaller; but those numbers are quick to factor anyway. If you factor a longer range of numbers, the bulk of the time is spent factoring the few numbers with large prime divisors).
// if your compiler doesn't transform that to bit-operations, do it yourself
while(n % 2 == 0) {
tab[dim++] = 2;
n /= 2;
}
while(n % 3 == 0) {
tab[dim++] = 3;
n /= 3;
}
for(int d = 5, s = 2; d*d <= n; d += s, s = 6-s) {
while(n % d == 0) {
tab[dim++] = d;
n /= d;
}
}
If you're calling that function really often, it would be worthwhile to precompute the 6542 primes not exceeding 65535, store them in a static array, and divide only by the primes to eliminate all divisions that are a priori guaranteed to not find a divisor.
If unsigned int happens to be larger than 32 bits, then using one of the more advanced algorithms would be profitable. You should still begin with trial divisions to find the small prime factors (whether small should mean <= 1000, <= 10000, <= 100000 or perhaps <= 1000000 would need to be tested, my gut feeling says one of the smaller values would be better on average). If after the trial division phase the factorisation is not yet complete, check whether the remaining factor is prime using e.g. a deterministic (for the range in question) variant of the Miller-Rabin test. If it's not, search a factor using your favourite advanced algorithm. For 64 bit numbers, I'd recommend Pollard's rho algorithm or an elliptic curve factorisation. Pollard's rho algorithm is easier to implement and for numbers of that magnitude finds factors in comparable time, so that's my first recommendation.

Int is way to small to encounter any performance problems. I just tried to measure the time of your algorithm with boost but couldn't get any useful output (too fast). So you shouldn't worry about integers at all.
If you use i*i I was able to calculate 1.000.000 9-digit integers in 15.097 seconds. It's good to optimize an algorithm but instead of "wasting" time (depends on your situation) it's important to consider if a small improvement really is worth the effort. Sometimes you have to ask yourself if you rally need to be able to calculate 1.000.000 ints in 10 seconds or if 15 is fine as well.

Related

How to find the sum of largest odd factor of a number?

I've a problem to find sum of largest odd factor of a number ( F(x) ) and F(x) = f(1)+f(2)+...+f(x). As you know the largest factor of 1 is 1, 2 is 1, 3 is 3, 4 is 1, and so on...
e.g The sum of largest odd factor of a number 6 is f(1)+f(2)+f(3)+f(4)+f(5)+f(6), that is 1+1+3+1+5+3 or 14.
And I want to try to solve a number until 2*10^9
So this is my code for the f(x) that get 82/100 before timeout
unsigned long long int biggestOddFactor(unsigned long long int n){
//!(n & (n - 1))
while(n%2 == 0){
n /= 2;
}
return n;
}
This is the another method by removing the zero on the last bit of a number, but it only make 77/100
#include <bitset>
unsigned long long int biggestOddFactorUsingBinary(unsigned long long int n){
std::string bin = std::bitset<32>(n).to_string();
int delet = 0;
for(int i = bin.length()-1; i >= 0; i--){
if(bin[i] == '0'){
delet += 1;
}else{
break;
}
}
bin = bin.substr(0, bin.length()-delet);
return std::bitset<32>(bin).to_ulong();;
}
Is there any way to optimize my algorithm?
You're asking the wrong question. The problem is not that you are finding the biggest odd factor too slowly, the problem is that your algorithm for finding the sum is too slow, not that this one part of it is too slow.
For example, the largest odd factor of every odd number is the number itself, and there's a formula for the sum of the first n odd numbers. Why are you not using that to halve the number of times you call biggestOddFactor? That's just for starters.
The largest odd factor of any even number is the same as that for half that number. So the sum of the largest odd factors of, say, 16, 14, 12, and 10 is the same as that for 8, 7, 6, and 5. Yet you compute these two ranges separately? Why?
And so on. You need to optimize your algorithm, not your implementation of a bad algorithm. The concepts above suggest several possible recursive implementations that will be much faster.
I just very quickly whipped up a solution to this problem using a better algorithm and it's thousands of times faster than just calling biggestOddFactor on every number. Note that my solution is recursive.
You should always consider algorithmic optimizations before you try to micro-optimize an implementation. The payoff tends to be much greater and the result is much less fragile.

sieve or eratosthenes calculator -- running into memory issues and crashing with numbers >=1,000,000

I'm not exactly sure why this is. I tried changing the variables to long long, and I even tried doing a few other things -- but its either about the inefficiency of my code (it literally does the whole process of finding all primes up to the number, then checking against the number to see if its divisible by that prime -- very inefficient, but its my first attempt at this and I feel pretty accomplished having it work at all....)
Or the fact that it overflows the stack. Im not sure where it is exactly, but all I know is that it MUST be related to memory and the way its dealing with the number.
If I had to guess, Id say its a memory issue happening when it is dealing with the prime number generation up to that number -- thats where it dies even if I remove the check against the input number.
I'll post my code -- just be aware, I didnt change long long back to int in a few places, and I also have a SquareRoot Variable that is not used, because it was supposed to try and help memory efficiency but was not effective the way I tried to do it. I Just never deleted it. I will clean up the code when and if I can successfully finish it.
As far as I am aware though, it DOES work pretty reliably for 999,999 and down, I actually checked it up against other calculators of the same type and it seemingly does generate the proper answers.
If anyone can help or explain what I screwed up here, your helping a guy trying to learn on his own without any school or anything. so its appreciated.
#include <iostream>
#include <cmath>
void sieve(int ubound, int primes[]);
int main()
{
long long n;
int i;
std::cout << "Input Number: ";
std::cin >> n;
if (n < 2) {
return 1;
}
long long upperbound = n;
int A[upperbound];
int SquareRoot = sqrt(upperbound);
sieve(upperbound, A);
for (i = 0; i < upperbound; i++) {
if (A[i] == 1 && upperbound % i == 0) {
std::cout << " " << i << " ";
}
}
return 0;
}
void sieve(int ubound, int primes[])
{
long long i, j, m;
for (i = 0; i < ubound; i++) {
primes[i] = 1;
}
primes[0] = 0, primes[1] = 0;
for (i = 2; i < ubound; i++) {
for(j = i * i; j < ubound; j += i) {
primes[j] = 0;
}
}
}
If you used legal C++ constructs instead of non-standard variable length arrays, your code will run (whether it produces the correct answers is another question).
The issue is more than likely that you're exceeding the limits of the stack when you declare arrays with a million or more elements.
Therefore instead of this:
long long upperbound = n;
A[upperbound];
Use std::vector:
#include <vector>
//...
long long upperbound = n;
std::vector<int> A(upperbound);
and then:
sieve(upperbound, A.data());
The std::vector does not use the stack space to allocate its elements (unless you have written an allocator for it that uses the stack).
As a matter of fact, you don't even need to pass upperbound to sieve, as a std::vector knows its own size by calling the size() member function. But I leave that as an exercise.
Live example using 2,000,000
First of all, read and apply PaulMcKenzie's advice. That's the most important thing. I'm only addressing some teeny bits of your question that remained open.
It seems that you are trying to factor the number that you misleadingly called upperbound. The mysterious role of the square root of this number is related to this fact: if the number is composite at all - and hence can be computed as the product of some prime factors - then the smallest of these prime factors cannot be greater than the square root of the number. In fact, only one factor can possibly be greater, all others cannot exceed the square root.
However, in its present form your code cannot draw advantage from this fact. The trial division loop as it stands now has to run up to number_to_be_factored / 2 in order not to miss any factors because its body looks like this:
if (sieve[i] == 1 && number_to_be_factored % i == 0) {
std::cout << " " << i << " ";
}
You can factor much more efficiently if you refactor your code a bit: when you have found the smallest prime factor p of your number then the remaining factors to be found must be precisely those of rest = number_to_be_factored / p (or n = n / p, if you will), and none of the remaining factors can be smaller than p. However, don't forget that p might occur more than once as a factor.
During any round of the proceedings you only need to consider the prime factors between p and the square root of the current number; if none of those primes divides the current number then it must be prime. To test whether p exceeds the square root of some number n you can use if (p * p > n), which is computationally more efficient that actually computing the square root.
Hence the square root occurs in two different roles:
the square root of the number to be factored limits the amount of sieving that needs to be done
during the trial division loop, the square root of the current number gives an upper bound for the highest prime factor that you need to consider
That's two faces of the same coin but two different usages in the actual code.
Note: once you got your code working by applying PaulMcKenzie's advice, you might also to consider posting over on Code Review.

How to calculate the sum of the bitwise xor values of all the distinct combination of the given numbers efficiently?

Given n(n<=1000000) positive integer numbers (each number is smaller than 1000000). The task is to calculate the sum of the bitwise xor ( ^ in c/c++) value of all the distinct combination of the given numbers.
Time limit is 1 second.
For example, if 3 integers are given as 7, 3 and 5, answer should be 7^3 + 7^5 + 3^5 = 12.
My approach is:
#include <bits/stdc++.h>
using namespace std;
int num[1000001];
int main()
{
int n, i, sum, j;
scanf("%d", &n);
sum=0;
for(i=0;i<n;i++)
scanf("%d", &num[i]);
for(i=0;i<n-1;i++)
{
for(j=i+1;j<n;j++)
{
sum+=(num[i]^num[j]);
}
}
printf("%d\n", sum);
return 0;
}
But my code failed to run in 1 second. How can I write my code in a faster way, which can run in 1 second ?
Edit: Actually this is an Online Judge problem and I am getting Cpu Limit Exceeded with my above code.
You need to compute around 1e12 xors in order to brute force this. Modern processors can do around 1e10 such operations per second. So brute force cannot work; therefore they are looking for you to figure out a better algorithm.
So you need to find a way to determine the answer without computing all those xors.
Hint: can you think of a way to do it if all the input numbers were either zero or one (one bit)? And then extend it to numbers of two bits, three bits, and so on?
When optimising your code you can go 3 different routes:
Optimising the algorithm.
Optimising the calls to language and library functions.
Optimising for the particular architecture.
There may very well be a quicker mathematical way of xoring every pair combination and then summing them up, but I know it not. In any case, on the contemporary processors you'll be shaving off microseconds at best; that is because you are doing basic operations (xor and sum).
Optimising for the architecture also makes little sense. It normally becomes important in repetitive branching, you have nothing like that here.
The biggest problem in your algorithm is reading from the standard input. Despite the fact that "scanf" takes only 5 characters in your computer code, in machine language this is the bulk of your program. Unfortunately, if the data will actually change each time your run your code, there is no way around the requirement of reading from stdin, and there will be no difference whether you use scanf, std::cin >>, or even will attempt to implement your own method to read characters from input and convert them into ints.
All this assumes that you don't expect a human being to enter thousands of numbers in less than one second. I guess you can be running your code via: myprogram < data.
This function grows quadratically (thanks #rici). At around 25,000 positive integers with each being 999,999 (worst case) the for loop calculation alone can finish in approximately a second. Trying to make this work with input as you have specified and for 1 million positive integers just doesn't seem possible.
With the hint in Alan Stokes's answer, you may have a linear complexity instead of quadratic with the following:
std::size_t xor_sum(const std::vector<std::uint32_t>& v)
{
std::size_t res = 0;
for (std::size_t b = 0; b != 32; ++b) {
const std::size_t count_0 =
std::count_if(v.begin(), v.end(),
[b](std::uint32_t n) { return (n >> b) & 0x01; });
const std::size_t count_1 = v.size() - count_0;
res += count_0 * count_1 << b;
}
return res;
}
Live Demo.
Explanation:
x^y = Sum_b((x&b)^(y&b)) where b is a single bit mask (from 1<<0 to 1<<32).
For a given bit, with count_0 and count_1 the respective number of count of number with bit set to 0 or 1, we have count_0 * (count_0 - 1) 0^0, count_0 * count_1 0^1 and count_1 * (count_1 - 1) 1^1 (and 0^0 and 1^1 are 0).

Need a way to make this code run faster

I'm trying to solve Project Euler problem 401. They only way I could find a way to solve it was brute-force. I've been running this code for like 10 mins without any answer. Can anyone help me with ideas improve it.
Code:
#include <iostream>
#include <cmath>
#define ull unsigned long long
using namespace std;
ull sigma2(ull n);
ull SIGMA2(ull n);
int main()
{
ull ans = SIGMA2(1000000000000000) % 1000000000;
cout << "Answer: " << ans << endl;
cin.get();
cin.ignore();
return 0;
}
ull sigma2(ull n)
{
ull sum = 0;
for(ull i = 1; i<=floor(sqrt(n)); i++)
{
if(n%i == 0)
{
sum += (i*i)+((n/i)*(n/i));
}
if(i*i == n)
{
sum -= n;
}
}
return sum;
}
ull SIGMA2(ull n)
{
ull sum = 0;
for(ull i = 1; i<=n; i++)
{
sum+=sigma2(i);
}
return sum;
}
You're missing some dividers, if a/b=c, and b is a divider of a then c will also be a divider of a but cmight be greater than floor(sqrt(a)), for example 3 > floor(sqrt(6)) but divides 6.
Then you should put your floor(sqrt(n)) in a variable and use the variable in the for, otherwise you recalculate it a every operation which is very expensive.
You can do some straightforward optimizations:
inline sigma2,
calculate floor(sqrt(n)) before the loop (but compiler may be doing it anyway, though),
precalculate squares of all ints from 1 to n and then use array lookup instead of multiplication
You will gain more by changing your approach. Think what you are trying to do - summing squares of all divisors of all integers from 1 to n. You grouped divisors by what they divide, but you can regroup terms in this sum. Let's group divisors by their value:
1 divides everything so it will appear n times in the sum, bringing 1*1*n total,
2 divides evens and will appear n/2 (integer division!) times, bringing 2*2*(n/2) total,
k ... will bring k*k*(n/k) total.
So we should just add up k*k*(n/k) for k from 1 to n.
Think about the problem.
Bruteforce the way you tried is obviously not a good idea.
You should come up with something better...
Isn't there any method how to use some nice prime factorization method to speed up the computation? Isn't there any recursion pattern? Try to find something...
One simple optimization that you can carry out is that there will be many repeated factors in the numbers.
So first estimate in how many numbers would 1 be a factor ( all N numbers ).
In how many numbers would 2 be a factor ( N/2 ).
...
Similarly for others.
Just multiply their squares with their frequency.
Time complexity shall then straight-away reduce to O(N)
There are obvious microoptimizations such as ++i rather than i++ or getting floor(sqrt(n)) out of the loop (these are two floating point operations which are really expensive compared to other integer operation in the loop), and calculting n/i only once (use a dummy variable for it and then calculate the square of the dummy).
There are also rather obvious simplifications in the algorithm. For example SIGMA2(i) = SIGMA2(i-1) + sigma2(i). But do not use recursion since you need a really huge number, this would not work and your stack memory would be exhausted. Use loop instead of recursion. There is a huge potential for improvement.
And well, there is a bigger problem - 10^15 has 15 digits. This number squared has 30 digits. There is no way you can store this into unsigned long long, which has I think about 20 digits. So you need to employ somehow the modulo 10^9 (the end of the assignment) and get additional space for your calculations...
And when using brute force, print out the temporary result every milion number for example to give you idea how fast you are approaching to the final result. Waiting 10 minutes blindly is not a good idea.

C++ program to calculate quotients of large factorials

How can I write a c++ program to calculate large factorials.
Example, if I want to calculate (100!) / (99!), we know the answer is 100, but if i calculate the factorials of the numerator and denominator individually, both the numbers are gigantically large.
expanding on Dirk's answer (which imo is the correct one):
#include "math.h"
#include "stdio.h"
int main(){
printf("%lf\n", (100.0/99.0) * exp(lgamma(100)-lgamma(99)) );
}
try it, it really does what you want even though it looks a little crazy if you are not familiar with it. Using a bigint library is going to be wildly inefficient. Taking exps of logs of gammas is super fast. This runs instantly.
The reason you need to multiply by 100/99 is that gamma is equivalent to n-1! not n!. So yeah, you could just do exp(lgamma(101)-lgamma(100)) instead. Also, gamma is defined for more than just integers.
You can use the Gamma function instead, see the Wikipedia page which also pointers to code.
Of course this particular expression should be optimized, but as for the title question, I like GMP because it offers a decent C++ interface, and is readily available.
#include <iostream>
#include <gmpxx.h>
mpz_class fact(unsigned int n)
{
mpz_class result(n);
while(n --> 1) result *= n;
return result;
}
int main()
{
mpz_class result = fact(100) / fact(99);
std::cout << result.get_str(10) << std::endl;
}
compiles on Linux with g++ -Wall -Wextra -o test test.cc -lgmpxx -lgmp
By the sounds of your comments, you also want to calculate expressions like 100!/(96!*4!).
Having "cancelled out the 96", leaving yourself with (97 * ... * 100)/4!, you can then keep the arithmetic within smaller bounds by taking as few numbers "from the top" as possible as you go. So, in this case:
i = 96
j = 4
result = i
while (i <= 100) or (j > 1)
if (j > 1) and (result % j == 0)
result /= j
--j
else
result *= i
++i
You can of course be cleverer than that in the same vein.
This just delays the inevitable, though: eventually you reach the limits of your fixed-size type. Factorials explode so quickly that for heavy-duty use you're going to need multiple-precision.
Here's an example of how to do so:
http://www.daniweb.com/code/snippet216490.html
The approach they take is to store the big #s as a character array of digits.
Also see this SO question: Calculate the factorial of an arbitrarily large number, showing all the digits
You can use a big integer library like gmp which can handle arbitrarily large integers.
The only optimization that can be made here (considering that in m!/n! m is larger than n) means crossing out everything you can before using multiplication.
If m is less than n we would have to swap the elements first, then calculate the factorial and then make something like 1 / result. Note that the result in this case would be double and you should handle it as double.
Here is the code.
if (m == n) return 1;
// If 'm' is less than 'n' we would have
// to calculate the denominator first and then
// make one division operation
bool need_swap = (m < n);
if (need_swap) std::swap(m, n);
// #note You could also use some BIG integer implementation,
// if your factorial would still be big after crossing some values
// Store the result here
int result = 1;
for (int i = m; i > n; --i) {
result *= i;
}
// Here comes the division if needed
// After that, we swap the elements back
if (need_swap) {
// Note the double here
// If m is always > n then these lines are not needed
double fractional_result = (double)1 / result;
std::swap(m, n);
}
Also to mention (if you need some big int implementation and want to do it yourself) - the best approach that is not so hard to implement is to treat your int as a sequence of blocks and the best is to split your int to series, that contain 4 digits each.
Example: 1234 | 4567 | 2323 | 2345 | .... Then you'll have to implement every basic operation that you need (sum, mult, maybe pow, division is actually a tough one).
To solve x!/y! for x > y:
int product = 1;
for(int i=0; i < x - y; i ++)
{
product *= x-i;
}
If y > x switch the variables and take the reciprocal of your solution.
I asked a similar question, and got some pointers to some libraries:
How can I calculate a factorial in C# using a library call?
It depends on whether or not you need all the digits, or just something close. If you just want something close, Stirling's Approximation is a good place to start.