What operations are time consuming and how to avoid them? - c++

I am fairly new to c++ and I want to learn how to optimize the speed of my programs. I am currently working on a program that computes the perfect numbers from 1 to 1.000.000. A perfect number is where the sum of its proper divisors is equal to the number itself. Eg. 28 is a perfect number because 1+2+4+7+14=28. Below is my code
#include <iostream>
using namespace std;
int main() {
int a = 1000000;
for(int i = 1; i <= a; ++i)
{
int sum = 0;
int q;
// The biggest proper divisor is half the number itself
if(i % 2 == 0) q = i/2;
else q = (i+1)/2;
for(int j = 1; j <= q; ++j)
{
if(i % j == 0) sum += j;
}
//Condition for perfect number
if(sum == i) cout << i << " is a perfect number!" << endl;
}
system("pause");
return 0;
}
What operations in this code are time consuming? How can I improve the speed of the program? In general, how do I learn about what operations that are time consuming and how to avoid them?

The only way to really know what operations are time consuming and are limiting the execution speed of your program is to run the program through a profiler. This tool will tell you where each second of the execution time was spent (on a function call basis, usually).
To answer your question specifically: the most time in this program will be spent at this line:
system("pause");
because, aside from the fact that this is a horrible line of code you should get rid of, is actually user input, and as we all know, the thing between the chair and the screen is what slows things down.

You may trade of computation by memory consumption with the following:
const int max = 1000000;
std::vector<std::size_t> a(max);
for(std::size_t i = 1; i != a.size(); ++i) {
for (std::size_t j = 2 * i; j < a.size(); j += i) {
a[j] += i;
}
}
for (std::size_t i = 1; i != a.size(); ++i) {
if(a[i] == i) {
std::cout << i << " is a perfect number!" << std::endl;
}
}
Live example

Branches: ifs, loops, function calls and goto are costly. They tend to distract the processor from perform data transfers and math operations.
For example, you can eliminate the if statement:
q = (i + (i % 2)) / 2; // This expression not simplified.
Research loop unrolling, although the compiler may perform this on higher optimization settings.
I/O operations are costly, especially using formatted I/O.
Try this:
if(sum == i)
{
static const char text[] = " is a perfect number!\n";
cout << i;
cout.write(text, sizeof(text) - 1); // Output as raw data.
}
Division and modulo operations are costly.
If you can divide by a power of 2, you can convert the division into a shift right.
You may be able to avoid modulo operations by using binary AND.

My rules of thumb:
conditional branches (i.e. comparisons) are costly
divisions are costly (as well as modulo %)
prefer integer operations over floating-point
How to avoid them ?
Well, there is no simple answer. In many cases you just can't.
You avoid conditional branches by using branchless expressions, or improving the program logics.
You avoid divisions by using shifts or lookup-tables or rewriting expressions (when possible).
You avoid floating-point by simulating fixed-point.
In the given example, you have to focus on the body of the innermost loop. That's the line that is the most frequently executed (about 125000000000 times vs 1000000 for others). Unfortunately, there is a comparison and a division, which are hard to remove.
Optimizations of other parts of the code will have no measurable effect. In particular, don't worry about the cout statement: it will be called 4 times in total.

Related

Modulo operation on power function

I am getting this error while performing modulo operation on power function.
invalid operands of types ‘int’ and ‘__gnu_cxx::__promote_2<int, int, double, double>::__type {aka double}’ to binary ‘operator%’
this is my piece of code.
#include <bits/stdc++.h>
using namespace std;
int main() {
int t, n;
cin >> t;
int i, j, sum = 0;
for (i = 0; i < t; i++) {
cin >> n;
for (j = 1; (n % pow(5, j)) == 0; j++)
sum = sum + (n / pow(5, j));
cout << sum;
}
return 0;
}
pow is returning a double, modulo can only operate on int. Throw in some explaining variables and this will become more obvious. The code will also be more readable and more performant.
Wrong Types
The compiler error message results from the fact that the pow routine routines a double type, but the % operator accepts only integer operands.
pow is a tempting routine to use for exponentiation, but it returns a floating-point type, and you generally should not mix floating-point and integer arithmetic, for reasons including:
There are considerable problems and subtleties in using floating-point arithmetic, including issues with handling rounding errors.
Some implementations of pow are deficient in that they return inexact answers when exact answers are representable in the double type. For example, pow(5, 3) might return a number slightly below 125, and then taking a remainder modulo that (or its truncation to an integer) would not give the result you want.
Better Method
One way to resolve the immediate problem is to replace pow with a routine of your own that raises an integer to a non-negative integer power simply by multiplying repeatedly. However, there is a better approach. Change these two lines:
for (j = 1; (n % pow(5, j)) == 0; j++)
sum = sum + (n / pow(5, j));
to this:
for (j = 1; n % 5 == 0; j++)
{
n /= 5;
sum = sum + n;
}
So, instead of repeatedly using a power of 5 (5, 25, 125,…) with n, we instead divide n by 5 repeatedly.
Other Issues
These changes will give code that does what the code in your question would do if pow returned an integer type, for cases where it would not overflow. However, I suspect there are other issues in your code and it does not compute what you intended.
I think it is most likely your assignment was to write a program that computes the number of trailing zeros in n! (n factorial) when written in decimal. The number of trailing zeros in n! is the exponent of the greatest power of 10 that divides n!. This power is determined by the factors of 5 that are available, because every trailing zero requires a factor of 2 and a factor of 5 (to make 10) in n!, but it is constrained by the factors of 5 because factors of 2 are plentiful.
Thus, 1!, 2!, 3!, and 4! have no trailing zeros because they have no factors of 5. 5! has one trailing zero, and so do 6!, 7!, 8!, and 9!. Then 10! has two trailing zeros, as we can see since 1•2•3•4•5•6•7•8•9•10 has two factors of 5. The trailing zeros increase to three at 15! and four at 20!. So far, the number of trailing zeros of n! is n/5, truncated to an integer. Then, at 25!, we add not one but two factors of 5, since 25 is 52. Now the number of trailing zeros is not n/5 but n/5 + n/5/5. With some thought, we can see that, in general, the number of trailing zeros of n! is n/5 + n/5/5 + n/5/5/5 + n/5/5/5/5 + …, ending where the term reaches zero.
If your program continued while n / 5j was not zero instead of when n modulo 5j was zero, it would calculate this sum. The similarity of your program to this calculation leads me to suspect that was the intent. If so, change the lines to:
for (j = 1; 0 < n; j++)
{
n /= 5;
sum = sum + n;
}
(I have phrased it this way for simplicity, but we can also see that, if n < 5, the final iteration of the loop adds nothing, so we can also change the loop condition from 0 < n to 5 <= n.)
Additionally, the sum is not reset when a new n is read. Remove the declaration of sum before the first for loop and insert int sum = 0; after the first for and before the second.
Generally, it is good practice not to declare things before you need them. So remove the declaration of n from the top of main and put it after the first for. Remove the declarations of i and j before the for loop and define each one in its for loop: for (int i = 0; i < t; i++) and for (int j = 1; 5 <= n; j++).
In cout << sum;, you probably want a new-line character: cout << sum << '\n'; or cout << sum << std::endl;.
Do not include <bits/stdc++.h>. Instead, including standard headers, such as <iostream> for this program.
Avoid using using namespace std;. Use std:: in your code (e.g., std::cin instead of cin) even though it requires more typing or be selective about taking a few specific things from namespaces, such as using std::cin; instead of the entire namespace. While this initially requires more work, it avoids programs and trains you to be better cognizant of what your program is using.
Like #Keynan mentioned, pow returns a double while % requires the argument to be int. To make it work, you can cast the result to int with either static_cast or C-style cast.
// static_cast
for (j = 1; (n % static_cast<int>(pow(5, j))) == 0; j++)
// c-style cast
for (j = 1; (n % (int)pow(5, j)) == 0; j++)
Related links:
Why isn't int pow(int base, int exponent) in the standard C++ libraries?

How can I improve my prime number program with Sieve of Eratosthenes algorithm?

My program prints all prime numbers from this expression:
((1 + sin(0.1*i))*k) + 1, i = 1, 2, ..., N.
Input Format:
No more than 100 examples. Every example has 2 positive integers on the same line.
Output Format:
Print each number on a separate line.
Sample Input:
4 10
500 100
Sample Output:
5
17
But my algorithm is not efficient enough. How can I add Sieve of Eratosthenes so it can be efficient enough to not print "Terminated due to timeout".
#include <iostream>
#include <cmath>
using namespace std;
int main() {
long long k, n;
int j;
while (cin >> k >> n) {
if (n>1000 && k>1000000000000000000) continue;
int count = 0;
for (int i = 1; i <= n; i++) {
int res = ((1 + sin(0.1*i)) * k) + 1;
for (j = 2; j < res; j++) {
if (res % j == 0) break;
}
if (j == res) count++;
}
cout << count << endl;
}
system("pause");
You can improve your speed by 10x simply by doing a better job with your trial division. You're testing all integers from 2 to res instead of treating 2 as a special case and testing just odd numbers from 3 to the square root of res:
// k <= 10^3, n <= 10^9
int main() {
unsigned k;
unsigned long long n;
while (cin >> k >> n) {
unsigned count = 0;
for (unsigned long long i = 1; i <= n; i++) {
unsigned long long j, res = (1 + sin(0.1 * i)) * k + 1;
bool is_prime = true;
if (res <= 2 || res % 2 == 0) {
is_prime = (res == 2);
} else {
for (j = 3; j * j <= res; j += 2) {
if (res % j == 0) {
is_prime = false;
break;
}
}
}
if (is_prime) {
count++;
}
}
cout << count << endl;
}
}
Though k = 500 and n = 500000000 is still going to take forty seconds or so.
EDIT: I added a 3rd mean to improve efficiency
EDIT2: Added an explanation why Sieve should not be the solution and some trigonometry relations. Moreover, I added a note on the history of the question
Your problem is not to count all the prime numbers in a given range, but only those which are generated by your function.
Therefore, I don't think that the Sieve of Eratosthenes is the solution for this particular exercise, for the following reason: n is always rather small while k can be very large. If kis very large, then the Sieve algorithm would have to generate a huge number of prime numbers, for finally use it for a small number of candidates.
You can improve the efficiency of you program by three means:
Avoid calculating sin(.) every time. You can use trigonometric relations for example. Moreover, first time you calculate these values, store them in an array and reuse these values. Calculation of sin()is very time consuming
In your test to check if a number is prime, limit the search to sqrt(res). Moreover, consider make the test with odd j only, plus 2
If a candidate res is equal to the previous one, avoid redoing the test
A few trigonometry
If c = cos(0.1) and s = sin(0.1), you can use the relations :
sin (0.1(i+1)) = s*cos (0.1*i) + c*sin(0.1*i))
cos (0.1(i+1)) = c*cos (0.1*i) - s*sin(0.1*i))
If n were large, it should be necessary to recalculate the sin() by the function regularly to avoid too much rounding error calculation. But it should not be the case here as n is always rather small.
However, as I mentioned, it is better to use only the "memorization" trick in a first step and check if it is enough.
A note on the history of this question and why this answer:
Recently, this site received several questions " how to improve my program, to count number of prime numbers generated by this k*sin() function ..." To my knowledge, these questions were all closed as duplicate, under the reason that the Sieve is the solution and was explained in a previous similar (but slightly different) question. Now, the same question reappeared under a slightly different form "How can I insert the Sieve algorithm in this program ... (with k*sin() again)". And then I realised that the Sieve is not the solution. It is not a criticism to previous closes as I made the same mistake in the understanding on the question. However, I think it is time to propose a new solution, even it is does not match the new question perfectly
When you make use of a simple Wheel factorization, you can obtain a very nice speedup of your code. Wheel factorization of order 2 makes use of the fact that all primes bigger than 3 can be written as 6n+1 or 6n+5 for natural n. This means that you only have to do 2 divisions per 6 numbers. Or even further, all primes bigger than 5 can be written as 30n+m, with m in {1,7,11,13,17,19,23,29}. ( 8 divisions per 30 numbers).
Using this simple principle, you can write the following function to test your primes (wheel {2,3}):
bool isPrime(long long num) {
if (num == 1) return false; // 1 is not prime
if (num < 4) return true; // 2 and 3 are prime
if (num % 2 == 0) return false; // divisible by 2
if (num % 3 == 0) return false; // divisible by 3
int w = 5;
while (w*w <= num) {
if(num % w == 0) return false; // not prime
if(num % (w+2) == 0) return false; // not prime
w += 6;
}
return true; // must be prime
}
You can adapt the above for the wheel {2,3,5}. This function can be used in the main program as:
int main() {
long long k, n;
while (cin >> k >> n) {
if (n>1000 && k>1000000000000000000) continue;
int count = 0;
for (int i = 1; i <= n; i++) {
long long res = ((1 + sin(0.1*i)) * k) + 1;
if (isPrime(res)) { count++; }
}
cout << count << endl;
}
return 0;
}
A simple timing gives me for the original code (g++ prime.cpp)
% time echo "6000 100000000" | ./a.out
12999811
echo "6000 100000000" 0.00s user 0.00s system 48% cpu 0.002 total
./a.out 209.66s user 0.00s system 99% cpu 3:29.70 total
while the optimized version gives me
% time echo "6000 100000000" | ./a.out
12999811
echo "6000 100000000" 0.00s user 0.00s system 51% cpu 0.002 total
./a.out 10.12s user 0.00s system 99% cpu 10.124 total
Other improvements can be made but might have minor effects:
precompute your sine-table sin(0.1*i) for i from 0 to 1000. This will avoid recomputing those sines over and over. This however, has a minor impact as most time is wasted on the primetest.
Checking if res(i) == res(i+1): this has barely any impact as, depending on n and k most consecutive res are not equal.
Use a lookup table, might be handier, this does have an impact.
original answer:
My suggestion is the following:
Precompute your sinetable sin(0.1*i) for i from 0 to 1000. This will avoid recomputing those sines over and over. Also, do it smart (see point 3)
Find the largest possible value of res which is res_max=(2*k)+1
Find all primes for res_max using the Sieve of Eratosthenes. Also, realize that all primes bigger than 3 can be written as 6n+1 or 6n+5 for natural n. Or even further, all primes bigger than 5 can be written as 30n+m, with m in {1,7,11,13,17,19,23,29}. This is what is called Wheel factorization. So do not bother checking any other number. (a tiny bit more info here)
Have a lookup table that states if a number is a prime.
Do all your looping over the lookup table.

combinatorial exercise in C++ seeing speed decrement vs VBA

just for recreation I thought I'd model a combinatorial problem with a monte carlo implementation. I did an implementation in VBA, and then as an exercise I thought I'd try and write it in C++ (am a complete novice) to check speed differences etc. Other than my not knowing advanced coding techniques/tricks, I had naively thought that as long as the model was faithfully transferred to C++ with mirroring functions / loops / variable types as far as possible etc that other than for minor tweaks the power of C++ would give me an immediate speed improvement as am running a lot of sims with lots of embedded sorting etc. Well, quite the opposite is occurring so, so there must be something seriously wrong with the C++ implementation, which is about half as fast at best depending on parameters. They both converge to the same answer so am happy mathematically that they work.
The problem:
Suppose you have N days in which to allocate k exams randomly with, eg 2 exam slots per day (AM/PM). What is the probability that say 2 days are full exam days? I think I have a closed form for this which I believe for now, so anyway wanted to test with MC.
Algorithm Heuristic:
Quite simply, say we have 18 days, 6 exams, 2 slots a day, and we want to know the probability we'll have 2 full days.
(i) simulate 6 uniforms U_i
(ii) allocate slots to the exams by randomly allocating them amongst remaining slots using the uniforms adjusting for slots already allotted. As as example if Exam 4 got allocated slot 4 in 34-slot space but 3 and 5 were already taken, then in 36-slot space Exam_4 would be allotted slot 6 (that would be the first free slot after rebasing). Have implemented this with some embedded sorting (in VB Bubblesort/quicksort has negligible diff, so far in C++ just using bubblesort).
(iii) just convert the slots into days, then count the sims that hit the target.
Phew - that's just there for background. The spirit of this is not really to optimise the algorithm, just to help me understand what I've done wrong to make it so much slower when 'mirrored' in C++!!
The Code!
// monte carlo
#include "stdafx.h"
#include"AllocateSlots.h"
#include<vector>
#include<string>
#include<iostream>
#include<cmath>
#include<ctime>
using namespace std;
int main()
{
int i, j, k, m;
int days, exams, slotsperday, filledslotsperday, targetfulldays, filleddays;
long sims, count, simctr;
cout << "Days?: ";cin >> days;
cout << "Exams?: ";cin >> exams;
cout << "Slots Per Day?: ";cin >> slotsperday;
cout << "Filled Slots?: ";cin >> filledslotsperday;
cout << "Target Full Days?: ";cin >> targetfulldays;
cout << "No. of sims?: ";cin >> sims;
system("PAUSE");
//timer
clock_t start;
start = clock();
double randomvariate;
//define intervals for remaining slots
vector <double> interval(exams);
int totalslots = (days * slotsperday);
for (k = 1; k <= exams; k++)
{
interval[k-1] = 1 / (static_cast <double> (totalslots - k + 1));
}
vector <int> slots(exams); //allocated slots
vector <int> previousslots(exams); //previously allocated slots
vector <int> slotdays(exams); //days on which slots fall
srand((int) time(0)); //generates seed on current system time
count = 0;
for (simctr = 1; simctr <= sims; simctr++)
{
vector<int> daycounts(days); //initialised at 0
for (i = 1; i <= exams;i++)
{
//rand() generates integers in [0.0,32767]
randomvariate = (static_cast <double> (rand()+1))/ (static_cast <double> (RAND_MAX+1));
j = 1;
while (j <= totalslots - i + 1)
{
if (randomvariate < j*interval[i - 1]) break;
j++;
}
slots[i - 1] = j;
}
for (i = 2; i <= exams;i++)
{
previousslots.resize(i - 1);
for (m = 1; m <= i - 1; m++)
{
previousslots[m - 1] = slots[m - 1];
}
BubbleSort(previousslots);
for (k = 1; k <= i - 1;k++)
{
if (slots[i - 1] >= previousslots[k - 1])
{
slots[i - 1]++ ;
}
}
}
//convert slots into days
for (i = 1; i <= exams;i++)
{
slotdays[i - 1] = SlottoDays(slots[i - 1], slotsperday);
}
//calculate the filled days
filleddays = 0;
for (j = 1; j <= days; j++)
{
for (k = 1; k <= exams; k++)
{
if (slotdays[k - 1] == j)
{
daycounts[j - 1]++;
}
}
if (daycounts[j - 1] == filledslotsperday)
{
filleddays++;
}
}
//check if target is hit
if (filleddays == targetfulldays)
{
count++;
}
}
cout << count << endl;
cout << "Time: " << (clock() - start) / (double)(CLOCKS_PER_SEC) << " s" << endl;
//cout << (static_cast<double>(count)) / (static_cast<double>(sims));
system("PAUSE");
return 0;
}
And the 2 ancillary functions:
#include "stdafx.h"
#include"AllocateSlots.h"
#include<iostream>
#include<cmath>
#include<vector>
using namespace std;
//returns day for a given slot
int SlottoDays(int &examslot, int &slotsperday)
{
return((examslot % slotsperday == 0) ? examslot/ slotsperday: examslot/ slotsperday + 1);
}
//BubbleSort Algorithm
vector <int> BubbleSort(vector <int> &values)
{
int i;
int j;
int tmpSort;
int N = values.size();
for (i = 0; i < N;i++)
{
for (j = i + 1; j < N; j++)
{
if (values[i] > values[j])
{
tmpSort = values[j];
values[j] = values[i];
values[i] = tmpSort;
}
}
}
return values;
}
So there it is - like I say the algorithm is common to C++ and VBA, happy to post the VBA but in the first instance just wondered if there was glaringly thing glaringly obvious in the above. Pretty much first time have done this, used vectors etc etc, unaided, self 'taught' so have definitely screwed something up even though have managed to make it run by some miracle! Be very grateful for some words of wisdom - trying to teach myself C++ with exercises like this but what I really want to get to is speed (and mathematical accuracy of course!) for much larger projects.
Fyi in my example of 18 days, 6 exams, 2 slots per day, 2 days to get filled it should converge to about 3.77%, which it does with 1mm sims in 38s in VBA and 145s in the implementation above on my duocore 2.7G i7 4GB RAM laptop on x64 windows7.
From the discussion in comments it sounds like you may be running your program in Debug mode. This turns off a number of optimisations and even generates some extra code.
To run in Release mode look for the Solution Configurations drop down in the Standard Tool Bar and use the drop down to change from Debug to Release.
Then rebuild your Solution and rerun your tests.
To explore program performance further in Visual Studio you'll want to use the Performance Profiler tool. There's a tutorial on using the Performance Profiler (including a video) on the Microsoft documentation site: Profile application performance in Visual Studio. There's also Quickstart: First look at profiling tools and a whole bunch more: all under the Profiling in Visual Studio section.

fastest method for finding number of prime numbers between two large numbers x and y

here x,y<=10^12 and y-x<=10^6
i have looped from left to right and checked each number for a prime..this method is very slow when x and y are somewhat like 10^11 and 10^12..any faster approach?
i hv stored all primes till 10^6..can i use them to find primes between huge values like 10^10-10^12?
for(i=x;i<=y;i++)
{
num=i;
if(check(num))
{
res++;
}
}
my check function
int check(long long int num)
{
long long int i;
if(num<=1)
return 0;
if(num==2)
return 1;
if(num%2==0)
return 0;
long long int sRoot = sqrt(num*1.0);
for(i=3; i<=sRoot; i+=2)
{
if(num%i==0)
return 0;
}
return 1;
}
Use a segmented sieve of Eratosthenes.
That is, use a bit set to store the numbers between x and y, represented by x as an offset and a bit set for [0,y-x). Then sieve (eliminate multiples) for all the primes less or equal to the square root of y. Those numbers that remain in the set are prime.
With y at most 1012 you have to sieve with primes up to at most 106, which will take less than a second in a proper implementation.
This resource goes through a number of prime search algorithms in increasing complexity/efficiency. Here's the description of the best, that is PG7.8 (you'll have to translate back to C++, it shouldn't be too hard)
This algorithm efficiently selects potential primes by eliminating multiples of previously identified primes from consideration and
minimizes the number of tests which must be performed to verify the
primacy of each potential prime. While the efficiency of selecting
potential primes allows the program to sift through a greater range of
numbers per second the longer the program is run, the number of tests
which need to be performed on each potential prime does continue to
rise, (but rises at a slower rate compared to other algorithms).
Together, these processes bring greater efficiency to generating prime
numbers, making the generation of even 10 digit verified primes
possible within a reasonable amount of time on a PC.
Further skip sets can be developed to eliminate the selection of potential primes which can be factored by each prime that has already
been identified. Although this process is more complex, it can be
generalized and made somewhat elegant. At the same time, we can
continue to eliminate from the set of test primes each of the primes
which the skip sets eliminate multiples of, minimizing the number of
tests which must be performed on each potential prime.
You can use the Sieve of Eratosthenes algorithm. This page has some links to implementations in various languages: https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes.
Here is my implementation of Sieve of Erathostenes:
#include <string>
#include <iostream>
using namespace std;
const int k = 110000; //you can change this constant to whatever maximum int you would need to calculate
long int p[k]; //here we would store Sieve of Erathostenes from 2 to k
long int j;
void init_prime() //in here we set our array
{
for (int i = 2; i <= k; i++)
{
if (p[i] == 0)
{
j = i;
while (j <= k)
{
p[j] = i;
j = j + i;
}
}
}
/*for (int i = 2; i <= k; i++)
cout << p[i] << endl;*/ //if you uncomment this you can see the output of initialization...
}
string prime(int first, int last) //this is example of how you can use initialized array
{
string result = "";
for (int i = first; i <= last; i++)
{
if (p[i] == i)
result = result + to_str(i) + "";
}
return result;
}
int main() //I done this code some time ago for one contest, when first input was number of cases and then actual input came in so nocases means "number of cases"...
{
int nocases, first, last;
init_prime();
cin >> nocases;
for (int i = 1; i <= nocases; i++)
{
cin >> first >> last;
cout << prime(first, last);
}
return 0;
}
You can use the Sieve of Erathostenes to calculate factorial too. This is actually the fastest interpretation of the Sieve I could manage to create that day (it can calculate the Sieve of this range in less than a second)

C++ Euler-Problem 14 Program Freezing

I'm working on Euler Problem 14:
http://projecteuler.net/index.php?section=problems&id=14
I figured the best way would be to create a vector of numbers that kept track of how big the series was for that number... for example from 5 there are 6 steps to 1, so if ever reach the number 5 in a series, I know I have 6 steps to go and I have no need to calculate those steps. With this idea I coded up the following:
#include <iostream>
#include <vector>
#include <iomanip>
using namespace std;
int main()
{
vector<int> sizes(1);
sizes.push_back(1);
sizes.push_back(2);
int series, largest = 0, j;
for (int i = 3; i <= 1000000; i++)
{
series = 0;
j = i;
while (j > (sizes.size()-1))
{
if (j%2)
{
j=(3*j+1)/2;
series+=2;
}
else
{
j=j/2;
series++;
}
}
series+=sizes[j];
sizes.push_back(series);
if (series>largest)
largest=series;
cout << setw(7) << right << i << "::" << setw(5) << right << series << endl;
}
cout << largest << endl;
return 0;
}
It seems to work relatively well for smaller numbers but this specific program stalls at the number 113382. Can anyone explain to me how I would go about figuring out why it freezes at this number?
Is there some way I could modify my algorithim to be better? I realize that I am creating duplicates with the current way I'm doing it:
for example, the series of 3 is 3,10,5,16,8,4,2,1. So I already figured out the sizes for 10,5,16,8,4,2,1 but I will duplicate those solutions later.
Thanks for your help!
Have you ruled out integer overflow? Can you guarantee that the result of (3*j+1)/2 will always fit into an int?
Does the result change if you switch to a larger data type?
EDIT: The last forum post at http://forums.sun.com/thread.jspa?threadID=5427293 seems to confirm this. I found this by googling for 113382 3n+1.
I think you are severely overcomplicating things. Why are you even using vectors for this?
Your problem, I think, is overflow. Use unsigned ints everywhere.
Here's a working code that's much simpler and that works (it doesn't work with signed ints however).
int main()
{
unsigned int maxTerms = 0;
unsigned int longest = 0;
for (unsigned int i = 3; i <= 1000000; ++i)
{
unsigned int tempTerms = 1;
unsigned int j = i;
while (j != 1)
{
++tempTerms;
if (tempTerms > maxTerms)
{
maxTerms = tempTerms;
longest = i;
}
if (j % 2 == 0)
{
j /= 2;
}
else
{
j = 3*j + 1;
}
}
}
printf("%d %d\n", maxTerms, longest);
return 0;
}
Optimize from there if you really want to.
When i = 113383, your j overflows and becomes negative (thus never exiting the "while" loop).
I had to use "unsigned long int" for this problem.
The problem is overflow. Just because the sequence starts below 1 million does not mean that it cannot go above 1 million later. In this particular case, it overflows and goes negative resulting in your code going into an infinite loop. I changed your code to use "long long" and this makes it work.
But how did I find this out? I compiled your code and then ran it in a debugger. I paused the program execution while it was in the loop and inspected the variables. There I found that j was negative. That pretty much told me all I needed to know. To be sure, I added a cout << j; as well as an assert(j > 0) and confirmed that j was overflowing.
I would try using a large array rather than a vector, then you will be able to avoid those duplicates you mention as for every number you calculate you can check if it's in the array, and if not, add it. It's probably actually more memory efficient that way too. Also, you might want to try using unsigned long as it's not clear at first glance how large these numbers will get.
i stored the length of the chain for every number in an array.. and during brute force whenever i got a number less than that being evaluated for, i just added the chain length for that lower number and broke out of the loop.
For example, i already know the Collatz sequence for 10 is 7 lengths long.
now when i'm evaluating for 13, i get 40, then 20, then 10.. which i have already evaluated. so the total count is 3 + 7.
the result on my machine (for upto 1 million) was 0.2 secs. with pure brute force that was 5 seconds.