C++ Euler-Problem 14 Program Freezing - c++

I'm working on Euler Problem 14:
http://projecteuler.net/index.php?section=problems&id=14
I figured the best way would be to create a vector of numbers that kept track of how big the series was for that number... for example from 5 there are 6 steps to 1, so if ever reach the number 5 in a series, I know I have 6 steps to go and I have no need to calculate those steps. With this idea I coded up the following:
#include <iostream>
#include <vector>
#include <iomanip>
using namespace std;
int main()
{
vector<int> sizes(1);
sizes.push_back(1);
sizes.push_back(2);
int series, largest = 0, j;
for (int i = 3; i <= 1000000; i++)
{
series = 0;
j = i;
while (j > (sizes.size()-1))
{
if (j%2)
{
j=(3*j+1)/2;
series+=2;
}
else
{
j=j/2;
series++;
}
}
series+=sizes[j];
sizes.push_back(series);
if (series>largest)
largest=series;
cout << setw(7) << right << i << "::" << setw(5) << right << series << endl;
}
cout << largest << endl;
return 0;
}
It seems to work relatively well for smaller numbers but this specific program stalls at the number 113382. Can anyone explain to me how I would go about figuring out why it freezes at this number?
Is there some way I could modify my algorithim to be better? I realize that I am creating duplicates with the current way I'm doing it:
for example, the series of 3 is 3,10,5,16,8,4,2,1. So I already figured out the sizes for 10,5,16,8,4,2,1 but I will duplicate those solutions later.
Thanks for your help!

Have you ruled out integer overflow? Can you guarantee that the result of (3*j+1)/2 will always fit into an int?
Does the result change if you switch to a larger data type?
EDIT: The last forum post at http://forums.sun.com/thread.jspa?threadID=5427293 seems to confirm this. I found this by googling for 113382 3n+1.

I think you are severely overcomplicating things. Why are you even using vectors for this?
Your problem, I think, is overflow. Use unsigned ints everywhere.
Here's a working code that's much simpler and that works (it doesn't work with signed ints however).
int main()
{
unsigned int maxTerms = 0;
unsigned int longest = 0;
for (unsigned int i = 3; i <= 1000000; ++i)
{
unsigned int tempTerms = 1;
unsigned int j = i;
while (j != 1)
{
++tempTerms;
if (tempTerms > maxTerms)
{
maxTerms = tempTerms;
longest = i;
}
if (j % 2 == 0)
{
j /= 2;
}
else
{
j = 3*j + 1;
}
}
}
printf("%d %d\n", maxTerms, longest);
return 0;
}
Optimize from there if you really want to.

When i = 113383, your j overflows and becomes negative (thus never exiting the "while" loop).
I had to use "unsigned long int" for this problem.

The problem is overflow. Just because the sequence starts below 1 million does not mean that it cannot go above 1 million later. In this particular case, it overflows and goes negative resulting in your code going into an infinite loop. I changed your code to use "long long" and this makes it work.
But how did I find this out? I compiled your code and then ran it in a debugger. I paused the program execution while it was in the loop and inspected the variables. There I found that j was negative. That pretty much told me all I needed to know. To be sure, I added a cout << j; as well as an assert(j > 0) and confirmed that j was overflowing.

I would try using a large array rather than a vector, then you will be able to avoid those duplicates you mention as for every number you calculate you can check if it's in the array, and if not, add it. It's probably actually more memory efficient that way too. Also, you might want to try using unsigned long as it's not clear at first glance how large these numbers will get.

i stored the length of the chain for every number in an array.. and during brute force whenever i got a number less than that being evaluated for, i just added the chain length for that lower number and broke out of the loop.
For example, i already know the Collatz sequence for 10 is 7 lengths long.
now when i'm evaluating for 13, i get 40, then 20, then 10.. which i have already evaluated. so the total count is 3 + 7.
the result on my machine (for upto 1 million) was 0.2 secs. with pure brute force that was 5 seconds.

Related

C++ - Digitwise addition with carryover for arbitrary unsigned ints - running into memory problems [duplicate]

I have made some research on Stackoverflow about reverse for loops in C++ that use an unsigned integer instead of a signed one. But I still do NOT understand why there is a problem (see Unsigned int reverse iteration with for loops). Why the following code will yield a segmentation fault?
#include <vector>
#include <iostream>
using namespace std;
int main(void)
{
vector<double> x(10);
for (unsigned int i = 9; i >= 0; i--)
{
cout << "i= " << i << endl;
x[i] = 1.0;
}
cout << "x0= " << x[0] << endl;
return 0;
}
I understand that the problem is when the index i will be equal to zero, because there is something like an overflow. But I think an unsigned integer is allowed to take the zero value, isn't it? Now if I replace it with a signed integer, there is absolutely no problem.
Does somebody can explain me the mechanism behind that reverse loop with an unsigned integer?
Thank you very much!
The problem here is that an unsigned integer is never negative.
Therefore, the loop-test:
i >= 0
will always be true. Thus you get an infinite loop.
When it drops below zero, it wraps around to the largest value unsigned value.
Thus, you will also be accessing x[i] out-of-bounds.
This is not a problem for signed integers because it will simply go negative and thus fail i >= 0.
Thus, if you want to use unsigned integers, you can try one of the following possibilities:
for (unsigned int i = 9; i-- != 0; )
and
for (unsigned int i = 9; i != -1; i--)
These two were suggested by GManNickG and AndreyT from the comments.
And here's my original 3 versions:
for (unsigned int i = 9; i != (unsigned)0 - 1; i--)
or
for (unsigned int i = 9; i != ~(unsigned)0; i--)
or
for (unsigned int i = 9; i != UINT_MAX; i--)
The problem is, your loop allows i to be as low as zero and only expects to exit the loop if i is less than 0. Since i is unsigned, it can never be less than 0. It rolls over to 2^32-1. That is greater than the size of your vector and so results in a segfault.
Whatever the value of unsigned int i it is always true that i >= 0 so your for loop never ends.
In other words, if at some point i is 0 and you decrement it, it still stays non-negative, because it contains then a huge number, probably 4294967295 (that is 232-1).
The problem is here:
for (unsigned int i = 9; i >= 0; i--)
You are starting with a value of 9 for an unsigned int and your exit definition is i >= 0 and this will be always true. (unsigned int will never be negative!!!). Because of this your loop will start over (endless loop, because i=0 then -1 goes max uint).
As you said a decrease of an unsigned below zero, which happens right after the last step of the loop, creates an overflow, the number wraps around to its maximum value and thus we end up with an infinite loop.
Does somebody can explain me the mechanism behind that reverse loop with an unsigned integer?
My preferred method for a reverse loop with an index is this:
for (unsigned int i = 9; i > 0; --i) {
cout << "i= " << x[i - 1] << endl;
}
and that is why because it maps most closely to the normal loop equivalent:
for (unsigned int i = 0; i < 9; ++i) {
cout << "i= " << x[i] << endl;
}
If then you need to access the indexed element multiple times and you don't want to continuously write [i - 1], you can add something like this as the first line in the loop:
auto& my_element = my_vector[i - 1];

If we list all the natural numbers below 10 that [C++]

The code that I did was for solving the following problem. However, the logic of the code is wrong and I as a good newbie cannot figure what is wrong.
After I compile the result of 'sum' is always 0, if I change the initialization of 'sum' for a whatever number, that whatever number is what appear as answer of 'sum'.
If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23.
Find the sum of all the multiples of 3 or 5 below 1000.
#include <iostream>
using std::cout;
using std::endl;
int main()
{
long sum = 0;
for( long i; i < 1000; ++i )
{
if (( i % 3 == 0 ) || ( i % 5 == 0 ))
{
sum = sum + i;
}
}
cout << "The sum is: " << sum << endl;
return 0;
}
You need to initialize i in your loop:
for (long i = 0; i < 1000; ++i )
As it is, i is probably some random number greater than 0 at the top of the loop, and the result is that the loop is never executed.
You need to initialize i to zero otherwise i's value will be whatever happens to be in memory. In this case it's > 1000.
for (long i = 0; i < 1000; ++i)
Also, a nice trick I learned. Use ii as your index variable. It's much easier to find ii than just i in your code.
You forgot to set i=O in your loop so the loop won't iterate.
All answers above are correct - you need to set your i to 0. What you might be interested in is that accessing uninitialized variable is an "undefined behaviour" (UB) and modern compilers are actually pretty good in finding the UBs and using them to optimize the code. For example, GCC 5.2 with -O2 (optimizations enabled) will generate same assembly for your code and for one without the loop at all regardless of what might happen to be in the memory at the address of i.
Try putting this code to (which is your code with additional #if for convenience) this online disassembler
#include <iostream>
using std::cout;
using std::endl;
int main()
{
long sum = 0;
#if 1
for (long i; i < 1000; ++i)
{
if ((i % 3 == 0) || (i % 5 == 0))
{
sum = sum + i;
}
}
#endif
cout << "The sum is: " << sum << endl;
return 0;
}
add the -O2 flag to compiler options on top right and try changing #if 1 to #if 0 and observe that the disassembly is the same meaning that the compiler cut out the loop completely.

Sieve of Eratosthenes for large numbers c++

Just like this question, I also am working on the sieve of Eratosthenes. Also from the book "programming principles and practice using c++", chapter 4. I was able to implement it correctly and it is functioning exactly as the exercise asks.
#include <iostream>
#include <vector>
using namespace std;
int main() {
unsigned int amount = 0;
cin >> amount;
vector<int>numbers;
for (unsigned int i = 0; i <= amount; i++) {
numbers.push_back(i);
}
for (unsigned int p = 2; p < amount; p++) {
if (numbers[p] == 0)
continue;
cout << p << '\n';
for (unsigned int i = p + p; i <= amount; i += p) {
numbers[i] = false;
}
}
return 0;
}
Now, how would I be able to handle real big numbers in the amount input? The unsigned int type should allow me to enter a number of 2^32=4,294,967,296. But I can't, I run out of memory. Yes, I've done the math: storing 2^32 amount of int, 32 bits each. So 32/8*2^32=16 GiB of memory. I have just 4 GiB...
So what I am really doing here is setting non-primes to zero. So I could use a boolean. But still, they would take 8 bits, so 1 byte each. Theoretical I could go to the limit for unsigned int (8/8*2^32=4 GiB), using some of my swap space for the OS and overhead. But I have a x86_64 PC, so what about numbers larger than 2^32?
Knowing that primes are important in cryptography, there must be a more efficient way of doing this? And are there also ways to optimize the time needed to find all those primes?
In the sense of storage, you could use the std::vector<bool> container. Because of how it works, you have to trade in speed for storage. Because this implements one bit per boolean, your storage becomes 8 times as efficient. You should be possible to get numbers close to 8*4,294,967,296 if you have all your RAM available for this one program. Only thing you need to do, is use unsigned long long to unleash the availability of 64 bit numbers.
Note: Testing the program with the code example below, with an amount input of 8 billion, caused the program to run with a memory usage of approx. 975 MiB, proving the theoretical number.
You can also gain some time, because you can declare the complete vector at once, without iteration: vector<bool>numbers (amount, true); creates a vector of size equal to input amount, with all elements set to true. Now, you can adjust the code to set non-primes to false instead of 0.
Furthermore, once you have followed the sieve up to the square root of amount, all numbers that remain true are primes. Insert if (p * p >= amount) as an additional continue condition, just after you output the prime number. Also this is a humble improvement for your processing time.
Edit: In the last loop, p can be squared, because all numbers until the square of p are already proved not to be primes by previous numbers.
All together you should get something like this:
#include <iostream>
#include <vector>
using namespace std;
int main() {
unsigned long long amount = 0;
cin >> amount;
vector<bool>numbers (amount, true);
for (unsigned long long p = 2; p < amount; p++) {
if ( ! numbers[p])
continue;
cout << p << '\n';
if (p * p >= amount)
continue;
for (unsigned long long i = p * p; i <= amount; i += p) {
numbers[i] = false;
}
}
return 0;
}
You've asked a couple of different questions.
For primes up to 2**32, sieving is appropriate, but you need to work in segments instead of in one big blog. My answer here tells how to do that.
For cryptographic primes, which are very much larger, the process is to pick a number and then test it for primality, using a probabilistic test such as a Miller-Rabin test or a Baillie-Wagstaff test. This process isn't perfect, and occasionally a composite might be chosen instead of a prime, but such an occurrence is very rare.

What operations are time consuming and how to avoid them?

I am fairly new to c++ and I want to learn how to optimize the speed of my programs. I am currently working on a program that computes the perfect numbers from 1 to 1.000.000. A perfect number is where the sum of its proper divisors is equal to the number itself. Eg. 28 is a perfect number because 1+2+4+7+14=28. Below is my code
#include <iostream>
using namespace std;
int main() {
int a = 1000000;
for(int i = 1; i <= a; ++i)
{
int sum = 0;
int q;
// The biggest proper divisor is half the number itself
if(i % 2 == 0) q = i/2;
else q = (i+1)/2;
for(int j = 1; j <= q; ++j)
{
if(i % j == 0) sum += j;
}
//Condition for perfect number
if(sum == i) cout << i << " is a perfect number!" << endl;
}
system("pause");
return 0;
}
What operations in this code are time consuming? How can I improve the speed of the program? In general, how do I learn about what operations that are time consuming and how to avoid them?
The only way to really know what operations are time consuming and are limiting the execution speed of your program is to run the program through a profiler. This tool will tell you where each second of the execution time was spent (on a function call basis, usually).
To answer your question specifically: the most time in this program will be spent at this line:
system("pause");
because, aside from the fact that this is a horrible line of code you should get rid of, is actually user input, and as we all know, the thing between the chair and the screen is what slows things down.
You may trade of computation by memory consumption with the following:
const int max = 1000000;
std::vector<std::size_t> a(max);
for(std::size_t i = 1; i != a.size(); ++i) {
for (std::size_t j = 2 * i; j < a.size(); j += i) {
a[j] += i;
}
}
for (std::size_t i = 1; i != a.size(); ++i) {
if(a[i] == i) {
std::cout << i << " is a perfect number!" << std::endl;
}
}
Live example
Branches: ifs, loops, function calls and goto are costly. They tend to distract the processor from perform data transfers and math operations.
For example, you can eliminate the if statement:
q = (i + (i % 2)) / 2; // This expression not simplified.
Research loop unrolling, although the compiler may perform this on higher optimization settings.
I/O operations are costly, especially using formatted I/O.
Try this:
if(sum == i)
{
static const char text[] = " is a perfect number!\n";
cout << i;
cout.write(text, sizeof(text) - 1); // Output as raw data.
}
Division and modulo operations are costly.
If you can divide by a power of 2, you can convert the division into a shift right.
You may be able to avoid modulo operations by using binary AND.
My rules of thumb:
conditional branches (i.e. comparisons) are costly
divisions are costly (as well as modulo %)
prefer integer operations over floating-point
How to avoid them ?
Well, there is no simple answer. In many cases you just can't.
You avoid conditional branches by using branchless expressions, or improving the program logics.
You avoid divisions by using shifts or lookup-tables or rewriting expressions (when possible).
You avoid floating-point by simulating fixed-point.
In the given example, you have to focus on the body of the innermost loop. That's the line that is the most frequently executed (about 125000000000 times vs 1000000 for others). Unfortunately, there is a comparison and a division, which are hard to remove.
Optimizations of other parts of the code will have no measurable effect. In particular, don't worry about the cout statement: it will be called 4 times in total.

main() not executing, but compiling

I have this simple program:
// Include libraries
#include <iostream>
#include <string>
#include <vector>
using namespace std;
// Include locals
// Start
#define NUMBER 600851475143
int main(int argc, const char* argv[])
{
long long int ans = 0;
long long int num = NUMBER;
vector<int> factors;
do
{
// Get lowest factor
for (int i = 1; i <= num; ++i)
{
if (!(num % i))
{
factors.push_back(i);
num /= i;
break;
}
}
} while (num > 1);
cout << "Calculated to 1.\n";
int highestFactor = numeric_limits<int>::min();
for (int i = 0; i < factors.size(); ++i)
{
if (factors[i] > highestFactor)
{
highestFactor = factors[i];
}
}
ans = highestFactor;
cout << ans << endl;
return EXIT_SUCCESS;
}
compiling with g++ -O2 -c -o prob3.o prob3.cpp proved successful, but when I ran it I saw nothing and it just kept running and I had to Ctrl-C (force-kill) it in the end. When I try to add
int main(int argc, const char* argv[])
{
cout << "Test\n";
to the program, Test didn't get printed too. It's like my program is not executed at all.
Any help or advice is appreciated!
Solution
I forgot prime numbers started at 2. Change for (int i = 1 to for (int i = 2.
Those nested loops are going to loop forever. The inner for loop will only ever execute once because of the break so it will only ever do num /= 1. That means num never decreases and so num > 1 will never be false. I suppose you just need to wait longer!
The reason you're not seeing "Test" is probably because you haven't flushed the output. Try:
std::cout << "Test" << std::endl;
Your program is simply running. It takes a long time to execute.
For the cout << "Test\n";, it's a matter of the cout stream not being flushed: what you wrote to the stream is still in your program memory and not yet flushed to the system to be printed.
Have you tried to start your for condition from 2? The module function doesn't have sense if start from 1.
if (!(num % i))
Num / 1 give 0, so you're not enter in the if condition
Your loop is an infinite loop. The first factor you find is 1 (since num % 1 is 0) and as such you divide num by 1 which results in num which re-enters the for loop, which does the same again and again.
Also with this fixed (initialize i in the loop with 2), your inner for loop is most likely an infinite loop and/or causing UB. Otherwise (as the others stated) it is "just" running very long. For the case where it is different (assuming most common platforms here). This depends on the value you are trying to factor, if the first factor is smaller than std::numeric_limits<int>::max() then this does not apply. Lets call those primes BIGPRIME (600851475149 would be a good example).
long long int is at least 64bit in size. int is unlikely to be bigger than 32bit on most platforms, so when it is not bigger on your platform it can only go up to std::numeric_limits<int>::max() which is (again assuming the common 32bit platform here) 2147483647 which is in turn promoted in the comparison to long long int but keeps its value, which is always smaller than BIGPRIME. Always increasing i does never get anywhere, and once you are at max() you enter UB land as signed integers don't wrap in C++. Your code might infinite loop there, or do some things like recording -1 as a valid factor, or make you pregnant.
You can easily observe that by adding some
if( 0 == (i%100000000)){ std::cout << i << std::endl; }
into the for loop.