I would like to multiple 2 positive signed 32-bit integers i and j. The worst case scenario is when i and j are both INT_MAX where their product is within the limit of a 64-bit integer, but when I perform the operation
int i = INT_MAX;
int j = INT_MAX;
long long int res = i * j;
I get garbage due to integer overflow. So I've typically solved that problem by casting i or j to a long long int
int i = INT_MAX;
int j = INT_MAX;
long long int res = (long long int)i * j;
Is this the typical workaround for this issue? Are there other ways that may be better?
Your solution is correct, and standard enough that quality compilers will recognize it. Some CPU's have dedicated 32x32->64 multiplication instructions, and you can reasonably expect a compiler to use such an instruction despite the cast.
Related
I was solving a problem wherein the task was to output the result of a pascal triangle at a given row mentioned by a user.
https://leetcode.com/problems/pascals-triangle-ii/
I wrote my solution which had issues storing huge factorial results.
vector<int> getRow(int rowIndex) {
vector<int> v;
int C = 1;
v.push_back(1);
for (int i = 1; i <= rowIndex; i++)
{
printf("%d ", C);
C = C * (rowIndex +1 - i) / i;
v.push_back(C);
}
return v;
}
On going through these questions,
What range of values can integer types store in C++
How many bytes is unsigned long long?
and going through some other sources, i made the following change which gave me the required result.
C = (unsigned long long)C * (rowIndex +1 - i) / i;
Since "C" is of type int and my vector v stores int, i wanted to know why would a cast unsigned long long still give me valid results.
The sub-expression C * (rowIndex +1 - i) can overflow before your division. By casting C to a larger data-type then the whole expression becomes that type, so the multiplication will not overflow. Then after the division with i the result is converted to an int again, but because of the division it is within range of an int.
Note that this is only for the values you currently have. If you continue with even higher values then sooner or later you will have overflows that can't be fixed by such a cast.
When you say
(unsigned long long)C
you are not making actual variable C a unsigned long long. You are just saying when doing this
C * (rowIndex +1 - i) / i;
treat C (at the right side) as unsigned long long. That is, only the temporary space to hold C, then hold its multiplication with (rowIndex +1 - i), and then its division with i is done at a space that big. If the whole result was bigger than a value that integer can have, this would not work either.
I've written a C++ function to calculate factorial and used it to calculate 22C11 (Combination). I have declared a variable ans and set it to 0. I tried to calculate
22C11 = fact(2*n)/(fact(n)*fact(n))
where i sent n as 11. For some reason, i'm getting a negative value stored in answer. How can i fix this?
long int fact(long int n) {
if(n==1||n==0)
return 1;
long int x=1;
if(n>1)
x=n*fact(n-1);
return x;
}
The following lines are included in the main function:
long int ans=0;
ans=ans+(fact(2*n)/(fact(n)*fact(n)));
cout<<ans;
The answer i'm getting is -784
The correct answer should be 705432
NOTE: This function is working perfectly fine for n<=10. I have tried long long int instead of long int but it still isn't working.
It is unwise to actually calculate factorials - they grow extremely fast. Generally, with combinatorial formulae it's a good idea to look for a way to re-order operations to keep intermediate results somewhat constrained.
For example, let's look at (2*n)!/(n!*n!). It can be rewritten as ((n+1)*(n+2)*...*(2*n)) / (1*2*...*n) == (n+1)/1 * (n+2)/2 * (n+3)/3 ... * (2*n)/n. By interleaving multiplication and division, the rate of growth of intermediate result is reduced.
So, something like this:
int f(int n) {
int ret = 1;
for (int i = 1; i <= n; ++i) {
ret *= (n + i);
ret /= i;
}
return ret;
}
Demo
22! = 1,124,000,727,777,607,680,000
264 = 18,446,744,073,709,551,615
So unless you have 128-bit integers for unsigned long long you have integer overflow.
You are triggering integer overflow, which causes undefined behaviour. You could in fact use long long int, or unsigned long long int to get a little bit more precision, e.g:
unsigned long long fact(int n)
{
if(n < 2)
return 1;
return fact(n-1) * n;
}
You say you tried this and it didn't work but I'm guessing you forgot to also update the type of x or something. (In my version I removed x as it is redundant). And/or your calculation still was so big that it overflowed unsigned long long int.
You may be interested in this thread which shows an algorithm for working out nCr that doesn't require so much intermediate storage.
You increasing your chances of success by avoiding the brute force method.
COMB(N1, N2) = FACT(N1)/(FACT(N1-N2)*FACT(N2))
You can take advantage of the fact that both the nominator and the denominator have a lot of common terms.
COMB(N1, N2) = (N1-N2+1)*(N1-N2+2)*...*N1/FACT(N1)
Here's an implementation that makes use of that knowledge and computes COMB(22,11) with much less risk of integer overflow.
unsigned long long comb(int n1, int n2)
{
unsigned long long res = 1;
for (int i = (n1-n2)+1; i<= n1; ++i )
{
res *= i;
}
for (int i = 2; i<= n2; ++i )
{
res /= i;
}
return res;
}
When I compile this trivial piece of code via Microsoft's VC 2008:
double maxDistance(unsigned long long* a, unsigned long long* b, int n)
{
double maxD = 0, currentD = 0;
for(int i = 0; i < n; ++i)
{
currentD = b[i] - a[i];
if(currentD > maxD)
{
maxD = currentD;
}
}
return maxD;
}
The compiler gives me:
warning C4244 stating: conversion from 'unsigned long long' to 'double', possible loss of data. On the line
currentD = b[i] - a[i]
I know that it's better to rewrite the code somehow, I use double to account for possible negative values of the difference, but I'm just curious, why in the world conversion from unsigned long long to double can lead to data loss if unsigned long long's range is from 0 to 18,446,744,073,709,551,615 and double is
+/- 1.7E +/- 308 ?
An IEEE double-precision floating point number has 53 bits of mantissa. This means that (most) integers greater than 253 can't be stored exactly in a double.
Example program (this is for GCC, use %I64u for MSVC):
#include <stdio.h>
int main() {
unsigned long long ull;
ull = (1ULL << 53) - 1;
printf("%llu %f\n", ull, (double)ull);
ull = (1ULL << 53) + 1;
printf("%llu %f\n", ull, (double)ull);
return 0;
}
Output:
9007199254740991 9007199254740991.000000
9007199254740993 9007199254740992.000000
A double supports a larger range of possible values, but cannot represent all values in that range. Some of the values that cannot be represented are integral values, which a long or a long long can represent.
Trying to assign a value into a floating point variable that it cannot represent means the result is some approximation - a value that is close, but not exactly equal. That represents a potential data loss (depending on what value is being assigned).
Couldn't you initialize an unsigned int and then increment it until it doesn't increment anymore? That's what I tried to do and I got a runtime error "Timeout." Any idea why this doesn't work? Any idea how to do it correctly?
#include
int main() {
unsigned int i(0), j(1);
while (i != j) {
++i;
++j;
}
std::cout << i;
return 0;
}
Unsigned arithmetic is defined as modulo 2^n in C++ (where n is
the number of bits). So when you increment the maximum value,
you get 0.
Because of this, the simplest way to get the maximum value is to
use -1:
unsigned int i = -1;
std::cout << i;
(If the compiler gives you a warning, and this bothers you, you
can use 0U - 1, or initialize with 0, and then decrement.)
Since i will never be equal to j, you have an infinite loop.
Additionally, this is a very inefficient method for determining the maximum value of an unsigned int. numeric_limits gives you the result without looping for 2^(16, 32, 64, or however many bits are in your unsigned int) iterations. If you didn't want to do that, you could write a much smaller loop:
unsigned int shifts = sizeof(unsigned int) * 8; // or CHAR_BITS
unsigned int maximum_value = 1;
for (int i = 1; i < shifts; ++i)
{
maximum_value <<= 1;
++maximum_value;
}
Or simply do
unsigned int maximum = (unsigned int)-1;
i will always be different than j, so you have entered an endless loop. If you want to take this approach, your code should look like this:
unsigned int i(0), j(1);
while (i < j) {
++i;
++j;
}
std::cout << i;
return 0;
Notice I changed it to while (i<j). Once j overflows i will be greater than j.
When an overflow happens, the value doesn't just stay at the highest, it wraps back abound to the lowest possible number.
i and j will be never equal to each other. When an unsigned integral value achieves its maximum adding to it 1 will result in that the next value will be the minimum that is 0.
For example if to consider unsigned char then its maximum is 255. After adding 1 you will get 0.
So your loop is infinite.
I assume you're trying to find the maximum limit that an unsigned integer can store (which is 65,535 in decimal). The reason that the program will time out is because when the int hits the maximum value it can store, it "Goes off the end." The next time j increments, it will be 65,535; i will be 0.
This means that the way you're going about it, i would NEVER equal j, and the loop would run indefinitely. If you changed it to what Damien has, you'd have i == 65,535; j equal to 0.
Couldn't you initialize an unsigned int and then increment it until it doesn't increment anymore?
No. Unsigned arithmetic is modular, so it wraps around to zero after the maximum value. You can carry on incrementing it forever, as your loop does.
Any idea how to do it correctly?
unsigned int max = -1; // or
unsigned int max = std::numeric_limits<unsigned int>::max();
or, if you want to use a loop to calculate it, change your condition to (j != 0) or (i < j) to stop when j wraps. Since i is one behind j, that will contain the maximum value. Note that this might not work for signed types - they give undefined behaviour when they overflow.
This should find the largest prime factor of a number.. but it isn't working..
The answer should be 6857, but it is returning 688543..
int isPrime(unsigned long int n)
{
for(unsigned long int i=2;i*i<(n);i++)
{
if(n%i==0)
{
return 0;
break;
}
}
return 1;
}
int main()
{
unsigned long int num=600851475143;
unsigned long int max=2, i=2;
while(num!=1)
{
if(num%i==0 && isPrime(i))
{
max=i;
num/=i;
i--;
}
i++;
}
cout<<max;
return 0;
}
Thanks in advance:)
Among other issues, this will be a problem with large numbers:
for(unsigned long int i=2;i*i<(n);i++)
i*i for large numbers will overflow for an unsigned long (which appears to be 32-bits on the system you are compiling for).
You can fix it by switching it:
for (unsigned long int i = 2; i <= sqrt(n); ++i)
As long as n didn't overflow, the sqrt(n) will be valid. However, I would still suggest a switch to using unsigned long long if you are going to use numbers that get very close to the bounds for 32-bit integers.
unsigned long is apparently 32 bit on your system, so num won't be 600851475143 but instead 600851475143 mod 1<<32 which is 3851020999. 688543 is the largest prime factor of this number, so it appears that your algorithm works correctly at least.
Look up the maximum ranges for the types on your compiler/system combination, then pick an appropriate one.