There are 2 large integer numbers. When I multiply it the result is always wrong, even if I used long double and the result should be in valid range of long double:
long double f = 1000000000 * 99999;
I debugged, and the result is so strange: -723552768.00000000. Did I missed something? how can I multiple it?
Thanks and regard!
from the C++ standards:
4 An unsuffixed floating constant has type double. If suffixed by the
letter f or F, it has type float. If suffixed by the letter l or L, it
has type long double
auto fl = 1000000000.L * 99999.L;
std::cout << fl << "\n";
or
long double fl = 1000000000L * 99999.L;
std::cout <<"\n"<< fl << "\n";
Numeric literals are int by default in C++. Thus, the expression 1000000000 * 99999 is viewed as the multiplication of two int 's and therefore the result returned by the * operator is an int. This int is only converted to the long double variable f after the multiplication has taken place. Depending on your platform, the range of int is usually from -2147483648 to 2147483647 (or 4 bytes in size). However, the product of 1000000000 x 99999 is 9.9999 x 10^13 which falls outside this range and thus overflow occurs as the int variable is not large enough to hold the value.
To avoid this, at least one of the numbers the * operator operates on should be declared as a long double literal with the suffix .l or .L as follows:
long double f = 1000000000.L * 99999
In the above expression , the * operator will return a long double which is large enough to hold the resulting product before being assigned to f.
Agree with #feiXiang. You are basically multiplying two ints. To do correct calculations, you have to define large numbers as long double. See the code below:
#include <iostream>
using namespace std;
int main()
{
long double a = 1000000000;
long double b = 99999;
long double f = a * b;
cout<<f;
return 0;
}
Output:
9.9999e+13
Actually you invoke undefined behavior with:
long double f = 1000000000 * 99999;
First, evaluate 1000000000 * 99999, which is a multiplication of two int objects. Multiplying two int objects is always an int. Since int is not big enough to represent the result (most likely 32 bits), the upper bits are lost.
Since overflows in signed integer types is undefined, you just triggered undefined behavior. But in this case it is possible to explain what happened, even though it is UB.
The computation keeps only the lowest 32 bits, which should be (1000000000 * 99999) modulo (2**32) == 3571414528. But this value is too big for int. Since on PC int negatives are represented by two's complement, we have to subtract 2**32, every time 2**31<= result < 2**32. This gives -723552768
Now, the last step is:
long double f = -723552768
And that is what you see.
To overcome the issue, either use long long like this:
long double f = 1000000000LL * 99999;
Or double:
long double f = 1000000000.0 * 99999;
1000000000 and 99999 are integer numbers, then the result of 1000000000 * 99999 will be an integer before it is assigned to your variable, and the result is out of range of integer.
You should make sure that the result is a long double first:
long double f = (long double) 1000000000 * 99999;
Or
long double f = 1000000000LL * 99999;
Related
This question already has answers here:
Multiplication of two integers in C++
(3 answers)
Closed 6 years ago.
Found this issue in C++ while detecting Kaprekar numbers in a range. For number 77778 -
unsigned long long sq = pow(n, 2);
returns 6,049,417,284 while
unsigned long long sq = n * n;
returns 1,754,449,988
Any ideas why? Is this some sort of overflow which pow avoids but normal n*n does not.
Assuming your n to be typical int or unsigned int, the reason for this is because
this line
unsigned long long sq = n * n;
is equivalent to
unsigned long long sq = (int)(n * n);
as the n * n will be first processed (both as integers) before assigning the result to sq. So, this is an overflow problem (And welcome to Stack Overflow too!).
You also probably want to understand these terms overflow and casting more by searching around (since they are very common issues in Computing, understanding them early will be of great help!).
This has nothing to do with Kaprekar numbers. In most of nowadays machine int is 32-bit. Thus it can only handle value -2,147,483,648 to 2,147,483,647 (or 0 to 4,294,967,295 for unsigned integer counter part).
Thus processing n * n will give you:
n * n = 6,049,417,284 - 4,294,967,296 = 1,754,449,988 //overflow at (4,294,967,295 + 1)!
If you do casting before hand:
unsigned int n = 77778;
unsigned long long sq = pow(n, 2);
unsigned long long sq2 = (unsigned long long)n * n; //note the casting here.
std::cout << sq << std::endl;
std::cout << sq2 << std::endl;
Then the results will be identical, since there won't be overflow.
Your n is declared as a 32 bit int. You need to either change it to long long or just typecast the operation into long long.
unsigned long long sq=(unsigned long long)n*n;
this will give the right answer
I suspect that n is declared as unsigned int and you've compiler with a data model that assumes int to be 32 bits wide. The maximum value that can be represented with this type would be 232 - 1 = 4294967295. Anything beyond this value would wrap around. So assigning 4294967296 would become 0, 4294967297 would become 1, and so on.
You have an overflow; since both operands are unsigned int the resulting type would be the same too. The true result of the operation would be 6049417284. Assigning it to an unsigned int would (wrap) and become 1754449988 = 6049417284 - 4294967296. This unsigned int result is assigned to a wider type unsigned long long, which doesn't change the value. It's necessary to understand the difference between the result's type (the type of the expression) and destination type (the type of the variable that is going to hold the result).
Wrap around behaviour (more formally modulo n) in unsigned types is well-defined in C++, so the compiler might not warn you.
Quote from Unsigned Arithmetic:
If an unsigned integer overflows, the result is defined modulo 2w, where w is the number of bits in that particular unsigned integer. By implication, an unsigned integer is never negative.
I am testing the function fitsBits(int x, int n) on my own and I figure out there is a condition that doesn't fit in this function, what is the problem?
/*
* fitsBits - return 1 if x can be represented as an
* n-bit, two's complement integer.
* 1 <= n <= 32
* Examples: fitsBits(5,3) = 0, fitsBits(-4,3) = 1
* Legal ops: ! ~ & ^ | + << >>
* Max ops: 15
* Rating: 2
*/
int fitsBits(int x, int n) {
int r, c;
c = 33 + ~n;
r = !(((x << c)>>c)^x);
return r;
}
It seems like it gives the wrong answer in
fitsBits(0x80000000, 0x20);
It gives me 1, but actually it should be 0...
How could I fix it?
Thank you!
fitsBits(0x80000000, 0x20);
This function returns 1, because the first argument of your function is int, which is (in practice these days) a 32 bit signed integer. The largest value that signed 32 bit integer can represent is 0x7FFFFFFF, which is less than the value you are passing in. Because of that your value gets truncated and becomes -0x80000000, something that 32 bit integer can represent. Therefore your function returns 1 (yes, my first argument is something that can be represented using 0x20 = 32 bits).
If you want your function to properly classify number 0x80000000 as something that cannot be represented using 32 bits, you need to change the type of the first argument of your function. One options would've been using an unsigned int, but from your problem definition it seems like you need to properly handle negative numbers, so your remaining option is long long int, that can hold numbers between -0x8000000000000000 and 0x7FFFFFFFFFFFFFFF.
You will need to do couple more adjustments: you need to explicitly specify that your constant is of type long long by using LL suffix, and you now need to shift by 64 - c, not by 32 - c:
#include <stdio.h>
int fitsBits(long long x, int n) {
long long r;
int c;
c = 65 + ~n;
r = !(((x << c)>>c)^x);
return r;
}
int main() {
printf("%d\n", fitsBits(0x80000000LL, 0x20));
return 0;
}
Link to IDEONE: http://ideone.com/G8I3kZ
Left shifts that cause overflow are undefined for signed types. Hence the compiler may optimise (x<<c)>>c as simply x, and the entire function reduces down to return 1;.
Probably you want to use unsigned types.
A second cause of undefined behavior in your code is that c may be greater than or equal to the width of int. A shift of more than the width of the integer type is undefined behavior.
r = (((x << c)>>c)^x); //This will give you 0, meaning r = 0;
OR
r = !((x << c)>>c);
Your function can be simplified to
int fitsBits(int x) {
int r, c;
c = 33;
r = (((x << c)>>c)^x);
return r;
}
Note that when NOT(!) is brought you're asking for opposite of r
This question is regarding the modulo operator %. We know in general a % b returns the remainder when a is divided by b and the remainder is greater than or equal to zero and strictly less than b. But does the above hold when a and b are of magnitude 10^9 ?
I seem to be getting a negative output for the following code for input:
74 41 28
However changing the final output statement does the work and the result becomes correct!
#include<iostream>
using namespace std;
#define m 1000000007
int main(){
int n,k,d;
cin>>n>>k>>d;
if(d>n)
cout<<0<<endl;
else
{
long long *dp1 = new long long[n+1], *dp2 = new long long[n+1];
//build dp1:
dp1[0] = 1;
dp1[1] = 1;
for(int r=2;r<=n;r++)
{
dp1[r] = (2 * dp1[r-1]) % m;
if(r>=k+1) dp1[r] -= dp1[r-k-1];
dp1[r] %= m;
}
//build dp2:
for(int r=0;r<d;r++) dp2[r] = 0;
dp2[d] = 1;
for(int r = d+1;r<=n;r++)
{
dp2[r] = ((2*dp2[r-1]) - dp2[r-d] + dp1[r-d]) % m;
if(r>=k+1) dp2[r] -= dp1[r-k-1];
dp2[r] %= m;
}
cout<<dp2[n]<<endl;
}
}
changing the final output statement to:
if(dp2[n]<0) cout<<dp2[n]+m<<endl;
else cout<<dp2[n]<<endl;
does the work, but why was it required?
By the way, the code is actually my solution to this question
This is a limit imposed by the range of int.
int can only hold values between –2,147,483,648 to 2,147,483,647.
Consider using long long for your m, n, k, d & r variables. If possible use unsigned long long if your calculations should never have a negative value.
long long can hold values from –9,223,372,036,854,775,808 to 9,223,372,036,854,775,807
while unsigned long long can hold values from 0 to 18,446,744,073,709,551,615. (2^64)
The range of positive values is approximately halved in signed types compared to unsigned types, due to the fact that the most significant bit is used for the sign; When you try to assign a positive value greater than the range imposed by the specified Data Type the most significant bit is raised and it gets interpreted as a negative value.
Well, no, modulo with positive operands does not produce negative results.
However .....
The int type is only guaranteed by the C standards to support values in the range -32767 to 32767, which means your macro m is not necessarily expanding to a literal of type int. It will fit in a long though (which is guaranteed to have a large enough range).
If that's happening (e.g. a compiler that has a 16-bit int type and a 32-bit long type) the results of your modulo operations will be computed as long, and may have values that exceed what an int can represent. Converting that value to an int (as will be required with statements like dp1[r] %= m since dp1 is a pointer to int) gives undefined behaviour.
Mathematically, there is nothing special about big numbers, but computers only have a limited width to write down numbers in, so when things get too big you get "overflow" errors. A common analogy is the counter of miles traveled on a car dashboard - eventually it will show as all 9s and roll round to 0. Because of the way negative numbers are handled, standard signed integers don't roll round to zero, but to a very large negative number.
You need to switch to larger variable types so that they overflow less quickly - "long int" or "long long int" instead of just "int", the range doubling with each extra bit of width. You can also use unsigned types for a further doubling, since no range is used for negatives.
Earlier I came up with something, which I solved, but it got me later
let's take a look at a similar example of what I was on:
int b = 35000000; //35million
int a = 30000000;
unsigned long n = ( 100 * a ) / b;
Output: 4294967260
I simply changed a to unsigned long and the correct 85% output would come up, because a is a signed 32bit integer. But this got me later. There is no value assignment to a during ( 100 * a ) there is just simply a calculation and the correct value which is 3billion should come up instead of an overflow. To understand if there wasn't really an assignment to a I removed a from the code and manually write the value instead instead:
int b = 35000000;
unsigned long n = ( 100 * 30000000 ) / b;
The big surprise was that the output is also: 4294967260
And of course value of 3billion can be assigned to an unsigned long.
My first thought was that ( 100 * 30000000 ) was causing an overflow, but then I asked "an overflow on what? there is nothing to be overflowed".
Then I changed b to unsigned long, which even most suprisingly the output was correct 85%.
In the first example changing a to unsigned long
int b = 35000000;
unsigned long a = 30000000;
unsigned long n = ( 100 * a ) / b;
and leaving bas an int as it is works, but on the second example it doesn't, what is occuring?
This might be a little overwhelming to let me re-write all examples with the ones who work and the ones who dont.
Works (Output = 85):
int b = 35000000;
unsigned long a = 30000000;
unsigned long n = ( 100 * a ) / b;
Works (Output = 85):
unsigned long b= 35000000;
unsigned long n = ( 100 * 30000000 ) / b;
Doesn't works (Overflow):
int b = 35000000;
int a = 30000000;
unsigned long n = ( 100 * a ) / b;
Doesn't works (Overflow):
int b = 35000000;
unsigned long n = ( 100 * 30000000 ) / b;
Let me explain what is occuring here.
On:
int b= 35000000;
unsigned long n = ( 100 * 30000000 ) / b;
The value is incorrect because overflow happens at ( 100 * 30000000 )
But on:
unsigned long b= 35000000;
unsigned long n = ( 100 * 30000000 ) / b;
The value is correct, so what is happening?
In the first example b is a int, as said by Tony, an overflow happens because the register where the temporary value of ( 100 * 30000000 )will be assigned is able to hold 32bit signed integers, that happens because 100 is an int and 30000000 is also an int AND because b is also an int, the register in this case are smart, when ALL values on the right side are int it assumes the values also have to be an int but when a mighty unsigned long comes to the party, it knows that dividing an int by an unsigned long, / b is wrong, so it stores the value of ( 100 * 30000000 ) to an unsigned long.
In C++, there are programming elements called "literal constants".
For example (taken from here):
157 // integer constant
0xFE // integer constant
'c' // character constant
0.2 // floating constant
0.2E-01 // floating constant
"dog" // string literal
So, back to your example, 100 * 30000000 is multiplying two ints together. That is why there is overflow. Anytime you perform arithmetic operations on operands of the same type, you get a result of the same type. Also, in the snippet unsigned long a = 30000000;, you are taking an integer constant 30000000 and assigning that to the variable a of type unsigned long.
To get your desired output, add the ul suffix to the end: unsigned long n = ( 100ul * 30000000ul ) / b;.
Here is a site that has explanations for the suffixes.
why /b when b is unsigned long is still an interesting question
Because 100 * 30000000 is performed before you divide by b and the operands are both of type int.
The maximum number that can be represented in a 32-bit signed integer without overflow is 2147483647. 100*30000000 is larger than that.
The type of an arithmetic operation is completely independent of the type of the variable you're storing it into. It's based on the type of the operands. If both operands are of type int, the result will be of type int too, and that result will then be converted before it is stored in the variable.
Works (Output = 85):
unsigned long b= 35000000;
unsigned long n = ( 100 * 30000000 ) / b;
Not here, using:
#include <iostream>
int main() {
unsigned long b= 35000000;
unsigned long n = ( 100 * 30000000 ) / b;
std::cout << n << std::endl;
return 0;
}
the output is 527049830640 (and the compiler warned about the overflow even with the default warning level).
The point is that, as Mark Ransom already wrote, the type of an arithmetic operation is determined by the type of its operands.
The type of the constant 100 is int, as is the type of the constant 30000000 (assuming 32-bit or larger ints, would be long int if int is 16 bits). So the multiplication is performed at type int, and with 32-bit ints it overflows. The overflow is undefined behaviour, but wrap-around is the most common manifestation of that undefined behaviour, resulting in the value -1294967296. Then the result of the multiplication is converted to the type of b (since that is an unsigned type and - in C terminology - its integer conversion rank is not smaller than that of int) for the division.
Conversion to an unsigned integer type means reduction modulo 2^WIDTH. If the width of unsigned long is 32, the result of that last conversion is 2^32 - 1294967296 = 3000000000, resulting in the quotient 85. But if - as on my system - the width of unsigned long is 64 bits, the result of that conversion is 2^64 - 1294967296 = 18446744072414584320.
Another common solution is to typecast one of the constants to the larger, result type prior to operating on it. I prefer this method, since not everyone remembers all the possible suffixes. Including myself.
In this case, I'd use:
unsigned long n = ( (unsigned long)100 * 30000000 ) / b;
The sad part is that this is one thing assembly language—yes, assembly language—gets right that C, C++, and many other languages do not: The result of multiplying an M-bit integer by an N-bit integer is a (M+N)-bit integer, not a (max(M, N))-bit integer.
EDIT: Mark makes an interesting point: the compiler does not "look ahead" to where the result is stored in order to infer a result type. Thus, C++ demands that the result of any sub-expression, by itself, be deterministic. In other words, the exact type of 100 * 30000000 can always be determined without looking at any other piece of code.
I'm doing a program that calculates the probability of lotteries.
Specification is choose 5 numbers out of 47 and 1 out of 27
So I did the following:
#include <iostream>
long int choose(unsigned n, unsigned k);
long int factorial(unsigned n);
int main(){
using namespace std;
long int regularProb, megaProb;
regularProb = choose(47, 5);
megaProb = choose(27, 1);
cout << "The probability of the correct number is 1 out of " << (regularProb * megaProb) << endl;
return 0;
}
long int choose(unsigned n, unsigned k){
return factorial(n) / (factorial(k) * factorial(n-k));
}
long int factorial(unsigned n){
long int result = 1;
for (int i=2;i<=n;i++) result *= i;
return result;
}
However the program doesn't work. The program calculates for 30 seconds, then gives me Process 4 exited with code -1,073,741,676 I have to change all the long int to long double, but that loses precision. Is it because long int is too short for the big values? Though I thought long int nowadays are 64bit? My compiler is g++ win32 (64bit host).
Whether long is 64-bit or not depends on the model. Windows uses a 32-bit long. Use int64_t from <stdint.h> if you need to ensure it is 64-bit.
But even if long is 64-bit it is still too small to hold factorial(47).
47! == 2.58623242e+59
2^64 == 1.84467441e+19
although 47C5 is way smaller than that.
You should never use nCr == n!/(r! (n-r)!) directly do the calculation as it overflows easily. Instead, factor out the n!/(n-r)! to get:
47 * 46 * 45 * 44 * 43
C = ----------------------
47 5 5 * 4 * 3 * 2 * 1
this can be managed even by a 32-bit integer.
BTW, for #Coffee's question: a double only has 53-bits of precision, where 47! requires 154 bits. 47! and 42! represented in double would be
47! = (0b10100100110011011110001010000100011110111001100100100 << 145) ± (1 << 144)
42! = (0b11110000010101100000011101010010010001101100101001000 << 117) ± (1 << 116)
so 47! / (42! × 5!)'s possible range of value will be
0b101110110011111110011 = 1533939 53 bits
v
max = 0b101110110011111110011.000000000000000000000000000000001001111...
val = 0b101110110011111110010.111111111111111111111111111111111010100...
min = 0b101110110011111110010.111111111111111111111111111111101011010...
that's enough to get the exact value 47C5.
to use 64bit long, you should use long long. (as mentioned here)
KennyTM has it right, you're going to overflow no matter what type you use. You need to approach the problem more smartly and factor out lots of work. If you're ok with an approximate answer, then take a look at Stirling's approximation:
Ln(n!) ~ n Ln(n) - n
So if you have
n!/(k!*(n-k)!)
You could say that's
e(ln(n!/(k!*(n-k)!)))
which after some math (double check to make sure I got it right) is
e(n*ln(n)-k*ln(k)-(n-k)*ln(n-k))
And that shouldn't overflow (but it's an approximate answer)
It's easy to calculate binomial coefficients up to 47C5 and beyond without overflow, using standard unsigned long 32-bit arithmetic. See my response to this question: https://math.stackexchange.com/questions/34518/are-there-examples-where-mathematicians-needs-to-calculate-big-combinations/34530#comment-76389