Squaring number in c++, Kaprekar numbers [duplicate] - c++

This question already has answers here:
Multiplication of two integers in C++
(3 answers)
Closed 6 years ago.
Found this issue in C++ while detecting Kaprekar numbers in a range. For number 77778 -
unsigned long long sq = pow(n, 2);
returns 6,049,417,284 while
unsigned long long sq = n * n;
returns 1,754,449,988
Any ideas why? Is this some sort of overflow which pow avoids but normal n*n does not.

Assuming your n to be typical int or unsigned int, the reason for this is because
this line
unsigned long long sq = n * n;
is equivalent to
unsigned long long sq = (int)(n * n);
as the n * n will be first processed (both as integers) before assigning the result to sq. So, this is an overflow problem (And welcome to Stack Overflow too!).
You also probably want to understand these terms overflow and casting more by searching around (since they are very common issues in Computing, understanding them early will be of great help!).
This has nothing to do with Kaprekar numbers. In most of nowadays machine int is 32-bit. Thus it can only handle value -2,147,483,648 to 2,147,483,647 (or 0 to 4,294,967,295 for unsigned integer counter part).
Thus processing n * n will give you:
n * n = 6,049,417,284 - 4,294,967,296 = 1,754,449,988 //overflow at (4,294,967,295 + 1)!
If you do casting before hand:
unsigned int n = 77778;
unsigned long long sq = pow(n, 2);
unsigned long long sq2 = (unsigned long long)n * n; //note the casting here.
std::cout << sq << std::endl;
std::cout << sq2 << std::endl;
Then the results will be identical, since there won't be overflow.

Your n is declared as a 32 bit int. You need to either change it to long long or just typecast the operation into long long.
unsigned long long sq=(unsigned long long)n*n;
this will give the right answer

I suspect that n is declared as unsigned int and you've compiler with a data model that assumes int to be 32 bits wide. The maximum value that can be represented with this type would be 232 - 1 = 4294967295. Anything beyond this value would wrap around. So assigning 4294967296 would become 0, 4294967297 would become 1, and so on.
You have an overflow; since both operands are unsigned int the resulting type would be the same too. The true result of the operation would be 6049417284. Assigning it to an unsigned int would (wrap) and become 1754449988 = 6049417284 - 4294967296. This unsigned int result is assigned to a wider type unsigned long long, which doesn't change the value. It's necessary to understand the difference between the result's type (the type of the expression) and destination type (the type of the variable that is going to hold the result).
Wrap around behaviour (more formally modulo n) in unsigned types is well-defined in C++, so the compiler might not warn you.
Quote from Unsigned Arithmetic:
If an unsigned integer overflows, the result is defined modulo 2w, where w is the number of bits in that particular unsigned integer. By implication, an unsigned integer is never negative.

Related

Assign a negative number to an unsigned int

This code gives the meaningful output
#include <iostream>
int main() {
unsigned int ui = 100;
unsigned int negative_ui = -22u;
std::cout << ui + negative_ui << std::endl;
}
Output:
78
The variable negative_ui stores -22, but is an unsigned int.
My question is why does unsigned int negative_ui = -22u; work.
How can an unsigned int store a negative number? Is it save to be used or does this yield undefined behaviour?
I use the intel compiler 18.0.3. With the option -Wall no warnings occurred.
Ps. I have read What happens if I assign a negative value to an unsigned variable? and Why unsigned int contained negative number
How can an unsigned int store a negative number?
It doesn't. Instead, it stores a representable number that is congruent with that negative number modulo the number of all representable values. The same is also true with results that are larger than the largest representable value.
Is it save to be used or does this yield undefined behaviour?
There is no UB. Unsigned arithmetic overflow is well defined.
It is safe to rely on the result. However, it can be brittle. For example, if you add -22u and 100ull, then you get UINT_MAX + 79 (i.e. a large value assuming unsigned long long is a larger type than unsigned) which is congruent with 78 modulo UINT_MAX + 1 that is representable in unsigned long long but not representable in unsigned.
Note that signed arithmetic overflow is undefined.
Signed/Unsigned is a convention. It uses the last bit of the variable (in case of x86 int, the last 31th bit). What you store in the variable takes the full bit length.
It's the calculations that follow that take the upper bit as a sign indicator or ignore it. Therefore, any "unsigned" variable can contain a signed value which will be converted to the unsigned form when the unsigned variable participates in a calculation.
unsigned int x = -1; // x is now 0xFFFFFFFF.
x -= 1; // x is now 0xFFFFFFFE.
if (x < 0) // false. x is compared as 0xFFFFFFFE.
int x = -1; // x stored as 0xFFFFFFFF
x -= 1; // x stored as 0xFFFFFFFE
if (x < 0) // true, x is compared as -2.
Technically valid, bad programming.

3 * 1000000000 overflows as an int, but the variable is long long. Why? [duplicate]

This question already has answers here:
long long is 8 bytes, but I get integer overflow?
(1 answer)
Why does long long n = 2000*2000*2000*2000; overflow?
(6 answers)
Closed 5 years ago.
I have a simple c++ app that performs the following calculations
long long calcOne = 3 * 100000000; // 3e8, essentially
long long calcTwo = 3 * 1000000000; // 3e9, essentially
long long calcThree = 3 * 10000000000; // 3e10, essentially
If I write the result of each calculation I get the following output:
calcOne = 300000000
calcTwo = -1294967296
calcThree = 30000000000
So why does the second calculation fail? As far as I can tell it is within the limits of a long long type (calcThree was larger...).
I am using Visual Studio 2015 on Windows 10. Thanks in advance.
Integer constants are, by default ints.
1000000000
That can fit into an int. So, this constant gets parsed as an int. But multiplying it by 3 overflows int.
10000000000
This is too big to an int, so this constant is a long long, so the resulting multiplication does not overflow.
Solution: explicitly use long long constants:
long long calcOne = 3 * 100000000LL; // 3e8, essentially
long long calcTwo = 3 * 1000000000LL; // 3e9, essentially
long long calcThree = 3 * 10000000000LL; // 3e10, essentially
What you do with a result doesn't affect how that result is calculated. So the fact that you store the result in a long long doesn't change the fact that the numbers you multiplied in the second line of code were not long longs and so they overflowed. In the third line of code, the constant is a long long, so the multiplication is performed on long longs.
The compiler saw this
long long calcOne = (int) 3 * (int) 100000000; // 3e8, essentially
long long calcTwo = (int) 3 * (int) 1000000000; // 3e9, essentially
long long calcThree = (int) 3 * (long long) 10000000000; // 3e10, essentially
And so the calcTwo right hand value was inferred as an int type and then over flowed. You see the over flow as a negative long.
long long calcOne = 3LL * 100000000LL; // 3e8, essentially
long long calcTwo = 3LL * 1000000000LL; // 3e9, essentially
long long calcThree = 3LL * 10000000000LL; // 3e10, essentially
To avoid this in the future, be explicit as to the types of your static values.To tell the compiler a number is a long long post fix it with LL.
Most programming languages rank number types by size. The size/rank/type of a numeric expression is (usually) the type of the highest-ranked value in the expression.
Example: int * double -> double
Your program has:
long long int = int * int.
What's happening is that the result of int * int is an int. So your program will multiply first and treat the result in a signed integer (maximum value ~= 2 billion, so it wraps around into negative numbers). Then, this negative value gets stored in the long long int.
300 million (your first multiplication) fits in an int. No problem there. I'm guessing the third works properly because the compiler is smart enough to know that 30 billion doesn't fit in a 32-bit int and automatically gives it a 64-bit long long int.

Can n %= m ever return negative value for very large nonnegative n and m?

This question is regarding the modulo operator %. We know in general a % b returns the remainder when a is divided by b and the remainder is greater than or equal to zero and strictly less than b. But does the above hold when a and b are of magnitude 10^9 ?
I seem to be getting a negative output for the following code for input:
74 41 28
However changing the final output statement does the work and the result becomes correct!
#include<iostream>
using namespace std;
#define m 1000000007
int main(){
int n,k,d;
cin>>n>>k>>d;
if(d>n)
cout<<0<<endl;
else
{
long long *dp1 = new long long[n+1], *dp2 = new long long[n+1];
//build dp1:
dp1[0] = 1;
dp1[1] = 1;
for(int r=2;r<=n;r++)
{
dp1[r] = (2 * dp1[r-1]) % m;
if(r>=k+1) dp1[r] -= dp1[r-k-1];
dp1[r] %= m;
}
//build dp2:
for(int r=0;r<d;r++) dp2[r] = 0;
dp2[d] = 1;
for(int r = d+1;r<=n;r++)
{
dp2[r] = ((2*dp2[r-1]) - dp2[r-d] + dp1[r-d]) % m;
if(r>=k+1) dp2[r] -= dp1[r-k-1];
dp2[r] %= m;
}
cout<<dp2[n]<<endl;
}
}
changing the final output statement to:
if(dp2[n]<0) cout<<dp2[n]+m<<endl;
else cout<<dp2[n]<<endl;
does the work, but why was it required?
By the way, the code is actually my solution to this question
This is a limit imposed by the range of int.
int can only hold values between –2,147,483,648 to 2,147,483,647.
Consider using long long for your m, n, k, d & r variables. If possible use unsigned long long if your calculations should never have a negative value.
long long can hold values from –9,223,372,036,854,775,808 to 9,223,372,036,854,775,807
while unsigned long long can hold values from 0 to 18,446,744,073,709,551,615. (2^64)
The range of positive values is approximately halved in signed types compared to unsigned types, due to the fact that the most significant bit is used for the sign; When you try to assign a positive value greater than the range imposed by the specified Data Type the most significant bit is raised and it gets interpreted as a negative value.
Well, no, modulo with positive operands does not produce negative results.
However .....
The int type is only guaranteed by the C standards to support values in the range -32767 to 32767, which means your macro m is not necessarily expanding to a literal of type int. It will fit in a long though (which is guaranteed to have a large enough range).
If that's happening (e.g. a compiler that has a 16-bit int type and a 32-bit long type) the results of your modulo operations will be computed as long, and may have values that exceed what an int can represent. Converting that value to an int (as will be required with statements like dp1[r] %= m since dp1 is a pointer to int) gives undefined behaviour.
Mathematically, there is nothing special about big numbers, but computers only have a limited width to write down numbers in, so when things get too big you get "overflow" errors. A common analogy is the counter of miles traveled on a car dashboard - eventually it will show as all 9s and roll round to 0. Because of the way negative numbers are handled, standard signed integers don't roll round to zero, but to a very large negative number.
You need to switch to larger variable types so that they overflow less quickly - "long int" or "long long int" instead of just "int", the range doubling with each extra bit of width. You can also use unsigned types for a further doubling, since no range is used for negatives.

unsigned long long won't store big numbers

I'm confused with the C/C++ unsigned long long type because theoretically it should store up to 2^64-1 which is a number of 19 decimal digits, but the following code:
unsigned int x = 1000000u; //(One million)
unsigned long long k = (x*x);
cout << k << endl;
prints out 3567587328, which is not correct.
Now 1,000,000^2 results in 1,000,000,000,000 - a number of 12 decimal digit, way below the limit of even signed long long. How could this happen?
Does it have anything to do with the system I am running? (32-bit Ubuntu)
If I need a 64 bit system to implement a 64 bit operation then another question arises:
Most compilers use linear congruential generator to generate random numbers as follow:
x(t) = (a*x(t-1) + c) mod m.
a and c is usually a 32 bit big number, m is 2^32-1
So there is a big chance that a*x(t-1) results in a 64-bit number before the modulo operation is carried out.
If a 64 bit system is needed then how could gcc generate random numbers since 1990s on 16-32bit machines?
Thanks a million.
Sure k is unsigned long long, but x is unsigned int and hence so is x*x. Your expression is calculated as an unsigned int, which results in the usual wraparound when going over the limits of unsigned types. After the damage is done, it is converted to an unsigned long long.
Possible fixes:
make x an unsigned long long
unsigned long long k = ((unsigned long long)x*(unsigned long long)x);
unsigned long long k = (1ULL*x*x);
x is unsigned int --> x*x is unsigned int as well. In case the result of the multiplication exceeds the maximal value of unsigned int, wraparound occurs. Only after these operations the result is being assigned into the receiving variable (k). If you want the result to be unsigned long long you need to promote at least one of the operand to this type, e.g.: unsigned long long k = (unsigned long long)x * x;.
Regarding your second question: compilers usually do not generate numbers, that's done during runtime. I'm not sure where did you get the formulae x(t) = (a*x(t-1) + c) mod m. Assuming this is indeed the formula there are ways to keep the intermediate results bounded: the modulo operation can be applied to any operand or intermediate result without changing the outcome. Therefore x(t) = (a*x(t-1) + c) mod m = (a mod m) * (x(t-1) mod m) + c mod m.
When you multiply an unsigned int by an unsigned int on the right side, the result is an unsigned int. As such it has the same limits as the two numbers being multiplied, regardless of the fact that this value is subsequently assigned to an unsigned long long.
However, if you cast the unsigned int variables to unsigned long long, the result will be an unsigned long long and the value will not be limited to the size of an unsigned int.
unsigned long long k = (((unsigned long long)x)*((unsigned long long)x));
That should give you the result you want.

why overflow happens on calculation depending on the data type when the type where the value is being assigned can hold it

Earlier I came up with something, which I solved, but it got me later
let's take a look at a similar example of what I was on:
int b = 35000000; //35million
int a = 30000000;
unsigned long n = ( 100 * a ) / b;
Output: 4294967260
I simply changed a to unsigned long and the correct 85% output would come up, because a is a signed 32bit integer. But this got me later. There is no value assignment to a during ( 100 * a ) there is just simply a calculation and the correct value which is 3billion should come up instead of an overflow. To understand if there wasn't really an assignment to a I removed a from the code and manually write the value instead instead:
int b = 35000000;
unsigned long n = ( 100 * 30000000 ) / b;
The big surprise was that the output is also: 4294967260
And of course value of 3billion can be assigned to an unsigned long.
My first thought was that ( 100 * 30000000 ) was causing an overflow, but then I asked "an overflow on what? there is nothing to be overflowed".
Then I changed b to unsigned long, which even most suprisingly the output was correct 85%.
In the first example changing a to unsigned long
int b = 35000000;
unsigned long a = 30000000;
unsigned long n = ( 100 * a ) / b;
and leaving bas an int as it is works, but on the second example it doesn't, what is occuring?
This might be a little overwhelming to let me re-write all examples with the ones who work and the ones who dont.
Works (Output = 85):
int b = 35000000;
unsigned long a = 30000000;
unsigned long n = ( 100 * a ) / b;
Works (Output = 85):
unsigned long b= 35000000;
unsigned long n = ( 100 * 30000000 ) / b;
Doesn't works (Overflow):
int b = 35000000;
int a = 30000000;
unsigned long n = ( 100 * a ) / b;
Doesn't works (Overflow):
int b = 35000000;
unsigned long n = ( 100 * 30000000 ) / b;
Let me explain what is occuring here.
On:
int b= 35000000;
unsigned long n = ( 100 * 30000000 ) / b;
The value is incorrect because overflow happens at ( 100 * 30000000 )
But on:
unsigned long b= 35000000;
unsigned long n = ( 100 * 30000000 ) / b;
The value is correct, so what is happening?
In the first example b is a int, as said by Tony, an overflow happens because the register where the temporary value of ( 100 * 30000000 )will be assigned is able to hold 32bit signed integers, that happens because 100 is an int and 30000000 is also an int AND because b is also an int, the register in this case are smart, when ALL values on the right side are int it assumes the values also have to be an int but when a mighty unsigned long comes to the party, it knows that dividing an int by an unsigned long, / b is wrong, so it stores the value of ( 100 * 30000000 ) to an unsigned long.
In C++, there are programming elements called "literal constants".
For example (taken from here):
157 // integer constant
0xFE // integer constant
'c' // character constant
0.2 // floating constant
0.2E-01 // floating constant
"dog" // string literal
So, back to your example, 100 * 30000000 is multiplying two ints together. That is why there is overflow. Anytime you perform arithmetic operations on operands of the same type, you get a result of the same type. Also, in the snippet unsigned long a = 30000000;, you are taking an integer constant 30000000 and assigning that to the variable a of type unsigned long.
To get your desired output, add the ul suffix to the end: unsigned long n = ( 100ul * 30000000ul ) / b;.
Here is a site that has explanations for the suffixes.
why /b when b is unsigned long is still an interesting question
Because 100 * 30000000 is performed before you divide by b and the operands are both of type int.
The maximum number that can be represented in a 32-bit signed integer without overflow is 2147483647. 100*30000000 is larger than that.
The type of an arithmetic operation is completely independent of the type of the variable you're storing it into. It's based on the type of the operands. If both operands are of type int, the result will be of type int too, and that result will then be converted before it is stored in the variable.
Works (Output = 85):
unsigned long b= 35000000;
unsigned long n = ( 100 * 30000000 ) / b;
Not here, using:
#include <iostream>
int main() {
unsigned long b= 35000000;
unsigned long n = ( 100 * 30000000 ) / b;
std::cout << n << std::endl;
return 0;
}
the output is 527049830640 (and the compiler warned about the overflow even with the default warning level).
The point is that, as Mark Ransom already wrote, the type of an arithmetic operation is determined by the type of its operands.
The type of the constant 100 is int, as is the type of the constant 30000000 (assuming 32-bit or larger ints, would be long int if int is 16 bits). So the multiplication is performed at type int, and with 32-bit ints it overflows. The overflow is undefined behaviour, but wrap-around is the most common manifestation of that undefined behaviour, resulting in the value -1294967296. Then the result of the multiplication is converted to the type of b (since that is an unsigned type and - in C terminology - its integer conversion rank is not smaller than that of int) for the division.
Conversion to an unsigned integer type means reduction modulo 2^WIDTH. If the width of unsigned long is 32, the result of that last conversion is 2^32 - 1294967296 = 3000000000, resulting in the quotient 85. But if - as on my system - the width of unsigned long is 64 bits, the result of that conversion is 2^64 - 1294967296 = 18446744072414584320.
Another common solution is to typecast one of the constants to the larger, result type prior to operating on it. I prefer this method, since not everyone remembers all the possible suffixes. Including myself.
In this case, I'd use:
unsigned long n = ( (unsigned long)100 * 30000000 ) / b;
The sad part is that this is one thing assembly language—yes, assembly language—gets right that C, C++, and many other languages do not: The result of multiplying an M-bit integer by an N-bit integer is a (M+N)-bit integer, not a (max(M, N))-bit integer.
EDIT: Mark makes an interesting point: the compiler does not "look ahead" to where the result is stored in order to infer a result type. Thus, C++ demands that the result of any sub-expression, by itself, be deterministic. In other words, the exact type of 100 * 30000000 can always be determined without looking at any other piece of code.