Does modulus overflow? - c++

I know that (INT_MIN / -1) overflows, but (INT_MIN % -1) does not. At least this is what happens in two compilers, one pre-c++11 (VC++ 2010) and the other post-c++11 GCC 4.8.1
int x = INT_MIN;
cout << x / -1 << endl;
cout << x % -1 << endl;
Gives:
-2147483648
0
Is this behavior standard defined or this is implementation defined?
Also is there ANY OTHER case where the division operation would overflow?
and is there any case where the modulus operator would overflow?

According to CERT C++ Secure Coding Standard modulus can can overflow it says:
[...]Overflow can occur during a modulo operation when the dividend is equal to the minimum (negative) value for the signed integer type and the divisor is equal to -1.
and they recommend the follow style of check to prevent overflow:
signed long sl1, sl2, result;
/* Initialize sl1 and sl2 */
if ( (sl2 == 0 ) || ( (sl1 == LONG_MIN) && (sl2 == -1) ) ) {
/* handle error condition */
}
else {
result = sl1 % sl2;
}
The C++ draft standard section 5.6 Multiplicative operators paragraph 4 says(emphasis mine):
The binary / operator yields the quotient, and the binary % operator yields the remainder from the division of the first expression by the second. If the second operand of / or % is zero the behavior is undefined. For integral operands the / operator yields the algebraic quotient with any fractional part discarded;81 if the quotient a/b is representable in the type of the result, (a/b)*b + a%b is equal to a; otherwise, the behavior of both a/b and a%b is undefined.
The C version of the CERT document provides some more insight on how % works on some platforms and in some cases INT_MIN % -1 could produce a floating-point exception.
The logic for preventing overflow for / is the same as the above logic for %.

It is not the modulo operation that is causing an overflow:
int x = INT_MIN;
int a = x / -1; // int's are 2's complement, so this is effectively -INT_MIN which overflows
int b = x % -1; // there is never a non-0 result for a anything +/- 1 as there is no remainder

Every modern implementation of the C++ language uses two's complement as the underlying representation scheme for integers. As a result, -INT_MIN is never representable as an int. (INT_MAX is -1+INT_MIN). There is overflow because the value x/-1 can't be represented as an int when x is INT_MIN.
Think about what % computes. If the result can be represented as an int, there won't be any overflow.

Related

Is ((a + (b & 255)) & 255) the same as ((a + b) & 255)?

I was browsing some C++ code, and found something like this:
(a + (b & 255)) & 255
The double AND annoyed me, so I thought of:
(a + b) & 255
(a and b are 32-bit unsigned integers)
I quickly wrote a test script (JS) to confirm my theory:
for (var i = 0; i < 100; i++) {
var a = Math.ceil(Math.random() * 0xFFFF),
b = Math.ceil(Math.random() * 0xFFFF);
var expr1 = (a + (b & 255)) & 255,
expr2 = (a + b) & 255;
if (expr1 != expr2) {
console.log("Numbers " + a + " and " + b + " mismatch!");
break;
}
}
While the script confirmed my hypothesis (both operations are equal), I still don't trust it, because 1) random and 2) I'm not a mathematician, I have no idea what am I doing.
Also, sorry for the Lisp-y title. Feel free to edit it.
They are the same. Here's a proof:
First note the identity (A + B) mod C = (A mod C + B mod C) mod C
Let's restate the problem by regarding a & 255 as standing in for a % 256. This is true since a is unsigned.
So (a + (b & 255)) & 255 is (a + (b % 256)) % 256
This is the same as (a % 256 + b % 256 % 256) % 256 (I've applied the identity stated above: note that mod and % are equivalent for unsigned types.)
This simplifies to (a % 256 + b % 256) % 256 which becomes (a + b) % 256 (reapplying the identity). You can then put the bitwise operator back to give
(a + b) & 255
completing the proof.
Lemma: a & 255 == a % 256 for unsigned a.
Unsigned a can be rewritten as m * 0x100 + b some unsigned m,b, 0 <= b < 0xff, 0 <= m <= 0xffffff. It follows from both definitions that a & 255 == b == a % 256.
Additionally, we need:
the distributive property: (a + b) mod n = [(a mod n) + (b mod n)] mod n
the definition of unsigned addition, mathematically: (a + b) ==> (a + b) % (2 ^ 32)
Thus:
(a + (b & 255)) & 255 = ((a + (b & 255)) % (2^32)) & 255 // def'n of addition
= ((a + (b % 256)) % (2^32)) % 256 // lemma
= (a + (b % 256)) % 256 // because 256 divides (2^32)
= ((a % 256) + (b % 256 % 256)) % 256 // Distributive
= ((a % 256) + (b % 256)) % 256 // a mod n mod n = a mod n
= (a + b) % 256 // Distributive again
= (a + b) & 255 // lemma
So yes, it is true. For 32-bit unsigned integers.
What about other integer types?
For 64-bit unsigned integers, all of the above applies just as well, just substituting 2^64 for 2^32.
For 8- and 16-bit unsigned integers, addition involves promotion to int. This int will definitely neither overflow or be negative in any of these operations, so they all remain valid.
For signed integers, if either a+b or a+(b&255) overflow, it's undefined behavior. So the equality can't hold — there are cases where (a+b)&255 is undefined behavior but (a+(b&255))&255 isn't.
In positional addition, subtraction and multiplication of unsigned numbers to produce unsigned results, more significant digits of the input don't affect less-significant digits of the result. This applies to binary arithmetic as much as it does to decimal arithmetic. It also applies to "twos complement" signed arithmetic, but not to sign-magnitude signed arithmetic.
However we have to be careful when taking rules from binary arithmetic and applying them to C (I beleive C++ has the same rules as C on this stuff but i'm not 100% sure) because C arithmetic has some arcane rules that can trip us up. Unsigned arithmetic in C follows simple binary wraparound rules but signed arithmetic overflow is undefined behaviour. Worse under some circumstances C will automatically "promote" an unsigned type to (signed) int.
Undefined behaviour in C can be especially insiduous. A dumb compiler (or a compiler on a low optimisation level) is likely to do what you expect based on your understanding of binary arithmetic while an optimising compiler may break your code in strange ways.
So getting back to the formula in the question the equivilence depends on the operand types.
If they are unsigned integers whose size is greater than or equal to the size of int then the overflow behaviour of the addition operator is well-defined as simple binary wraparound. Whether or not we mask off the high 24 bits of one operand before the addition operation has no impact on the low bits of the result.
If they are unsigned integers whose size is less than int then they will be promoted to (signed) int. Overflow of signed integers is undefined behaviour but at least on every platform I have encountered the difference in size between different integer types is large enough that a single addition of two promoted values will not cause overflow. So again we can fall back to the simply binary arithmetic argument to deem the statements equivalent.
If they are signed integers whose size is less than int then again overflow can't happen and on twos-complement implementations we can rely on the standard binary arithmetic argument to say they are equivilent. On sign-magnitude or ones complement implementations they would not be equivilent.
OTOH if a and b were signed integers whose size was greater than or equal to the size of int then even on twos complement implementations there are cases where one statement would be well-defined while the other would be undefined behaviour.
Yes, (a + b) & 255 is fine.
Remember addition in school? You add numbers digit by digit, and add a carry value to the next column of digits. There is no way for a later (more significant) column of digits to influence an already processed column. Because of this, it does not make a difference if you zero-out the digits only in the result, or also first in an argument.
The above is not always true, the C++ standard allows an implementation that would break this.
Such a Deathstation 9000 :-) would have to use a 33-bit int, if the OP meant unsigned short with "32-bit unsigned integers". If unsigned int was meant, the DS9K would have to use a 32-bit int, and a 32-bit unsigned int with a padding bit. (The unsigned integers are required to have the same size as their signed counterparts as per §3.9.1/3, and padding bits are allowed in §3.9.1/1.) Other combinations of sizes and padding bits would work too.
As far as I can tell, this is the only way to break it, because:
The integer representation must use a "purely binary" encoding scheme (§3.9.1/7 and the footnote), all bits except padding bits and the sign bit must contribute a value of 2n
int promotion is allowed only if int can represent all the values of the source type (§4.5/1), so int must have at least 32 bits contributing to the value, plus a sign bit.
the int can not have more value bits (not counting the sign bit) than 32, because else an addition can not overflow.
You already have the smart answer: unsigned arithmetic is modulo arithmetic and therefore the results will hold, you can prove it mathematically...
One cool thing about computers, though, is that computers are fast. Indeed, they are so fast that enumerating all valid combinations of 32 bits is possible in a reasonable amount of time (don't try with 64 bits).
So, in your case, I personally like to just throw it at a computer; it takes me less time to convince myself that the program is correct than it takes to convince myself than the mathematical proof is correct and that I didn't oversee a detail in the specification1:
#include <iostream>
#include <limits>
int main() {
std::uint64_t const MAX = std::uint64_t(1) << 32;
for (std::uint64_t i = 0; i < MAX; ++i) {
for (std::uint64_t j = 0; j < MAX; ++j) {
std::uint32_t const a = static_cast<std::uint32_t>(i);
std::uint32_t const b = static_cast<std::uint32_t>(j);
auto const champion = (a + (b & 255)) & 255;
auto const challenger = (a + b) & 255;
if (champion == challenger) { continue; }
std::cout << "a: " << a << ", b: " << b << ", champion: " << champion << ", challenger: " << challenger << "\n";
return 1;
}
}
std::cout << "Equality holds\n";
return 0;
}
This enumerates through all possible values of a and b in the 32-bits space and checks whether the equality holds, or not. If it does not, it prints the case which didn't work, which you can use as a sanity check.
And, according to Clang: Equality holds.
Furthermore, given that the arithmetic rules are bit-width agnostic (above int bit-width), this equality will hold for any unsigned integer type of 32 bits or more, including 64 bits and 128 bits.
Note: How can a compiler enumerates all 64-bits patterns in a reasonable time frame? It cannot. The loops were optimized out. Otherwise we would all have died before execution terminated.
I initially only proved it for 16-bits unsigned integers; unfortunately C++ is an insane language where small integers (smaller bitwidths than int) are first converted to int.
#include <iostream>
int main() {
unsigned const MAX = 65536;
for (unsigned i = 0; i < MAX; ++i) {
for (unsigned j = 0; j < MAX; ++j) {
std::uint16_t const a = static_cast<std::uint16_t>(i);
std::uint16_t const b = static_cast<std::uint16_t>(j);
auto const champion = (a + (b & 255)) & 255;
auto const challenger = (a + b) & 255;
if (champion == challenger) { continue; }
std::cout << "a: " << a << ", b: " << b << ", champion: "
<< champion << ", challenger: " << challenger << "\n";
return 1;
}
}
std::cout << "Equality holds\n";
return 0;
}
And once again, according to Clang: Equality holds.
Well, there you go :)
1 Of course, if a program ever inadvertently triggers Undefined Behavior, it would not prove much.
The quick answer is: both expressions are equivalent
since a and b are 32-bit unsigned integers, the result is the same even in case of overflow. unsigned arithmetic guarantees this: a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.
The long answer is: there are no known platforms where these expressions would differ, but the Standard does not guarantee it, because of the rules of integral promotion.
If the type of a and b (unsigned 32 bit integers) has a higher rank than int, the computation is performed as unsigned, modulo 232, and it yields the same defined result for both expressions for all values of a and b.
Conversely, If the type of a and b is smaller than int, both are promoted to int and the computation is performed using signed arithmetic, where overflow invokes undefined behavior.
If int has at least 33 value bits, neither of the above expressions can overflow, so the result is perfectly defined and has the same value for both expressions.
If int has exactly 32 value bits, the computation can overflow for both expressions, for example values a=0xFFFFFFFF and b=1 would cause an overflow in both expressions. In order to avoid this, you would need to write ((a & 255) + (b & 255)) & 255.
The good news is there are no such platforms1.
1 More precisely, no such real platform exists, but one could configure a DS9K to exhibit such behavior and still conform to the C Standard.
Identical assuming no overflow. Neither version is truly immune to overflow but the double and version is more resistant to it. I am not aware of a system where an overflow in this case is a problem but I can see the author doing this in case there is one.
Yes you can prove it with arithmetic, but there is a more intuitive answer.
When adding, every bit only influences those more significant than itself; never those less significant.
Therefore, whatever you do to the higher bits before the addition won't change the result, as long as you only keep bits less significant than the lowest bit modified.
The proof is trivial and left as an exercise for the reader
But to actually legitimize this as an answer, your first line of code says take the last 8 bits of b** (all higher bits of b set to zero) and add this to a and then take only the last 8 bits of the result setting all higher bits to zero.
The second line says add a and b and take the last 8 bits with all higher bits zero.
Only the last 8 bits are significant in the result. Therefore only the last 8 bits are significant in the input(s).
** last 8 bits = 8 LSB
Also it is interesting to note that the output would be equivalent to
char a = something;
char b = something;
return (unsigned int)(a + b);
As above, only the 8 LSB are significant, but the result is an unsigned int with all other bits zero. The a + b will overflow, producing the expected result.

How does C++ integer division work for limit and negative values?

I am facing some strange results with integer division in C++. I am trying to calculate this: -2147483648 / -1.
What I get is 3 different results in 3 different scenarios:
int foo(int numerator, int denominator) {
int res = numerator / denominator; // produces SIGFPE, Arithmetic exception interrupt
cout << res << endl;
}
int main() {
int res = -2147483648 / -1;
cout << res << endl; // prints -2147483648
cout << -2147483648 / -1 << endl; // prints 2147483648
foo(-2147483648, -1);
return 0;
}
Why does the integer division operation produces different results in different situations?
The literal -2147483648 / -1 is calculated by your compiler as 2147483648 in a data type that is wide enough to hold that value.
When the literal is printed out directly, it prints the value correctly.
When the literal is stored in res, it is cast to an int. An int appears to be 32 bits wide on your system. The value 2147483648 cannot be represented as a 32 bit signed integer, so the cast causes an overflow. On your system, this overflow results in the value -2147483648 (likely it's using two's complement).
Finally, when trying to perform the division at runtime (in the foo function), the SIGFPE exception occurs due to the overflow (because the int datatype cannot represent the result).
Note that all of these three options rely on platform dependent behavior :
the fact that the compiler doesn't generate any errors (or other issues) when the literal calculation overflows and just uses a data type large enough to hold the result
the fact that the int overflow when storing the literal generates that specific value (and no other issues)
the fact that the SIGFPE exception is thrown when overflowing at runtime
int res = -2147483648 / -1;
cout << res << endl; // prints -2147483648
cout << -2147483648 / -1 << endl; // prints 2147483648
int res = numerator / denominator; // produces SIGFPE, Arithmetic exception interrupt
Note there're no negative integer literals.
There are no negative integer literals. Expressions such as -1 apply the unary minus operator to the value represented by the literal, which may involve implicit type conversions.
The literal 2147483648 is larger than the max value of int, so its type will be long (or long long, depends on implementation). Then -2147483648 's type is long, and the result of calculation (-2147483648 / -1) is long too.
For the 1st case, the result 2147483648 of type long is implicitly converted to int, but it's larger than the max value of int, the result is implementation-defined. (It seems the result is wrapped around according to the rules of the representation (2's complement) here, so you get the result -2147483648.)
For the 2nd case, the result with type long is printed out directly, so you get the correct result.
For the 3rd case, you're doing the calculation on two ints, and the result can't fit in the result type (i.e. int), signed integer arithmetic operation overflow happened, the behavior is undefined. (Produces SIGFPE, Arithmetic exception interrupt here.)
Your outcome might be INT_MAX+1, in other words it probably overflows. That is Undefined Behavior, and anything can happen. For instance, a compiler may reject the code outright.
(A system might have INT_MAX >= 2147483648, but then you would expect the same result for your 3 testcases)

modulo with unsigned int

In C++ i try to use modulo operator for two unsigned int variables like in Marsaglia's multiply with carry algorithm.
The results seem right, but i'm not sure about the limitations of modulo.
m_upperBits = (36969 * (m_upperBits & 65535) + (m_upperBits >> 16))<<16;
m_lowerBits = 18000 * (m_lowerBits & 65535) + (m_lowerBits >> 16);
unsigned int sum = m_upperBits + m_lowerBits; /* 32-bit result */
unsigned int mod = (max-min+1);
int result=min+sum%mod;
Reference: C++03 paragraph 5.6 clause 4
The binary / operator yields the quotient, and the binary % operator yields the remainder from the division of the first expression by the second. If the second operand of / or % is zero the behavior is undefined;
otherwise (a/b)*b + a%b is equal to a. If both operands are nonnegative then the remainder is nonnegative; if not, the sign of the remainder is implementation-defined.
I don't see any limitations of modulo in c.

operator modulo change in c++ 11? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
C++ operator % guarantees
In c++ 98/03
5.6-4
The binary / operator yields the quotient, and the binary % operator
yields the remainder from the division of the first expression by the
second. If the second operand of / or % is zero the behavior is
undefined; otherwise (a/b)*b + a%b is equal to a. If both operands
are nonnegative then the remainder is nonnegative; if not, the sign of
the remainder is implementation-defined.
In c++ 11:
5.6 -4
The binary / operator yields the quotient, and the binary % operator
yields the remainder from the division of the first expression by the
second. If the second operand of / or % is zero the behavior is
undefined. For integral operands the / operator yields the algebraic
quotient with any fractional part discarded;81 if the quotient a/b is
representable in the type of the result, (a/b)*b + a%b is equal to a.
As you can see the implementation-defined for the sign bit is missing, what happens to it ?
The behaviour of % was tightened in C++11, and is now fully specified (apart from division by 0).
The combination of truncation towards zero and the identity (a/b)*b + a%b == a implies that a%b is always positive for positive a and negative for negative a.
The mathematical reason for this is as follows:
Let ÷ be mathematical division, and / be C++ division.
For any a and b, we have a÷b = a/b + f (where f is the fractional part), and from the standard, we also have (a/b)*b + a%b == a.
a/b is known to truncate towards 0, so we know that the fractional part will always be positive if a÷b is positive, and negative is a÷b is negative:
sign(f) == sign(a)*sign(b)
a÷b = a/b + f can be rearranged to give a/b = a÷b - f. a can be expanded as (a÷b)*b:
(a/b)*b + a%b == a => (a÷b - f)*b+a%b == (a÷b)*b.
Now the left hand side can also be expanded:
(a÷b)*b - f*b + a%b == (a÷b)*b
a%b == f*b
Recall from earlier that sign(f)==sign(a)*sign(b), so:
sign(a%b) == sign(f*b) == sign(a)*sign(b)*sign(b) == sign(a)
The algorithm says (a/b)*b + a%b = a, which is easier to read if you remember that it's truncate(a/b)*b + a%b = a Using algebra, a%b = a - truncate(a/b)*b. That is to say, f(a,b) = a - truncate(a/b)*b. For what values is f(a,b) < 0?
It doesn't matter if b is negative or positive. It cancels itself out because it appears in the numerator and the denominator. Even if truncate(a/b) = 0 and b is negative, well, it's going to be canceled out when it's a product of 0.
Therefore, it is only the sign of a that determines the sign of f(a,b), or a%b.

Why is such complex code emitted for dividing a signed integer by a power of two?

When I compile this code with VC++10:
DWORD ran = rand();
return ran / 4096;
I get this disassembly:
299: {
300: DWORD ran = rand();
00403940 call dword ptr [__imp__rand (4050C0h)]
301: return ran / 4096;
00403946 shr eax,0Ch
302: }
00403949 ret
which is clean and concise and replaced a division by a power of two with a logical right shift.
Yet when I compile this code:
int ran = rand();
return ran / 4096;
I get this disassembly:
299: {
300: int ran = rand();
00403940 call dword ptr [__imp__rand (4050C0h)]
301: return ran / 4096;
00403946 cdq
00403947 and edx,0FFFh
0040394D add eax,edx
0040394F sar eax,0Ch
302: }
00403952 ret
that performs some manipulations before doing a right arithmetic shift.
What's the need for those extra manipulations? Why is an arithmetic shift not enough?
The reason is that unsigned division by 2^n can be implemented very simply, whereas signed division is somewhat more complex.
unsigned int u;
int v;
u / 4096 is equivalent to u >> 12 for all possible values of u.
v / 4096 is NOT equivalent to v >> 12 - it breaks down when v < 0, as the rounding direction is different for shifting versus division when negative numbers are involved.
the "extra manipulations" compensate for the fact that arithmetic right-shift rounds the result toward negative infinity, whereas division rounds the result towards zero.
For example, -1 >> 1 is -1, whereas -1/2 is 0.
From the C standard:
When integers are divided, the result of the / operator is the
algebraic quotient with any fractional part discarded.105) If the
quotient a/b is representable, the expression (a/b)*b + a%b shall
equal a; otherwise, the behavior of both a/b and a%b is undefined.
It's not hard to think of examples where negative values for a don't follow this rule with pure arithmetic shift. E.g.
(-8191) / 4096 -> -1
(-8191) % 4096 -> -4095
which satisfies the equation, whereas
(-8191) >> 12 -> -2 (assuming arithmetic shifting)
is not division with truncation, and therefore -2 * 4096 - 4095 is most certainly not equal to -8191.
Note that shifting of negative numbers is actually implementation-defined, so the C expression (-8191) >> 12 does not have a generally correct result as per the standard.