This question already has answers here:
How do I detect unsigned integer overflow?
(31 answers)
Closed 8 years ago.
How to correctly check if overflow occurs in integer multiplication?
int i = X(), j = Y();
i *= j;
How to check for overflow, given values of i, j and their type? Note that the check must work correctly for both signed and unsigned types. Can assume that both i and j are of the same type. Can also assume that the type is known while writing the code, so different solutions can be provided for signed / unsigned cases (no need for template juggling, if it works in "C", it is a bonus).
EDIT:
Answer of #pmg is the correct one. I just couldn't wrap my head around its simplicity for a while so I will share with you here. Suppose we want to check:
i * j > MAX
But we can't really check because i * j would cause overflow and the result would be incorrect (and always less or equal to MAX). So we modify it like this:
i > MAX / j
But this is not quite correct, as in the division, there is some rounding involved. Rather, we want to know the result of this:
i > floor(MAX / j) + float(MAX % j) / j
So we have the division itself, which is implicitly rounded down by the integer arithmetics (the floor is no-op there, merely as an illustration), and we have the remainder of the division which was missing in the previous inequality (which evaluates to less than 1).
Assume that i and j are two numbers at the limit and if any of them increases by 1, an overflow will occur. Assuming none of them is zero (in which case no overflow would occur anyway), both (i + 1) * j and i * (j + 1) are both more than 1 + (i * j). We can therefore safely ignore the roundoff error of the division, which is less than 1.
Alternately, we can reorganize as such:
i - floor(MAX / j) > float(MAX % j) / j
Basically, this tells us that i - floor(MAX / j) must be greater than a number in a [0, 1) interval. That can be written exactly, as:
i - floor(MAX / j) >= 1
Because 1 is just after the interval. We can rewrite as:
i - floor(MAX / j) > 0
Or as:
i > floor(MAX / j)
So we have shown equivalence of the simple test and the floating-point version. It is because the division does not cause significant roundoff error. We can now use the simple test and live happily ever after.
You cannot test afterwards. If the multiplication overflows, it triggers Undefined Behaviour which can render tests inconclusive.
You need to test before doing the multiplication
if (INT_MAX / x > y) /* multiplication of x and y will overflow */;
If your compiler has a type that is at least twice as big as int then you can do this:
long long r = 1LL * x * y;
if ( r > INT_MAX || r < INT_MIN )
// overflowed...
else
x = r;
For portability you should STATIC_ASSERT( sizeof(long long) >= 2 * sizeof(int) ); or something similar but more extreme if you're worried about padding bits!
Try this
bool willoverflow(uint32_t a, uint32_t b) {
size_t a_bits=highestOneBitPosition(a),
size_t b_bits=highestOneBitPosition(b);
return (a_bits+b_bits<=32);
}
It is possible to see if overflow occured postfacto by using a division. In the case of unsigned values, the multiplication z=x*y has overflowed if y!=0 and:
bool overflow_occured = (y!=0)? z/y!=x : false;
(if y did equal zero, no overflow occured). For the case of signed values, it is a little trickier.
if(y!=0){
bool overflow_occured = (y<0 && x=2^31) | (y!=0 && z/y != x);
}
We need the first part of the expression because the first test will fail if x=-2^31 and y=-1. In this case the multiplication overflows, but the machine may give a result of -2^31. Therefore we test for it seperately.
This is true for 32 bit values. Extending the code to the 64 bit case is left as an exercise for the reader.
Related
The following C++ template detects overflows from multiplying two unsigned integers.
template<typename UInt> UInt safe_multiply(UInt a, UInt b) {
UInt x = a * b; // x := ab mod n, for n := 2^#bits > 0
if (a != 0 && x / a != b)
cerr << "Overflow for " << a << " * " << b << "." << endl;
return x;
}
Can you give a proof that this algorithm detects every potential overflow, regardless of how many bits UInt uses?
The case
cannot result in overflows, so we can consider
.
It seems that the correctness proof boils down to leading
to a contradiction, since x / a actually means .
When assuming
, this leads to the straightforward consequence
thus
which contradicts n > 0.
So it remains to show
or there must be another way.
If the last equation is true, WolframAlpha fails to confirm that (also with exponents).
However, it asserts that the original assumptions have no integer solutions, so the algorithms seems to be correct indeed.
But it doesn't provide an explanation. So why is it correct?
I am looking for the smallest possible explanation that is still mathematically profound, ideally that it fits in a single-line comment. Maybe I am missing something trivial, or the problem is not as easy as it looks.
On a side note, I used Codecogs Equation Editor for the LaTeX markup images, which apparently looks bad in dark mode, so consider switching to light mode or, if you know, please tell me how to use different images depending on the client settings. It is just \bg{white} vs. \bg{black} as part of the image URLs.
To be clear, I'll use the multiplication and division symbols (*, /) mathematically.
Also, for convenience let's name the set N = {0, 1, ..., n - 1}.
Let's clear up what unsigned multiplication is:
Unsigned multiplication for some magnitude, n, is a modular n operation on unsigned-n inputs (inputs that are in N) that results in an unsigned-n output (ie. also in N).
In other words, the result of unsigned multiplication, x, is x = a*b (mod n), and, additionally, we know that x,a,b are in N.
It's important to be able to expand many modular formulas like so: x = a*b - k*n, where k is an integer - but in our case x,a,b are in N so this implies that k is in N.
Now, let's mathematically say what we're trying to prove:
Given positive integers, a,n, and non-negative integers x,b, where x,a,b are in N, and x = a*b (mod n), then a*b >= n (overflow) implies floor(x/a) != b.
Proof:
If overflow (a*b >= n) then x >= n - k*n = (1 - k)*n (for k in N),
As x < n then (1 - k)*n < n, so k > 0.
This means x <= a*b - n.
So, floor(x/a) <= floor([a*b - n]/a) = floor(a*b/a - n/a) = b - floor(n/a) <= b - 1,
Abbreviated: floor(x/a) <= b - 1
Therefore floor(x/a) != b
The multiplication gives either the mathematically correct result, or it is off by some multiple of 2^64. Since you check for a=0, the division always gives the correct result for its input. But in the case of overflow, the input is off by 2^64 or more, so the test will fail as you hoped.
The last bit is that unsigned operations don’t have undefined behaviour except for division by zero, so your code is fine.
I was trying to solve the reverse integer problem, where we have to keep in mind to deal with overflow.
Reading others solutions and tried out, I wrote my solution
class Solution {
public:
int reverse(int x) {
int result = 0;
int old_result = 0;
while(x) {
old_result = result;
result = result*10 + x%10;
if ((result-old_result*10)!=x%10)
return 0;
x = x/10;
}
return result;
}
};
And the answer was not accepted because overflow was not handled well. It turned out changing
if ((result-old_result*10)!=x%10)
to
if ((result-x%10)/10!=old_result)
would make things work.
I feel these lines are doing the same check. Not sure why one passes and one fails.
Can anyone help explain?
I feel these lines are doing the same check. Not sure why one passes and one fails.
Not necessarily. If the value of old_result ever was more than (or equal to) std::numeric_limits<int>::max () / 10 + 1, the expression old_result*10 would overflow, which would give you the wrong answer.
Overflow of integral types are undefined behavior. This is the quite from C++ (C++11/C++14/C++17) standard draft (I don't have access for the full version of standard, and, in majority of cases, it is good enough):
If during the evaluation of an expression, the result is not mathematically defined or not in the range of
representable values for its type, the behavior is undefined.
The second form (reordered) of if removes the multiplication - effectively increasing the range of values, that can be used in old_result.
result = result*10 + x%10;
if ((result-old_result*10)!=x%10)
// or
if ((result-x%10)/10!=old_result)
Both are bad when coded after result*10 + x%10; as the overflow may already have happened.
int overflow is to be avoided for well behaved code.
Rather than depend on overflow behaving as certain way, detect if result*10 + x%10 will overflow before computing it.
// for positive numbers
int max = std::numeric_limits<int>::max
while(x) {
int digit = x%10;
if (result >= max/10 && (result > max/10 || digit > max%10)) {
Overflow();
}
result = result*10 + digit;
x = x/10;
}
Note that overflow with signed numbers is implementation specific UB, so I suggest to use unsigned instead. Then considering that it use similar property than unsigned, and assuming that result = result*10 + x%10; overflows. Then:
result -= old_result * 10;
"reverts" the overflow in the same way.
whereas the following is true
(result - x % 10) == old_result * 10; // With possible overflow in both side.
Dividing by 10 on both side removes the overflow only with the simplification
(result - x % 10) / 10 == old_result;
// overflow on left side (before division). No overflow on right side.
I need to divide two numbers and round it up. Are there any better way to do this?
int myValue = (int) ceil( (float)myIntNumber / myOtherInt );
I find an overkill to have to cast two different time. (the extern int cast is just to shut down the warning)
Note I have to cast internally to float otherwise
int a = ceil(256/11); //> Should be 24, but it is 23
^example
Assuming that both myIntNumber and myOtherInt are positive, you could do:
int myValue = (myIntNumber + myOtherInt - 1) / myOtherInt;
With help from DyP, came up with the following branchless formula:
int idiv_ceil ( int numerator, int denominator )
{
return numerator / denominator
+ (((numerator < 0) ^ (denominator > 0)) && (numerator%denominator));
}
It avoids floating-point conversions and passes a basic suite of unit tests, as shown here:
http://ideone.com/3OrviU
Here's another version that avoids the modulo operator.
int idiv_ceil ( int numerator, int denominator )
{
int truncated = numerator / denominator;
return truncated + (((numerator < 0) ^ (denominator > 0)) &&
(numerator - truncated*denominator));
}
http://ideone.com/Z41G5q
The first one will be faster on processors where IDIV returns both quotient and remainder (and the compiler is smart enough to use that).
Maybe it is just easier to do a:
int result = dividend / divisor;
if(dividend % divisor != 0)
result++;
Benchmarks
Since a lot of different methods are shown in the answers and none of the answers actually prove any advantages in terms of performance I tried to benchmark them myself. My plan was to write an answer that contains a short table and a definite answer which method is the fastest.
Unfortunately it wasn't that simple. (It never is.) It seems that the performance of the rounding formulas depend on the used data type, compiler and optimization level.
In one case there is an increase of speed by 7.5× from one method to another. So the impact can be significant for some people.
TL;DR
For long integers the naive version using a type cast to float and std::ceil was actually the fastest. This was interesting for me personally since I intended to use it with size_t which is usually defined as unsigned long.
For ordinary ints it depends on your optimization level. For lower levels #Jwodder's solution performs best. For the highest levels std::ceil was the optimal one. With one exception: For the clang/unsigned int combination #Jwodder's was still better.
The solutions from the accepted answer never really outperformed the other two. You should keep in mind however that #Jwodder's solution doesn't work with negatives.
Results are at the bottom.
Implementations
To recap here are the four methods I benchmarked and compared:
#Jwodder's version (Jwodder)
template<typename T>
inline T divCeilJwodder(const T& numerator, const T& denominator)
{
return (numerator + denominator - 1) / denominator;
}
#Ben Voigt's version using modulo (VoigtModulo)
template<typename T>
inline T divCeilVoigtModulo(const T& numerator, const T& denominator)
{
return numerator / denominator + (((numerator < 0) ^ (denominator > 0))
&& (numerator%denominator));
}
#Ben Voigt's version without using modulo (VoigtNoModulo)
inline T divCeilVoigtNoModulo(const T& numerator, const T& denominator)
{
T truncated = numerator / denominator;
return truncated + (((numerator < 0) ^ (denominator > 0))
&& (numerator - truncated*denominator));
}
OP's implementation (TypeCast)
template<typename T>
inline T divCeilTypeCast(const T& numerator, const T& denominator)
{
return (int)std::ceil((double)numerator / denominator);
}
Methodology
In a single batch the division is performed 100 million times. Ten batches are calculated for each combination of Compiler/Optimization level, used data type and used implementation. The values shown below are the averages of all 10 batches in milliseconds. The errors that are given are standard deviations.
The whole source code that was used can be found here. Also you might find this script useful which compiles and executes the source with different compiler flags.
The whole benchmark was performed on a i7-7700K. The used compiler versions were GCC 10.2.0 and clang 11.0.1.
Results
Now without further ado here are the results:
DataTypeAlgorithm
GCC-O0
-O1
-O2
-O3
-Os
-Ofast
-Og
clang-O0
-O1
-O2
-O3
-Ofast
-Os
-Oz
unsigned
Jwodder
264.1 ± 0.9 🏆
175.2 ± 0.9 🏆
153.5 ± 0.7 🏆
175.2 ± 0.5 🏆
153.3 ± 0.5
153.4 ± 0.8
175.5 ± 0.6 🏆
329.4 ± 1.3 🏆
220.0 ± 1.3 🏆
146.2 ± 0.6 🏆
146.2 ± 0.6 🏆
146.0 ± 0.5 🏆
153.2 ± 0.3 🏆
153.5 ± 0.6 🏆
VoigtModulo
528.5 ± 2.5
306.5 ± 1.0
175.8 ± 0.7
175.2 ± 0.5 🏆
175.6 ± 0.7
175.4 ± 0.6
352.0 ± 1.0
588.9 ± 6.4
408.7 ± 1.5
164.8 ± 1.0
164.0 ± 0.4
164.1 ± 0.4
175.2 ± 0.5
175.8 ± 0.9
VoigtNoModulo
375.3 ± 1.5
175.7 ± 1.3 🏆
192.5 ± 1.4
197.6 ± 1.9
200.6 ± 7.2
176.1 ± 1.5
197.9 ± 0.5
541.0 ± 1.8
263.1 ± 0.9
186.4 ± 0.6
186.4 ± 1.2
187.2 ± 1.1
197.2 ± 0.8
197.1 ± 0.7
TypeCast
348.5 ± 2.7
231.9 ± 3.9
234.4 ± 1.3
226.6 ± 1.0
137.5 ± 0.8 🏆
138.7 ± 1.7 🏆
243.8 ± 1.4
591.2 ± 2.4
591.3 ± 2.6
155.8 ± 1.9
155.9 ± 1.6
155.9 ± 2.4
214.6 ± 1.9
213.6 ± 1.1
unsigned long
Jwodder
658.6 ± 2.5
546.3 ± 0.9
549.3 ± 1.8
549.1 ± 2.8
540.6 ± 3.4
548.8 ± 1.3
486.1 ± 1.1
638.1 ± 1.8
613.3 ± 2.1
190.0 ± 0.8 🏆
182.7 ± 0.5
182.4 ± 0.5
496.2 ± 1.3
554.1 ± 1.0
VoigtModulo
1,169.0 ± 2.9
1,015.9 ± 4.4
550.8 ± 2.0
504.0 ± 1.4
550.3 ± 1.2
550.5 ± 1.3
1,020.1 ± 2.9
1,259.0 ± 9.0
1,136.5 ± 4.2
187.0 ± 3.4 🏆
199.7 ± 6.1
197.6 ± 1.0
549.4 ± 1.7
506.8 ± 4.4
VoigtNoModulo
768.1 ± 1.7
559.1 ± 1.8
534.4 ± 1.6
533.7 ± 1.5
559.5 ± 1.7
534.3 ± 1.5
571.5 ± 1.3
879.5 ± 10.8
617.8 ± 2.1
223.4 ± 1.3
231.3 ± 1.3
231.4 ± 1.1
594.6 ± 1.9
572.2 ± 0.8
TypeCast
353.3 ± 2.5 🏆
267.5 ± 1.7 🏆
248.0 ± 1.6 🏆
243.8 ± 1.2 🏆
154.2 ± 0.8 🏆
154.1 ± 1.0 🏆
263.8 ± 1.8 🏆
365.5 ± 1.6 🏆
316.9 ± 1.8 🏆
189.7 ± 2.1 🏆
156.3 ± 1.8 🏆
157.0 ± 2.2 🏆
155.1 ± 0.9 🏆
176.5 ± 1.2 🏆
int
Jwodder
307.9 ± 1.3 🏆
175.4 ± 0.9 🏆
175.3 ± 0.5 🏆
175.4 ± 0.6 🏆
175.2 ± 0.5
175.1 ± 0.6
175.1 ± 0.5 🏆
307.4 ± 1.2 🏆
219.6 ± 0.6 🏆
146.0 ± 0.3 🏆
153.5 ± 0.5
153.6 ± 0.8
175.4 ± 0.7 🏆
175.2 ± 0.5 🏆
VoigtModulo
528.5 ± 1.9
351.9 ± 4.6
175.3 ± 0.6 🏆
175.2 ± 0.4 🏆
197.1 ± 0.6
175.2 ± 0.8
373.5 ± 1.1
598.7 ± 5.1
460.6 ± 1.3
175.4 ± 0.4
164.3 ± 0.9
164.0 ± 0.4
176.3 ± 1.6 🏆
460.0 ± 0.8
VoigtNoModulo
398.0 ± 2.5
241.0 ± 0.7
199.4 ± 5.1
219.2 ± 1.0
175.9 ± 1.2
197.7 ± 1.2
242.9 ± 3.0
543.5 ± 2.3
350.6 ± 1.0
186.6 ± 1.2
185.7 ± 0.3
186.3 ± 1.1
197.1 ± 0.6
373.3 ± 1.6
TypeCast
338.8 ± 4.9
228.1 ± 0.9
230.3 ± 0.8
229.5 ± 9.4
153.8 ± 0.4 🏆
138.3 ± 1.0 🏆
241.1 ± 1.1
590.0 ± 2.1
589.9 ± 0.8
155.2 ± 2.4
149.4 ± 1.6 🏆
148.4 ± 1.2 🏆
214.6 ± 2.2
211.7 ± 1.6
long
Jwodder
758.1 ± 1.8
600.6 ± 0.9
601.5 ± 2.2
601.5 ± 2.8
581.2 ± 1.9
600.6 ± 1.8
586.3 ± 3.6
745.9 ± 3.6
685.8 ± 2.2
183.1 ± 1.0
182.5 ± 0.5
182.6 ± 0.6
553.2 ± 1.5
488.0 ± 0.8
VoigtModulo
1,360.8 ± 6.1
1,202.0 ± 2.1
600.0 ± 2.4
600.0 ± 3.0
607.0 ± 6.8
599.0 ± 1.6
1,187.2 ± 2.6
1,439.6 ± 6.7
1,346.5 ± 2.9
197.9 ± 0.7
208.2 ± 0.6
208.0 ± 0.4
548.9 ± 1.4
1,326.4 ± 3.0
VoigtNoModulo
844.5 ± 6.9
647.3 ± 1.3
628.9 ± 1.8
627.9 ± 1.6
629.1 ± 2.4
629.6 ± 4.4
668.2 ± 2.7
1,019.5 ± 3.2
715.1 ± 8.2
224.3 ± 4.8
219.0 ± 1.0
219.0 ± 0.6
561.7 ± 2.5
769.4 ± 9.3
TypeCast
366.1 ± 0.8 🏆
246.2 ± 1.1 🏆
245.3 ± 1.8 🏆
244.6 ± 1.1 🏆
154.6 ± 1.6 🏆
154.3 ± 0.5 🏆
257.4 ± 1.5 🏆
591.8 ± 4.1 🏆
590.4 ± 1.3 🏆
154.5 ± 1.3 🏆
135.4 ± 8.3 🏆
132.9 ± 0.7 🏆
132.8 ± 1.2 🏆
177.4 ± 2.3 🏆
Now I can finally get on with my life :P
Integer division with round-up.
Only 1 division executed per call, no % or * or conversion to/from floating point, works for positive and negative int. See note (1).
n (numerator) = OPs myIntNumber;
d (denominator) = OPs myOtherInt;
The following approach is simple. int division rounds toward 0. For negative quotients, this is a round up so nothing special is needed. For positive quotients, add d-1 to effect a round up, then perform an unsigned division.
Note (1) The usual divide by 0 blows things up and MININT/-1 fails as expected on 2's compliment machines.
int IntDivRoundUp(int n, int d) {
// If n and d are the same sign ...
if ((n < 0) == (d < 0)) {
// If n (and d) are negative ...
if (n < 0) {
n = -n;
d = -d;
}
// Unsigned division rounds down. Adding d-1 to n effects a round up.
return (((unsigned) n) + ((unsigned) d) - 1)/((unsigned) d);
}
else {
return n/d;
}
}
[Edit: test code removed, see earlier rev as needed]
Just use
int ceil_of_division = ((dividend-1)/divisor)+1;
For example:
for (int i=0;i<20;i++)
std::cout << i << "/8 = " << ((i-1)/8)+1 << std::endl;
A small hack is to do:
int divideUp(int a, int b) {
result = (a-1)/b + 1;
}
// Proof:
a = b*N + k (always)
if k == 0, then
(a-1) == b*N - 1
(a-1)/b == N - 1
(a-1)/b + 1 == N ---> Good !
if k > 0, then
(a-1) == b*N + l
(a-1)/b == N
(a-1)/b + 1 == N+1 ---> Good !
Instead of using the ceil function before casting to int, you can add a constant which is very nearly (but not quite) equal to 1 - this way, nearly anything (except a value which is exactly or incredibly close to an actual integer) will be increased by one before it is truncated.
Example:
#define EPSILON (0.9999)
int myValue = (int)(((float)myIntNumber)/myOtherInt + EPSILON);
EDIT: after seeing your response to the other post, I want to clarify that this will round up, not away from zero - negative numbers will become less negative, and positive numbers will become more positive.
This question already has answers here:
How do I detect unsigned integer overflow?
(31 answers)
Closed 9 years ago.
I have two numbers: A and B. I need to calculate A+B somewhere in my code. Both A and B are long long, and they can be positive or negative.
My code runs wrong, and I suspect the problem happens when calculating A+B. I simply want to check if A+B exceed long long range. So, any method is acceptable, as I only use it for debug.
Overflow is possible only when both numbers have the same sign. If both are positive, then you have overflow if mathematically A + B > LLONG_MAX, or equivalently B > LLONG_MAX - A. Since the right hand side is non-negative, the latter condition already implies B > 0. The analogous argument shows that for the negative case, we also need not check the sign of B (thanks to Ben Voigt for pointing out that the sign check on B is unnecessary). Then you can check
if (A > 0) {
return B > (LLONG_MAX - A);
}
if (A < 0) {
return B < (LLONG_MIN - A);
}
return false;
to detect overflow. These computations cannot overflow due to the initial checks.
Checking the sign of the result of A + B would work with guaranteed wrap-around semantics of overflowing integer computations. But overflow of signed integers is undefined behaviour, and even on CPUs where wrap-around is the implemented behaviour, the compiler may assume that no undefined behaviour occurs and remove the overflow-check altogether when implemented thus. So the check suggested in the comments to the question is highly unreliable.
Something like the following:
long long max = std::numeric_limits<long long>::max();
long long min = std::numeric_limits<long long>::min();
if(A < 0 && B < 0)
return B < min - A;
if(A > 0 && B > 0)
return B > max - A;
return false;
We can reason about this as follows:
If A and B are opposite sign, they cannot overflow - the one greater than zero would need to be greater than max or the one less than zero would need to be less than min.
In the other cases, simple algebra suffices. A + B > max => B > max - A will overflow if they are both positive. Otherwise if they are both negative, A + B < min => B < min - A.
Also, if you're only using it for debug, you can use the following 'hack' to read the overflow bit from the last operation directly (assuming your compiler/cpu supports this):
int flags;
_asm {
pushf // push flag register on the stack
pop flags // read the value from the stack
}
if (flags & 0x0800) // bit 11 - overflow
...
Mask the signs, cast to unsigned values, and perform the addition. If it's above 1 << (sizeof(int) * 8 - 1) then you have an overflow.
int x, y;
if (sign(x) == sign(y)){
unsigned int ux = abs(x), uy = abs(y);
overflow = ux + uy >= (1 << (sizeof(int) * 8 - 1));
}
Better yet, let's write a template:
template <typename T>
bool overflow(signed T x, signed T y){
unsigned T ux = x, uy = y;
return ( sign(x) == sign(y) && (ux + uy >= (1 << (sizeof(T) * 8 - 1)));
}
For given numbers x,y and n, I would like to calculate x-y mod n in C. Look at this example:
int substract_modulu(int x, int y, int n)
{
return (x-y) % n;
}
As long as x>y, we are fine. In the other case, however, the modulu operation is undefined.
You can think of x,y,n>0. I would like the result to be positive, so if (x-y)<0, then ((x-y)-substract_modulu(x,y,n))/ n shall be an integer.
What is the fastest algorithm you know for that? Is there one which avoids any calls of if and operator??
As many have pointed out, in current C and C++ standards, x % n is no longer implementation-defined for any values of x and n. It is undefined behaviour in the cases where x / n is undefined [1]. Also, x - y is undefined behaviour in the case of integer overflow, which is possible if the signs of x and y might differ.
So the main problem for a general solution is avoiding integer overflow, either in the division or the subtraction. If we know that x and y are non-negative and n is positive, then overflow and division by zero are not possible, and we can confidently say that (x - y) % n is defined. Unfortunately, x - y might be negative, in which case so will be the result of the % operator.
It's easy to correct for the result being negative if we know that n is positive; all we have to do is unconditionally add n and do another modulo operation. That's unlikely to be the best solution, unless you have a computer where division is faster than branching.
If a conditional load instruction is available (pretty common these days), then the compiler will probably do well with the following code, which is portable and well-defined, subject to the constraints that x,y ≥ 0 ∧ n > 0:
((x - y) % n) + ((x >= y) ? 0 : n)
For example, gcc produces this code for my core I5 (although it's generic enough to work on any non-Paleozoic intel chip):
idivq %rcx
cmpq %rsi, %rdi
movl $0, %eax
cmovge %rax, %rcx
leaq (%rdx,%rcx), %rax
which is cheerfully branch-free. (Conditional move is usually a lot faster than branching.)
Another way of doing this would be (except that the function sign needs to be written):
((x - y) % n) + (sign(x - y) & (unsigned long)n)
where sign is all 1s if its argument is negative, and otherwise 0. One possible implementation of sign (adapted from bithacks) is
unsigned long sign(unsigned long x) {
return x >> (sizeof(long) * CHAR_BIT - 1);
}
This is portable (casting negative integer values to unsigned is defined), but it may be slow on architectures which lack high-speed shift. It's unlikely to be faster than the previous solution, but YMMV. TIAS.
Neither of these produce correct results for the general case where integer overflow is possible. It's very difficult to deal with integer overflow. (One particularly annoying case is n == -1, although you can test for that and return 0 without any use of %.) Also, you need to decide your preference for the result of modulo of negative n. I personally prefer the definition where x%n is either 0 or has the same sign as n -- otherwise why would you bother with a negative divisor -- but applications differ.
The three-modulo solution proposed by Tom Tanner will work if n is not -1 and n + n does not overflow. n == -1 will fail if either x or y is INT_MIN, and the simple fix of using abs(n) instead of n will fail if n is INT_MIN. The cases where n has a large absolute value could be replaced with comparisons, but there are a lot of corner cases, and made more complicated by the fact that the standard does not require 2's complement arithmetic, so it's not easily predictable what the corner cases are [2].
As a final note, some tempting solutions do not work. You cannot just take the absolute value of (x - y):
(-z) % n == -(z % n) == n - (z % n) ≠ z % n (unless z % n happens to be n / 2)
And, for the same reason, you cannot just take the absolute value of the result of modulo.
Also, you cannot just cast (x - y) to unsigned:
(unsigned)z == z + 2k (for some k) if z < 0
(z + 2k) % n == (z % n) + (2k % n) ≠ z % n unless (2k % n) == 0
[1] x/n and x%n are both undefined if n==0. But x%n is also undefined if x/n is "not representable" (i.e. there was integer overflow), which will happen on twos-complement
machines (that is, all the ones you care about) if x is most negative representable number and n == -1. It's clear why x/n should be undefined in this case, but slightly less so in the case of x%n, since that value is (mathematically) 0.
[2] Most people who complain about the difficulty of predicting the results of floating-point arithmetic haven't spent much time trying to write truly portable integer arithmetic code :)
If you want to avoid undefined behaviour, without an if, the following would work
return (x % n - y % n + n) % n;
The efficiency depends on the implementation of the modulo operation, but I'd suspect algorithms involving if would be rather faster.
Alternatively you could treat x and y as unsigned. In which case there are no negative numbers involved and no undefined behaviour.
With C++11 the undefined behavior was removed. Depending on the the exact behavior you want you can there just stick with
return (x-y) % n;
For a full explanation read this answer:
https://stackoverflow.com/a/13100805/1149664
You still get undefined behavior for n==0 or if x-y can not be stored in the type you are using.
Whether branching is going to matter will depend on the CPU to some degree. According to the documentation abs (on MSDN) has intrinsic behavior and it might not be a bottleneck at all. This you'll have to test.
If you wan't unconditionally compute things there are several nice methods that can be adapted from the Bit Twiddling Hacks site.
int v; // we want to find the absolute value of v
unsigned int r; // the result goes here
int const mask = v >> sizeof(int) * CHAR_BIT - 1;
r = (v + mask) ^ mask;
However, I don't know if this will be helpful to your situation without more information about hardware targets and testing.
Just out of curiosity I had to test this myself and when you look at the assembly generated by the compiler we can see there's no real overhead in the use of abs.
unsigned r = abs(i);
====
00381006 cdq
00381007 xor eax,edx
00381009 sub eax,edx
The following is just an alternate form of the above example which according to the Bit Twiddling Site is not patented (while the version used by the Visual C++ 2008 compiler is).
Throughout my answer I have been using MSDN and Visual C++ but I would assume that any sane compiler has similar behavior.
Assuming 0 <= x < n and 0 <= y < n, how about (x + n - y) % n? Then x + n will certainly be larger than y, subtracting y will always result in a positive integer, and the final mod n reduces the result if necessary.
I'm going to guess that it's not really the case here, but I'd like to mention that if the value you are taking modulo with is a power of two, then using the "AND" method is a lot quicker (I'm going to ignore the x-y, and just show how it works for a single x, as x-y is not part of the equation here):
int modpow2(int x, int n)
{
return x & (n-1);
}
If you want to ensure that your code doesn't do anything daft, you could add ASSERT(!(n & n-1)); - this checks that there is only a single bit set in n (so, n is a power of two).
Here is the CPP Code I use in competitive programming:
#include <iostream>
#include<bits/stdc++.h>
using namespace std;
#define ll long long
#define mod 1000000007
ll subtraction_modulo(ll x, ll y ){
return ( ( (x - y) % mod ) + mod ) % mod;
}
Here,
ll -> long long int
mod -> globally defined mod value to be used.