While answering another question I got curious about this. I'm well aware that
if( __builtin_expect( !!a, 0 ) ) {
// not likely
} else {
// quite likely
}
will make the "quite likely" branch faster (in general) by doing something along the lines of hinting to the processor / changing the assembly code order / some kind of magic. (if anyone can clarify that magic that would also be great).
But does this work for a) inline ifs, b) variables and c) values other than 0 and 1? i.e. will
__builtin_expect( !!a, 0 ) ? /* unlikely */ : /* likely */;
or
int x = __builtin_expect( t / 10, 7 );
if( x == 7 ) {
// likely
} else {
// unlikely
}
or
if( __builtin_expect( a, 3 ) ) {
// likely
// uh-oh, what happens if a is 2?
} else {
// unlikely
}
have any effect? And does all of this depend on the architecture being targeted?
Did you read the GCC documentation?
Built-in Function: long __builtin_expect (long exp, long c)
You may use __builtin_expect to provide the compiler with branch
prediction information. In general, you should prefer to use actual
profile feedback for this (-fprofile-arcs), as programmers are
notoriously bad at predicting how their programs actually perform.
However, there are applications in which this data is hard to collect.
The return value is the value of exp, which should be an integral
expression. The semantics of the built-in are that it is expected that
exp == c. For example:
if (__builtin_expect (x, 0))
foo ();
indicates that we do not expect to call foo, since we expect x to be zero. Since you are limited to integral expressions for exp, you should use constructions such as
if (__builtin_expect (ptr != NULL, 1))
foo (*ptr);
when testing pointer or floating-point values.
To explain this a bit... __builtin_expect is specifically useful for communicating which branch you think the program's likely to take. You ask how the compiler can use that insight - well, consider this code:
if (x == 0)
return 10 * y;
else
return 39;
In machine code, the CPU can typically be asked to "goto" another line (which takes time, and depending on the CPU may prevent other execution optimisations - i.e. beneath the level of machine code - for example, see the Branches heading under http://en.wikipedia.org/wiki/Instruction_pipeline), or to call some other code, but there's not really an if/else concept where both true and false code are equal... you have to branch away to find the code for one or the other. The way that's done is basically, in pseudo-code:
test whether x is 0
if it was goto else_return_39
return 10 * y
else_return_39:
return 39
Given most CPUs are slower following the goto down to the else_return_39: label than to just fall through to return 10 * y, code for the "true" branch will be reached faster than for the false branch. Of course, the machine code could test whether x is not 0, put the "false" code (return 39) first and thereby reverse the performance characteristics.
This is what __builtin_expect controls - you can tell the compiler to put the true or the false branch where less branching is needed to reach it, thereby getting a tiny performance boost.
But does this work for a) inline ifs, b) variables and c) values other than 0 and 1?
a) Whether the surrounding function is inlined or not doesn't change the need for branching where the if statement appears (unless the optimiser sees the condition the if statement tests is always true or false and only one branch could never run). So, it's equally applicable to inlined code.
[ Your comment shows you were interested in conditional expressions - a ? b : c - I'm not sure - there's a disputed answer to that question at https://stackoverflow.com/questions/14784481/can-i-use-gccs-builtin-expect-with-ternary-operator-in-c that might prove insightful one way or the other, or the basis for further exploration ]
b) variables - you postulated:
int x = __builtin_expect( t / 10, 7 );
if( x == 7 ) {
That won't work - the compiler's not obliged to associate such expectations with variables and remember them the next time an if is seen. You can verify this (as I did for gcc 3.4.4) using gcc -S to produce assembly language output: the assembly doesn't change regardless of the expected value.
c) values other than 0 and 1
It works for integral (long) values, so yes. The last paragraph of the documentation quoted above address this, specifically:
you should use constructions such as
if (__builtin_expect (ptr != NULL, 1))
foo (*ptr);
when testing pointer or floating-point values.
Why? Well, if the pointer type is larger than long, then calling __builtin_conversion(long, long) would effectively slice off some of the less-significant bits for the test, and fail to incorporate the (high-order) rest in the test. Similarly, floating point values might be larger than a long, and the conversion not produce the result you expect. By using a boolean expression such as ptr != NULL (given true converts to 1 and false to 0) you're sure to get intended results.
But does this work for a) inline ifs, b) variables and c) values other than 0 and 1?
It works for an expression context that is used to determine branching.
So, a) Yes. b) No. c) Yes.
And does all of this depend on the architecture being targeted?
Yes!
It leverages architectures that use instruction pipelining, which allow a CPU to begin working on upcoming instructions before the current instruction has been completed.
(if anyone can clarify that magic that would also be great).
("Branch prediction" complicates this description, so I'm intentionally omitting it)
Any code resembling an if statement implies that an expression may result in the CPU jumping to a different location in the program. These jumps invalidate what's in the CPU's instruction pipeline.
__builtin_expect allows (without guarantee) gcc to try to assemble the code so the likely scenario involves fewer jumps than the alternate.
Related
I am reading this book by Fedor Pikus and he has some very very interesting examples which for me were a surprise.
Particularly this benchmark caught me, where the only difference is that in one of them we use || in if and in another we use |.
void BM_misspredict(benchmark::State& state)
{
std::srand(1);
const unsigned int N = 10000;;
std::vector<unsigned long> v1(N), v2(N);
std::vector<int> c1(N), c2(N);
for (int i = 0; i < N; ++i)
{
v1[i] = rand();
v2[i] = rand();
c1[i] = rand() & 0x1;
c2[i] = !c1[i];
}
unsigned long* p1 = v1.data();
unsigned long* p2 = v2.data();
int* b1 = c1.data();
int* b2 = c2.data();
for (auto _ : state)
{
unsigned long a1 = 0, a2 = 0;
for (size_t i = 0; i < N; ++i)
{
if (b1[i] || b2[i]) // Only difference
{
a1 += p1[i];
}
else
{
a2 *= p2[i];
}
}
benchmark::DoNotOptimize(a1);
benchmark::DoNotOptimize(a2);
benchmark::ClobberMemory();
}
state.SetItemsProcessed(state.iterations());
}
void BM_predict(benchmark::State& state)
{
std::srand(1);
const unsigned int N = 10000;;
std::vector<unsigned long> v1(N), v2(N);
std::vector<int> c1(N), c2(N);
for (int i = 0; i < N; ++i)
{
v1[i] = rand();
v2[i] = rand();
c1[i] = rand() & 0x1;
c2[i] = !c1[i];
}
unsigned long* p1 = v1.data();
unsigned long* p2 = v2.data();
int* b1 = c1.data();
int* b2 = c2.data();
for (auto _ : state)
{
unsigned long a1 = 0, a2 = 0;
for (size_t i = 0; i < N; ++i)
{
if (b1[i] | b2[i]) // Only difference
{
a1 += p1[i];
}
else
{
a2 *= p2[i];
}
}
benchmark::DoNotOptimize(a1);
benchmark::DoNotOptimize(a2);
benchmark::ClobberMemory();
}
state.SetItemsProcessed(state.iterations());
}
I will not go in all the details explained in the book why the latter is faster, but the idea is that hardware branch predictor is given 2 chances to misspredict in slower version and in the | (bitwise or) version. See the benchmark results below.
So the question is why don't we always use | instead of || in branches?
Is if(A | B) always faster than if(A || B)?
No, if(A | B) is not always faster than if(A || B).
Consider a case where A is true and the B expression is a very expensive operation. Not doing the operation can save that expense.
So the question is why don't we always use | instead of || in branches?
Besides the cases where the logical or is more efficient, the efficiency is not the only concern. There are often operations that have pre-conditions, and there are cases where the result of the left hand operation signals whether the pre-condition is satisfied for the right hand operation. In such case, we must use the logical operator.
if (b1[i]) // maybe this exists somewhere in the program
b2 = nullptr;
if(b1[i] || b2[i]) // OK
if(b1[i] | b2[i]) // NOT OK; indirection through null pointer
It is this possibility that is typically the problem when the optimiser is unable to replace logical with bitwise. In the example of if(b1[i] || b2[i]), the optimiser can do such replacement only if it can prove that b2 is valid at least when b1[i] != 0. That condition might not exist in your example, but that doesn't mean that it would necessarily be easy or - sometimes even possible - for the optimiser to prove that it doesn't exist.
Furthermore, there can be a dependency on the order of the operations, for example if one operand modifies a value read by the other operation:
if(a || (a = b)) // OK
if(a | (a = b)) // NOT OK; undefined behaviour
Also, there are types that are convertible to bool and thus are valid operands for ||, but are not valid operators for |:
if(ptr1 || ptr2) // OK
if(ptr1 | ptr2) // NOT OK; no bitwise or for pointers
TL;DR If we could always use bitwise or instead of logical operators, then there would be no need for logical operators and they probably wouldn't be in the language. But such replacement is not always a possibility, which is the reason why we use logical operators, and also the reason why optimiser sometimes cannot use the faster option.
If evaluating A is fast, B is slow, and when the short circuit happens (A returns true), then if (A || B) will avoid the slow path where if (A | B) will not.
If evaluating A almost always gives the same result, the processor's branch prediction may give if (A || B) performance better than if (A | B) even if B is fast.
As others have mentioned, there are cases where the short circuit is mandatory: you only want to execute B if A is known to evaluate false:
if (p == NULL || test(*p)) { ... } // null pointer would crash on *p
if (should_skip() || try_update()) { ... } // active use of side effects
Bitwise-or is a branchless arithmetic operator corresponding to a single ALU instruction. Logical-or is defined as implying shortcut evaluation, which involves a (costly) conditional branch. The effect of the two can differ when the evaluations of the operands have side effects.
In the case of two boolean variables, a smart compiler might evaluate logical-or as a bitwise-or, or using a conditional move, but who knows...
So the question is why don't we always use | instead of || in branches?
Branch prediction is relevant only to fairly hot pieces of code, and it depends on the branch being predictable enough to matter. In most places, | has little or no performance benefit over ||.
Also, taking A and B as expressions of suitable type (not necessarily single variables), key relevant differences include:
In A || B, B is evaluated only if A evaluates to 0, but in A | B, B is always evaluated. Conditionally avoiding evaluation of B is sometimes exactly the point of using the former.
In A || B there is a sequence point between evaluation of A and evaluation of B, but there isn't one in A | B. Even if you don't care about short-circuiting, you may care about the sequencing, even in relatively simple examples. For example, given an integer x, x-- || x-- has well-defined behavior, but x-- | x-- has undefined behavior.
When used in conditional context, the intent of A || B is clear to other humans, but the reason to substitute A | B less so. Code clarity is extremely important. And after all, if the compiler can see that it is safe to do so (and in most cases it is more reliable than a human at making the determination) then it is at liberty to compile one of those expressions as if it were the other.
If you cannot be sure that A and B both have built-in types -- in a template, for example -- you have to account for the possibility that one or both of | and || have been overloaded. In that case, it is reasonable to suppose that || will still do something that makes sense for branch control, but it is much less safe to assume that | will do something equivalent or even suitable.
As an additional minor matter, the precedence of | is different from that of ||. This can bite you if you rely on precedence instead of parentheses for grouping, and you need to watch out for that if you are considering modifying existing code to change || expressions to | expressions. For example, A && B || C && D groups as (A && B) || (C && D), but A && B | C && D groups as (A && (B | C)) && D.
Even if a and b are automatic-duration Boolean flags, that doesn't mean that an expression like a||b will be evaluated by checking the state of one flag, and then if necessary checking the state of the other. If a section of code performs:
x = y+z;
flag1 = (x==0);
... code that uses flag1
a compiler could replace that with:
x = y+z;
if (processor's Z flag was set)
{
... copy of that uses flag1, but where flag is replaced with constant 1
}
else
{
... copy of that uses flag1, but where flag is replaced with constant 0
}
Although hardly required to do so, a compiler may base some of its decisions about whether to perform such substitution upon a programmer's choice of whether ot write (flag1 || flag2) or (flag1 | flag2), and many factors may cause the aforementioned substitution to run faster or slower than the original code.
Code readability, short-circuiting and it is not guaranteed that Ord will always outperform a || operand.
Computer systems are more complicated than expected, even though they are man-made.
There was a case where a for loop with a much more complicated condition ran faster on an IBM. The CPU didn't cool and thus instructions were executed faster, that was a possible reason. What I am trying to say, focus on other areas to improve code than fighting small-cases which will differ depending on the CPU and the boolean evaluation (compiler optimizations).
The expression A | B might be faster in a loop that the compiler can optimize to a bitwise or of two vectors. Another case where | might be slightly more optimized is when the compiler would want to optimize away a branch by combining the two operands with bitmasks. If the operand on the right is something that might have visible side-effects, the compiler must instead insert a branch to guarantee the correct short-circuiting.
In other situations I can think of, A || B will be as fast or faster, including just about all the ones I can think of where you’re comparing a single pair of operands rather than a vector. However, those are almost never critical to performance.
Adding to the list:
Given the case that A and B are totally unpredictable, but usually A||B is true (i.e. when A is wrong, then usually B is true and vice versa).
In this case A||B may lead to a lot of mispredictions, but A|B is predictable and most likely faster.
So the question is why don't we always use | instead of || in branches?
In my little mind three reasons, but that might be only me.
First and most: all code tends to be read multiple times more often than it is written. So, in the example you have an array of ints with value either zero or 1. What the code really is meant to do is hidden to the reader. It might be clear to the original writer exactly what is supposed to be done, but years later, and after adding lots and lots of code lines it is probably anyones guess. In my world use what shows the intended comparison, it either is a bit comparison or a logical one. It should not be both.
Secondly: does the performance gain really matter? The example basically does nothing and could be coded a lot more efficient. Not? Well don't create the array in the first place, what the code does now is simply to check the quality of the rand function. The optimization is an example where a better analysis of the problem probably would give a much larger gain.
Third: a different compiler or a different processor will probably change the relative speeds. Now, what you are doing here is applying your knowledge of the internals of the current compiler and current processor. Next version of compiler and processor might turn the question around totally. You really cannot know. And we all know that the only way to know if an optimization actually will be quicker is by testing, which this very specific example has done.
So, if, for some reason, getting the very last bit of efficiency out of the code is worth it I would mark the choice as a "hack". I would probably include a macro with something like "DO_PERFORMANCE_HACKS" and have two versions of the code, one with | and one with || and commenting what the intention is. By changing the macro it will be possible for next readers to see what and where the hack is, and in the future may elect to not compile them.
I need to find whether two finite floating point values A and B have different signs or one of them is zero.
In many code examples I see the test as follows:
if ( (A <= 0 && B >= 0) || (A >= 0 && B <= 0) )
It works fine but looks inefficient to me since many conditions are verified here and each condition branch is a slow operation for modern CPUs.
The other option is to use multiplication of the floating-point values:
if ( A * B <= 0 )
It shall work even in case of overflow, since infinity values preserve proper sign.
What is the most efficient way for performing the check (one of these two or probably some other)?
Efficient check that two floating point values have distinct signs
if ( A * B <= 0 ) fails to distinguish the sign when A or B are of the set -0.0, +0.0.
Consider signbit()
if (signbit(A) == signbit(B)) {
; // same sign
} else {
; // distinct signs
}
Trust the compiler to form efficient code - or use a better compiler.
OP then formed a different goal: "ether two finite floating point values A and B have distinct signs or one of them is zero."
Of course if (signbit(A) != signbit(B) || A == 0.0 || B == 0.0) meets this newer functionality without the rounding to 0.0 issue #Eric Postpischil of if ( A * B <= 0 ), but may not meet the fuzzy requirement of most efficient way.
The best way to form efficient code and avoid premature optimization really the root of all evil is to see the larger context.
It works fine but looks inefficient to me since many conditions are verified here and each condition branch is a slow operation for modern CPUs.
An optimizing compiler won't branch unnecessarily. Don't worry about premature optimization unless you're in a very tight hot loop. Your first snippet is compiled to only 2 branches in x86-64 and ARM64 in the compilers I tried. It can also be compiled to a branchless version. Of course it may still be slower than a simple A * B <= 0 but there's no way to know that for sure without a proper benchmark
If you don't care about NaN then you can simply do some bitwise operations:
auto a = std::bit_cast<uint64_t>(A);
auto b = std::bit_cast<uint64_t>(B);
const auto sign_mask = 1ULL << 63;
return ((a ^ b) & sign_mask) || (((a | b) & ~sign_mask) == 0);
If A and B have different signs then it'll be matched by (a ^ b) & sign_mask. If they have same sign then they both must be zero which will be caught by the latter condition. But this works in the integer so it may incur a cross-domain penalty when moving the value from float to int domain
If std::bit_cast not available then just replace with memcpy(&a, &A, sizeof A)
Demo on Godbolt
Again, do a benchmark to determine what's best for your target. There's no solution that's fastest on every microarchitecture available. If you really run this a lot of times in a loop then you should use SIMD instead to check for multiple values at the same time. You should also use profile-guided optimization in order for the compiler to know where and when to branch
In my software I am using the input values from the user at run time and performing some mathematical operations. Consider for simplicity below example:
int multiply(const int a, const int b)
{
if(a >= INT_MAX || B >= INT_MAX)
return 0;
else
return a*b;
}
I can check if the input values are greater than the limits, but how do I check if the result will be out of limits? It is quite possible that a = INT_MAX - 1 and b = 2. Since the inputs are perfectly valid, it will execute the undefined code which makes my program meaningless. This means any code executed after this will be random and eventually may result in crash. So how do I protect my program in such cases?
This really comes down to what you actually want to do in this case.
For a machine where long or long long (or int64_t) is a 64-bit value, and int is a 32-bit value, you could do (I'm assuming long is 64 bit here):
long x = static_cast<long>(a) * b;
if (x > MAX_INT || x < MIN_INT)
return 0;
else
return static_cast<int>(x);
By casting one value to long, the other will have to be converted as well. You can cast both if that makes you happier. The overhead here, above a normal 32-bit multiply is a couple of clock-cycles on modern CPU's, and it's unlikely that you can find a safer solution, that is also faster. [You can, in some compilers, add attributes to the if saying that it's unlikely to encourage branch prediction "to get it right" for the common case of returning x]
Obviously, this won't work for values where the type is as big as the biggest integer you can deal with (although you could possibly use floating point, but it may still be a bit dodgy, since the precision of float is not sufficient - could be done using some "safety margin" tho' [e.g. compare to less than LONG_INT_MAX / 2], if you don't need the entire range of integers.). Penalty here is a bit worse tho', especially transitions between float and integer isn't "pleasant".
Another alternative is to actually test the relevant code, with "known invalid values", and as long as the rest of the code is "ok" with it. Make sure you test this with the relevant compiler settings, as changing the compiler options will change the behaviour. Note that your code then has to deal with "what do we do when 65536 * 100000 is a negative number", and your code didn't expect so. Perhaps add something like:
int x = a * b;
if (x < 0) return 0;
[But this only works if you don't expect negative results, of course]
You could also inspect the assembly code generated and understand the architecture of the actual processor [the key here is to understand if "overflow will trap" - which it won't by default in x86, ARM, 68K, 29K. I think MIPS has an option of "trap on overflow"], and determine whether it's likely to cause a problem [1], and add something like
#if (defined(__X86__) || defined(__ARM__))
#error This code needs inspecting for correct behaviour
#endif
return a * b;
One problem with this approach, however, is that even the slightest changes in code, or compiler version may alter the outcome, so it's important to couple this with the testing approach above (and make sure you test the ACTUAL production code, not some hacked up mini-example).
[1] The "undefined behaviour" is undefined to allow C to "work" on processors that have trapping overflows of integer math, as well as the fact that that a * b when it overflows in a signed value is of course hard to determine unless you have a defined math system (two's complement, one's complement, distinct sign bit) - so to avoid "defining" the exact behaviour in these cases, the C standard says "It's undefined". It doesn't mean that it will definitely go bad.
Specifically for the multiplication of a by b the mathematically correct way to detect if it will overflow is to calculate log₂ of both values. If their sum is higher than the log₂ of the highest representable value of the result, then there is overflow.
log₂(a) + log₂(b) < log₂(UINT_MAX)
The difficulty is to calculate quickly the log₂ of an integer. For that, there are several bit twiddling hacks that can be used, like counting bit, counting leading zeros (some processors even have instructions for that). This site has several implementations
https://graphics.stanford.edu/~seander/bithacks.html#IntegerLogObvious
The simplest implementation could be:
unsigned int log2(unsigned int v)
{
unsigned int r = 0;
while (v >>= 1)
r++;
return r;
}
In your program you only need to check then
if(log2(a) + log2(b) < MYLOG2UINTMAX)
return a*b;
else
printf("Overflow");
The signed case is similar but has to take care of the negative case specifically.
EDIT: My solution is not complete and has an error which makes the test more severe than necessary. The equation works in reality if the log₂ function returns a floating point value. In the implementation I limited thevalue to unsigned integers. This means that completely valid multiplication get refused. Why? Because log2(UINT_MAX) is truncated
log₂(UINT_MAX)=log₂(4294967295)≈31.9999999997 truncated to 31.
We have there for to change the implementation to replace the constant to compare to
#define MYLOG2UINTMAX (CHAR_BIT*sizeof (unsigned int))
You may try this:
if ( b > ULONG_MAX / a ) // Need to check a != 0 before this division
return 0; //a*b invoke UB
else
return a*b;
Say, I have a function like shown below
void caller()
{
int flag = _getFlagFromConfig();
//this is a flag, which according to the implementation
//is supposed to have only two values, 0 and 1 (as of now)
callee_1(flag);
callee_2(1 == flag);
}
void callee_1(int flag)
{
if (1 == flag)
{
//do operation X
}
}
void callee_2(bool flag)
{
if (flag)
{
//do operation X
}
}
Which of the callee functions will be a better implementation?
I have gone through this link and I'm pretty convinced that there is not much of a performance impact by taking bool for comparison in an if-condition. But in my case, I have the flag as an integer. In this case, is it worth going for the second callee?
It won't make any difference in terms of performance, however in terms of readability if there are only 2 values then a bool makes more sense. Especially if you name your flag something sensible like isFlagSet.
In terms of efficiency, they should be the same.
Note however that they don't do the same thing - you can pass something other than 1 to the first function, and the condition will evaluate to false even if the parameter is not itself false. The extra comparison could account for some overhead, probably not.
So let's assume the following case:
void callee_1(int flag)
{
if (flag)
{
//do operation X
}
}
void callee_2(bool flag)
{
if (flag)
{
//do operation X
}
}
In this case, technically the first variant would be faster, since bool values aren't checked directly for true or false, but promoted to a word-sized type and then checked for 0. Although the assembly generated might be the same, the processor theoretically does more work on the bool option.
If the value or argument is being used as a boolean, declare it bool.
The probability of it making any difference in performance is almost 0,
and the use of bool documents your intent, both to the reader and to
the compiler.
Also, if you have an int which is being used as a flag (due to an
existing interface): either use the implicit conversion (if the
interface documents it as a boolean), or compare it with 0 (not with
1). This is conform with the older definitions of how int served as
a boolean (before the days when C++ had bool).
One case where the difference between bool and int results in different (optimized) asm is the negation operator ("!").
For "!b", If b is a bool, the compiler can assume that the integer value is either 1 or 0, and the negation can thus be a simple "b XOR 1". OTOH, if b is an integer, the compiler, barring data-flow analysis, must assume that the variable may contain any integer value, and thus to implement the negation it must generate code such as
(b != 0) ? 0: 1
That being said, code where negation is a performance-critical operation is quite rare, I'm sure.
According to C++ Standard (5/5) dividing by zero is undefined behavior. Now consider this code (lots of useless statements are there to prevent the compiler from optimizing code out):
int main()
{
char buffer[1] = {};
int len = strlen( buffer );
if( len / 0 ) {
rand();
}
}
Visual C++ compiles the if-statement like this:
sub eax,edx
cdq
xor ecx,ecx
idiv eax,ecx
test eax,eax
je wmain+2Ah (40102Ah)
call rand
Clearly the compiler sees that the code is to divide by zero - it uses xor x,x pattern to zero out ecx which then serves the second operand in integer division. This code will definitely trigger an "integer division by zero" error at runtime.
IMO such cases (when the compiler knows that the code will divide by zero at all times) are worth a compile-time error - the Standard doesn't prohibit that. That would help diagnose such cases at compile time instead of at runtime.
However I talked to several other developers and they seem to disagree - their objection is "what if the author wanted to divide by zero to... emm... test error handling?"
Intentionally dividing by zero without compiler awareness is not that hard - using __declspec(noinline) Visual C++ specific function decorator:
__declspec(noinline)
void divide( int what, int byWhat )
{
if( what/byWhat ) {
rand();
}
}
void divideByZero()
{
divide( 0, 0 );
}
which is much more readable and maintainable. One can use that function when he "needs to test error handling" and have a nice compile-time error in all other cases.
Am I missing something? Is it necessary to allow emission of code that the compiler knows divides by zero?
There is probably code out there which has accidental division by zero in functions which are never called (e.g. because of some platform-specific macro expansion), and these would no longer compile with your compiler, making your compiler less useful.
Also, most division by zero errors that I've seen in real code are input-dependent, or at least are not really amenable to static analysis. Maybe it's not worth the effort of performing the check.
Dividing by 0 is undefined behavior because it might trigger, on certain platforms, a hardware exception. We could all wish for a better behaved hardware, but since nobody ever saw fit to have integers with -INF/+INF and NaN values, it's quite pointeless.
Now, because it's undefined behavior, interesting things may happen. I encourage you to read Chris Lattner's articles on undefined behavior and optimizations, I'll just give a quick example here:
int foo(char* buf, int i) {
if (5 / i == 3) {
return 1;
}
if (buf != buf + i) {
return 2;
}
return 0;
}
Because i is used as a divisor, then it is not 0. Therefore, the second if is trivially true and can be optimized away.
In the face of such transformations, anyone hoping for a sane behavior of a division by 0... will be harshly disappointed.
In the case of integral types (int, short, long, etc.) I can't think of any uses for intentional divide by zero offhand.
However, for floating point types on IEEE-compliant hardware, explicit divide by zero is tremendously useful. You can use it to produce positive & negative infinity (+/- 1/0), and not a number (NaN, 0/0) values, which can be quite helpful.
In the case of sorting algorithms, you can use the infinities as initial values representing greater or less than all possible values.
For data analysis purposes, you can use NaNs to indicate missing or invalid data, which can then be handled gracefully. Matlab, for example, uses explicit NaN values to suppress missing data in plots, etc.
Although you can access these values through macros and std::numeric_limits (in C++), it is useful to be able to create them on your own (and allows you to avoid lots of "special case" code). It also allows implementors of the standard library to avoid resorting to hackery (such as manual assembly of the correct FP bit sequence) to provide these values.
If the compiler detects a division-by-0, there is absolutely nothing wrong with a compiler error. The developers you talked to are wrong - you could apply that logic to every single compile error. There is no point in ever dividing by 0.
Detecting divisions by zero at compile-time is the sort of thing that you'd want to have be a compiler warning. That's definitely a nice idea.
I don't keep no company with Microsoft Visual C++, but G++ 4.2.1 does do such checking. Try compiling:
#include <iostream>
int main() {
int x = 1;
int y = x / 0;
std::cout << y;
return 0;
}
And it will tell you:
test.cpp: In function ‘int main()’:
test.cpp:5: warning: division by zero in ‘x / 0’
But considering it an error is a slippery slope that the savvy know not to spend too much of their spare time climbing. Consider why G++ doesn't have anything to say when I write:
int main() {
while (true) {
}
return 0;
}
Do you think it should compile that, or give an error? Should it always give a warning? If you think it must intervene on all such cases, I eagerly await your copy of the compiler you've written that only compiles programs that guarantee successful termination! :-)