Is pointer difference between two arrays defined on specific implementations? - c++

According to the C standard:
When two pointers are subtracted, both shall point to elements of the
same array object, or one past the last element of the array object
(sect. 6.5.6 1173)
[Note: do not assume that I know much of the standard or UB, I just happen to have found out this one]
I understand that in almost all cases, taking the difference of pointers in two different arrays would be a bad idea anyway.
I also know that on some architectures ("segmented machine" as I read somewhere), there are good reasons that the behavior is undefined.
Now on the other hand
It may useful in some corner cases. For example, in this post, it would allow to use a library interface with different arrays, instead of copying everything in one array that will be split just after.
It seems that on "ordinary" architectures, the way of thinking "all objects are stored in a big array starting at approx. 0 and ending at approx. memory size" is a reasonable description of the memory. When you actually do look at pointer differences of different arrays, you get sensible results.
Hence my question: from experiment, it seems that on some architectures (e.g. x86-64), pointer difference between two arrays provides sensible, reproducible results. And it seems to correspond reasonably well to the hardware of these architectures. So does some implementation actually insure a specific behavior?
For example, is there an implementation out there in the wild that guarantees for a and b being char*, we have a + (reinterpret_cast<std::ptrdiff_t>(b)-reinterpret_cast<std::ptrdiff_t>(a)) == b?

Why make it UB, and not implementation-defined? (where of course, for some architectures, implementation-defined will specify it as UB)
That is not how it works.
If something is documented as "implementation-defined" by the standard, then any conforming implementation is expected to define a behavior for that case, and document it. Leaving it undefined is not an option.
As labeling pointer difference between unrelated arrays "implementation defined" would leave e.g. segmented or Harvard architectures with no way to have a fully-conforming implementation, this case remains undefined by the standard.
Implementations could offer a defined behavior as a non-standard extension. But any program making use of such an extension would no longer be strictly conforming, and non-portable.

Any implementation is free to document a behaviour for which the standard does not require behaviour to be documented - it is well within the limits of the standard. The problem with implementation-defined behaviour in this case is that the implementations must then carefully document them, and when C was standardized, the committee presumably found out that the different implementations were so wildly variable, that no sensible common ground would exist, so they decided to make it UB altogether.
I do not know any compilers that do make it defined, but I know a compiler which does explicitly keep it undefined, even if you try to cheat with casts:
When casting from pointer to integer and back again, the resulting pointer must reference the same object as the original pointer, otherwise the behavior is undefined. That is, one may not use integer arithmetic to avoid the undefined behavior of pointer arithmetic as proscribed in C99 and C11 6.5.6/8.
I believe another compiler also has the same behaviour, though, unfortunately it doesn't document it in an accessible way.
That those two compilers do not define it would be a good reason to avoid depending on it in any programs, even if compiled with another compiler that would specify a behaviour, because you can never be too sure what compiler you need to use 5 years from now...

The more implementation-defined behavior you have and someone's code depends on, the less portable that code is. In this case, there's already an implementation-defined way out of this: reinterpret_cast the pointers to integers and do your math there. That makes it clear to everyone that you're relying on behavior specific to the implementation (or at least, behavior that may not be portable everywhere).
Plus, while the runtime environment may in fact be "all objects are stored in a big array starting at approx. 0 and ending at approx. memory size," that is not true of the compile-time behavior. At compile-time, you can get pointers to objects and do pointer arithmetic on them. But treating such pointers as just addresses into memory could allow a user to start indexing into compiler data and such. By making such things UB, it makes it expressly forbidden at compile-time (and reinterpret_cast is explicitly disallowed at compile-time).

One big reason for saying that things are UB is to allow the compiler to perform optimizations. If you want to allow such a thing, then you remove some optimizations. And as you say, this is only (if even then) useful in some small corner cases. I would say that in most cases where this might seem like a viable option, you should instead reconsider your design.
From comments below:
I agree but the problem it that while I can reconsider my design, I can't reconsider the design of other libraries..
It is very rare that the standard adopts to such things. It has happened however. That's the reason why int *p = 0 is perfectly valid, even though p is a pointer and 0 is an int. This made it in the standard because it was so commonly used instead of the more correct int *p = NULL. But in general, this does not happen, and for good reasons.

First, I feel like we need to get some terms straight, at least with respect to C.
From the C2011 online draft:
Undefined behavior - behavior, upon use of a nonportable or erroneous program construct or of erroneous data,
for which this International Standard imposes no requirements. Possible undefined behavior ranges from ignoring the situation completely with unpredictable
results, to behaving during translation or program execution in a documented manner characteristic of the
environment (with or without the issuance of a diagnostic message), to terminating a translation or
execution (with the issuance of a diagnostic message).
Unspecified behavior - use of an unspecified value, or other behavior where this International Standard provides
two or more possibilities and imposes no further requirements on which is chosen in any
instance. An example of unspecified behavior is the order in which the arguments to a function are
evaluated.
Implementation-defined behavior - unspecified behavior where each implementation documents how the choice is made. An example of implementation-defined behavior is the propagation of the high-order bit
when a signed integer is shifted right.
The key point above is that unspecified behavior means that the language definition provides multiple values or behaviors from which the implementation may choose, and there are no further requirements on how that choice is made. Unspecified behavior becomes implementation-defined behavior when the implementation documents how it makes that choice.
This means that there are restrictions on what may be considered implementation-defined behavior.
The other key point is that undefined does not mean illegal, it only means unpredictable. It means you've voided the warranty, and anything that happens afterwards is not the responsibility of the compiler implementation. One possible outcome of undefined behavior is to work exactly as expected with no nasty side effects. Which, frankly, is the worst possible outcome, because it means as soon as something in the code or environment changes, everything could blow up and you have no idea why (been in that movie a few times).
Now to the question at hand:
I also know that on some architectures ("segmented machine" as I read somewhere), there are good reasons that the behavior is undefined.
And that's why it's undefined everywhere. There are some architectures still in use where different objects can be stored in different memory segments, and any differences in their addresses would be meaningless. There are just so many different memory models and addressing schemes that you cannot hope to define a behavior that works consistently for all of them (or the definition would be so complicated that it would be difficult to implement).
The philosophy behind C is to be maximally portable to as many architectures as possible, and to do that it imposes as few requirements on the implementation as possible. This is why the standard arithmetic types (int, float, etc.) are defined by the minimum range of values that they can represent with a minimum precision, not by the number of bits they take up. It's why pointers to different types may have different sizes and alignments.
Adding language that would make some behaviors undefined on this list of architectures vs. unspecified on that list of architectures would be a headache, both for the standards committee and various compiler implementors. It would mean adding a lot of special-case logic to compilers like gcc, which could make it less reliable as a compiler.

Related

C++: Empty string instead of input string Incase of having longer input string than char array allocated for the input [duplicate]

This question already has answers here:
Undefined, unspecified and implementation-defined behavior
(9 answers)
Closed 7 years ago.
The classic apocryphal example of "undefined behavior" is, of course, "nasal demons" — a physical impossibility, regardless of what the C and C++ standards permit.
Because the C and C++ communities tend to put such an emphasis on the unpredictability of undefined behavior and the idea that the compiler is allowed to cause the program to do literally anything when undefined behavior is encountered, I had assumed that the standard puts no restrictions whatsoever on the behavior of, well, undefined behavior.
But the relevant quote in the C++ standard seems to be:
[C++14: defns.undefined]: [..] Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message). [..]
This actually specifies a small set of possible options:
Ignoring the situation -- Yes, the standard goes on to say that this will have "unpredictable results", but that's not the same as the compiler inserting code (which I assume would be a prerequisite for, you know, nasal demons).
Behaving in a documented manner characteristic of the environment -- this actually sounds relatively benign. (I certainly haven't heard of any documented cases of nasal demons.)
Terminating translation or execution -- with a diagnostic, no less. Would that all UB would behave so nicely.
I assume that in most cases, compilers choose to ignore the undefined behavior; for example, when reading uninitialized memory, it would presumably be an anti-optimization to insert any code to ensure consistent behavior. I suppose that the stranger types of undefined behavior (such as "time travel") would fall under the second category--but this requires that such behaviors be documented and "characteristic of the environment" (so I guess nasal demons are only produced by infernal computers?).
Am I misunderstanding the definition? Are these intended as mere examples of what could constitute undefined behavior, rather than a comprehensive list of options? Is the claim that "anything can happen" meant merely as an unexpected side-effect of ignoring the situation?
Two minor points of clarification:
I thought it was clear from the original question, and I think to most people it was, but I'll spell it out anyway: I do realize that "nasal demons" is tongue-in-cheek.
Please do not write an(other) answer explaining that UB allows for platform-specific compiler optimizations, unless you also explain how it allows for optimizations that implementation-defined behavior wouldn't allow.
This question was not intended as a forum for discussion about the (de)merits of undefined behavior, but that's sort of what it became. In any case, this thread about a hypothetical C-compiler with no undefined behavior may be of additional interest to those who think this is an important topic.
Yes, it permits anything to happen. The note is just giving examples. The definition is pretty clear:
Undefined behavior: behavior for which this International Standard imposes no requirements.
Frequent point of confusion:
You should understand that "no requirement" also means means the implementation is NOT required to leave the behavior undefined or do something bizarre/nondeterministic!
The implementation is perfectly allowed by the C++ standard to document some sane behavior and behave accordingly.1 So, if your compiler claims to wrap around on signed overflow, logic (sanity?) would dictate that you're welcome to rely on that behavior on that compiler. Just don't expect another compiler to behave the same way if it doesn't claim to.
1Heck, it's even allowed to document one thing and do another. That'd be stupid, and it'd probably make you toss it into the trash—why would you trust a compiler whose documentation lies to you?—but it's not against the C++ standard.
One of the historical purposes of Undefined Behavior was to allow for the possibility that certain actions may have different potentially-useful effects on different platforms. For example, in the early days of C, given
int i=INT_MAX;
i++;
printf("%d",i);
some compilers could guarantee that the code would print some particular value (for a two's-complement machine it would typically be INT_MIN), while others would guarantee that the program would terminate without reaching the printf. Depending upon the application requirements, either behavior could be useful. Leaving the behavior undefined meant that an application where abnormal program termination was an acceptable consequence of overflow but producing seemingly-valid-but-wrong output would not be, could forgo overflow checking if run on a platform which would reliably trap it, and an application where abnormal termination in case of overflow would not be acceptable, but producing arithmetically-incorrect output would be, could forgo overflow checking if run on a platform where overflows weren't trapped.
Recently, however, some compiler authors seem to have gotten into a contest to see who can most efficiently eliminate any code whose existence would not be mandated by the standard. Given, for example...
#include <stdio.h>
int main(void)
{
int ch = getchar();
if (ch < 74)
printf("Hey there!");
else
printf("%d",ch*ch*ch*ch*ch);
}
a hyper-modern compiler may conclude that if ch is 74 or greater, the computation of ch*ch*ch*ch*ch would yield Undefined Behavior, and as a
consequence the program should print "Hey there!" unconditionally regardless
of what character was typed.
Nitpicking: You have not quoted a standard.
These are the sources used to generate drafts of the C++ standard. These sources should not be considered an ISO publication, nor should documents generated from them unless officially adopted by the C++ working group (ISO/IEC JTC1/SC22/WG21).
Interpretation: Notes are not normative according to the ISO/IEC Directives Part 2.
Notes and examples integrated in the text of a document shall only be used for giving additional information intended to assist the understanding or use of the document. They shall not contain requirements ("shall"; see 3.3.1 and Table H.1) or any information considered indispensable for the use of the document e.g. instructions (imperative; see Table H.1), recommendations ("should"; see 3.3.2 and Table H.2) or permission ("may"; see Table H.3). Notes may be written as a statement of fact.
Emphasis mine. This alone rules out "comprehensive list of options". Giving examples however does count as "additional information intended to assist the understanding .. of the document".
Do keep in mind that the "nasal demon" meme is not meant to be taken literally, just as using a balloon to explain how universe expansion works holds no truth in physical reality. It's to illustrate that it's foolhardy to discuss what "undefined behavior" should do when it's permissible to do anything. Yes, this means that there isn't an actual rubber band in outer space.
The definition of undefined behaviour, in every C and C++ standard, is essentially that the standard imposes no requirements on what happens.
Yes, that means any outcome is permitted. But there are no particular outcomes that are required to happen, nor any outcomes that are required to NOT happen. It does not matter if you have a compiler and library that consistently yields a particular behaviour in response to a particular instance of undefined behaviour - such a behaviour is not required, and may change even in a future bugfix release of your compiler - and the compiler will still be perfectly correct according to each version of the C and C++ standards.
If your host system has hardware support in the form of connection to probes that are inserted in your nostrils, it is within the realms of possibility that an occurrence of undefined behaviour will cause undesired nasal effects.
I thought I'd answer just one of your points, since the other answers answer the general question quite well, but have left this unaddressed.
"Ignoring the situation -- Yes, the standard goes on to say that this will have "unpredictable results", but that's not the same as the compiler inserting code (which I assume would be a prerequisite for, you know, nasal demons)."
A situation in which nasal demons could very reasonably be expected to occur with a sensible compiler, without the compiler inserting ANY code, would be the following:
if(!spawn_of_satan)
printf("Random debug value: %i\n", *x); // oops, null pointer deference
nasal_angels();
else
nasal_demons();
A compiler, if it can prove that that *x is a null pointer dereference, is perfectly entitled, as part of some optimisation, to say "OK, so I see that they've dereferenced a null pointer in this branch of the if. Therefore, as part of that branch I'm allowed to do anything. So I can therefore optimise to this:"
if(!spawn_of_satan)
nasal_demons();
else
nasal_demons();
"And from there, I can optimise to this:"
nasal_demons();
You can see how this sort of thing can in the right circumstances prove very useful for an optimising compiler, and yet cause disaster. I did see some examples a while back of cases where actually it IS important for optimisation to be able to optimise this sort of case. I might try to dig them out later when I have more time.
EDIT: One example that just came from the depths of my memory of such a case where it's useful for optimisation is where you very frequently check a pointer for being NULL (perhaps in inlined helper functions), even after having already dereferenced it and without having changed it. The optimising compiler can see that you've dereferenced it and so optimise out all the "is NULL" checks, since if you've dereferenced it and it IS null, anything is allowed to happen, including just not running the "is NULL" checks. I believe that similar arguments apply to other undefined behaviour.
First, it is important to note that it is not only the behaviour of the user program that is undefined, it is the behaviour of the compiler that is undefined. Similarly, UB is not encountered at runtime, it is a property of the source code.
To a compiler writer, "the behaviour is undefined" means, "you do not have to take this situation into account", or even "you can assume no source code will ever produce this situation".
A compiler can do anything, intentionally or unintentionally, when presented with UB, and still be standard compliant, so yes, if you granted access to your nose...
Then, it is not always possible to know if a program has UB or not.
Example:
int * ptr = calculateAddress();
int i = *ptr;
Knowing if this can ever be UB or not would require knowing all possible values returned by calculateAddress(), which is impossible in the general case (See "Halting Problem"). A compiler has two choices:
assume ptr will always have a valid address
insert runtime checks to guarantee a certain behaviour
The first option produces fast programs, and puts the burden of avoiding undesired effects on the programmer, while the second option produces safer but slower code.
The C and C++ standards leave this choice open, and most compilers choose the first, while Java for example mandates the second.
Why is the behaviour not implementation-defined, but undefined?
Implementation-defined means (N4296, 1.9§2):
Certain aspects and operations of the abstract machine are described in this International Standard as
implementation-defined (for example,
sizeof(int)
). These constitute the parameters of the abstract machine. Each implementation shall include documentation describing its characteristics and behavior in these
respects.
Such documentation shall define the instance of the abstract machine that corresponds to that
implementation (referred to as the “corresponding instance” below).
Emphasis mine. In other words: A compiler-writer has to document exactly how the machine-code behaves, when the source code uses implementation-defined features.
Writing to a random non-null invalid pointer is one of the most unpredictable things you can do in a program, so this would require performance-reducing runtime-checks too.
Before we had MMUs, you could destroy hardware by writing to the wrong address, which comes very close to nasal demons ;-)
Undefined behavior is simply the result of a situation coming up that the writers of the specification did not foresee.
Take the idea of a traffic light. Red means stop, yellow means prepare for red, and green means go. In this example people driving cars are the implementation of the spec.
What happens if both green and red are on? Do you stop, then go? Do you wait until red turns off and it's just green? This is a case that the spec did not describe, and as a result, anything the drivers do is undefined behavior. Some people will do one thing, some another. Since there is no guarantee about what will happen you want to avoid this situation. The same applies to code.
One of the reasons for leaving behavior undefined is to allow the compiler to make whatever assumptions it wants when optimizing.
If there exists some condition that must hold if an optimization is to be applied, and that condition is dependent on undefined behavior in the code, then the compiler may assume that it's met, since a conforming program can't depend on undefined behavior in any way. Importantly, the compiler does not need to be consistent in these assumptions. (which is not the case for implementation-defined behavior)
So suppose your code contains an admittedly contrived example like the one below:
int bar = 0;
int foo = (undefined behavior of some kind);
if (foo) {
f();
bar = 1;
}
if (!foo) {
g();
bar = 1;
}
assert(1 == bar);
The compiler is free to assume that !foo is true in the first block and foo is true in the second, and thus optimize the entire chunk of code away. Now, logically either foo or !foo must be true, and so looking at the code, you would reasonably be able to assume that bar must equal 1 once you've run the code. But because the compiler optimized in that manner, bar never gets set to 1. And now that assertion becomes false and the program terminates, which is behavior that would not have happened if foo hadn't relied on undefined behavior.
Now, is it possible for the compiler to actually insert completely new code if it sees undefined behavior? If doing so will allow it to optimize more, absolutely. Is it likely to happen often? Probably not, but you can never guarantee it, so operating on the assumption that nasal demons are possible is the only safe approach.
Undefined behaviors allow compilers to generate faster code in some cases. Consider two different processor architectures that ADD differently:
Processor A inherently discards the carry bit upon overflow, while processor B generates an error. (Of course, Processor C inherently generates Nasal Demons - its just the easiest way to discharge that extra bit of energy in a snot-powered nanobot...)
If the standard required that an error be generated, then all code compiled for processor A would basically be forced to include additional instructions, to perform some sort of check for overflow, and if so, generate an error. This would result in slower code, even if the developer know that they were only going to end up adding small numbers.
Undefined behavior sacrifices portability for speed. By allowing 'anything' to happen, the compiler can avoid writing safety-checks for situations that will never occur. (Or, you know... they might.)
Additionally, when a programmer knows exactly what an undefined behavior will actually cause in their given environment, they are free to exploit that knowledge to gain additional performance.
If you want to ensure that your code behaves exactly the same on all platforms, you need to ensure that no 'undefined behavior' ever occurs - however, this may not be your goal.
Edit: (In respons to OPs edit)
Implementation Defined behavior would require the consistent generation of nasal demons. Undefined behavior allows the sporadic generation of nasal demons.
That's where the advantage that undefined behavior has over implementation specific behavior appears. Consider that extra code may be needed to avoid inconsistent behavior on a particular system. In these cases, undefined behavior allows greater speed.

Return By Reference to Local Reference From Function [duplicate]

This question already has answers here:
Undefined, unspecified and implementation-defined behavior
(9 answers)
Closed 7 years ago.
The classic apocryphal example of "undefined behavior" is, of course, "nasal demons" — a physical impossibility, regardless of what the C and C++ standards permit.
Because the C and C++ communities tend to put such an emphasis on the unpredictability of undefined behavior and the idea that the compiler is allowed to cause the program to do literally anything when undefined behavior is encountered, I had assumed that the standard puts no restrictions whatsoever on the behavior of, well, undefined behavior.
But the relevant quote in the C++ standard seems to be:
[C++14: defns.undefined]: [..] Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message). [..]
This actually specifies a small set of possible options:
Ignoring the situation -- Yes, the standard goes on to say that this will have "unpredictable results", but that's not the same as the compiler inserting code (which I assume would be a prerequisite for, you know, nasal demons).
Behaving in a documented manner characteristic of the environment -- this actually sounds relatively benign. (I certainly haven't heard of any documented cases of nasal demons.)
Terminating translation or execution -- with a diagnostic, no less. Would that all UB would behave so nicely.
I assume that in most cases, compilers choose to ignore the undefined behavior; for example, when reading uninitialized memory, it would presumably be an anti-optimization to insert any code to ensure consistent behavior. I suppose that the stranger types of undefined behavior (such as "time travel") would fall under the second category--but this requires that such behaviors be documented and "characteristic of the environment" (so I guess nasal demons are only produced by infernal computers?).
Am I misunderstanding the definition? Are these intended as mere examples of what could constitute undefined behavior, rather than a comprehensive list of options? Is the claim that "anything can happen" meant merely as an unexpected side-effect of ignoring the situation?
Two minor points of clarification:
I thought it was clear from the original question, and I think to most people it was, but I'll spell it out anyway: I do realize that "nasal demons" is tongue-in-cheek.
Please do not write an(other) answer explaining that UB allows for platform-specific compiler optimizations, unless you also explain how it allows for optimizations that implementation-defined behavior wouldn't allow.
This question was not intended as a forum for discussion about the (de)merits of undefined behavior, but that's sort of what it became. In any case, this thread about a hypothetical C-compiler with no undefined behavior may be of additional interest to those who think this is an important topic.
Yes, it permits anything to happen. The note is just giving examples. The definition is pretty clear:
Undefined behavior: behavior for which this International Standard imposes no requirements.
Frequent point of confusion:
You should understand that "no requirement" also means means the implementation is NOT required to leave the behavior undefined or do something bizarre/nondeterministic!
The implementation is perfectly allowed by the C++ standard to document some sane behavior and behave accordingly.1 So, if your compiler claims to wrap around on signed overflow, logic (sanity?) would dictate that you're welcome to rely on that behavior on that compiler. Just don't expect another compiler to behave the same way if it doesn't claim to.
1Heck, it's even allowed to document one thing and do another. That'd be stupid, and it'd probably make you toss it into the trash—why would you trust a compiler whose documentation lies to you?—but it's not against the C++ standard.
One of the historical purposes of Undefined Behavior was to allow for the possibility that certain actions may have different potentially-useful effects on different platforms. For example, in the early days of C, given
int i=INT_MAX;
i++;
printf("%d",i);
some compilers could guarantee that the code would print some particular value (for a two's-complement machine it would typically be INT_MIN), while others would guarantee that the program would terminate without reaching the printf. Depending upon the application requirements, either behavior could be useful. Leaving the behavior undefined meant that an application where abnormal program termination was an acceptable consequence of overflow but producing seemingly-valid-but-wrong output would not be, could forgo overflow checking if run on a platform which would reliably trap it, and an application where abnormal termination in case of overflow would not be acceptable, but producing arithmetically-incorrect output would be, could forgo overflow checking if run on a platform where overflows weren't trapped.
Recently, however, some compiler authors seem to have gotten into a contest to see who can most efficiently eliminate any code whose existence would not be mandated by the standard. Given, for example...
#include <stdio.h>
int main(void)
{
int ch = getchar();
if (ch < 74)
printf("Hey there!");
else
printf("%d",ch*ch*ch*ch*ch);
}
a hyper-modern compiler may conclude that if ch is 74 or greater, the computation of ch*ch*ch*ch*ch would yield Undefined Behavior, and as a
consequence the program should print "Hey there!" unconditionally regardless
of what character was typed.
Nitpicking: You have not quoted a standard.
These are the sources used to generate drafts of the C++ standard. These sources should not be considered an ISO publication, nor should documents generated from them unless officially adopted by the C++ working group (ISO/IEC JTC1/SC22/WG21).
Interpretation: Notes are not normative according to the ISO/IEC Directives Part 2.
Notes and examples integrated in the text of a document shall only be used for giving additional information intended to assist the understanding or use of the document. They shall not contain requirements ("shall"; see 3.3.1 and Table H.1) or any information considered indispensable for the use of the document e.g. instructions (imperative; see Table H.1), recommendations ("should"; see 3.3.2 and Table H.2) or permission ("may"; see Table H.3). Notes may be written as a statement of fact.
Emphasis mine. This alone rules out "comprehensive list of options". Giving examples however does count as "additional information intended to assist the understanding .. of the document".
Do keep in mind that the "nasal demon" meme is not meant to be taken literally, just as using a balloon to explain how universe expansion works holds no truth in physical reality. It's to illustrate that it's foolhardy to discuss what "undefined behavior" should do when it's permissible to do anything. Yes, this means that there isn't an actual rubber band in outer space.
The definition of undefined behaviour, in every C and C++ standard, is essentially that the standard imposes no requirements on what happens.
Yes, that means any outcome is permitted. But there are no particular outcomes that are required to happen, nor any outcomes that are required to NOT happen. It does not matter if you have a compiler and library that consistently yields a particular behaviour in response to a particular instance of undefined behaviour - such a behaviour is not required, and may change even in a future bugfix release of your compiler - and the compiler will still be perfectly correct according to each version of the C and C++ standards.
If your host system has hardware support in the form of connection to probes that are inserted in your nostrils, it is within the realms of possibility that an occurrence of undefined behaviour will cause undesired nasal effects.
I thought I'd answer just one of your points, since the other answers answer the general question quite well, but have left this unaddressed.
"Ignoring the situation -- Yes, the standard goes on to say that this will have "unpredictable results", but that's not the same as the compiler inserting code (which I assume would be a prerequisite for, you know, nasal demons)."
A situation in which nasal demons could very reasonably be expected to occur with a sensible compiler, without the compiler inserting ANY code, would be the following:
if(!spawn_of_satan)
printf("Random debug value: %i\n", *x); // oops, null pointer deference
nasal_angels();
else
nasal_demons();
A compiler, if it can prove that that *x is a null pointer dereference, is perfectly entitled, as part of some optimisation, to say "OK, so I see that they've dereferenced a null pointer in this branch of the if. Therefore, as part of that branch I'm allowed to do anything. So I can therefore optimise to this:"
if(!spawn_of_satan)
nasal_demons();
else
nasal_demons();
"And from there, I can optimise to this:"
nasal_demons();
You can see how this sort of thing can in the right circumstances prove very useful for an optimising compiler, and yet cause disaster. I did see some examples a while back of cases where actually it IS important for optimisation to be able to optimise this sort of case. I might try to dig them out later when I have more time.
EDIT: One example that just came from the depths of my memory of such a case where it's useful for optimisation is where you very frequently check a pointer for being NULL (perhaps in inlined helper functions), even after having already dereferenced it and without having changed it. The optimising compiler can see that you've dereferenced it and so optimise out all the "is NULL" checks, since if you've dereferenced it and it IS null, anything is allowed to happen, including just not running the "is NULL" checks. I believe that similar arguments apply to other undefined behaviour.
First, it is important to note that it is not only the behaviour of the user program that is undefined, it is the behaviour of the compiler that is undefined. Similarly, UB is not encountered at runtime, it is a property of the source code.
To a compiler writer, "the behaviour is undefined" means, "you do not have to take this situation into account", or even "you can assume no source code will ever produce this situation".
A compiler can do anything, intentionally or unintentionally, when presented with UB, and still be standard compliant, so yes, if you granted access to your nose...
Then, it is not always possible to know if a program has UB or not.
Example:
int * ptr = calculateAddress();
int i = *ptr;
Knowing if this can ever be UB or not would require knowing all possible values returned by calculateAddress(), which is impossible in the general case (See "Halting Problem"). A compiler has two choices:
assume ptr will always have a valid address
insert runtime checks to guarantee a certain behaviour
The first option produces fast programs, and puts the burden of avoiding undesired effects on the programmer, while the second option produces safer but slower code.
The C and C++ standards leave this choice open, and most compilers choose the first, while Java for example mandates the second.
Why is the behaviour not implementation-defined, but undefined?
Implementation-defined means (N4296, 1.9§2):
Certain aspects and operations of the abstract machine are described in this International Standard as
implementation-defined (for example,
sizeof(int)
). These constitute the parameters of the abstract machine. Each implementation shall include documentation describing its characteristics and behavior in these
respects.
Such documentation shall define the instance of the abstract machine that corresponds to that
implementation (referred to as the “corresponding instance” below).
Emphasis mine. In other words: A compiler-writer has to document exactly how the machine-code behaves, when the source code uses implementation-defined features.
Writing to a random non-null invalid pointer is one of the most unpredictable things you can do in a program, so this would require performance-reducing runtime-checks too.
Before we had MMUs, you could destroy hardware by writing to the wrong address, which comes very close to nasal demons ;-)
Undefined behavior is simply the result of a situation coming up that the writers of the specification did not foresee.
Take the idea of a traffic light. Red means stop, yellow means prepare for red, and green means go. In this example people driving cars are the implementation of the spec.
What happens if both green and red are on? Do you stop, then go? Do you wait until red turns off and it's just green? This is a case that the spec did not describe, and as a result, anything the drivers do is undefined behavior. Some people will do one thing, some another. Since there is no guarantee about what will happen you want to avoid this situation. The same applies to code.
One of the reasons for leaving behavior undefined is to allow the compiler to make whatever assumptions it wants when optimizing.
If there exists some condition that must hold if an optimization is to be applied, and that condition is dependent on undefined behavior in the code, then the compiler may assume that it's met, since a conforming program can't depend on undefined behavior in any way. Importantly, the compiler does not need to be consistent in these assumptions. (which is not the case for implementation-defined behavior)
So suppose your code contains an admittedly contrived example like the one below:
int bar = 0;
int foo = (undefined behavior of some kind);
if (foo) {
f();
bar = 1;
}
if (!foo) {
g();
bar = 1;
}
assert(1 == bar);
The compiler is free to assume that !foo is true in the first block and foo is true in the second, and thus optimize the entire chunk of code away. Now, logically either foo or !foo must be true, and so looking at the code, you would reasonably be able to assume that bar must equal 1 once you've run the code. But because the compiler optimized in that manner, bar never gets set to 1. And now that assertion becomes false and the program terminates, which is behavior that would not have happened if foo hadn't relied on undefined behavior.
Now, is it possible for the compiler to actually insert completely new code if it sees undefined behavior? If doing so will allow it to optimize more, absolutely. Is it likely to happen often? Probably not, but you can never guarantee it, so operating on the assumption that nasal demons are possible is the only safe approach.
Undefined behaviors allow compilers to generate faster code in some cases. Consider two different processor architectures that ADD differently:
Processor A inherently discards the carry bit upon overflow, while processor B generates an error. (Of course, Processor C inherently generates Nasal Demons - its just the easiest way to discharge that extra bit of energy in a snot-powered nanobot...)
If the standard required that an error be generated, then all code compiled for processor A would basically be forced to include additional instructions, to perform some sort of check for overflow, and if so, generate an error. This would result in slower code, even if the developer know that they were only going to end up adding small numbers.
Undefined behavior sacrifices portability for speed. By allowing 'anything' to happen, the compiler can avoid writing safety-checks for situations that will never occur. (Or, you know... they might.)
Additionally, when a programmer knows exactly what an undefined behavior will actually cause in their given environment, they are free to exploit that knowledge to gain additional performance.
If you want to ensure that your code behaves exactly the same on all platforms, you need to ensure that no 'undefined behavior' ever occurs - however, this may not be your goal.
Edit: (In respons to OPs edit)
Implementation Defined behavior would require the consistent generation of nasal demons. Undefined behavior allows the sporadic generation of nasal demons.
That's where the advantage that undefined behavior has over implementation specific behavior appears. Consider that extra code may be needed to avoid inconsistent behavior on a particular system. In these cases, undefined behavior allows greater speed.

Recursion ( No return still works) [duplicate]

This question already has answers here:
Undefined, unspecified and implementation-defined behavior
(9 answers)
Closed 7 years ago.
The classic apocryphal example of "undefined behavior" is, of course, "nasal demons" — a physical impossibility, regardless of what the C and C++ standards permit.
Because the C and C++ communities tend to put such an emphasis on the unpredictability of undefined behavior and the idea that the compiler is allowed to cause the program to do literally anything when undefined behavior is encountered, I had assumed that the standard puts no restrictions whatsoever on the behavior of, well, undefined behavior.
But the relevant quote in the C++ standard seems to be:
[C++14: defns.undefined]: [..] Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message). [..]
This actually specifies a small set of possible options:
Ignoring the situation -- Yes, the standard goes on to say that this will have "unpredictable results", but that's not the same as the compiler inserting code (which I assume would be a prerequisite for, you know, nasal demons).
Behaving in a documented manner characteristic of the environment -- this actually sounds relatively benign. (I certainly haven't heard of any documented cases of nasal demons.)
Terminating translation or execution -- with a diagnostic, no less. Would that all UB would behave so nicely.
I assume that in most cases, compilers choose to ignore the undefined behavior; for example, when reading uninitialized memory, it would presumably be an anti-optimization to insert any code to ensure consistent behavior. I suppose that the stranger types of undefined behavior (such as "time travel") would fall under the second category--but this requires that such behaviors be documented and "characteristic of the environment" (so I guess nasal demons are only produced by infernal computers?).
Am I misunderstanding the definition? Are these intended as mere examples of what could constitute undefined behavior, rather than a comprehensive list of options? Is the claim that "anything can happen" meant merely as an unexpected side-effect of ignoring the situation?
Two minor points of clarification:
I thought it was clear from the original question, and I think to most people it was, but I'll spell it out anyway: I do realize that "nasal demons" is tongue-in-cheek.
Please do not write an(other) answer explaining that UB allows for platform-specific compiler optimizations, unless you also explain how it allows for optimizations that implementation-defined behavior wouldn't allow.
This question was not intended as a forum for discussion about the (de)merits of undefined behavior, but that's sort of what it became. In any case, this thread about a hypothetical C-compiler with no undefined behavior may be of additional interest to those who think this is an important topic.
Yes, it permits anything to happen. The note is just giving examples. The definition is pretty clear:
Undefined behavior: behavior for which this International Standard imposes no requirements.
Frequent point of confusion:
You should understand that "no requirement" also means means the implementation is NOT required to leave the behavior undefined or do something bizarre/nondeterministic!
The implementation is perfectly allowed by the C++ standard to document some sane behavior and behave accordingly.1 So, if your compiler claims to wrap around on signed overflow, logic (sanity?) would dictate that you're welcome to rely on that behavior on that compiler. Just don't expect another compiler to behave the same way if it doesn't claim to.
1Heck, it's even allowed to document one thing and do another. That'd be stupid, and it'd probably make you toss it into the trash—why would you trust a compiler whose documentation lies to you?—but it's not against the C++ standard.
One of the historical purposes of Undefined Behavior was to allow for the possibility that certain actions may have different potentially-useful effects on different platforms. For example, in the early days of C, given
int i=INT_MAX;
i++;
printf("%d",i);
some compilers could guarantee that the code would print some particular value (for a two's-complement machine it would typically be INT_MIN), while others would guarantee that the program would terminate without reaching the printf. Depending upon the application requirements, either behavior could be useful. Leaving the behavior undefined meant that an application where abnormal program termination was an acceptable consequence of overflow but producing seemingly-valid-but-wrong output would not be, could forgo overflow checking if run on a platform which would reliably trap it, and an application where abnormal termination in case of overflow would not be acceptable, but producing arithmetically-incorrect output would be, could forgo overflow checking if run on a platform where overflows weren't trapped.
Recently, however, some compiler authors seem to have gotten into a contest to see who can most efficiently eliminate any code whose existence would not be mandated by the standard. Given, for example...
#include <stdio.h>
int main(void)
{
int ch = getchar();
if (ch < 74)
printf("Hey there!");
else
printf("%d",ch*ch*ch*ch*ch);
}
a hyper-modern compiler may conclude that if ch is 74 or greater, the computation of ch*ch*ch*ch*ch would yield Undefined Behavior, and as a
consequence the program should print "Hey there!" unconditionally regardless
of what character was typed.
Nitpicking: You have not quoted a standard.
These are the sources used to generate drafts of the C++ standard. These sources should not be considered an ISO publication, nor should documents generated from them unless officially adopted by the C++ working group (ISO/IEC JTC1/SC22/WG21).
Interpretation: Notes are not normative according to the ISO/IEC Directives Part 2.
Notes and examples integrated in the text of a document shall only be used for giving additional information intended to assist the understanding or use of the document. They shall not contain requirements ("shall"; see 3.3.1 and Table H.1) or any information considered indispensable for the use of the document e.g. instructions (imperative; see Table H.1), recommendations ("should"; see 3.3.2 and Table H.2) or permission ("may"; see Table H.3). Notes may be written as a statement of fact.
Emphasis mine. This alone rules out "comprehensive list of options". Giving examples however does count as "additional information intended to assist the understanding .. of the document".
Do keep in mind that the "nasal demon" meme is not meant to be taken literally, just as using a balloon to explain how universe expansion works holds no truth in physical reality. It's to illustrate that it's foolhardy to discuss what "undefined behavior" should do when it's permissible to do anything. Yes, this means that there isn't an actual rubber band in outer space.
The definition of undefined behaviour, in every C and C++ standard, is essentially that the standard imposes no requirements on what happens.
Yes, that means any outcome is permitted. But there are no particular outcomes that are required to happen, nor any outcomes that are required to NOT happen. It does not matter if you have a compiler and library that consistently yields a particular behaviour in response to a particular instance of undefined behaviour - such a behaviour is not required, and may change even in a future bugfix release of your compiler - and the compiler will still be perfectly correct according to each version of the C and C++ standards.
If your host system has hardware support in the form of connection to probes that are inserted in your nostrils, it is within the realms of possibility that an occurrence of undefined behaviour will cause undesired nasal effects.
I thought I'd answer just one of your points, since the other answers answer the general question quite well, but have left this unaddressed.
"Ignoring the situation -- Yes, the standard goes on to say that this will have "unpredictable results", but that's not the same as the compiler inserting code (which I assume would be a prerequisite for, you know, nasal demons)."
A situation in which nasal demons could very reasonably be expected to occur with a sensible compiler, without the compiler inserting ANY code, would be the following:
if(!spawn_of_satan)
printf("Random debug value: %i\n", *x); // oops, null pointer deference
nasal_angels();
else
nasal_demons();
A compiler, if it can prove that that *x is a null pointer dereference, is perfectly entitled, as part of some optimisation, to say "OK, so I see that they've dereferenced a null pointer in this branch of the if. Therefore, as part of that branch I'm allowed to do anything. So I can therefore optimise to this:"
if(!spawn_of_satan)
nasal_demons();
else
nasal_demons();
"And from there, I can optimise to this:"
nasal_demons();
You can see how this sort of thing can in the right circumstances prove very useful for an optimising compiler, and yet cause disaster. I did see some examples a while back of cases where actually it IS important for optimisation to be able to optimise this sort of case. I might try to dig them out later when I have more time.
EDIT: One example that just came from the depths of my memory of such a case where it's useful for optimisation is where you very frequently check a pointer for being NULL (perhaps in inlined helper functions), even after having already dereferenced it and without having changed it. The optimising compiler can see that you've dereferenced it and so optimise out all the "is NULL" checks, since if you've dereferenced it and it IS null, anything is allowed to happen, including just not running the "is NULL" checks. I believe that similar arguments apply to other undefined behaviour.
First, it is important to note that it is not only the behaviour of the user program that is undefined, it is the behaviour of the compiler that is undefined. Similarly, UB is not encountered at runtime, it is a property of the source code.
To a compiler writer, "the behaviour is undefined" means, "you do not have to take this situation into account", or even "you can assume no source code will ever produce this situation".
A compiler can do anything, intentionally or unintentionally, when presented with UB, and still be standard compliant, so yes, if you granted access to your nose...
Then, it is not always possible to know if a program has UB or not.
Example:
int * ptr = calculateAddress();
int i = *ptr;
Knowing if this can ever be UB or not would require knowing all possible values returned by calculateAddress(), which is impossible in the general case (See "Halting Problem"). A compiler has two choices:
assume ptr will always have a valid address
insert runtime checks to guarantee a certain behaviour
The first option produces fast programs, and puts the burden of avoiding undesired effects on the programmer, while the second option produces safer but slower code.
The C and C++ standards leave this choice open, and most compilers choose the first, while Java for example mandates the second.
Why is the behaviour not implementation-defined, but undefined?
Implementation-defined means (N4296, 1.9§2):
Certain aspects and operations of the abstract machine are described in this International Standard as
implementation-defined (for example,
sizeof(int)
). These constitute the parameters of the abstract machine. Each implementation shall include documentation describing its characteristics and behavior in these
respects.
Such documentation shall define the instance of the abstract machine that corresponds to that
implementation (referred to as the “corresponding instance” below).
Emphasis mine. In other words: A compiler-writer has to document exactly how the machine-code behaves, when the source code uses implementation-defined features.
Writing to a random non-null invalid pointer is one of the most unpredictable things you can do in a program, so this would require performance-reducing runtime-checks too.
Before we had MMUs, you could destroy hardware by writing to the wrong address, which comes very close to nasal demons ;-)
Undefined behavior is simply the result of a situation coming up that the writers of the specification did not foresee.
Take the idea of a traffic light. Red means stop, yellow means prepare for red, and green means go. In this example people driving cars are the implementation of the spec.
What happens if both green and red are on? Do you stop, then go? Do you wait until red turns off and it's just green? This is a case that the spec did not describe, and as a result, anything the drivers do is undefined behavior. Some people will do one thing, some another. Since there is no guarantee about what will happen you want to avoid this situation. The same applies to code.
One of the reasons for leaving behavior undefined is to allow the compiler to make whatever assumptions it wants when optimizing.
If there exists some condition that must hold if an optimization is to be applied, and that condition is dependent on undefined behavior in the code, then the compiler may assume that it's met, since a conforming program can't depend on undefined behavior in any way. Importantly, the compiler does not need to be consistent in these assumptions. (which is not the case for implementation-defined behavior)
So suppose your code contains an admittedly contrived example like the one below:
int bar = 0;
int foo = (undefined behavior of some kind);
if (foo) {
f();
bar = 1;
}
if (!foo) {
g();
bar = 1;
}
assert(1 == bar);
The compiler is free to assume that !foo is true in the first block and foo is true in the second, and thus optimize the entire chunk of code away. Now, logically either foo or !foo must be true, and so looking at the code, you would reasonably be able to assume that bar must equal 1 once you've run the code. But because the compiler optimized in that manner, bar never gets set to 1. And now that assertion becomes false and the program terminates, which is behavior that would not have happened if foo hadn't relied on undefined behavior.
Now, is it possible for the compiler to actually insert completely new code if it sees undefined behavior? If doing so will allow it to optimize more, absolutely. Is it likely to happen often? Probably not, but you can never guarantee it, so operating on the assumption that nasal demons are possible is the only safe approach.
Undefined behaviors allow compilers to generate faster code in some cases. Consider two different processor architectures that ADD differently:
Processor A inherently discards the carry bit upon overflow, while processor B generates an error. (Of course, Processor C inherently generates Nasal Demons - its just the easiest way to discharge that extra bit of energy in a snot-powered nanobot...)
If the standard required that an error be generated, then all code compiled for processor A would basically be forced to include additional instructions, to perform some sort of check for overflow, and if so, generate an error. This would result in slower code, even if the developer know that they were only going to end up adding small numbers.
Undefined behavior sacrifices portability for speed. By allowing 'anything' to happen, the compiler can avoid writing safety-checks for situations that will never occur. (Or, you know... they might.)
Additionally, when a programmer knows exactly what an undefined behavior will actually cause in their given environment, they are free to exploit that knowledge to gain additional performance.
If you want to ensure that your code behaves exactly the same on all platforms, you need to ensure that no 'undefined behavior' ever occurs - however, this may not be your goal.
Edit: (In respons to OPs edit)
Implementation Defined behavior would require the consistent generation of nasal demons. Undefined behavior allows the sporadic generation of nasal demons.
That's where the advantage that undefined behavior has over implementation specific behavior appears. Consider that extra code may be needed to avoid inconsistent behavior on a particular system. In these cases, undefined behavior allows greater speed.

Is it legal for the compiler to degrade the time complexity of a program? Is this considered observable behavior?

(Note: This is intended to be a language-lawyer question; I'm not referring to any particular existing compilers.)
When, if ever, is the compiler allowed to degrade the time complexity of a program?
Under what circumstances (if any) is this considered "observable behavior", and why?
(For example, can the compiler legally "reduce" a polynomial-time program to an exponential-time one?)
If the answer differs in C and C++, or in different versions of either, then please explain the differences.
The C standard doesn't actually have a time complexity model, neither for its primitive operations, nor its library functions, so compilers are allowed to do pretty much anything that preserves program semantics (observable behavior).
The C++ standard only gives complexity guarantees only for some its library functions, and says (17.5.1.4 [structure.specifications]):
Complexity requirements specified in the library clauses are upper bounds, and implementations that provide better complexity guarantees satisfy the requirements.
A compiler better preserve these bounds (and since many of the functions are templated/may be inlined, the compiler is involved), but the bounds are in terms of the number of elements in containers and restrict the number of calls to comparison operators and the like. Otherwise, the compiler is again free to do as it pleases.
Performance of the code is not considered observable behavior and could potentially be modified by the compiler in either direction. In practical terms, for quality of implementation (QoI) reasons compilers don't degrade your programs, but there are cases where QoI is not performance.
A compiler, given the appropriate flags, could add instrumentation to the program it is building for debugging purposes (this is often the case in library implementations, for example with checked iterators).
Note that the simple answer to when the compiler would degrade your program is twofold: when the client asks for it, or when the implementor doesn't want to have users for the compiler.
5.1.2.3 in the C standard says
The semantic descriptions in this International Standard describe the behavior of an
abstract machine in which issues of optimization are irrelevant.
The C++ standard has similar wording in 1.9 [intro.execution]
Both standards have the same definition of observable behaviour:
The least requirements on a conforming implementation are:
— Accesses to volatile objects are evaluated strictly according to the rules of the abstract
machine.
— At program termination, all data written into files shall be identical to the result that
execution of the program according to the abstract semantics would have produced.
— The input and output dynamics of interactive devices shall take place as specified in
7.21.3. The intent of these requirements is that unbuffered or line-buffered output
appear as soon as possible, to ensure that prompting messages actually appear prior to
a program waiting for input.
This is the observable behavior of the program.
So anything else, e.g. performance of a for loop, or the number of reads/writes done for non-volatile variables, is not considered observable and so there are no corresponding performance requirements on the compiler.
If the compiler wanted to re-evaluate a block of code 100 times (assuming it had no observable side-effects, only altering the state of non-volatile variables) and check that the same results were obtained every time (and not affected by cosmic rays or faulty hardware) that would be allowed by the standard.
Others have pointed out that the standard doesn't constrain how the C runtime works, only its observable behaviour. There is no reason why you can't have interpreted or JIT-compiled C, for example.
Consider a C implementation where all memory cells are stored in a linked list on some underlying system. Pointers are then an index into this linked list. All pointer operations would function as normal, except the runtime would have to iterate over the linked list on every memory access. All sorts of common algorithms would suddenly gain an extra factor of N in their complexity, for example the common null-terminated string operations.

What exactly do "IB" and "UB" mean?

I've seen the terms "IB" and "UB" used several times, particularly in the context of C++. I've tried googling them, but apparently those two-letter combinations see a lot of use. :P
So, I ask you...what do they mean, when they're said as if they're a bad thing?
IB: Implementation-defined Behaviour. The standard leaves it up to the particular compiler/platform to define the precise behaviour, but requires that it be defined.
Using implementation-defined behaviour can be useful, but makes your code less portable.
UB: Undefined Behaviour. The standard does not specify how a program invoking undefined behaviour should behave. Also known as "nasal demons" because theoretically it could make demons fly out of your nose.
Using undefined behaviour is nearly always a bad idea. Even if it seems to work sometimes, any change to environment, compiler or platform can randomly break your code.
Implementation-defined behavior and Undefined behavior
The C++ standard is very specific about the effects of various constructs, and in particular you should always be aware of these categories of trouble:
Undefined behavior means that there are absolutely no guarantees given. The code could work, or it could set fire to your harddrive or make demons fly out your nose. As far as the C++ language is concerned, absolutely anything might happen. In practical terms, this generally means that you have an unrecoverable bug. If this happens, you can't really trust anything about your application (because one of the effects of this undefined behavior might just have been to mess up the memory used by the rest of your app). It's not required to be consistent, so running the program twice might give different results. It may depend on the phases of the moon, the color of the shirt you're wearing, or absolutely anything else.
Unspecified behavior means that the program must do something sane and consistent, but it is not required to document this.
Implementation-defined behavior is similar to unspecified, but must also be documented by the compiler writers. An example of this is the result of a reinterpret_cast. usually, it simply changes the type of a pointer, without modifying the address, but the mapping is actually implementation-defined, so a compiler could map to a completely different address, as long as it documented this choice. Another example is the size of an int. The C++ standard doesn't care if it is 2, 4 or 8 bytes, but it must be documented by the compiler
But common for all of these is that they're best avoided. When possible, stick with behavior that is 100% specified by the C++ standard itself. That way, you're guaranteed portability.
You often have to rely on some implementation-defined behavior as well. It may be unavoidable, but you should still pay attention to it, and be aware that you're relying on something that may change between different compilers.
Undefined behavior, on the other hand, should always be avoided. In general, you should just assume that it makes your program explode in one way or another.
IB: is implementation defined behavior - the compiler must document what it does. Performing a >> operation on a negative value is an example.
UB: undefined behavior - the compiler can do what ever, including simply crashing or giving unpredictable results. Dereferencing a null pointer falls into this category, but also subtler things like pointer arithmetic that falls outside the bounds of an array object.
Another related term is 'unspecified behavior'. This is kind of between implementation defined and undefined behaviors. for unspecified behavior, the compiler must do something according to the standard, but exactly which choices the standard gives it is up to the compiler and need not be defined (or even consistent). Things like order of evaluation of sub-expressions falls in this category. The compiler can perform these in whatever order it likes, and could do it differently in different builds or even in different runs of the same build (unlikely, but permitted).
The short version:
Implementation-defined behaviour (IB): Correctly programmed but indeterminate*
Undefined behaviour (UB): Incorrectly programmed (i.e. a bug!)
*) "indeterminate" as far as the language standard is concerned, it will of course be determinate on any fixed platform.
UB: Undefined Behavior
IB: Implementation-defined Behavior