Is undefined versus undefined behavior - c++

The cppreference wording on front and back is surprisingly (at least for me) asymmetric.
front:
Calling front on an empty container is undefined.
back
Calling back on an empty container causes undefined behavior.
I know I am supposed to ask only one question, but still,
Why it is different?, and
What is the difference between is undefined and causes undefined behavior?

In C++, there is no difference between "undefined" and "undefined behavior." Both terms refer to the same concept: a situation in which the standard does not specify the expected outcome of a certain operation.

What is the difference between is undefined and causes undefined behavior?
They have the same meaning here.
Why it is different?,
Most likely because the page has been written by different authors or/and has not been updated for quite some time. Still, both are intended to mean the same thing.
Update
The page has now been updated to make the documentation language more consistent. In particular, now front says:
Calling front on an empty container causes undefined behavior.

Related

Function of .pop() on an empty stack

I'm working on some code for an extra credit project and came across something peculiar that I wanted to see if someone could explain for a better understanding. I have a for loop that populates a std::stack and then another that pops the stack for the same amount of time it's populated. I was wondering what would happen if I attempted to pop() when the stack itself is empty already.
If you are using the default std::stack container of std::deque, calling pop() on an empty stack would invoke undefined behavior.
"undefined behavior" means that your program's behavior can no longer be relied upon in any way.
Here's a documentation trail to follow. You are asking about the behavior of std::stack::pop() in a certain situation. So start with the documentation for that function.
std::stack<T,Container>::pop
Effectively calls c.pop_back()
It is not explicitly clear what is meant by c, but further down the page is a mention of Container::pop_back, so it is reasonable to infer that that is the next thing to look up. (Note that Container is the second template parameter.) You might have a difficulty here if you did not specify a second template parameter for your stack. In that case, back up to the documentation for std::stack to see what the default is.
std::stack
By default, if no container class is specified for a particular stack class instantiation, the standard container std::deque is used.
Aha! So we need to look up the pop_back() member of std::deque.
std::deque<T,Allocator>::pop_back
Calling pop_back on an empty container results in undefined behavior.
There's your answer: undefined behavior. Now you might be asking yourself what is undefined behavior in C++? In brief, undefined behavior allows your program's behavior to be whatever is convenient for the compiler. Technically, it allows any behavior whatsoever, but in practice, compilers just do whatever is convenient.

Can nested undefined behavior(s) cancel the (potential) hazard of the initial undefined behavior?

In case of nested undefined behaviors:
Can one of the subsequent undefined behavior lead to canceling the (potential) hazard of the initial undefined behavior?
Can the combination of all the subsequent undefined behaviors lead to canceling the (potential) hazard of the initial undefined behavior?
Is there are an examples from practice showing a canceling (by coincidence) the hazard of the initial undefined behavior due to the presence of nested undefined behavior(s)?
You are thinking about undefined behaviour wrong. Its not "something that has happened and you can analyse what". Its "your program no longer has any guarantees, it may seem to work now. It may not. It may work only if the moon is full. It may start working when you change compilers. It may fail to work on a different OS.". No matter how many instances of undefined behaviour you have, they cannot cancel each other out. UB + UB is always UB.
Once you have UB anything can happen including it happening to work. This is true no matter how many instances of UB you had in the original source.
However, taking a compiled binary that was generated by a code base that had undefined behaviour and analysing what that binary does is possible and you can easily find places where 2 wrongs cancel each other out. Probably the simplest example would be overwriting the end of an array onto another variable which you forgot to initialise. I've actually seen this one happen.
Note that if you do this, you still dont have any guarantees that its "ok". If you upgrade compilers or perturb the system in anyway (even if you just recompile) then next time the compiler may choose to layout things differently resulting in a new binary which doesn't work.
“Undefined” simply means that neither the compiler nor the execution environment are required to handle the situation in any particular way. In general, UB is not predictable or consistent; once your program invokes undefined behavior, you’ve voided the warranty and all bets are off.
UB is kind of like Murphy’s Law - you can’t make it work in your favor.
There may be specific instances of UB that cancel each other out, but you can’t guarantee that they will for any particular build. You’re better off fixing the code that caused the undefined behavior in the first place.
Undefined Behavior, as the C and C++ Standard uses the term, means nothing more nor less than that the Standards impose no requirements on how a conforming implementations behave when given input that would result in them attempting to perform a particular action. The fact that a particular input would cause a program to invoke UB means that, from a standards-compliance standpoint, nothing an implementation could do would render it non-conforming, no matter what else the program does.
As to whether the behavior is predictable, that depends upon whether an implementation opts to define the behavior. According to the authors of the C Standard, Undefined behavior among other things "It also identifies areas of possible conforming language extension: the implementor may augment the language by providing a definition of the officially undefined behavior." Nearly all quality implementations can be configured to extend the semantics of the language in this fashion, though some may require completely disabling many optimizations in order to avoid having them break corner cases that a quality optimizer could have supported easily.
With regard to your particular question about multiple forms of UB that interact, implementations may specify that some particular action over which the Standard would impose no requirements will behave predictably when, and only when, it is preceded by some other action over which the Standard would impose no requirements.

Is it possible to have UB by default?

So, I've been reading the C++ standard and came to [defns.undefined] (3.27 in this C++17 draft that I'm reading. Note that while I'm citing C++17 here, I've found similar wording in other standards)--that is the definition of Undefined Behavior. I noticed this wording (emphasis mine):
Note: Undefined behavior may be expected when this International Standard omits any explicit definition of
behavior or when a program uses an erroneous construct or erroneous data
Now, thinking about this, this sort of makes sense. It's sort of saying that if the Standard doesn't give a behavior for it, it has undefined behavior. It seems to be saying that if you do something that is out of scope of the Standard, the Standard has nothing to say about it. That makes sense.
However, this is also kind of weird, because I always thought Undefined Behavior had to be explicitly declared by the Standard. Yet, this seems to imply that we should assume Undefined Behavior unless we are told otherwise.
If this is the case, then couldn't there be instances of Undefined Behavior that are Undefined Behavior because the Standard didn't explicitly give a behavior for some construct? And if such a thing is possible, could is it possible to generate an example (that would still compile) of Undefined Behavior that is Undefined Behavior because of this wording, or would anything that fall under this be near impossible to construct for some reason?
If this is the case, then couldn't there be instances of Undefined Behavior that are Undefined Behavior because the Standard didn't explicitly give a behavior for some construct?
I think this is the correct point of view. If the standard "accidentally" omits a specification of how a particular construct behaves, but it's something that we all know "should" be well-defined, then it's a defect in the standard and needs to be fixed. If, on the other hand, it's a construct that "should" be UB, then the standard is already "correct" (although there are benefits to being explicit).
For example, the standard fails to mention what happens if typeid is applied to an lvalue of polymorphic class type if the object's constructor has not yet begun executing or the destructor has completed. Therefore, the behaviour is undefined by omission. It's also something that's "obviously" UB. So there is no problem.
is it possible to generate an example (that would still compile) of Undefined Behavior that is Undefined Behavior because of this wording
The classic example is indirection through a null pointer (CWG232):
*(int*)nullptr;
[expr.unary.op]/1 says that the result of applying the indirection operator is an lvalue which denotes the object to which the argument of the operator points to, whilst null pointer doesn't point to any object. So indirection through a null pointer is UB by omission of explicit definition of behavior for the case when the argument doesn't point to an object.

Why does 'undefined behaviour' exist? [duplicate]

This question already has answers here:
Why is undefined behaviour allowed in C
(3 answers)
Closed 4 years ago.
Certain common programming languages, most notably C and C++, have the strong notion of undefined behaviour: When you attempt to perform certain operations outside of the way they are intended to be used, this causes undefined behaviour.
If undefined behaviour occurs, a compiler is allowed to do anything (including nothing at all, 'time traveling', etc.) it wants.
My question is: Why does this notion of undefined behaviour exist? As far as I can see, a huge load of bugs, programs that work one one version of a compiler stop working on the next, etc. would be prevented if instead of causing undefined behaviour, using the operations outside of their intended use would cause a compilation error.
Why is this not the way things are?
Why does this notion of undefined behaviour exist?
To allow the language / library to be implemented on a variety of different computer architectures as efficiently as possible (- and perhaps in the case of C - while allowing the implementation to remain simple).
if instead of causing undefined behaviour, using the operations outside of their intended use would cause a compilation error
In most cases of undefined behaviour, it is impossible - or prohibitively expensive in resources - to prove that undefined behaviour exists at compile time for all programs in general.
Some cases are possible to prove for some programs, but it's not possible to specify which of those cases are exhaustively, and so the standard won't attempt to do so. Nevertheless, some compilers are smart enough to recognize some simple cases of UB, and those compilers will warn the programmer about it. Example:
int arr[10];
return arr[10];
This program has undefined behaviour. A particular version of GCC that I tested shows:
warning: array subscript 10 is above array bounds of 'int [10]' [-Warray-bounds]
It's hardly a good idea to ignore a warning like this.
More typical alternative to having undefined behaviour would be to have defined error handling in such cases, such as throwing an exception (compare for example Java, where accessing a null reference causes an exception of type java.lang.NullPointerException to be thrown). But checking for the pre-conditions of well defined behaviour is slower than not checking it.
By not checking for pre-conditions, the language gives the programmer the option of proving the correctness themselves, and thereby avoiding the runtime overhead of the check in a program that was proven to not need it. Indeed, this power comes with a great responsibility.
These days the burden of proving the program's well-definedness can be somewhat alleviated by using tools (example) which add some of those runtime checks, and neatly terminate the program upon failed check.
Undefined behavior exists mainly to give the compiler freedom to optimize. One thing it allows the compiler to do, for example, is to operate under the assumption that certain things can't happen (without having to first prove that they can't happen, which would often be very difficult or impossible). By allowing it to assume that certain things can't happen, the compiler can then eliminate/does not have to generate code that would otherwise be needed to account for certain possibilities.
Good talk on the topic
Undefined behavior is mostly based on the target it is intended to run on. The compiler is not responsible for the dynamic behavior of the program or the static behavior for that matter. The compiler checks are restricted to the rules of the language and some modern compilers do some level of static analysis too.
A typical example would be uninitialized variables. It exists because of the syntax rules of C where a variable can be declared without init value. Some compilers assign 0 to such variables and some just assign a mem pointer to the variable and leave just like that. if program does not initialize these variables it leads to undefined behavior.

Undefined vs. Unspecified vs. Implementation-defined behavior [duplicate]

This question already has answers here:
Undefined, unspecified and implementation-defined behavior
(9 answers)
Closed 2 years ago.
Wikipedia has pages about undefined and unspecified behavior and links to them are plentifully used in comments and answers here, in SO.
Each one begins with a note to not be confused with other one but except one no very clear sentence they didn't point at the difference between them.
One of them gives an example (comparing addresses of 2 variables: &a < &b) with the comment that this will results in unspecified behavior in C++, undefined in C.
Is it possible to pinpoint the substantial difference between undefined and unspecified behavior in a clear, understandable manner?
In short:
Undefined behaviour: this is not okay to do
Unspecified behaviour: this is okay to do, but the result could be anything*
Implementation-defined behaviour: this is okay to do, the result could be anything* but the compiler manual should tell you
Or, in quotes from the C++ standard (N4659 section 3, Terms and Definitions):
3.28 Undefined behavior: behavior for which this International Standard imposes no requirements
3.29 Unspecified behavior: behavior, for a well-formed program construct and correct data, that depends on the implementation
3.12 Implementation-defined behavior: behavior, for a well-formed program construct and correct data, that depends on the implementation and
that each implementation documents
EDIT: *As pointed out by M.M in the comments, saying that the result of unspecified behaviour could be anything is not quite right. In fact as the standard itself points out, in a note for paragraph 3.29
The range of possible behaviors is usually delineated by this International Standard.
So in practise you have some idea of what the possible results are, but what exactly will happen depends on your compiler/compiler flags/platform/etc.
Unspecified and its example ( &a < &b ) seems to say the compiler writer does not have to make a commitment to where it stores variables on a stack, and the result could change if nearby items were added or deleted (without changing the order of declaration of a and b).
Implementation specific is items such as a % b where the result is at the implementation's discretion (usually based on the hardware), as to what happens when a is negative.
Here it is important to describe what will happen, but would impact performance if the standard committed to a specific behavior.
Undefined behavior is describing the point your program becomes ill-formed - it may work on a particular platform, but not for any good reasons.