int main() {
int a = 10;
int b = a * a++;
printf("%i %i", a, b);
return 0;
}
Is the output of the above code undefined behavior?
No in
int b = a * a++;
the behavior is undefined, so the result can be anything - that's not what "implementation dependent" means.
You might wonder why it's UB here since a is modified only once. The reason is there's also a requirement in 5/4 paragraph of the Standard that the prior value shall be accessed only to determine the value to be stored. a shall only be read to determine the new value of a, but here a is read twice - once to compute the first multiplier and once again to compute the result of a++ that has a side-effect of writing a new value into a. So even though a is modified once here it is undefined behavior.
Related
main.cpp
const int& f(int& i ) { return (++i);}
int main(){
int i = 10;
int a = i++ + i++; //undefined behavior
int b = f(i) + f(i); //but this is not
}
compile
$ g++ main.cpp -Wsequence-point
statement int a = i++ + i++; is undefined behaviour.
statement int b = f(i) + f(i); is not undefined .
why?
statement int b = f(i) + f(i); is not undefined . why?
No, the second statement will result in unspecified behavior. You can confirm this here. As you'll see in the above linked demo, gcc gives the output as 23 while msvc gives 24 for the same program.
Pre-c++11, we use Sequence_point_rules:
Pre-C++11 Undefined behavior
Between the previous and next sequence point, the value of any object in a memory location must be modified at most once by the evaluation of an expression, otherwise the behavior is undefined.
Pre-C++11 Rules
3) There is a sequence point after the copying of a returned value of a function and before the execution of any expressions outside the function.
In int a = i++ + i++;, i is modified twice
In int b = f(i) + f(i);, there are sequence point with function call. i is modified only once between the sequence call. so no UB.
Note though that order of evaluation is unspecified so (evaluation of result) f(i) might happens after or before the second f(i), which might lead to different result depending of optimization/compiler and even between call.
Since C++11, we use "Sequenced before" rules, which is a similar way disallows i++ + i++ but allows f(i) + f(i).
I have a pointer which is defined as follows:
A ***b;
What does accessing it as follows do:
A** c = b[-1]
Is it an access violation because we are using a negative index to an array? Or is it a legal operation similar to *--b?
EDIT Note that negative array indexing has different support in C and C++. Hence, this is not a dupe.
X[Y] is identical to *(X + Y) as long as one of X and Y is of pointer type and the other has integral type. So b[-1] is the same as *(b - 1), which is an expression that may or may not be evaluated in a well-formed program – it all depends on the initial value of b! For example, the following is perfectly fine:
int q[24];
int * b = q + 13;
b[-1] = 9;
assert(q[12] == 9);
In general, it is your responsibility as a programmer to guarantee that pointers have permissible values when you perform operations with them. If you get it wrong, your program has undefined behaviour. For example:
int * c = q; // q as above
c[-1] = 0; // undefined behaviour!
Finally, just to reinforce the original statement, the following is fine, too:
std::cout << 2["Good morning"] << 4["Stack"] << 8["Overflow\n"];
see simple example:
int a = 0;
int b = (a ++ , a + 1); // result of b is UB or well defined ? (c++03).
This was changed in c++11/c++14 ?
The result is well defined and has been since C++98. The comma operator introduces a sequence point (or a "sequenced before" relationship in later C++s) between the the write and the second read of a and I don't see any other potential reasons for undefined behavior.
This question already has answers here:
Access array beyond the limit in C and C++ [duplicate]
(7 answers)
How dangerous is it to access an array out of bounds?
(12 answers)
Closed 9 years ago.
Say I have an array like so:
int val[10];
and I intentionally index it with everything from negative values to anything higher than 9, but WITHOUT using the resulting value in any way. This would be for performance reasons (perhaps it's more efficient to check the input index AFTER the array access has been made).
My questions are:
Is it safe to do so, or will I run into some sort of memory protection barriers, risk corrupting memory or similar for certain indices?
Is it perhaps not at all efficient if I access data out of range like this? (assuming the array has no built in range check).
Would it be considered bad practice? (assuming a comment is written to indicate we're aware of using out of range indices).
It is undefined behavior. By definition, undefined means "anything could happen." Your code could crash, it could work perfectly, it could bring about peace and harmony amongst all humans. I wouldn't bet on the second or the last.
It is Undefined Behavior, and you might actually run afoul of the optimizers.
Imagine this simple code example:
int select(int i) {
int values[10] = { .... };
int const result = values[i];
if (i < 0 or i > 9) throw std::out_of_range("out!");
return result;
}
And now look at it from an optimizer point of view:
int values[10] = { ... };: valid indexes are in [0, 9].
values[i]: i is an index, thus i is in [0, 9].
if (i < 0 or i > 9) throw std::out_of_range("out!");: i is in [0, 9], never taken
And thus the function rewritten by the optimizer:
int select(int i) {
int values[10] = { ... };
return values[i];
}
For more amusing stories about forward and backward propagation of assumptions based on the fact that the developer is not doing anything forbidden, see What every C programmer should know about Undefined Behavior: Part 2.
EDIT:
Possible work-around: if you know that you will access from -M to +N you can:
declare the array with appropriate buffer: int values[M + 10 + N]
offset any access: values[M + i]
As verbose said, this yields undefined behavior. A bit more precision follows.
5.2.1/1 says
[...] The expression E1[E2] is identical (by definition) to *((E1)+(E2))
Hence, val[i] is equivalent to *((val)+i)). Since val is an array, the array-to-pointer conversion (4.2/1) occurs before the addition is performed. Therefore, val[i] is equivalent to *(ptr + i) where ptr is an int* set to &val[0].
Then, 5.7/2 explains what ptr + i points to. It also says (emphasis are mine):
[...] If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.
In the case of ptr + i, ptr is the pointer operand and the result is ptr + i. According to the quote above, both should point to an element of the array or to one past the last element. That is, in the OP's case ptr + i is a well defined expression for all i = 0, ..., 10. Finally, *(ptr + i) is well defined for 0 <= i < 10 but not for i = 10.
Edit:
I'm puzzled to whether val[10] (or, equivalently, *(ptr + 10)) yields undefined behavior or not (I'm considering C++ not C). In some circumstances this is true (e.g. int x = val[10]; is undefined behavior) but in others this is not so clear. For instance,
int* p = &val[10];
As we have seen, this is equivalent to int* p = &*(ptr + 10); which could be undefined behavior (because it dereferences a pointer to one past the last element of val) or the same as int* p = ptr + 10; which is well defined.
I found these two references which show how fuzzy this question is:
May I take the address of the one-past-the-end element of an array?
Take the address of a one-past-the-end array element via subscript: legal by the C++ Standard or not?
If you put it in a structure with some padding ints, it should be safe (since the pointer actually points to "known" destinations).
But it's better to avoid it.
struct SafeOutOfBoundsAccess
{
int paddingBefore[6];
int val[10];
int paddingAfter[6];
};
void foo()
{
SafeOutOfBoundsAccess a;
bool maybeTrue1 = a.val[-1] == a.paddingBefore[5];
bool maybeTrue2 = a.val[10] == a.paddingAfter[0];
}
I vaguely remember reading somewhere that it is undefined behaviour if multiple operands in a compound expression modify the same object.
I believe an example of this UB is shown in the code below however I've compiled on g++, clang++ and visual studio and all of them print out the same values and can't seem to produce unpredictable values in different compilers.
#include <iostream>
int a( int& lhs ) { lhs -= 4; return lhs; }
int b( int& lhs ) { lhs *= 7; return lhs; }
int c( int& lhs ) { lhs += 1; return lhs; }
int d( int& lhs ) { lhs += 2; return lhs; }
int e( int& lhs ) { lhs *= 3; return lhs; }
int main( int argc, char **argv )
{
int i = 100;
int j = ( b( i ) + c( i ) ) * e( i ) / a( i ) * d( i );
std::cout << i << ", " << j << std::endl;
return 0;
}
Is this behaviour undefined or have I somehow conjured up a description of supposed UB that is not actually undefined?
I would be grateful if someone could post an example of this UB and maybe even point me to where in the C++ standard that it says it is UB.
No. It is not. Undefined behavior is out of question here (assuming the int arithmetic does not overflow): all modifications of i are isolated by sequence points (using C++03 terminology). There's a sequence point at the entrance to each function and there's a sequence point at the exit.
The behavior is unspecified here.
Your code actually follows the same pattern as the classic example often used to illustrate the difference between undefined and unspecified behavior. Consider this
int i = 1;
int j = ++i * ++i;
People will often claim that in this example the "result does not depend on the order of evaluation and therefore j must always be 6". This is an invalid claim, since the behavior is undefined.
However in this example
int inc(int &i) { return ++i; }
int i = 1;
int j = inc(i) * inc(i);
the behavior is formally only unspecified. Namely, the order of evaluation is unspecified. However, since the result of the expression does not depend on the order of evaluation at all, j is guaranteed to always end up as 6. This is an example of how generally dangerous unspecified behavior combination can lead to perfectly defined result.
In your case the result of your expression does critically depend on the order of evaluation, which means that the result will be unpredictable. Yet, there's no undefined behavior here, i.e. the program is not allowed to format your hard drive. It is only allowed to produce unpredictable result in j.
P.S. Again, it might turn out that some of the evaluation scenarios for your expression lead to signed integer overflow (I haven't analyzed them all), which by itself triggers undefined behavior. So, there's still a potential for unspecified behavior leading to undefined behavior in your expression. But this is probably not what your question is about.
No its not undefined behavior.
But it does invoke unspecified behavior.
This is because the order that sub-expressions are evaluated is unspecified.
int j = ( b( i ) + c( i ) ) * e( i ) / a( i ) * d( i );
In the above expression the sub expressions:
b(i)
c(i)
e(i)
a(i)
d(i)
Can be evaluated in any order. Because they all have side-effects the results will depend on this order.
If you divide up the expression into all sub-expressions (this is pseudo-code)
Then you can see any ordering required. Not only can the above expressions be done in any order, potentially they can be interleaved with the higher level sub-expressions (with only a few constraits).
tmp_1 = b(i) // A
tmp_2 = c(i) // B
tmp_3 = e(i) // C
tmp_4 = a(i) // D
tmp_5 = d(i) // E
tmp_6 = tmp_1 + tmp_2 // F (Happens after A and B)
tmp_7 = tmp_6 * tmp_3 // G (Happens after C and F)
tmp_8 = tmp_7 / tmp_4 // H (Happens after D and G)
tmp_9 = tmp_8 * tmp_5 // I (Happens after E and H)
int j = tmp_9; // J (Happens after I)
It isn't undefined behavior but it has unspecified results: The only modified object is i through the references passed to the functions. However, the call to the functions introduce sequence points (I don't have the C++ 2011 with me: they are called something different there), i.e. there is no problem of multiple changes within an expression causing undefined behavior.
However, the order in which the expression is evaluated isn't specified. As a result you may get different results if the order of the evaluation changes. This isn't undefined behavior: The result is one of all possible orders of evaluation. Undefined behavior means that the program can behave in any way it wants, including producing the "expected" (expected by the programmer) results for the expression in question while currupting all other data.