Sequence point from function call? - c++

This is yet another sequence-point question, but a rather simple one:
#include <stdio.h>
void f(int p, int) {
printf("p: %d\n", p);
}
int g(int* p) {
*p = 42;
return 0;
}
int main() {
int p = 0;
f(p, g(&p));
return 0;
}
Is this undefined behaviour? Or does the call to g(&p) act as a sequence point?

No. It doesn't invoke undefined behavior. It is just unspecified, as the order in which the function arguments are evaluated is unspecified in the Standard. So the output could be 0 or 42 depending on the evaluation order decided by your compiler.

The behavior of the program is unspecified since we don't know the order of evaluation of the function arguments, from the draft C++ standard 1.9 Program execution paragraph 3:
Certain other aspects and operations of the abstract machine are described in this International Standard as unspecified (for example, order of evaluation of arguments to a function). Where possible, this International Standard defines a set of allowable behaviors. [...]
and all side effects from the arguments are sequenced before the function is entered, from section 5.2.2 Function call paragraph 8:
[ Note: The evaluations of the postfix expression and of the argument expressions are all unsequenced relative to one another. All side effects of argument expression evaluations are sequenced before the function is entered (see 1.9). —end note ]
As for C both points are covered in the C99 draft standard in section 6.5.2.2 Function calls paragraph 10:
The order of evaluation of the function designator, the actual arguments, and
subexpressions within the actual arguments is unspecified, but there is a sequence point
before the actual call.
So in both C and C++ you can end up with either f(0,0) or f(42,0).

Related

call operator argument evaluation order

#include <string>
struct X
{
char y;
std::string z;
X & operator()(std::string && s)
{
z = std::move(s);
return *this;
}
X & operator()(char c)
{
y = c;
return *this;
}
};
int main()
{
X x;
std::string y("abc");
x(y[0])(std::move(y));
}
is the last line of main undefined behavior? I'm guessing yes because it would unfold to something like the following but just want to make sure that there are no stricter guarantees on call operators or member function invocations in general
X::operator()(&X::operator()(&x, y[0]), std::move(z))
Please add references from the standard or cppref
Before c++17, chaining function calls that modify the same l-value, like in your example, is indeed undefined behavior, since the order of evaluation of these expressions is unspecified.
However, a proposal to fix that was merged into c++17.
Here's the relevant rule (emphasis mine), which also contains an example from the proposal that shows how this works:
The postfix-expression is sequenced before each expression in the expression-list and any default argument. The initialization of a parameter, including every associated value computation and side effect, is indeterminately sequenced with respect to that of any other parameter. [Note: All side effects of argument evaluations are sequenced before the function is entered (see [intro.execution]). — end note] [Example:
void f() {
std::string s = "but I have heard it works even if you don't believe in it";
s.replace(0, 4, "").replace(s.find("even"), 4, "only").replace(s.find(" don't"), 6, "");
assert(s == "I have heard it works only if you believe in it"); // OK
}
— end example]
While the above rule only strictly refers to the built-in operator(), and you have user-defined operators, the same rules about order of evaluation apply, because of this rule:
If either operand has a type that is a class or an enumeration, a user-defined operator function might be declared that implements this operator or a user-defined conversion can be necessary to convert the operand to a type that is appropriate for a built-in operator. ... However, the operands are sequenced in the order prescribed for the built-in operator.

Order of evaluation with function pointers in C++17

Consider the following program (and its alternative in the comment) in C++17:
#include<iostream>
void a(int) {
std::cout << "a\n";
}
void b(int) {
std::cout << "b\n";
}
int main() {
using T = void(*)(int);
T f = a;
(T(f))((f=b,0)); // alternatively: f((f=b,0))
}
With -O2 option, Clang 9.0.0 prints a and GCC 9.2 prints b. Both warn me about unsequenced modification and access to f. See godbolt.org.
My expectation was that this is program has well-defined behavior and will print a, because C++17 guarantees that the left-hand expression of the call (T(f)) is sequenced before any evaluation of the arguments. Because the result of the expression (T(f)) is a new pointer to a, the later modification of f should have no impact on the call at all. Am I wrong?
Both compilers give the same output if I use f((f=b,0)); instead of (T(f))((f=b,0));. Here I am slightly unsure about the undefined behavior aspect. Would this be undefined behavior because f still refers to the declared function pointer after evaluation, which will have been modified by the evaluation of the arguments and if so, why exactly would that cause undefined behavior rather than calling b?
I have asked a related question with regards to order of evaluation of non-static member function calls in C++17 here. I am aware that writing code like this is dangerous and unnecessary, but I want to understand the details of the C++ standard better.
Edit: GCC trunk now also prints a after the bug filed by Barry (see his answer below) has been fixed. Both Clang and GCC trunk do still show false-positive warnings with -Wall, though.
The C++17 rule is, from [expr.call]/8:
The postfix-expression is sequenced before each expression in the expression-list and any default argument. The initialization of a parameter, including every associated value computation and side effect, is indeterminately sequenced with respect to that of any other parameter.
In (T(f))((f=b,0));, (T(f)) is sequenced before the initialization of the parameter from (f=b, 0). All of this is well-defined and the program should print "a". That is, it should behave just like:
auto __tmp = T(f);
__tmp((f=b, 0));
The same is true even if we change your program such that this were valid:
T{f}(f=b, 0); // two parameters now, instead of one
The f=b and 0 expressions are indeterminately sequenced with each other, but T{f} is still sequenced before both, so this would still invoke a.
Filed 91974.

What are these evaluations in the calling function that are not specifically sequenced before the body of the called function?

[intro.execution]/15 contains these statements in page 11 of N4140 (emphasis is mine):
When calling a function (whether or not the function is inline), every
value computation and side effect associated with any argument
expression, or with the postfix expression designating the called
function, is sequenced before execution of every expression or
statement in the body of the called function. [ Note: Value
computations and side effects associated with different argument
expressions are unsequenced. —end note ]
Every evaluation in the calling function (including other function calls) that is not otherwise specifically sequenced before or after
the execution of the body of the called function is indeterminately
sequenced with respect to the execution of the called
function.9
9) In other words, function executions do not interleave with each
other.
I'm wondering what are these evaluations in the calling function that are not specifically sequenced before the execution of the body of the called function?
* Edit * I claim both answers in the question "sequence before" and "Every evaluation in the calling function" in C++ have nothing whatsoever to do with my question. Please read my question and the answers given therein. See also my comment to the answer given below by #CoryKramer.
* Edit 2 * This is probably the answer to my question. See Proposed Resolution number 3 in DR 1949:
Every evaluation in the calling function (including other function calls) that is not otherwise specifically
sequenced before or after the execution of the body of the called function is indeterminately sequenced with
respect to the execution of the called function For each function invocation F, for every evaluation A that occurs within F and every evaluation B that does not occur within F but is evaluated on the same thread and as part of the same signal handler (if any), either A is sequenced before B or B is sequenced before A.9 [Note: if A and B would not otherwise be sequenced then they are indeterminately sequenced. —end note]
If you have two functions that return int
int Foo();
int Bar();
Then you have some calling function
void SomeFunction(int x, int y);
Then the calling code looks like
SomeFunction(Foo(), Bar());
They are saying the execution order of Foo and Bar is indeterminant.
The answer by CoryKramer is entirely correct but perhaps insufficiently elaborated. The clarification in DR 1949 is not relevant.
The key is paragraph 13 of §1.9: which defines the relation "sequenced before" as a partial order, and provides four possibilites for two evaluations A and B:
A is sequenced before B
B is sequenced before A
One of (1) or (2) holds, but the standard does not specify which. In this case, we say that A and B are indeterminately sequenced.
Neither (1) nor (2) holds. In this case, we say that A and B are unsequenced.
There is a difference between indeterminately sequenced and unsequenced, and it is this difference which is addressed in paragraph 15. Paragraph 15 commences with the general rule (emphasis added):
Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced.
The consequence of that is that in the function call
f(argument1, argument2);
the evaluations argument1 and argument2 are unsequenced. That makes the following undefined behaviour:
f(i++, i++);
But suppose we had:
int incr(int& i) { return i++; }
and we instead wrote:
f(incr(i), incr(i));
If we applied the general rule, then this would also be undefined behaviour, with exactly the same argument: the two evaluations are unsequenced, so the evaluations of the bodies of the two function calls are unsequenced, and we end up with two unsequenced modifications of the same variable.
But this is really not desirable, as it would lead to chaos. [It should be noted that the above example is brutally simplified; the two functions might be completely different and the common variable might not be named. In particular, as a common case, the two functions might both send output to std::cout, thus performing a mutable operation on the same shared object (std::cout itself).]
So an explicit exception is made for function calls: the body of a function evaluation is always sequenced with respect to sub-expressions of the expression which contains the function call. So in
f(incr1(i), incr2(i));
because these are function calls, the evaluation of the bodies of incr1 and incr2 are indeterminately sequenced, not unsequenced, and since both orders result in i being incremented twice, the value of i at the end of the evaluation of the argument list is well-defined. Furthermore, the actual arguments passed to f are unspecified, not undefined; either the first or the second will be greater, but they will not be equal.
That exception doesn't apply to the evaluation of the calls themselves, only to the evaluation of the bodies of the called function. So f(g(i++), h(i++)) is still undefined behaviour, because the evaluation of the two subexpressions i++ are not part of the evaluation of the body of either function.
Two side issues
Paragraph 15 also extends this exception to function calls which are the result of the semantics of the language, including operator overrides, with the interesting result that
f(i++, i++);
is unspecified rather than undefined if i is an instance of an object with an operator++(int) override. Similarly,
f(std::cout << 'a', std::cout << 'b');
will cause either ab or ba to be sent to std::cout (it is unspecified which), but is not undefined behaviour.
The point of DR 1949 is that "sequenced after" has never been formally defined. So rather than saying "A is sequenced either before or after B", the more precise formulation is "either A is sequenced before B or B is sequenced before A". You could achieve the same logical effect by formally defining "A is sequenced after B" as "B is sequenced before A". DR 1949 does both.

Are multiple mutations of the same variable within initializer lists undefined behavior pre C++11

Consider the following code:
int main()
{
int count = 0 ;
int arrInt[2] = { count++, count++ } ;
return 0 ;
}
If we compile the code using clang -std=c++03 it produces the following warning(live example):
warning: multiple unsequenced modifications to 'count' [-Wunsequenced]
int arrInt[2] = { count++, count++ } ;
^ ~~
I am not advocating for code like this but similar code came up in another question and there was disagreement over whether it is defined or not according to the standard pre-C++11. In C++11 this behavior is well defined behavior according to Are multiple mutations within initializer lists undefined behavior and indeed if I use -std=c++11 then the warning goes away.
If we look at a pre-C++11 draft standard it does not have the same language covering initializer-list so it seems we are left with Chapter 5 Expressions paragraph 4 which says:
Except where noted, the order of evaluation of operands of individual operators and subexpressions of individual expressions, and the order in which side effects take place, is unspecified.57) Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored. The requirements of this paragraph shall be met for each allowable ordering of the subexpressions of a full expression; otherwise the behavior is undefined.
In order for this to be undefined it would seem we would have to interpret count++, count++ as an expression and therefore each count++ as a subexpression, so is this code undefined pre-C++11?
The code is not undefined pre-C++11 but the evaluation order is unspecified. If we look at the draft standard section 1.9 Program execution paragraph 12 says:
A full-expression is an expression that is not a subexpression of another expression. [...]
and paragraph 15 says:
There is a sequence point at the completion of evaluation of each full-expression12).
then the question is whether count++, count++ is a full expression and each count++ a sub-expression or is each count++ it's own full expression and therefore there is sequence point after each one? if we look at the grammar for this initialization from section 8.5 Initializers:
initializer-clause:
assignment-expression
{ initializer-list ,opt }
{ }
initializer-list:
initializer-clause
initializer-list , initializer-clause
the only expression we have is an assignment-expression and the , separating the components is part of the initializer-list and and not part of an expression and therefore each count++ is a full expression and there is a sequence point after each one.
This interpretation is confirmed by the following gcc bug report, which has very similar code to mine(I came up with my example way before I found this bug report):
int count = 23;
int foo[] = { count++, count++, count++ };
which ends up as defect report 430, which I will quote:
[...]I believe the standard is clear that each initializer expression in the above is a full-expression (1.9 [intro.execution]/12-13; see also issue 392) and therefore there is a sequence point after each expression (1.9 [intro.execution]/16). I agree that the standard does not seem to dictate the order in which the expressions are evaluated, and perhaps it should. Does anyone know of a compiler that would not evaluate the expressions left to right?

Is this code well defined?

I suspect the following chaining of functions would result in unspecified sequence according to the C++ standards (assume C++0x). Just want a confirmation and if anyone could provide an explanation, I'd appreciate it.
#include <iostream>
struct TFoo
{
TFoo(int)
{
std::cout<<"TFoo"<<std::endl;
};
TFoo foobar1(int)
{
std::cout<<"foobar1"<<std::endl;
return *this;
};
TFoo foobar2(int)
{
std::cout<<"foobar2"<<std::endl;
return *this;
};
static int bar1()
{
std::cout<<"bar1"<<std::endl;
return 0;
};
static int bar2()
{
std::cout<<"bar2"<<std::endl;
return 0;
};
static int bar3()
{
std::cout<<"bar3"<<std::endl;
return 0;
}
};
int main(int argc, char *argv[])
{
// is the sequence well defined for bar1, bar2 and bar3?
TFoo(TFoo::bar1()).foobar1(TFoo::bar2()).foobar2(TFoo::bar3());
}
* edit: removed __fastcall specifier for functions (not required/relevant to the question).
The evaluation order is not specified. The relevant section of the draft C++0x spec is 1.9, paragraphs 14 and 15:
14 Every value computation and side effect associated with a full-expression is sequenced before every value computation and side effect associated with the next full-expression to be evaluated.
15 Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced.
Here the relevant full-expression is:
TFoo(TFoo::bar1()).foobar1(TFoo::bar2()).foobar2(TFoo::bar3());
And so the evaluation of its subexpressions are unsequenced (unless there is an exception noted somewhere that I missed).
I am pretty sure earlier standards include language having the same effect but in terms of "sequence points".
[edit]
Paragraph 15 also says:
When calling a function (whether or not the function is inline), every value computation and side effect associated with any argument expression, or with the postfix expression designating the called function, is sequenced before execution of every expression or statement in the body of the called function. [Note: Value computations and side effects associated with different argument expressions are unsequenced.— end note]
A "postfix expression designating the called function" is something like the foo().bar in foo().bar().
The "note" here merely clarifies that argument evaluation order is not an exception to the "unspecified order" default. By inference, neither is the evaluation order associated with the "postfix expression designating the called function"; or if you prefer, the evaluation order of the expression for the this argument. (If there were an exception, this would be the natural place to specify it. Or possibly section 5.2.2 that talks about function calls. Neither section says anything about the evaluation order for this example, so it is unspecified.)
Yes, the order of evaluation of function arguments is unspecified.
For me, gcc 4.5.2 on linux produces
bar3
bar2
bar1
TFoo
foobar1
foobar2
but clang++ on linux and gcc 3.4.6 on solaris produce
bar1
TFoo
bar2
foobar1
bar3
foobar2
To analyze a simpler example, TFoo(0).foobar1(TFoo::bar2()); is a call to TFoo::foobar1 which takes two arguments: the result of the subexpression TFoo(0) (as the hidden argument this) and the result of the subexpression Tfoo::bar2(). For me, gcc executs bar2() first, then TFoo's constructor, and then calls foobar1(), while clang++ for example, executes TFoo's constructor first, then bar2() and then calls foobar1().