Order of evaluation with function pointers in C++17 - c++

Consider the following program (and its alternative in the comment) in C++17:
#include<iostream>
void a(int) {
std::cout << "a\n";
}
void b(int) {
std::cout << "b\n";
}
int main() {
using T = void(*)(int);
T f = a;
(T(f))((f=b,0)); // alternatively: f((f=b,0))
}
With -O2 option, Clang 9.0.0 prints a and GCC 9.2 prints b. Both warn me about unsequenced modification and access to f. See godbolt.org.
My expectation was that this is program has well-defined behavior and will print a, because C++17 guarantees that the left-hand expression of the call (T(f)) is sequenced before any evaluation of the arguments. Because the result of the expression (T(f)) is a new pointer to a, the later modification of f should have no impact on the call at all. Am I wrong?
Both compilers give the same output if I use f((f=b,0)); instead of (T(f))((f=b,0));. Here I am slightly unsure about the undefined behavior aspect. Would this be undefined behavior because f still refers to the declared function pointer after evaluation, which will have been modified by the evaluation of the arguments and if so, why exactly would that cause undefined behavior rather than calling b?
I have asked a related question with regards to order of evaluation of non-static member function calls in C++17 here. I am aware that writing code like this is dangerous and unnecessary, but I want to understand the details of the C++ standard better.
Edit: GCC trunk now also prints a after the bug filed by Barry (see his answer below) has been fixed. Both Clang and GCC trunk do still show false-positive warnings with -Wall, though.

The C++17 rule is, from [expr.call]/8:
The postfix-expression is sequenced before each expression in the expression-list and any default argument. The initialization of a parameter, including every associated value computation and side effect, is indeterminately sequenced with respect to that of any other parameter.
In (T(f))((f=b,0));, (T(f)) is sequenced before the initialization of the parameter from (f=b, 0). All of this is well-defined and the program should print "a". That is, it should behave just like:
auto __tmp = T(f);
__tmp((f=b, 0));
The same is true even if we change your program such that this were valid:
T{f}(f=b, 0); // two parameters now, instead of one
The f=b and 0 expressions are indeterminately sequenced with each other, but T{f} is still sequenced before both, so this would still invoke a.
Filed 91974.

Related

Constant evaluation of self-assignment in member initialization

In the following program, constexpr function foo() makes an object of A with the field x=1, then constructs another object on top of it using std::construct_at and default initialization x=x, then the constant evaluated value is printed:
#include <memory>
#include <iostream>
struct A {
int x = x;
};
constexpr int foo() {
A a{1};
std::construct_at<A>(&a);
return a.x;
}
constexpr int v = foo();
int main() {
std::cout << v;
}
GCC prints 1 here. Both Clang and MSVC print 0. And only Clang issues a warning: field 'x' is uninitialized when used. Demo: https://gcc.godbolt.org/z/WTsxdrj8e
Is there an undefined behavior in the program? If yes, why does no compiler detect it during constant evaluation? If no, which compiler is right?
C++20 [basic.life]/1.5 states that the lifetime of an object (in this case, the object a) ends when
the storage which the object occupies is released, or is reused by an object that is not nested within o (6.7.2).
The standard isn't totally clear about when exactly the memory is considered "reused" (and thus, the old A's lifetime ends) but [intro.object]/1 states that
... An object occupies a region of storage in its period of construction (11.10.5), throughout its lifetime (6.7.3), and in its period of destruction (11.10.5).
In my opinion, the evaluation of the default member initializer = x is something that happens during the "period of construction" of the new A object, and that means that at that point, the new A object has already come into existence (but its lifetime has not yet begun), and the old object's lifetime has already ended. That means the initialization of the new A reads the value of its x member, whose lifetime has not begun because its initialization is not complete, which violates [basic.life]/7.1 and would be UB.
In C++20, the definition of foo violates [dcl.constexpr]/6:
A constexpr function that is neither defaulted nor a template is ill-formed, no diagnostic required, if it is not possible for an evaluation of an invocation of the function to be performed while evaluating any valid manifestly constant-evaluated expression.
This means compilers are not required to issue a diagnostic for your program.
In C++23, this rule will be abolished (see P2448) so you can argue that compilers must issue a diagnostic if they claim C++23 compliance. However, no compiler has ever been able to diagnose all kinds of core language UB in constant expressions (for example, something that seems particularly difficult to diagnose is unsequenced writes or an unsequenced read and write involving the same scalar object) so don't hold your breath for it to be fixed.

constexpr result from non-constexpr call

Recently I was surprised that the following code compiles in clang, gcc and msvc too (at least with their current versions).
struct A {
static const int value = 42;
};
constexpr int f(A a) { return a.value; }
void g() {
A a; // Intentionally non-constexpr.
constexpr int kInt = f(a);
}
My understanding was that the call to f is not constexpr because the argument i isn't, but it seems I am wrong. Is this a proper standard-supported code or some kind of compiler extension?
As mentioned in the comments, the rules for constant expressions do not generally require that every variable mentioned in the expression and whose lifetime began outside the expression evaluation is constexpr.
There is a (long) list of requirements that when not satisfied prevent an expression from being a constant expression. As long as none of them is violated, the expression is a constant expression.
The requirement that a used variable/object be constexpr is formally known as the object being usable in constant expressions (although the exact definition contains more detailed requirements and exceptions, see also linked cppreference page).
Looking at the list you can see that this property is required only in certain situations, namely only for variables/objects whose lifetime began outside the expression and if either a virtual function call is performed on it, a lvalue-to-rvalue conversion is performed on it or it is a reference variable named in the expression.
Neither of these cases apply here. There are no virtual functions involved and a is not a reference variable. Typically the lvalue-to-rvalue conversion causes the requirement to become important. An lvalue-to-rvalue conversions happens whenever you try to use the value stored in the object or one of its subobjects. However A is an empty class without any state and therefore there is no value to read. When passing a to the function, the implicit copy constructor is called to construct the parameter of f, but because the class is empty, it doesn't actually do anything. It doesn't access any state of a.
Note that, as mentioned above, the rules are stricter if you use references, e.g.
A a;
A& ar = a;
constexpr int kInt = f(ar);
will fail, because ar names a reference variable which is not usable in constant expressions. This will hopefully be fixed soon to be more consistent. (see https://github.com/cplusplus/papers/issues/973)

New-expression with consteval constructor in constexpr context

struct A {
consteval A() {};
};
constexpr bool g() {
auto a = new A;
delete a;
return true;
}
int main() {
static_assert(g());
}
https://godbolt.org/z/jsq35WxKs
GCC and MSVC reject the program, ICC and Clang accept it:
///MSVC:
<source>(6): error C7595: 'A::A': call to immediate function is not a constant expression
Compiler returned: 2
//GCC:
<source>: In function 'constexpr bool g()':
<source>:6:18: error: the value of '<anonymous>' is not usable in a constant expression
6 | auto a = new A;
| ^
<source>:6:18: note: '<anonymous>' was not declared 'constexpr'
<source>:7:12: error: type '<type error>' argument given to 'delete', expected pointer
7 | delete a;
| ^
Compiler returned: 1
Although, replacing new A by new A() results in GCC accepting the program as well (but not for new A{} either).
Making at least one of the following changes results in all four compilers accepting the program:
Replace consteval with constexpr
Replace constexpr with consteval
Replace
auto a = new A;
delete a;
with
auto alloc = std::allocator<A>{};
auto a = alloc.allocate(1);
std::construct_at(a);
std::destroy_at(a);
alloc.deallocate(a, 1);
with A a;, with auto&& a = A{}; or with A{};
Only exceptions:
Clang trunk with libstdc++ seems to fail compilation with the std::allocator version seemingly due to an unrelated bug. With Clang 13 or libc++ it is accepted as well.
In file included from <source>:1:
In file included from [...]/memory:78:
[...]/shared_ptr_atomic.h:459:14: error: missing 'typename' prior to dependent type name '_Atomic_count::pointer'
static _Atomic_count::pointer
MSVC rejects the std::allocator version as long as there is consteval on the constructor:
error C7595: 'A::A': call to immediate function is not a constant expression
<source>(10): note: see reference to function template instantiation '_Ty *std::construct_at<_Ty,,void>(_Ty *const ) noexcept(false)' being compiled
with
[
_Ty=A
]
Replacing static_assert(g()); with g() or removing the call completely does not seem to have any impact on these results.
Which compilers are correct and if the original is ill-formed, why is only that particular combination of qualifiers and construction method disallowed?
Motivated by the comments under this answer.
The relevant wording is [expr.const]/13:
An expression or conversion is an immediate invocation if it is a potentially-evaluated explicit or implicit invocation of an immediate function and is not in an immediate function context. An immediate invocation shall be a constant expression.
Note the words 'or conversion' and 'implicit invocation' - this seems to imply that the rule is intended to apply on a per-function-call basis.1 The evaluation of a single atomic expression can consist of multiple such calls, as in the case of e.g. the new-expression which may call an allocation function, a constructor, and a deallocation function. If the selected constructor is consteval, the part of the evaluation of the new-expression that initializes the object (i.e. the constructor call), and only that part, is an immediate invocation. Under this interpretation, using new with a consteval constructor should not be ill-formed regardless of context - even outside of a constant expression - as long as the initialization of the object is itself constant, of course.
There is an issue with this reading, however: the last sentence clearly says that an immediate invocation must be an expression. A 'sub-atomic call' as described above isn't one, it does not have a value category, and could not possibly satisfy the definition of a constant expression ([expr.const]/11):
A constant expression is either a glvalue core constant expression that refers to an entity that is a permitted result of a constant expression (as defined below), or a prvalue core constant expression whose value satisfies the following constraints [...]
A literal interpretation of this wording would preclude any use of a consteval constructor outside of an immediate function context, since a call to it can never appear as a standalone expression. This is clearly not the intended meaning - among other things, it would render parts of the standard library unusable.
A more optimistic (but also less faithful to the words as written) version of this reading is that the atomic expression containing the call (formally: the expression which the call is an immediate subexpression of 2) must be a constant expression. This still doesn't allow your new A construct because it is not a constant expression by itself, and also leaves some uncertainty in cases like initialization of function parameters or variables in general.
I'm inclined to believe that the first reading is the intended one, and that new A should be fine, but clearly there's implementation divergence.
As for the contradictory 'shall be a constant expression' requirement, this isn't the only place in the standard where it appears like this. Earlier in the same section, [expr.const]/2.2:
A variable or temporary object o is constant-initialized if [...]
the full-expression of its initialization is a constant expression when interpreted as a constant-expression [...]
Clearly, the following is supposed to be valid:
constinit A a;
But there's no constant expression in sight.
So, to answer your question:
Whether the call to g is being evaluated as part of a manifestly constant-evaluated expression does not matter3 regardless of which interpretation of [expr.const]/13 you go with. new A is either well-formed even during normal evaluation or ill-formed anywhere outside of an immediate function context.
By the looks of it, Clang and ICC implement the former set of rules while GCC and MSVC adhere to the latter. With the exception of GCC accepting new A() as an outlier (which is clearly a bug), neither are wrong, the wording is just defective.
[1] CWG2410 fixes the wording to properly include things like constructor calls (which are neither expressions nor conversions).
[2] Yes, a non-expression can be a subexpression.
[3] Such a requirement would be impossible to enforce.

What is the order of destruction of function arguments?

If some function f with parameters p_1, ..., p_n of types T_1, ..., T_n respectively is called with arguments a_1, ..., a_n and its body throws an exception, finishes or returns, in what order are the arguments destroyed and why? Please provide a reference to the standard, if possible.
EDIT: I actually wanted to ask about function "parameters", but as T.C. and Columbo managed to clear my confusion, I'm leaving this question be about the arguments and asked a new separate question about the parameters. See the comments on this question for the distinction.
I did not manage to find the answer in the standard, but I was able to test this on 3 most popular C++ compliant compilers. The answer of R Sahu pretty much explains that it is implementation defined.
§5.2.2/8: The evaluations of the postfix expression and of the arguments are all unsequenced relative to one
another. All side effects of argument evaluations are sequenced before the function is entered.
Visual Studio C++ Compiler (Windows) and gcc (Debian)
Arguments are constructed in order reverse to their declaration and destroyed in reversed order (thus destroyed in order of delcaration):
2
1
-1
-2
Clang (FreeBSD)
Arguments are constructed in order of their declaration and destroyed in reversed order:
1
2
-2
-1
All compilers were instructed to treat the source code as C++11 and I used the following snippet to demonstrate the situation:
struct A
{
A(int) { std::cout << "1" << std::endl; }
~A() { std::cout << "-1" << std::endl; }
};
struct B
{
B(double) { std::cout << "2" << std::endl; }
~B() { std::cout << "-2" << std::endl; }
};
void f(A, B) { }
int main()
{
f(4, 5.);
}
In §5.2.2[4] N3337 is quite explicit on what happens (online draft):
During the initialization of a parameter, an implementation may avoid the construction of extra temporaries by combining the conversions on the associated argument and/or the construction of temporaries with the
initialization of the parameter (see 12.2). The lifetime of a parameter ends when the function in which it is defined returns.
So for example in
f(g(h()));
the return value from the call h() is a temporary that will be destroyed at the end of the full expression. However the compiler is allowed to avoid this temporary and directly initialize with its value the parameter of g(). In this case the return value will be destroyed once g() returns (i.e. BEFORE calling f()).
If I understood correctly what is stated in the standard however it's not permitted to have the value returned from h() to survive to the end of the full expression unless a copy is made (the parameter) and this copy is destroyed once g() returns.
The two scenarios are:
h return value is used to directly initialize g parameter. This object is destroyed when g returns and before calling f.
h return value is a temporary. A copy is made to initialize g parameter and it is destroyed when g returns. The original temporary is destroyed at the end of the full expression instead.
I don't know if implementations are following the rules on this.
The order in which the arguments to a function are evaluated is not specified by the standard. From the C++11 Standard (online draft):
5.2.2 Function call
8 [ Note: The evaluations of the postfix expression and of the argument expressions are all unsequenced relative to one another. All side effects of argument expression evaluations are sequenced before the function
is entered (see 1.9). —end note ]
Hence, it is entirely up to an implementation to decide in what order to evaluate the arguments to a function. This, in turn, implies that the order of construction of the arguments is also implementation dependent.
A sensible implementation would destroy the objects in the reverse order of their construction.

Is this code well defined?

I suspect the following chaining of functions would result in unspecified sequence according to the C++ standards (assume C++0x). Just want a confirmation and if anyone could provide an explanation, I'd appreciate it.
#include <iostream>
struct TFoo
{
TFoo(int)
{
std::cout<<"TFoo"<<std::endl;
};
TFoo foobar1(int)
{
std::cout<<"foobar1"<<std::endl;
return *this;
};
TFoo foobar2(int)
{
std::cout<<"foobar2"<<std::endl;
return *this;
};
static int bar1()
{
std::cout<<"bar1"<<std::endl;
return 0;
};
static int bar2()
{
std::cout<<"bar2"<<std::endl;
return 0;
};
static int bar3()
{
std::cout<<"bar3"<<std::endl;
return 0;
}
};
int main(int argc, char *argv[])
{
// is the sequence well defined for bar1, bar2 and bar3?
TFoo(TFoo::bar1()).foobar1(TFoo::bar2()).foobar2(TFoo::bar3());
}
* edit: removed __fastcall specifier for functions (not required/relevant to the question).
The evaluation order is not specified. The relevant section of the draft C++0x spec is 1.9, paragraphs 14 and 15:
14 Every value computation and side effect associated with a full-expression is sequenced before every value computation and side effect associated with the next full-expression to be evaluated.
15 Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced.
Here the relevant full-expression is:
TFoo(TFoo::bar1()).foobar1(TFoo::bar2()).foobar2(TFoo::bar3());
And so the evaluation of its subexpressions are unsequenced (unless there is an exception noted somewhere that I missed).
I am pretty sure earlier standards include language having the same effect but in terms of "sequence points".
[edit]
Paragraph 15 also says:
When calling a function (whether or not the function is inline), every value computation and side effect associated with any argument expression, or with the postfix expression designating the called function, is sequenced before execution of every expression or statement in the body of the called function. [Note: Value computations and side effects associated with different argument expressions are unsequenced.— end note]
A "postfix expression designating the called function" is something like the foo().bar in foo().bar().
The "note" here merely clarifies that argument evaluation order is not an exception to the "unspecified order" default. By inference, neither is the evaluation order associated with the "postfix expression designating the called function"; or if you prefer, the evaluation order of the expression for the this argument. (If there were an exception, this would be the natural place to specify it. Or possibly section 5.2.2 that talks about function calls. Neither section says anything about the evaluation order for this example, so it is unspecified.)
Yes, the order of evaluation of function arguments is unspecified.
For me, gcc 4.5.2 on linux produces
bar3
bar2
bar1
TFoo
foobar1
foobar2
but clang++ on linux and gcc 3.4.6 on solaris produce
bar1
TFoo
bar2
foobar1
bar3
foobar2
To analyze a simpler example, TFoo(0).foobar1(TFoo::bar2()); is a call to TFoo::foobar1 which takes two arguments: the result of the subexpression TFoo(0) (as the hidden argument this) and the result of the subexpression Tfoo::bar2(). For me, gcc executs bar2() first, then TFoo's constructor, and then calls foobar1(), while clang++ for example, executes TFoo's constructor first, then bar2() and then calls foobar1().