[basic.execution] p5 sentence 2 states:
If a language construct is defined to produce an implicit call of a function, a use of the language construct is considered to be an expression for the purposes of this definition.
However, the intent of this sentence is not immediately clear. My best guess is that it is here in order to ensure proper sequencing and to make sure that temporaries are not destroyed before any implicit function calls complete, however, I cannot see a situation where this would apply and change the meaning of some code. For example:
struct S { };
const S& f() { return {}; }
Here, the return statement would be considered an expression, and the operand {} would also be considered an expression, and therefore a subexpression of the return statement. Is this the intent of the sentence? Where else would this apply and have a meaningful effect?
The key phrase is "in the context of this definition", i.e. the definition of full-expression.
It's just saying that the rules of a full-expression (e.g. temporary lifetime) will also apply for the entirety of your return statement, even though it's not otherwise enumerated in the list of things that constitute a full-expression.
And that's because it involves an implicit function call (a ctor call); if it didn't, then the point would be moot.
It doesn't "change the meaning" of any code.
Related
Recently I was surprised that the following code compiles in clang, gcc and msvc too (at least with their current versions).
struct A {
static const int value = 42;
};
constexpr int f(A a) { return a.value; }
void g() {
A a; // Intentionally non-constexpr.
constexpr int kInt = f(a);
}
My understanding was that the call to f is not constexpr because the argument i isn't, but it seems I am wrong. Is this a proper standard-supported code or some kind of compiler extension?
As mentioned in the comments, the rules for constant expressions do not generally require that every variable mentioned in the expression and whose lifetime began outside the expression evaluation is constexpr.
There is a (long) list of requirements that when not satisfied prevent an expression from being a constant expression. As long as none of them is violated, the expression is a constant expression.
The requirement that a used variable/object be constexpr is formally known as the object being usable in constant expressions (although the exact definition contains more detailed requirements and exceptions, see also linked cppreference page).
Looking at the list you can see that this property is required only in certain situations, namely only for variables/objects whose lifetime began outside the expression and if either a virtual function call is performed on it, a lvalue-to-rvalue conversion is performed on it or it is a reference variable named in the expression.
Neither of these cases apply here. There are no virtual functions involved and a is not a reference variable. Typically the lvalue-to-rvalue conversion causes the requirement to become important. An lvalue-to-rvalue conversions happens whenever you try to use the value stored in the object or one of its subobjects. However A is an empty class without any state and therefore there is no value to read. When passing a to the function, the implicit copy constructor is called to construct the parameter of f, but because the class is empty, it doesn't actually do anything. It doesn't access any state of a.
Note that, as mentioned above, the rules are stricter if you use references, e.g.
A a;
A& ar = a;
constexpr int kInt = f(ar);
will fail, because ar names a reference variable which is not usable in constant expressions. This will hopefully be fixed soon to be more consistent. (see https://github.com/cplusplus/papers/issues/973)
Note 2 to [expr.const]/2 implies that if we have a variable o such that:
the full-expression of its initialization is a constant expression when interpreted as a constant-expression, except that if o is an object, that full-expression may also invoke constexpr constructors for o and its subobjects even if those objects are of non-literal class types
then:
Within this evaluation, std::is_constant_evaluated() [...] returns true.
Consider:
#include <type_traits>
int main() {
int x = std::is_constant_evaluated();
return x;
}
This program returns 0 when executed.
However, I don't see how the full-expression of the initialization of x is not a constant expression. I do not see anything in [expr.const] that bans it. Therefore, my understanding of the note (which is probably wrong) implies that the program should return 1.
Now, if we look at the normative definition of std::is_constant_evaluated, it is only true in a context that is "manifestly constant-evaluated", and the normative definition of the latter, [expr.const]/14, is more clear that the program above should return 0. Specifically, the only item that we really need to look at is the fifth one:
the initializer of a variable that is usable in constant expressions or has constant initialization ...
x is not usable in constant expressions, and it doesn't have constant initialization because no automatic variable does.
So there are two possibilities here. The more likely one is that I haven't understood the note, and I need someone to explain to me why the note does not imply that the program should return 1. The less likely one is that the note contradicts the normative wording.
The full quote here is
A variable or temporary object o is constant-initialized if
(2.1) either it has an initializer or its default-initialization results in some initialization being performed, and
(2.2) the full-expression of its initialization is a constant expression when interpreted as a constant-expression, except that if
o is an object, that full-expression may also invoke constexpr
constructors for o and its subobjects even if those objects are of
non-literal class types. [Note 2: Such a class can have a
non-trivial destructor. Within this evaluation,
std::is_constant_evaluated() ([meta.const.eval]) returns true.
— end note]
The tricky bit here is that the term "is constant-initialized" (note: not "has constant initialization") doesn't mean anything by itself (it probably should renamed to something else). It's used in exactly three other places, two of which I'll quote below, and the last one ([dcl.constexpr]/6) isn't really relevant.
[expr.const]/4:
A constant-initialized potentially-constant variable V is usable in constant expressions at a point P if V's initializing declaration D is reachable from P and [...].
[basic.start.static]/2:
Constant initialization is performed if a variable or temporary object with static or thread storage duration is constant-initialized ([expr.const]).
Let's replace "constant-initialized" with something less confusing, like "green".
So
A green potentially-constant variable is usable in constant expressions if [some conditions are met]
Constant initialization is performed if a variable or temporary object with static or thread storage duration is green.
Outside of these two cases, the greenness of a variable doesn't matter. You can still compute whether it is green, but that property has no effect. It's an academic exercise.
Now go back to the definition of greenness, which says that a variable or temporary object is green if (among other things) "the full-expression of its initialization is a constant expression when interpreted as a constant-expression" with some exceptions. And the note says that during this hypothetical evaluation to determine the green-ness of the variable, is_constant_evaluated() returns true - which is entirely correct.
So going back to your example:
int main() {
int x = std::is_constant_evaluated();
return x;
}
Is x green? Sure, it is. But it doesn't matter. Nothing cares about its greenness, since x is neither static nor thread local nor potentially-constant. And the hypothetical computation done to determine whether x is green has nothing to do with how it is actually initialized, which is governed by other things in the standard.
Is this invalid? gcc accepts it, clang and msvc don't.
#include <memory>
class Holder {
std::unique_ptr<int> data;
public:
operator std::unique_ptr<int>() && { return std::move(data); }
};
std::unique_ptr<int> test()
{
Holder val;
return val;
}
Assuming that I don't want to add something like std::unique_ptr<int> Holder::TakeData() { return std::move(data); }, the only other workaround I could think of is moving in the return value:
std::unique_ptr<int> test()
{
Holder val;
return std::move(val); // lets the conversion proceed
}
But then gcc 9.3+ has the gall to tell me that the std::move is redundant (with all warnings enabled). WTF? I mean yeah, gcc doesn't need the move, sure, but nothing else accepts the code then. And if it won't be gcc, then some humans inevitably will balk at it later.
What's the authoritative last word on whether it should compile as-is or not?
How should such code be written? Should I put in this seemingly noisy TakeData function and use it? Worse yet - should I maybe make the TakeData function limited to rvalue context, i.e. having to do return std::move(val).TakeData() ?
Adding operator std::unique_ptr<int>() & { return std::move(data); } is not an option, since it obviously leads to nasty bugs - it can be invoked in wrong context.
The "implicit" rvalue conversion is standard mandated. But depending on which standard version you are using, which compiler is "correct" varies.
In C++17
[class.copy.elision] (emphasis mine)
3 In the following copy-initialization contexts, a move operation
might be used instead of a copy operation:
If the expression in a return statement is a (possibly parenthesized) id-expression that names an object with automatic
storage duration declared in the body or parameter-declaration-clause
of the innermost enclosing function or lambda-expression, or
...
overload resolution to select the constructor for the copy is first
performed as if the object were designated by an rvalue. If the first
overload resolution fails or was not performed, or if the type of the
first parameter of the selected constructor is not an rvalue reference
to the object's type (possibly cv-qualified), overload resolution is
performed again, considering the object as an lvalue. [ Note: This
two-stage overload resolution must be performed regardless of whether
copy elision will occur. It determines the constructor to be called if
elision is not performed, and the selected constructor must be
accessible even if the call is elided. — end note ]
Up to C++17, GCC is wrong. Using val implicitly as an rvalue should fail to initialize the return type on account of the sentence I marked in bold (the rvalue reference in the unique_ptr c'tor doesn't bind directly to val). But come C++20, that sentence is no longer there.
C++20
3 An implicitly movable entity is a variable of automatic storage
duration that is either a non-volatile object or an rvalue reference
to a non-volatile object type. In the following copy-initialization
contexts, a move operation might be used instead of a copy operation:
If the expression in a return ([stmt.return]) or co_return ([stmt.return.coroutine]) statement is a (possibly parenthesized)
id-expression that names an implicitly movable entity declared in the
body or parameter-declaration-clause of the innermost enclosing
function or lambda-expression, or
[...]
overload resolution to select the constructor for the copy or the
return_value overload to call is first performed as if the expression
or operand were an rvalue. If the first overload resolution fails or
was not performed, overload resolution is performed again, considering
the expression or operand as an lvalue. [ Note: This two-stage
overload resolution must be performed regardless of whether copy
elision will occur. It determines the constructor or the return_value
overload to be called if elision is not performed, and the selected
constructor or return_value overload must be accessible even if the
call is elided. — end note ]
The correctness of the code is thus subject to the time travel properties of your compiler(s).
As far as how should code like that should be written. If you aren't getting consistent results, an option would be to use the exact return type of the function
std::unique_ptr<int> test()
{
Holder val;
std::unique_ptr<int> ret_val = std::move(val);
return ret_val;
}
I agree from the get go that this may not look as appealing as simply returning val, but at least it plays nice with NRVO. So we aren't likely to get more copies of unique_ptr than we desired originally.
If that is too unappealing still, then I find your idea of a resource stealing member function to be most to my liking. But no accounting for taste.
If some function f with parameters p_1, ..., p_n of types T_1, ..., T_n respectively is called with arguments a_1, ..., a_n and its body throws an exception, finishes or returns, in what order are the arguments destroyed and why? Please provide a reference to the standard, if possible.
EDIT: I actually wanted to ask about function "parameters", but as T.C. and Columbo managed to clear my confusion, I'm leaving this question be about the arguments and asked a new separate question about the parameters. See the comments on this question for the distinction.
I did not manage to find the answer in the standard, but I was able to test this on 3 most popular C++ compliant compilers. The answer of R Sahu pretty much explains that it is implementation defined.
§5.2.2/8: The evaluations of the postfix expression and of the arguments are all unsequenced relative to one
another. All side effects of argument evaluations are sequenced before the function is entered.
Visual Studio C++ Compiler (Windows) and gcc (Debian)
Arguments are constructed in order reverse to their declaration and destroyed in reversed order (thus destroyed in order of delcaration):
2
1
-1
-2
Clang (FreeBSD)
Arguments are constructed in order of their declaration and destroyed in reversed order:
1
2
-2
-1
All compilers were instructed to treat the source code as C++11 and I used the following snippet to demonstrate the situation:
struct A
{
A(int) { std::cout << "1" << std::endl; }
~A() { std::cout << "-1" << std::endl; }
};
struct B
{
B(double) { std::cout << "2" << std::endl; }
~B() { std::cout << "-2" << std::endl; }
};
void f(A, B) { }
int main()
{
f(4, 5.);
}
In §5.2.2[4] N3337 is quite explicit on what happens (online draft):
During the initialization of a parameter, an implementation may avoid the construction of extra temporaries by combining the conversions on the associated argument and/or the construction of temporaries with the
initialization of the parameter (see 12.2). The lifetime of a parameter ends when the function in which it is defined returns.
So for example in
f(g(h()));
the return value from the call h() is a temporary that will be destroyed at the end of the full expression. However the compiler is allowed to avoid this temporary and directly initialize with its value the parameter of g(). In this case the return value will be destroyed once g() returns (i.e. BEFORE calling f()).
If I understood correctly what is stated in the standard however it's not permitted to have the value returned from h() to survive to the end of the full expression unless a copy is made (the parameter) and this copy is destroyed once g() returns.
The two scenarios are:
h return value is used to directly initialize g parameter. This object is destroyed when g returns and before calling f.
h return value is a temporary. A copy is made to initialize g parameter and it is destroyed when g returns. The original temporary is destroyed at the end of the full expression instead.
I don't know if implementations are following the rules on this.
The order in which the arguments to a function are evaluated is not specified by the standard. From the C++11 Standard (online draft):
5.2.2 Function call
8 [ Note: The evaluations of the postfix expression and of the argument expressions are all unsequenced relative to one another. All side effects of argument expression evaluations are sequenced before the function
is entered (see 1.9). —end note ]
Hence, it is entirely up to an implementation to decide in what order to evaluate the arguments to a function. This, in turn, implies that the order of construction of the arguments is also implementation dependent.
A sensible implementation would destroy the objects in the reverse order of their construction.
Unfortunately, I am somewhat confused about constexpr, global constants declared in header files, and the odr.
In short: Can we conclude from here
https://isocpp.org/files/papers/n4147.pdf
that
constexpr MyClass const MyClassObj () { return MyClass {}; }
constexpr char const * Hello () { return "Hello"; }
is preferable over
constexpr MyClass const kMyClassObj = MyClass {};
constexpr char const * kHello = "Hello";
for defining globals in a header file
if I want to "just use" those globally declared/defined entities and do not want to think about how I use them?
Note: as of C++17, you can declare your variables as inline.
TL;DR: If you want to be on the (very) safe side, go with constexpr functions. It isn't inherently necessary though, and certainly won't be if you're performing trivial operations on these objects and are solely interested in their value, or simply don't use them in the dangerous scenarios listed below.
The fundamental issue is that const variables at namespace scope such as yours (generally) have internal linkage ([basic.link]/(3.2)). This implies that each translation unit compiling the corresponding header will observe a different entity (i.e. symbol).
Now imagine we have a template or inline function in a header using those objects. The ODR is very precise about this scenario - [basic.def.odr]/6:
"initialized with a constant expression" is certainly met, since we're talking constexpr. So is "the object has the same value in all definitions of D" if you don't monkey about.
"the object isn't odr-used" is probably the only questionable condition. Basically, it requires that you don't necessitate the variables runtime existence as a symbol, which in turn implies that
You don't bind it to a reference (=> you don't forward it!)
You don't (neither explicitly nor implicitly) take its address.
The only exception to the second rule are arrays, which can be taken the address of implicitly inside a subscript operation as long as the two above rules aren't violated for the yielded glvalue.
More precisely, odr-use is governed by [basic.def.odr]/3:
A variable x whose name appears as a potentially-evaluated expression ex is odr-used by ex unless applying
the lvalue-to-rvalue conversion (4.1) to x yields a constant expression (5.20) that does not invoke any non-trivial functions and, if x is an object, ex is an element of the set of potential results of an expression e, where either the lvalue-to-rvalue conversion (4.1) is applied to e, or e is a discarded-value expression (Clause
5).
Applying l-t-r to any constexpr variable will behave as required by the first part. The second part requires that the variable be used as a value rather than an actual object; that is, it's eventually either discarded or directly evaluated, giving the above rules of thumb.
If you avoid odr-use of the variable inside inline functions, templates or the like, you're fine. But if you use the return value of a corresponding constexpr function, you won't have to worry, since prvalues are already behaving more like values/literals (not objects) and constexpr functions are inline and definitely won't violate the ODR (if you don't use constexpr variables inside there!).