When jumping over a declaration, why is trivial destructor required? - c++

goto or switch can jump over a declaration-statement given that it has no initializer and the construction is trivial — and that the object is also trivially destructible.
What's the rationale for the constraint on the destructor?
struct trivial {
trivial() = default;
~ trivial() = default;
};
struct semi_trivial {
semi_trivial() = default;
~ semi_trivial() noexcept { do_something(); }
};
void foo() {
goto good_label; // OK
trivial foo;
good_label:
goto bad_label; // Error: this goto statement
semi_trivial bar; // cannot jump over this declaration.
bad_label:
std::cout << "hi\n";
}

The current wording is a result of N2762. The paper gives the following rationale:
6.7 stmt.dcl:
    Jumping over the definition of an automatic variable will pose the problem of whether the destructor for that variable should be run at the end of the block. Thus, the destructor needs to be trivial, i.e. have no effect. Similarly, the default constructor (the one potentially used to initialize the object) is also required to not do anything, i.e. be trivial. No other requirements are necessary.
I think the case to keep in mind is:
int i = 2;
switch (i) {
case 1:
semi_trivial st;
do_something(st);
break;
case 2:
break; // should st be destructed here?
}
And indeed, this is not an easy question to answer. Calling the destructor there would not be the obviously right thing to do. There's no good way of telling whether it should be called. The st variable here is only used in the case 1 statements, and programmers would be surprised if its destructor got called by the case 2's break statement even though it was completely unused there and not constructed.

Related

What do clang and gcc qualify as variable being unused [duplicate]

This question already has an answer here:
Why do tuples not get unused variable warnings?
(1 answer)
Closed 10 months ago.
I noticed in a PR review an unused variable and we were wondering why compiler didn't catch that. So I tested with godbolt the following code with bunch of unused variables and was surprised that some were reported as unused but others not. Even though all of them are unused.
#include <string>
struct Index
{
Index(int index) : m_index(index) {}
int m_index;
};
int main()
{
std::string str = "hello"; // case 1. no warning here - unexpected
int someValue = 2; // case 2. warning - as expected
const int someConstant = 2; // case 3. warning - as expected
Index index1(2); // case 4. just as equally not used but no warning - unexpected
// here using the assignment but do get a warning here
// but the str assignment doesn't give a warning - weird
Index index2 = 2; // case 5.
Index index3{2}; // case 6. just as equally not used but no warning - unexpected
Index index4 = {2}; // case 7. just as equally not used but no warning - unexpected
return 0;
}
warning: unused variable 'someValue' [-Wunused-variable]
warning: unused variable 'index2' [-Wunused-variable] (warning only on clang, not on gcc)
warning: unused variable 'someConstant' [-Wunused-variable]
So what do clang and gcc qualify as unused? What if I'm using a lock? I declare it but don't use it directly but use it for automatic releasing of a resource. How do I tell the compiler that I am using it if one day it starts to give a warning about the lock?
int g_i = 0;
std::mutex g_i_mutex; // protects g_i
void safe_increment()
{
const std::lock_guard<std::mutex> lock(g_i_mutex);
++g_i;
// g_i_mutex is automatically released when lock goes out of scope
}
flags: -Wunused-variable
clang: 14.0.0
gcc: 11.3
The reason why there's no warning is that variables of non-trivial class type aren't technically unused when you initialize them but then never access them in your function.
Consider this example:
struct Trivial {};
struct NonTrivial {
NonTrivial() {
//Whatever
}
};
void test() {
Trivial t;
NonTrivial nt;
}
GCC warns about Trivial t; being unused since this declaration never causes any user-defined code to run; the only thing that's run are the trivial constructor and trivial destructor, which are no-ops. So no operation at all is performed on Trivial t and it is truly unused (its memory is never even touched).
NonTrivial nt; doesn't cause a warning, however, since it is in fact used to run its constructor, which is user-defined code.
That's also why compilers are not going to warn about "unused lock guards" or similar RAII classes - they're used to run user-defined code at construction and destruction, which means that they are used (a pointer to the object is passed to the user-defined constructor/destructor = address taken = used).
This can further be proved by marking the object's constructor with the gnu::pure attribute:
struct Trivial {};
struct NonTrivial {
[[gnu::pure]] NonTrivial() {
//Whatever
}
};
void test() {
Trivial t;
NonTrivial nt;
}
In this case, GCC warns about both of them because it knows that NonTrivial::NonTrivial() doesn't have side-effects, which in turn enables the compiler to prove that construction and destruction of a NonTrivial is a no-op, giving us back our "unused variable" warning. (It also warns about gnu::pure being used on a void function, which is fair enough. You shouldn't usually do that.)
Clang's warning about the following code also does make sense.
struct Hmm {
int m_i;
Hmm(int i): m_i(i) {}
};
void test() {
Hmm hmm = 2; //Case 5 from the question
}
This is equivalent to the following:
void test() {
Hmm hmm = Hmm(2);
}
Construction of the temporary Hmm(2) has side-effects (it calls the user-defined constructor), so this temporary is not unused. However, the temporary then gets moved into the local variable Hmm hmm. Both the move constructor and the destructor of that local variable are trivial (and therefore don't invoke user code), so the variable is indeed unused since the compiler can prove that the behavior of the program would be the same whether or not that variable is present (trivial ctor + trivial dtor + no other access to the variable = unused variable, as explained above). It wouldn't be unused if Hmm had a non-trivial move constructor or a non-trivial destructor.
Note that a trivial move constructor leaves the moved-from object intact, so it truly does not have any side-effects (other than initializing the object that's being constructed).
This can easily be verified by deleting the move constructor, which causes both Clang and GCC to complain.

Why does C++ not know to do an implicit move in the return when the variable is used in an initializer list?

Consider this code:
#include <iostream>
template<typename A>
struct S{
S(const A& a){
std::cout << "L\n";
}
S(A&& a){
std::cout << "R\n";
}
};
S<int> f1(){
int a = 1;
return {a};
}
S<int> f2(){
int a = 1;
return a;
}
S<int> f3(){
int a = 1;
return {std::move(a)};
}
int main()
{
f1();
f2();
f3();
}
Output is
L
R
R
As you may know C++ implicitly moves in the return (in f2). When we do it manually in the initializer list it works (f3), but it is not done automagically by C++ in f1.
Is there a good reason why this does not work, or is it just a corner case deemed not important enough to be specified by the standard?
P.S. I know compilers can (sometimes must) do RVO, but I do not see how this could explain the output.
The nice thing about:
return name;
Is that it's a simple case to reason about: you're obviously returning just an object, by name, there's no other shenanigans going on here at all. And yet, this specific case, has led to patch after patch after patch after patch. So, maybe not so simple after all.
Once we throw in any further complexity on top of that, it gets way more complicated.
With returning an initializer list, we would have to start considering all sorts of other cases:
// obviously can't move
return {name, name};
// 'name' might refer to an automatic storage variable, but
// what if 'other_name' is an alias to it? What if 'other_name'
// is a separate automatic storage variable but is somehow
// dependent on 'name' in a way that matters?
return {name, other_name};
You just... can't know. The only case that we could definitely consider is an initializer list consisting of a single name:
return {name};
That case is probably fine to implicitly move from. But the thing is, in that case, you can just move:
return {std::move(name)};
The problem specifically with the return name; case is that return std::move(name); was sometimes mandatory and sometimes a pessimization and we would like to get the point where you always just write the one thing and get the optimal behavior. There's no such concern here, return {std::move(name)}; can't inhibit copy elision in the same way. So it's just less of an issue to have to write that.

Is it safe to modify RVO values within an RAII construct? [duplicate]

This question already has an answer here:
Clang modifies return value in destructor?
(1 answer)
Closed 3 years ago.
Consider the following program:
#include <functional>
#include <iostream>
class RvoObj {
public:
RvoObj(int x) : x_{x} {}
RvoObj(const RvoObj& obj) : x_{obj.x_} { std::cout << "copied\n"; }
RvoObj(RvoObj&& obj) : x_{obj.x_} { std::cout << "moved\n"; }
int x() const { return x_; }
void set_x(int x) { x_ = x; }
private:
int x_;
};
class Finally {
public:
Finally(std::function<void()> f) : f_{f} {}
~Finally() { f_(); }
private:
std::function<void()> f_;
};
RvoObj BuildRvoObj() {
RvoObj obj{3};
Finally run{[&obj]() { obj.set_x(5); }};
return obj;
}
int main() {
auto obj = BuildRvoObj();
std::cout << obj.x() << '\n';
return 0;
}
Both clang and gcc (demo) output 5 without invoking the copy or move constructors.
Is this behavior well-defined and guaranteed by the C++17 standard?
Copy elision only permits an implementation to remove the presence of the object being generated by a function. That is, it can remove the copy from obj to the return value object of foo and the destructor of obj. However, the implementation can't change anything else.
The copy to the return value would happen before destructors for local objects in the function are called. And the destructor of obj would happen after the destructor of run, because destructors for automatic variables are executed in reverse-order of their construction.
This means that it is safe for run to access obj in its destructor. Whether the object denoted by obj is destroyed after run completes or not does not change this fact.
However, there is one problem. See, return <variable_name>; for a local variable is required to invoke a move operation. In your case, moving from RvoObj is the same as copying from it. So for your specific code, it'll be fine.
But if RvoObj were, for example, unique_ptr<T>, you'd be in trouble. Why? Because the move operation to the return value happens before destructors for local variables are called. So in this case obj will be in the moved-from state, which for unique_ptr means that it's empty.
That's bad.
If the move is elided, then there's no problem. But since elision is not required, there is potentially a problem, since your code will behave differently based on whether elision happens or not. Which is implementation-defined.
So generally speaking, it's best not to have destructors rely on the existence of local variables that you're returning.
The above purely relates to your question about undefined behavior. It isn't UB to do something that changes behavior based on whether elision happens or not. The standard defines that one or the other will happen.
However, you cannot and should not rely upon it.
Short answer: due to NRVO, the output of the program may be either 3 or 5. Both are valid.
For background, see first:
in C++ which happens first, the copy of a return object or local object's destructors?
What are copy elision and return value optimization?
Guideline:
Avoid destructors that modify return values.
For example, when we see the following pattern:
T f() {
T ret;
A a(ret); // or similar
return ret;
}
We need to ask ourselves: does A::~A() modify our return value somehow? If yes, then our program most likely has a bug.
For example:
A type that prints the return value on destruction is fine.
A type that computes the return value on destruction is not fine.
[From https://stackoverflow.com/a/54566080/9305398 ]

Why can't the last assignment from a variable in a function be treated as a move?

In code like this:
class X {
X(const X&) {
// ...
}
X(const X&&) {
// ...
}
// ...
};
void f() {
X a;
// ...
X b = a;
// ... code that doesn't use a
}
My understanding is that the last statement calls the copy constructor not the move constructor. Assuming a is never used again in f(), can the compiler automatically optimize this statement to use the move constructor instead?
P.S. I know about std::move(), but I'm asking about automatic move.
You'd need to write a spec that somehow correctly handles
void f() {
X a;
g(a); // stash a reference to a somewhere
X b = a; // can't move from a!
g2(); // use the reference stored by g
}
For the move to be safe, you'd need to prove that subsequent code, including all the functions it calls, does not access a directly or indirectly, which is impossible in the general case because the definitions of these functions may not be available to the compiler (e.g., in a different translation unit).
It is difficult/impossible for a compiler to know that a is unreferenced except in trivial scenarios. Any outside function could have saved a pointer or reference to a, and any outside function could be relying on said pointer's contents.
In the situation where no outside functions are involved, I guess that optimization could be possible.
The optimization would not be entirely safe without more rigorous analysis. For example, a local object might have been initialized with the address of a and do something on it upon destruction, which would happen after the last statement X b = a;.

What happens when we combine RAII and GOTO?

I'm wondering, for no other purpose than pure curiosity (because no one SHOULD EVER write code like this!) about how the behavior of RAII meshes with the use of goto (lovely idea isn't it).
class Two
{
public:
~Two()
{
printf("2,");
}
};
class Ghost
{
public:
~Ghost()
{
printf(" BOO! ");
}
};
void foo()
{
{
Two t;
printf("1,");
goto JUMP;
}
Ghost g;
JUMP:
printf("3");
}
int main()
{
foo();
}
When running the following code in Visual Studio 2005 I get the following output.
1,2,3 BOO!
However I imagined, guessed, hoped that 'BOO!' wouldn't actually appear as the Ghost should have never been instantiated (IMHO, because I don't know the actual expected behavior of this code).
What's up?
I just realized that if I instantiate an explicit constructor for Ghost the code doesn't compile...
class Ghost
{
public:
Ghost()
{
printf(" HAHAHA! ");
}
~Ghost()
{
printf(" BOO! ");
}
};
Ah, the mystery ...
The standard talks about this explicitly - with an example; 6.7/3 "Declaration statement" (emphasis added by me):
Variables with automatic storage duration are initialized each time their declaration-statement is executed. Variables with automatic storage duration declared in the block are destroyed on exit from the block.
It is possible to transfer into a block, but not in a way that bypasses declarations with initialization. A program that jumps from a point where a local variable with automatic storage duration is not in scope to a point where it is in scope is ill-formed unless the variable has POD type and is declared without an initializer.
[Example:
void f()
{
//...
goto lx; //ill-formed: jump into scope of a
//...
ly:
X a = 1;
//...
lx:
goto ly; //OK, jump implies destructor
//call for a, followed by construction
//again immediately following label ly
}
—end example]
So it seems to me that MSVC's behavior is not standards compliant - Ghost is not a POD type, so the compiler should issue an error when the the goto statement is coded to jump past it.
A couple other compilers I tried (GCC and Digital Mars) issue errors. Comeau issues a warning (but in fairness, my build script for Comeau has it configured for high MSVC compatibility, so it might be following Microsoft's lead intentionally).
Goto isn't radioactive. Leaving by goto is little different from leaving by exception. Entering by goto should be dictated by convenience, not the limits of the language. Not knowing whether the ghost is constructed or not is a good reason not to do that.
Jump in before the constructor. If you want to jump in after some object is already constructed, enclose it in a new scope or otherwise resolve its lifetime yourself.
In this scenario, I have found following approach useful.
void foo()
{
{
Two t;
printf("1,");
goto JUMP;
}
{
Ghost g;
// operations that use g.
}
// g is out of scope, so following JUMP is allowed.
JUMP:
printf("3");
}
Confining the scope of variable g in your foo() function, will make the goto jump legal. Now, we are not jumping from a place where g is not initialized to a place where g is expected to be initialized.