Can copy elision happen across synchronize-with statements?

Can copy elision happen across synchronize-with statements? - c++

In the example below, if we ignore the mutex for a second, copy elision may eliminate the two calls to the copy constructor.
user_type foo()
{
unique_lock lock( global_mutex );
return user_type(...);
}
user_type result = foo();
Now the rules for copy elision don't mention threading, but I'm wondering whether it should actually happen across such boundaries. In the situation above, the final copy, in the logical abstract machine inter-thread happens after the mutex is released. If however the copies are omitted the result data structure is initialized within the mutex, thus it inter-thread happens before the mutex is released.
I have yet to think of a concrete example how copy elision could truly result in a race condition, but the interference in the memory sequence seems like it might be problem. Can anybody definitively say it can not cause a problem, or can somebody produce an example that can indeed break?
To ensure the answer doesn't just address a special case, note that copy elision is (according to my reading) still allowed to occur if I have a statement like new (&result)( foo() ). That is, result does not need to be a stack object. user_type itself may also work with data shared between threads.
Answer: I've chosen the first answer as the most relevant discussion. Basically since the standard says elision can happen, the programmer just has to be careful when it happens across synchronization bounds. There is no indication of whether this is an intentional or accidental requirement. We're still lacking in any example showing what could go wrong, so perhaps it isn't an issue either way.

Threads have nothing to do with it, but the order of constructors/destructors of the lock may affect you.
Looking at the low level steps your code does, with out copy elision, one by one (using the GCC option -fno-elide-constructors):
Construct lock.
Construct the temporary user_type with (...) arguments.
Copy-construct the temporary return value of the function, of type user_type using the value from step 2.
Destroy the temporary from step 2.
Destroy lock.
Copy construct the user_type result using the value from step 3.
Destroy the temporary from step 3.
Later on, destroy result.
Naturally, with the multiple copy elision optimizations, it will be just:
Construct lock.
Construct the result object directly with (...).
Destroy lock.
Later on, destroy result.
Note that in both cases the user_type constructor with (...) is protected by the lock. Any other copy constructor or destructor call may not be protected.
Afterthoughts:
I think that the most likely place where it can cause problems is in the destructors. That is, if your original object, that constructed with (...) handles any shared resource differently than its copies, and does something in the destructor that needs the lock, then you have a problem.
Naturally, that would mean that your object is badly design in the first place, as copies do not behave as the original object.
Reference:
In the C++11 draft, 12.8.31 (a similar wording without all the "moves" is in C++98:
When certain criteria are met, an implementation is allowed to omit the copy/move construction of a class
object, even if the copy/move constructor and/or destructor for the object have side effects. In such cases,
the implementation treats the source and target of the omitted copy/move operation as simply two different
ways of referring to the same object, and the destruction of that object occurs at the later of the times
when the two objects would have been destroyed without the optimization. This elision of copy/move
operations, called copy elision, is permitted in the following circumstances (which may be combined to
eliminate multiple copies):
in a return statement in a function with a class return type, when the expression is the name of a
non-volatile automatic object (other than a function or catch-clause parameter) with the same cvunqualified
type as the function return type, the copy/move operation can be omitted by constructing
the automatic object directly into the function’s return value
a function or catch-clause parameter) whose scope does not extend beyond the end of the innermost
enclosing try-block (if there is one), the copy/move operation from the operand to the exception
object can be omitted by constructing the automatic object directly into the exception object
when a temporary class object that has not been bound to a reference would be copied/moved
to a class object with the same cv-unqualified type, the copy/move operation can be omitted by
constructing the temporary object directly into the target of the omitted copy/move
when the exception-declaration of an exception handler declares an object of the same type
(except for cv-qualification) as the exception object, the copy/move operation can be omitted by treating the exception-declaration as an alias for the exception object if the meaning of the program
will be unchanged except for the execution of constructors and destructors for the object declared by
the exception-declaration.
Points 1 and 3 collaborate in your example to elide all the copies.

"To ensure the answer doesn't just address a special case, note that copy elision is (according to my reading) still allowed to occur if I have a statement like new (&result)( foo() ). That is, result does not need to be a stack object. user_type itself may also work with data shared between threads."
There's the rub: if result is shared, you have a data race even without elision. The behavior is undefined to begin with.

Related

Return by value copies instead of moving

Why does this program call the copy constructor instead of the move constructor?
class Qwe {
public:
int x=0;
Qwe(int x) : x(x){}
Qwe(const Qwe& q) {
cout<<"copy ctor\n";
}
Qwe(Qwe&& q) {
cout<<"move ctor\n";
}
};
Qwe foo(int x) {
Qwe q=42;
Qwe e=32;
cout<<"return!!!\n";
return q.x > x ? q : e;
}
int main(void)
{
Qwe r = foo(50);
}
The result is:
return!!!
copy ctor
return q.x > x ? q : e; is used to disable nrvo. When I wrap it in std::move, it is indeed moved. But in "A Tour of C++" the author said that the move c'tor must be called when it available.
What have I done wrong?

You did not write your function in a way that allows copy/move elision to occur. The requirements for a copy to be replaced by a move are as follows:
[class.copy.elision]/3:
In the following copy-initialization contexts, a move operation might
be used instead of a copy operation:
If the expression in a return statement is a (possibly parenthesized) id-expression that names an object with automatic
storage duration declared in the body or
parameter-declaration-clause of the innermost enclosing function or lambda-expression
overload resolution to select the constructor for the copy is first
performed as if the object were designated by an rvalue. If the first
overload resolution fails or was not performed, or if the type of the
first parameter of the selected constructor is not an rvalue reference
to the object's type (possibly cv-qualified), overload resolution is
performed again, considering the object as an lvalue.
The above is from C++17, but the C++11 wording is pretty much the same. The conditional operator is not an id-expression that names an object in the scope of the function.
An id-expression would be something like q or e in your particular case. You need to name an object in that scope. A conditional expression doesn't qualify as naming an object, so it must preform a copy.
If you want to exercise your English comprehension abilities on a difficult wall of text, then this is how it's written in C++11. Takes some effort to see IMO, but it's the same as the clarified version above:
When certain criteria are met, an implementation is allowed to omit
the copy/move construction of a class object, even if the copy/move
constructor and/or destructor for the object have side effects. [...]
This elision of copy/move operations, called copy elision, is
permitted in the following circumstances (which may be combined to
eliminate multiple copies):
in a return statement in a function with a class return type, when the expression is the name of a non-volatile automatic object (other
than a function or catch-clause parameter) with the same
cv-unqualified type as the function return type, the copy/move
operation can be omitted by constructing the automatic object directly
into the function's return value
When the criteria for elision of a copy operation are met or would be
met save for the fact that the source object is a function parameter,
and the object to be copied is designated by an lvalue, overload
resolution to select the constructor for the copy is first performed
as if the object were designated by an rvalue. If overload resolution
fails, or if the type of the first parameter of the selected
constructor is not an rvalue reference to the object's type (possibly
cv-qualified), overload resolution is performed again, considering the
object as an lvalue.

StoryTeller didn't answer the question: Why is the move c'tor not called? (And not: Why is there no copy elision?)
Here's my go: The move c'tor will be called if and only if:
Copy elision (RVO) is not performed. Your use of the ternary operator is indeed a way to prevent copy elision. Let me point out though that return (0, q); is a simpler way to do this if you just want to return q while suppressing copy elision. This uses the (in-)famous comma operator. Possibly return ((q)); might work, too, but I am not enough of a language lawyer to tell for sure.
The argument to return is an rvalue. This could be a temporary (more precisely, a prvalue), but these are also eligible for copy elision. Therefore, the argument to return must be an xvalue, such as std::move(q) if you want to ensure the move c'tor is called.
See also: C++ value categories
Some more technicalities of your example:
q and e are objects of type Qwe.
q.x > x ? q : e is an lvalue expression of type Qwe. This is because the expressions q and e are lvalues of type Qwe. The ternary operator just selects either of them.
std::move(q.x > x ? q : e) is an xvalue expression of type Qwe. The std::move simply turns (casts) the lvalue into an xvalue. As an aside, q.x > x ? std::move(q) : std::move(e) would also work.
The copy c'tor gets called in return q.x > x ? q : e; because it can be called with an lvalue of type Qwe (constness is optional), while, on the other hand, the move c'tor cannot be called with an lvalue and is therefore eliminated from the candidate set.
UPDATE: Addressing the comments by going into more depth… this is a really confusing aspect of C++!
Conceptually, in C++98, returning an object by value meant returning a copy of the object, so the copy c'tor would be called. However, the standard's authors considered that a compiler should be free to perform an optimization such that this potentially expensive copy (e.g. of a container) could be elided under suitable circumstances.
This copy elision means that, instead of creating the object in one place and then copying it to a memory address controlled by the caller, the callee creates the object directly in the memory controlled by the caller. Therefore, only the "normal" constructor, e.g. a default c'tor, is called.
Therefore, they added a passage such that the compiler is required to check that the copy c'tor — whether generated or user-defined – exists and is accessible (there was no notion yet of deleted functions for that matter), and must ensure that the object is initialized as-if it had been first created in a different place and then copied (cf. as-if rule), but the compiler was not required to ensure that any side effects of the copy c'tor would be observable, such as the stream output in your example.
The reason why the c'tor was still required to be there was that they wanted to avoid a scenario where a compiler was able to accept code that another would have to reject, simply because the former implemented an optional optimization that the latter did not.
In C++11, move semantics were added, and the committee very much wanted to use this in such a manner that a lot of existing return-by-value functions e.g. involving strings or containers would become more efficient. This was done in such a way that conditions were given under which the compiler was actually required to perform a move instead of a copy. However, the idea of copy elision remained important, so basically there were now four different categories:
The compiler is required to check for a usable (see above) move c'tor, but is allowed to elide it.
The compiler is required to check for a usable move c'tor, and has to call it.
The compiler is required to check for a usable copy c'tor, but is allowed to elide it.
The compiler is required to check for a usable copy c'tor, and has to call it.
… which in turn lead to four possible outcomes:
Compiler checks for move c'tor, but then elides it. (relates to 1. above)
Compiler checks for move c'tor and actually emits a call to it. (relates to 1. or 2. above)
Compiler checks for copy c'tor, but then elides it. (relates to 3. above)
Compiler checks for copy c'tor and actually emits a call to it. (relates to 3. or 4. above)
And the long optimization story doesn't end here, because, in C++17, the compiler is required to elide certain c'tor calls. In these cases, the compiler is not even allowed to demand that a copy or move c'tor is available.
Note that a compiler has always been free to elide even such c'tor calls that do not meet the standard requirements, under the protection of the as-if rule, for instance by function inlining and the following optimization steps. Anyway, a function call, conceptually, does not have to be backed by the actual machine instruction for the execution of a subroutine. The compiler is just not allowed to remove observable, otherwise defined behavior.
By now you should have noticed that, at least prior to C++17, it is very well possible for the same well-formed program to behave differently, depending on the compiler used and even optimization settings, if the copy rsp. move constructor has observable side effects. Also, a compiler that implements copy/move elision may do so for a subset of the conditions under which the standard allows it to happen. This makes your question almost impossible to answer in detail. Why is the copy/move c'tor called here, but not there? Well, it may be because of the requirements of the C++ standard, but it also may be the preference of your compiler. Maybe the compiler authors had time and leisure implementing the one optimization but not the other. Maybe they found it too difficult in the latter case. Maybe they just had more important stuff to do. Who knows?
What matters 99% of the time for me as a developer is to write my code in such a way that the compiler can apply the best optimizations. Sticking to common cases and standard practice is one thing. Knowing NRVO and RVO of temporaries is another thing, and writing the code such that the standard allows (or, in C++17, requires) copy/move elision, and ensuring that a move c'tor is available where it is beneficial (in case elision does not occur). Don't rely on side effects such as writing a log message or incrementing a global counter. These are not what a copy or move c'tor should commonly do anyway, except possibly for debugging or scholarly interest.

How to initialize a std::shared_ptr from a function returning by value?

I am doing it like this:
class Something;
Something f();
...
std::shared_ptr<Something> ptr(new Something(f()));
but this doesn't feel right. Moreover it needs the copy constructor. Is there a better way?

Use std::make_shared to avoid explicitly calling new. Similarly, use std::make_unique.
make_shared might be more efficient because it can allocate the counters for the smart-pointer and the object in one block together.
Still, it does not come into its own until you have at least one more way for your statement to cause an exception after construction of the object but before it is safely ensconced in its smart-pointer. Said exceptions would otherwise cause a memory-leak.
Example for bad behaviour:
void f(std::shared_ptr<int> a, std::shared_ptr<int> b);
f(std::shared_ptr<int>(new int(0)), std::shared_ptr<int>(new int(4)));
And corrected:
f(std::make_shared<int>(0), std::make_shared<int>(4));
Now, someone advises you to return Something not by value but as a dynamically allocated pointer. For your use-case, there's actually no difference with an acceptable compiler as long as Something is copyable, due to copy-ellision, aka directly constructing the returned value in the space allocated by new/make_shared/make_unique.
So, just do what you think best there.
Copy-ellision is explicitly allowed by the standard. Just be aware the copy-constructor must be accessible anyway:
12.8. Copying and moving class objects §32
When certain criteria are met, an implementation is allowed to omit the copy/move construction of a class
object, even if the copy/move constructor and/or destructor for the object have side effects. In such cases,
the implementation treats the source and target of the omitted copy/move operation as simply two different
ways of referring to the same object, and the destruction of that object occurs at the later of the times
when the two objects would have been destroyed without the optimization.123 This elision of copy/move
operations, called copy elision, is permitted in the following circumstances (which may be combined to
eliminate multiple copies):
— in a return statement in a function with a class return type, when the expression is the name of a
non-volatile automatic object (other than a function or catch-clause parameter) with the same cvunqualified
type as the function return type, the copy/move operation can be omitted by constructing
the automatic object directly into the function’s return value
— in a throw-expression, when the operand is the name of a non-volatile automatic object (other than
a function or catch-clause parameter) whose scope does not extend beyond the end of the innermost
enclosing try-block (if there is one), the copy/move operation from the operand to the exception
object (15.1) can be omitted by constructing the automatic object directly into the exception object
— when a temporary class object that has not been bound to a reference (12.2) would be copied/moved
to a class object with the same cv-unqualified type, the copy/move operation can be omitted by
constructing the temporary object directly into the target of the omitted copy/move
— when the exception-declaration of an exception handler (Clause 15) declares an object of the same type
(except for cv-qualification) as the exception object (15.1), the copy/move operation can be omitted
by treating the exception-declaration as an alias for the exception object if the meaning of the program
will be unchanged except for the execution of constructors and destructors for the object declared by
the exception-declaration.

You can use std::make_shared.
It is better to use it for the following reason:
This function typically allocates memory for the T object and for the shared_ptr's control block with a single memory allocation (it is a non-binding requirement in the Standard). In contrast, the declaration std::shared_ptr p(new T(Args...)) performs at least two memory allocations, which may incur unnecessary overhead.

The better way would be to have f() return Something* (allocated with new) or shared_ptr<Something>. Otherwise, the Something returned by f() will have automatic storage and putting it in a shared_ptr doesn't make sense. You could, in theory, use a shared_ptr with a custom deleter, but that wouldn't change the storage class of the underlying object, and you'd most likely just end up with a wild pointer.
If you can't change f(), your solution of making a copy with dynamic storage is really all you can do. If you can give Something a move constructor, you could at least reduce the cost of making the copy (assuming it's expensive enough to be worth reducing).
But see this answer for why the copy isn't worth worrying about. Do whatever you think makes the code most readable.

Why throw local variable invokes moves constructor?

Recently, I've "played" with rvalues to understand their behavior. Most result didn't surprize me, but then I saw that if I throw a local variable, the move constructor is invoked.
Until then, I thought that the purpose of move semantics rules is to guarantee that object will move (and become invalid) only if the compiler can detect that it will not be used any more (as in temporary objects), or the user promise not to use it (as in std::move).
However, in the following code, none of this condition held, and my variable is still being moved (at least on g++ 4.7.3).
Why is that?
#include <iostream>
#include <string>
using namespace std;
int main() {
string s="blabla";
try {
throw s;
}
catch(...) {
cout<<"Exception!\n";
}
cout<<s; //prints nothing
}

C++ standard says (15.1.3):
Throwing an exception copy-initializes (8.5, 12.8) a temporary object, called the exception object. The temporary is an lvalue and is used to initialize the variable named in the matching handler (15.3).
This paragraph may be also relevant here (12.8.31):
When certain criteria are met, an implementation is allowed to omit the copy/move construction of a class object, even if the constructor selected for the copy/move operation and/or the destructor for the object have side effects. In such cases, the implementation treats the source and target of the omitted copy/move operation as simply two different ways of referring to the same object, and the destruction of that object
occurs at the later of the times when the two objects would have been destroyed without the optimization. This elision of copy/move operations, called copy elision, is permitted in the following circumstances (which may be combined to eliminate multiple copies):
(...)
— in a throw-expression, when the operand is the name of a non-volatile automatic object (other than a function or catch-clause parameter) whose scope does not extend beyond the end of the innermost enclosing try-block (if there is one), the copy/move operation from the operand to the exception object (15.1) can be omitted by constructing the automatic object directly into the exception object
Checked in Visual Studio 2012, effect:
Exception!
blabla
It looks like a bug in GCC indeed.

In the given case, it is probably a compiler bug, because the variable thrown (and moved from) is referenced afterwards.
In general case invoking move on throw is conceptually same as moving on return. It is good to invoke move automatically when it is known that the variable could not be referenced after the given point (throw or return).

Why is my copy constructor only called twice in this scenario?

I have the following two functions:
Class foo(Class arg)
{
return arg;
}
Class bar(Class *arg)
{
return *arg;
}
Now, when I solely call foo(arg), the copy constructor is of course called twice. When I call bar(&arg) solely, it's only called once. Thus, I would expect
foo(bar(&arg));
the copy constructor being called three times here. However, it's still only called twice. Why is that? Does the compiler recognise that another copy is unneeded?
Thanks in advance!

Does the compiler recognise that another copy is unneeded?
Indeed it does. The compiler is performing copy/move elision. That is the only exception to the so called "as-if" rule, and it allows the compiler (under some circumstances, like the one in your example) to elide calls to the copy or move constructor of a class even if those have side effects.
Per paragraph 12.8/31 of the C++11 Standard:
When certain criteria are met, an implementation is allowed to omit the copy/move construction of a class
object, even if the constructor selected for the copy/move operation and/or the destructor for the object
have side effects. In such cases, the implementation treats the source and target of the omitted copy/move
operation as simply two different ways of referring to the same object, and the destruction of that object
occurs at the later of the times when the two objects would have been destroyed without the optimization.
This elision of copy/move operations, called copy elision, is permitted in the following circumstances (which
may be combined to eliminate multiple copies):
— in a return statement in a function with a class return type, when the expression is the name of a
non-volatile automatic object (other than a function or catch-clause parameter) with the same cv-unqualified
type as the function return type, the copy/move operation can be omitted by constructing
the automatic object directly into the function’s return value
— [...]
— when a temporary class object that has not been bound to a reference (12.2) would be copied/moved
to a class object with the same cv-unqualified type, the copy/move operation can be omitted by
constructing the temporary object directly into the target of the omitted copy/move
— [...]
With GCC you can try using the -fno-elide-constructor compilation flag to suppress this optimization and see how the compiler would behave when no copy elision occurs.

Can copy elision occur in catch statements?

Consider an exception class with a copy constructor with side-effects.
Can a compiler skip calling the copy constructor here:
try {
throw ugly_exception();
}
catch(ugly_exception) // ignoring the exception, so I'm not naming it
{ }
What about this:
try {
something_that_throws_ugly_exception();
}
catch(ugly_exception) // ignoring the exception, so I'm not naming it
{ }
(yes, I know this is all very ugly, this was inspired by another question)

Yes, it can be elided both during throwing and catching. For catching it can be elided only when the type specified in the catch clause is the same (save for cv-qualifications) as the type of the exception object. For more formal and detailed description see C++11 12.8/31.
...This elision of copy/move operations, called copy elision, is permitted in the following circumstances (which may be combined to eliminate multiple copies):
...
in a throw-expression, when the operand is the name of a non-volatile automatic object (other than a function or catch-clause parameter) whose scope does not extend beyond the end of the innermost enclosing try-block (if there is one), the copy/move operation from the operand to the exception object (15.1) can be omitted by constructing the automatic object directly into the exception object
...
when the exception-declaration of an exception handler (Clause 15) declares an object of the same type (except for cv-qualification) as the exception object (15.1), the copy/move operation can be omitted by treating the exception-declaration as an alias for the exception object if the meaning of the program will be unchanged except for the execution of constructors and destructors for the object declared by the exception-declaration.

I think this is specifically permitted. For C++03, 15.1/3 says:
A throw-expression initializes a temporary object, called the
exception object,
and 12/15 says:
when a temporary class object that has not been bound to a reference
(12.2) would be copied to a class object with the same cv-unqualified
type, the copy operation can be omitted by constructing the tempo-
rary object directly into the target of the omitted copy
So, the secret hiding place where in-flight exceptions are kept, is defined by the standard to be a temporary, and hence is valid for copy-elision.
Edit: oops, I've now read further. 15.1/5:
If the use of the temporary object can be eliminated without changing
the meaning of the program except for the execution of constructors
and destructors associated with the use of the temporary object
(12.2), then the exception in the handler can be initialized directly
with the argument of the throw expression.
Doesn't get much clearer.
Whether it actually will... if the catch clause were to re-raise the exception (including if it called non-visible code that might do so), then the implementation needs that "temporary object called the exception object" still to be around. So there might be some restrictions on when that copy elision is possible. Clearly an empty catch clause can't re-raise it, though.

Yes. If the catch catches the exception by reference, then there will not be copy (well, that is by definition).
But I think that is not your question, and I believe the code which you've written is written on purpose with no mention of reference. If that is the case, then yes, even in this case, copy can be elided. Actually initialization of the variable in the catch is direct-initialization in theory. And copy in a direct-initialization can be elided by the compiler where it's possible.
C++03 §8.5/14 reads,
[...] In certain cases, an implementation is permitted to eliminate the copying inherent in this direct-initialization by constructing the intermediate result directly into the object being initialized;

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js