Elide copy/move when wrapping a function receiving a prvalue - c++

I am trying to create a wrapper function that has the exact same interface as the function that it wraps, with zero runtime cost overhead.
In the example code below, is it possible to design my_function_wrapped in a way so that it has the same interface as my_function, so that calling it with the exact same arguments as to my_function always yield the same results?
#include <iostream>
using namespace std;
struct SMyStruct
{
SMyStruct() { cout << "SMyStruct constructed" << std::endl;}
SMyStruct(SMyStruct&& Other) { cout << "SMyStruct moved" << std::endl;}
SMyStruct(const SMyStruct& Other) { cout << "SMyStruct copied" << std::endl; }
~SMyStruct() {cout << "SMyStruct destroyed" << std::endl;}
};
void my_function(SMyStruct Arg)
{
}
template<typename T>
void my_function_wrapped(T&& Arg)
{
my_function(std::forward<T>(Arg));
// Some extra logic here that doesn't use Arg
}
int main()
{
cout << "-----------------------------" << endl;
cout << "Direct call to my_function:" << endl;
cout << "-----------------------------" << endl;
my_function(SMyStruct());
cout << "-----------------------------" << endl;
cout << "Wrapped call to my_function:" << endl;
cout << "-----------------------------" << endl;
my_function_wrapped(SMyStruct());
return 0;
}
This program outputs:
-----------------------------
Direct call to my_function:
-----------------------------
SMyStruct constructed
SMyStruct destroyed
-----------------------------
Wrapped call to my_function:
-----------------------------
SMyStruct constructed
SMyStruct moved
SMyStruct destroyed
SMyStruct destroyed
I realize that copy/move is elided on the first call to my_function because SMyStruct() is a prvalue. Is it possible to wrap this call in my_function_wrapped and still get the elided copy/move? Is there any zero-cost way to abstract away the call?
godbolt-link to the code: https://godbolt.org/z/joGTTe64f
Thanks!

No, it is not possible to chain copy elision of prvalues through function calls like this.
Copy elision only works because the caller can construct the function parameter knowing where it needs to place it from the declaration of the function and the calling convention used.
The original caller doesn't know that you are going to simply forward the argument to another function in the body and therefore it cannot know that it is supposed to construct the object into the deeper stack frame. C++ is also designed in such a way that functions can be compiled individually only having to know the declarations of other functions (constant expression evaluation aside). Definitions of the functions don't even have to be available where a call happens.
Allowing this would require some additional language feature to annotate a function declaration to inform callers where they have to construct the parameter and I think it would be difficult to find a good specification for such a feature.
What you can do is pass the arguments for your constructor, or more generally a callable object which creates your prvalue, around, e.g.
template<typename F>
void my_function_wrapped(F&& f)
{
my_function(std::invoke(std::forward<F>(f)));
}
//...
my_function_wrapped([]{ return SMyStruct(); });
The lambda can capture arguments to the constructor if needed.
(Note however that all of this requires C++17. You also tagged C++14, but in C++14 there was no guaranteed copy elision in any of the situations under discussion anyway.)

It depends on your real case (or if there is a real case to begin with), though in your example there is really no point in passing the SMyStruct to the wrapper to forward it to the actual function (because SMyStruct has no state). If instead you forward parameters for the constructor you get desired output:
template<typename...T>
void my_function_wrapped(T&&... Arg)
{
my_function(SMyStruct(std::forward<T>(Arg)...));
}
Live Demo

Related

Copy constructor and returning passed argument

I'm reading Prata's C++ book, and when talking about copy constructors, it says that this constructor is invoked when:
Initializing a class object to the a class object of the same type.
Passing an object by value to a function.
Returning an object by value from a function.
Let say that we are working with a Vector class.
For understanding sakes, throughout the examples in the current chapter we included in the constructors/destructor definitions string outputs, to indicate which and when each of them is called.
For instance, if we have
int main()
{
Vector v1; // print "Default constructor called"
Vector v2(9, 5); // print "Constructor called"
Vector v3 = v1; // print "Copy Constructor called"
return 0;
}
And Destructor called would be printed in this case 3 times, when exiting main().
In order to check the 3 points above I've been playing with a dumb_display() function, changing the type of the formal parameter/return value. In doing so, I got confused about what is indeed happening under the hood.
Vector dumb_display(Vector v)
{
Vector test(45, 67);
cout << "In dumb_display() function" << endl;
cout << v << endl;
return v;
}
Here:
Every time we return by value the passed argument as in the above function (argument either passed by value or reference) the copy constructor gets called.
It makes sense. It satisfies point 3.
Every time we return an object defined in the function's body (e.g., changing return v; by return test;), the copy constructor isn't called.
Point 3 isn't satisfied.
I'm having a hard time trying to understand this behaviour.
I don't know whether this is correct, but I think (since an automatic storage duration object is created once for the duration of the function call) once test is created it, hasn't have to be created again, because the object "is already there". This brings the question:
Why does returning the passed argument call the copy constructor twice? Why does the same object have to be created twice for the duration of the call to the function?
#include <vector>
#include <type_traits>
#include <tuple>
#include <iostream>
using namespace std;
struct S {
S(){
cout << "default constructor" << endl;
}
S(S const &) {
cout << "copy constructor" << endl;
}
S(S &&) {
cout << "move constructor" << endl;
}
S & operator=(S const &) {
cout << "copy assignment" << endl;
return *this;
}
S & operator=(S &&) {
cout << "move assignment" << endl;
return *this;
}
};
S f() {
S s2;
cout << "In f()" << endl;
return s2;
}
S f2(S s) {
cout << "In f2()" << endl;
return s;
}
int main() {
cout << "about to call f" << endl;
S s2 = f();
(void)s2;
cout << endl << "about to call f2" << endl;
S s3 = f2(s2);
(void)s3;
}
results in:
about to call f
default constructor
In f()
about to call f2
copy constructor
In f2()
move constructor
In f(), the object is default constructed and return value optimization is used to actually construct it in place where it will actually end up -- in the s2 variable in main. No copy/move constructors are called.
In f2(), a copy is made for the input parameter for the function. That value is then moved into the variable s3 in main, again with return return value optimization.
live: https://wandbox.org/permlink/kvBHBJytaIuPj0YN
If you turn off return value optimization, you will see the results that you would expect from what your book says:
live: https://wandbox.org/permlink/BaysuTYJjlJmMGf6
Here's the same two examples without move operators if that's confusing you:
live: https://wandbox.org/permlink/c0brlus92psJtTCf
and without return value optimization:
live: https://wandbox.org/permlink/XSMaBnKTz2aZwgOm
it's the same number of constructor calls, just using the (potentially slower) copy constructor instead of the move constructor.
The copy constructor is called twice because it is first copied from the function into the temperary value (which is represented by the function call and is what the returned value is, then copied into the variable, requiring two copies. Since this is not very efficient, there is also a "move" constructor, which is only needed once.

Cannot understand why perfect forwarding is not working

I am trying to understand how perfect forwarding works but I cannot understand why the copy constructor is called in the code below
#include <utility>
#include <iostream>
using std::cout;
using std::endl;
class Something {
public:
Something() = default;
Something(__attribute__((unused)) const Something& other) {
cout << "Copy constructor called" << endl;
}
Something(__attribute__((unused)) Something&& other) {
cout << "Move constructor called" << endl;
}
void print() {
cout << "Something::print() called" << endl;
}
};
void function_1(Something&& one) {
cout << "version two called" << endl;
Something inner{one};
inner.print();
}
void function_1(const Something& one) {
Something inner(one);
inner.print();
}
template <typename... T>
void test_function(T&&... ts) {
function_1(std::forward<T>(ts)...);
}
int main() {
const Something some1 {Something()};
test_function(some1);
test_function(Something());
return 0;
}
This produces the following output
Copy constructor called
Something::print() called
version two called
Copy constructor called
Something::print() called
Changing the code to include std::move in the rvalue reference works but I did not expect to need it. When a reference is an rvalue reference the correct constructor should be called automatically right? The correct reference is resolved but the wrong constructor is being called. Any help would be greatly appreciated!
An rvalue reference binds to rvalues. It is not itself an rvalue, for it has a name.
But anything with a name at point of use is an lvalue by default, even rvalue references. Your code could use Something&& one three times, and if the first use implicitly moves you would be screwed.
Instead, it is an lvalue at point of use (by default), and it binds to an rvalue.
When you want to signal you no longer require its state to persist, std::move it.
Perfect forwarding can be used to write both of your function_1s by putting a std::forward<Blah>(blah) at the point where you'd want to move from blah if it was an rvalue reference.
Now the above is full of lies, for there are xvalues prvalues lvalues etc -- the standard is more complex. The use of a variable in return statements can turn a named value into an rvalue, for example. But the basic rule of thumb is worth knowing: it has a name, it is an lvalue (except if explicitly casted, or expiring).
This code will call the copy ctor, not the move ctor.
void function_1(Something&& one) {
cout << "version two called" << endl;
Something inner{one};
inner.print();
}
This code calls the move ctor.
void function_1(Something&& one) {
cout << "version two called" << endl;
Something inner{std::move(one)};
inner.print();
}
The expression one is technically an l-value. It refers to an rvalue-reference. But to actually get the rvalue-reference you have to use std::move. Generally anything that has a name is an l-value. Unnamed temporaries, like your Something() expression in main():
test_function(Something());
can be rvalue's and can invoke a move without using std::move.

Too many destructors called on template classes (N)RVO optimization

I'm trying to write own Smart Pointers (C++11) and stacks with one problem, that can be explained by next example:
#include <iostream>
template<typename T_Type>
class TestTemplateClass {
private:
T_Type _state;
public:
TestTemplateClass() : _state() {
std::cout << "Default constructor" << std::endl;
}
TestTemplateClass(int inState) : _state(inState) {
std::cout << "State constructor" << std::endl;
}
template<typename T_OtherType>
TestTemplateClass(const TestTemplateClass<T_OtherType> &inValue) {
std::cout << "Template-copy constructor" << std::endl;
}
template<typename T_OtherType>
void operator = (const TestTemplateClass<T_OtherType> &inValue) {
std::cout << "Operator" << std::endl;
}
~TestTemplateClass() {
std::cout << "Destructor" << std::endl;
}
};
TestTemplateClass<int> createFunction() {
return TestTemplateClass<int>();
}
int main() {
TestTemplateClass<int> theReference = createFunction();
std::cout << "Finished" << std::endl;
return 0;
}
output:
Default constructor
Destructor
Destructor
Finished
Destructor
As you can see, there are to many destructors here. In my mind, it's some problem with interaction between copy elision and template-constructor, but I don't know what may be the reason of such bug. I tried to fix the problem by adding explicit copy-constructor and force compiler use my template-constructor:
// After TestTemplateClass(int inState), but it's not important
explicit TestTemplateClass(const OwnType &inValue) {
std::cout << "Copy constructor" << std::endl;
}
got next output:
Default constructor
Template-copy constructor
Destructor
Template-copy constructor
Destructor
Finished
Destructor
Here all looks good, but it doesn't look like a clean solution. Are there better alternatives?
(N)RVO can never introduce a discrepancy between the number of constructor and destructor calls. It's designed to make that principally impossible.
The problem is with your code. According to the rules of the language, a constructor template is never used to produce a copy constructor. The copy constructor is never a template, period.
So your class template does not actually declare a copy constructor, hence the compiler generates the default one (which of course doesn't print anything). If you need any special processing in the copy constructor, you must always declare it manually. A template will never be used to instantiate one.
Your experiment suggests there isn't a bug at all: the first version simply used the copy constructor which doesn't print anything, and the second version uses a different constructor instead because you effectively disabled it.
(it also looks like whatever compiler and options you're using doesn't do RVO)

can a C++ function return an object with a constructor and a destructor

I'm trying to establish whether it is safe for a C++ function to return an object that has a constructor and a destructor. My understanding of the standard is that it ought to be possible, but my tests with simple examples show that it can be problematic. For example the following program:
#include <iostream>
using namespace std;
struct My
{ My() { cout << "My constructor " << endl; }
~My() { cout << "My destructor " << endl; }
};
My function() { My my; cout << "My function" << endl; return my; }
int main()
{ My my = function();
return 0;
}
gives the output:
My constructor
My function
My destructor
My destructor
when compiled on MSVC++, but when compiled with gcc gives the following output:
My constructor
My function
My destructor
Is this a case of "undefined behavior", or is one of the compilers not behaving in a standard way? If the latter, which ? The gcc output is closer to what I would have expected.
To date, I have been designing my classes on the assumption that for each constructor call there will be at most one destructor call, but this example seems to show that this assumption does not always hold, and can be compiler-dependent. Is there anything in the standard that specifies what should happen here, or is it better to avoid having functions return non-trivial objects ? Apologies if this question is a duplicate.
In both cases, the compiler generates a copy constructor for you, that has no output so you won't know if it is called: See this question.
In the first case the compiler generated copy constructor is used, which matches the second destructor call. The line return my; calls the copy constructor, giving it the variable my to be used to construct the return value. This doesn't generate any output.
my is then destroyed. Once the function call has completed, the return value is destroyed at the end of the line { function();.
In the second case, the copy for the return is elided completely (the compiler is allowed to do this as an optimisation). You only ever have one My instance. (Yes, it is allowed to do this even though it changes the observable behaviour of your program!)
These are both ok. Although as a general rule, if you define your own constructor and destructor, you should also define your own copy constructor (and assignment operator, and possibly move constructor and move assignment if you have c++11).
Try adding your own copy constructor and see what you get. Something like
My (const My& otherMy) { cout << "My copy constructor\n"; }
The problem is that your class My violates the Rule of Three; if you write a custom destructor then you should also write a custom copy constructor (and copy assignment operator, but that's not relevant here).
With:
struct My
{ My() { cout << "My constructor " << endl; }
My(const My &) { cout << "My copy constructor " << endl; }
~My() { cout << "My destructor " << endl; }
};
the output for MSVC is:
My constructor
My function
My copy constructor
My destructor
My destructor
As you can see, (copy) constructors match with destructors correctly.
The output under gcc is unchanged, because gcc is performing copy elision as allowed (but not required) by the standard.
You are missing two things here: the copy constructor and NRVO.
The behavior seen with MSVC++ is the "normal" behavior; my is created and the rest of the function is run; then, when returning, a copy of your object is created. The local my object is destroyed, and the copy is returned to the caller, which just discards it, resulting in its destruction.
Why does it seem that you are missing a constructor call? Because the compiler automatically generated a copy constructor, which is called but doesn't print anything. If you added your own copy constructor:
My(const My& Right) { cout << "My copy constructor " << endl; }
you'd see
My constructor <----+
My function | this is the local "my" object
My copy constructor <--|--+
My destructor <----+ | this is the return value
My destructor <-----+
So the point is: it's not that there are more calls to destructors than constructors, it's just that you are not seeing the call to the copy constructor.
In the gcc output, you are also seeing NRVO applied.
NRVO (Named Return Value Optimization) is one of the few cases where the compiler is allowed to perform an optimization that alters the visible behavior of your program. In fact, the compiler is allowed to elide the copy to the temporary return value, and construct the returned object directly, thus eliding temporary copies.
So, no copy is created, and my is actually the same object that is returned.
My constructor <-- called at the beginning of f
My function
My destructor <-- called after f is terminated, since
the caller discarded the return value of f
To date, I have been designing my classes on the assumption that for each constructor call there will be at most one destructor call [...]
You can still "assume" that since it is true. Each constructor call will go in hand with exactly one destructor call. (Remember that if you handle stuff on the free/heap memory on your own.)
[..] and can be compiler-dependent [...]
In this case it can't. It is optimization depedant. Both, MSVC and GCC behave identically if optimization is applied.
Why don't you see identical behaviour?
1. You don't track everything that happens with your object. Compiler-generated functions bypass your output.
If you want to "follow-up" on the things your compiler does with your objects, you should define all of the special members so you can really track everything and do not get bypassed by any implicit function.
struct My
{
My() { cout << "My constructor " << endl; }
My(My const&) { cout << "My copy-constructor " << endl; }
My(My &&) { cout << "My move-constructor " << endl; }
My& operator=(My const&) { cout << "My copy-assignment " << endl; }
My& operator=(My &&) { cout << "My move-assignment " << endl; }
~My() { cout << "My destructor " << endl; }
};
[Note: The move-constructor and move-assignment will not be implicitly present if you have the copy ones but it's still nice to see when the compiler use which of them.]
2. You don't compile with optimization on both MSVC and GCC.
If compiled with MSVC++11 /O2 option the output is:
My constructor
My function
My destructor
If compiled in debug mode / without optimization:
My constructor
My function
My move-constructor
My destructor
My destructor
I can't do a test on gcc to verify if there's an option that enforces all of these steps but -O0 should do the trick I guess.
What's the difference between optimized and non-optimized compilation here?
The case without any copy omittance:
The completely "non-optimized" behaviour in this line My my_in_main = function();
(changed the name to make things clear) would be:
Call function()
In function construct My My my;
Output stuff.
Copy-construct my into the return value instance.
return and destroy my instance.
Copy(or move in my example)-construct the return value instance into my_in_main.
Destroy the return value instance.
As you can see: we have at most two copies (or one copy and one move) here but the compilers may possibly omit them.
To my understanding, the first copy is omited even without optimization turned on (in this case), leaving the process as follows:
Call function()
In function construct My My my; First constructor output!
Output stuff. Function output!
Copy(or move in my example)-construct the return value instance into my_in_main. Move output!
Destroy the return value instance. Destroy output!
The my_in_main is destroy at the end of main giving the last Destroy output!. So we know what happens in the non-optimized case now.
Copy elision
The copy (or move if the class has a move constructor as in my example) can be elided.
§ 12.8 [class.copy] / 31
When certain criteria are met, an implementation is allowed to omit the copy/move construction of a class object, even if the copy/move constructor and/or destructor for the object have side effects.
So now the question is when does this happen in this example? The reason for the elison of the first copy is given in the very same paragraph:
[...] in a return statement in a function with a class return type, when the expression is the name of a non-volatile automatic object (other than a function or catch-clause parameter) with the same cvunqualified type as the function return type, the copy/move operation can be omitted by constructing the automatic object directly into the function’s return value.
Return type matches type in the return statement: function will construct My my; directly into the return value.
The reason for the elison of the second copy/move:
[...] when a temporary class object that has not been bound to a reference (12.2) would be copied/moved to a class object with the same cv-unqualified type, the copy/move operation can be omitted by constructing the temporary object directly into the target of the omitted copy/move.
Target type matches the type returned by the function: The return value of the function will be constructed into my_in_main.
So you have a cascade here:
My my; in your function is directly constructed into the return value which is directly constructed into my_in_main So you have in fact only one object here and function() would (whatever it does) in fact operate on the object my_in_main.
Call function()
In function construct My instance into my_in_main. Constructor output!
Output stuff. Function output!
my_in_main is still destroyed at the end of main giving a Destructor output!.
That makes three outputs in total: Those you observe if optimization is turned on.
An example where elision is not possible.
In the following example both copies mentioned above cannot be omitted because the class types do not match:
The return statement does not match the return type
The target type does not match the return type of the function
I just created two additional types:
#include <iostream>
using namespace std;
struct A
{
A(void) { cout << "A constructor " << endl; }
~A(void) { cout << "A destructor " << endl; }
};
struct B
{
B(A const&) { cout << "B copy from A" << endl; }
~B(void) { cout << "B destructor " << endl; }
};
struct C
{
C(B const &) { cout << "C copy from B" << endl; }
~C(void) { cout << "C destructor " << endl; }
};
B function() { A my; cout << "function" << endl; return my; }
int main()
{
C my_in_main(function());
return 0;
}
Here we have the "completely non-optimized behaviour" I mentioned above. I'll refer to the points I've drawn there.
A constructor (see 2.)
function (see 3.)
B copy from A (see 4.)
A destructor (see 5.)
C copy from B (see 6.)
B destructor (see 7.)
C destructor (instance in main, destroy at end of main)

c++ returning const reference of local variable

is it possible/ok to return a const reference even if the value the function returns is a local variable of this function? i know that locals are not valid anymore once the function returns - but what if the function is inlined and the returned value is only used within the callers scope? then the locals of the function should be included in the callers stackframe, no?
Don't count on it. Even if this works on 1 compiler, it's not standard supported behavior and is likely to break on others.
No, it's not OK. Local variables are declared on the stack, and the stack keeps changing between method calls. Also, the objects that get out of scope get destroyed. Always return a copy of a local variable.
Consider this code:
#include <iostream>
using namespace std;
class MyClass
{
public:
MyClass() { cout << "ctor" << endl; }
~MyClass() { cout << "dtor" << endl; }
MyClass(const MyClass& r) { cout << "copy" << endl; }
};
const MyClass& Test()
{
MyClass m;
return m;
}
int main()
{
cout << "before Test" << endl;
MyClass m = Test();
cout << "after Test" << endl;
}
This will print out:
before Test
ctor
dtor
copy
after Test
dtor
The object you're trying to copy has already called its destructor and may be in an invalid state.
inline is not a guarantee -- it's a suggestion. Even if you use tricks to force inline, you'll never be sure about the result, especially if you want to remain portable.
Hence, don't do it.
Doing that invokes undefined behaviour.
There's no way of forcing a compiler to inline the function. inline is just a suggestion - so is __forceinline
Even if you could guarantee that the function would be inlined, the destructor for the variable in question will still be executed, leaving you with a reference to a dead object.
And the big one - C++'s concept of the stack is delimited by scope - not by function.
#include <iostream>
int main()
{
{
int a = 5;
std::cout << std::hex << "0x" << &a << std::endl;
}
{
int b = 10;
std::cout << std::hex << "0x" << &b << std::endl;
}
}
My compiler puts 'a' and 'b' at different memory address. Except when I turn optimizations on. Yours may well decide that it's an optimization to reuse the memory your object previously occupied.
Is there a paticular problem you're trying to solve here? There are other ways of reducing the number of temporary objects created if that's your concern.
As others have noted, this is dangerous. It's also unnecessary, if your compiler supports the NRVO (Named Return Value Optimization), and your function uses and returns the local variable you would have liked to return by ref in a fairly simple way.
The NRVO allows the compiler to avoid copy construction under certain conditions - typically the main reason to avoid returning objects by value. VC++ 8 supports this (a delta on previous revisions) and it makes quite a bit of perf diff in frequently used code.
The value falls out of scope when the callee falls out of scope. So no, it is gone.
But if you want a fairly ugly solution (and a red flag warning you that your design might need refactoring), you can do something like this:
const MyObj& GetObj()
{
static const MyObj obj_;
return obj_;
}
...but this solution if fraught with peril, especially if the object is modifyable, or does something non-trivial in a multithreaded environment.
The inline keyword doesn't guarantee that the function is really inlined. Don't do it.