C++ copy constructor double call on member initialization - c++

Consider the below code, where a composing class with another class as its member is being instantiated:
class CopyAble {
private:
int mem1;
public:
CopyAble(int n1) : mem1(n1) {
cout << "Inside the CopyAble constructor" << endl;
}
CopyAble(const CopyAble& obj) {
cout << "Inside the CopyAble copy constructor" << endl;
this->mem1 = obj.mem1;
return *this;
}
CopyAble& operator=(const CopyAble& obj) {
cout << "Inside the CopyAble assignment constructor" << endl;
this->mem1 = obj.mem1;
}
~CopyAble() {};
};
class CopyAbleComposer {
private:
CopyAble memObj;
public:
CopyAbleComposer(CopyAble m1) : memObj(m1) {
cout << "Composing the composer" << endl;
}
~CopyAbleComposer() {}
};
int main()
{
CopyAble ca(10);
CopyAbleComposer cac(ca);
return 0;
}
When I run this, I get the output:
Inside the CopyAble constructor
Inside the CopyAble copy constructor
Inside the CopyAble copy constructor
Composing the composer
Which means that the CopyAble copy constructor is being run twice - once when the CopyAble object is passed into the CopyAbleComposer constructor, and again when the initializer memObj(m1) runs.
Is this an idiomatic use of the copy constructor? It seems very inefficient that the copy constructor runs twice when we try to initialize a member object with a passed-in object of the same type, and it seems like a trap a lot of C++ programmers can easily fall into without realizing it.
EDIT: I don't think this is a duplicate of the question regarding passing a reference into the copy constructor. Here, we are being forced to pass a reference into a regular constructor to avoid duplicate object creation, my question was that is this generally known that class constructors in C++ should have objects passed in by reference to avoid this kind of duplicate copy?

You should accept CopyAble by reference at CopyAbleComposer(CopyAble m1), otherwise a copy constructor will be called to construct an argument. You should also mark it as explicit to avoid accidental invocations:
explicit CopyAbleComposer(const CopyAble & m1)

Pass-by-value and the associated copying is a pretty widely known property of C++. Actually, in the past C++ was criticized for this gratuitious copying, which happened silently, was hard to avoid and could lead to decreased performance. This is humorously mentioned e.g. here:
You accidentally create a dozen instances of yourself and shoot them all in the foot. Providing emergency medical assistance is impossible since you can't tell which are bitwise copies and which are just pointing at others and saying, "That's me, over there."
C++98
When any function/method is declared to receive an argument by value, this sort of copying happens. It doesn't matter if it's a constructor, a "stand-alone" function or a method. To avoid this, use a const reference:
CopyAbleComposer(const CopyAble& m1) : memObj(m1)
{
...
}
Note: even if you rearrange your code as below, one copy always remains. This has been a major deficiency in C++ for a long time.
CopyAbleComposer cac(CopyAble(10)); // initializing mem1 by a temporary object
C++11
C++11 introduced move semantics, which replaces the additional copy by a "move" operation, which is supposed to be more efficient than copy: in the common case where an object allocates memory dynamically, "move" only reassigns some pointers, while "copy" allocates and deallocates memory.
To benefit from optimization offered by move semantics, you should undo the "optimization" you maybe did for C++98, and pass arguments by value. In addition, when initializing the mem1 member, you should invoke the move constructor:
CopyAbleComposer(CopyAble m1) : memObj(std::move(m1)) {
cout << "Composing the composer" << endl;
}
Finally, you should implement the move constructor:
CopyAble(CopyAble&& obj) {
cout << "Inside the CopyAble move constructor" << endl;
this->mem1 = obj.mem1;
}
Then you should see that the "copy" message doesn't appear, and is replaced by the "move" message.
See this question for more details.
Note: In all these examples, the CopyAble objects are assumed to be much more complex, with copy and move constructors doing non-trivial work (typically, resource management). In modern C++, resource management is considered a separate concern, in the context of separation of concerns. That is, any class that needs a non-default copy or move constructor, should be as small as possible. This is also called the Rule of Zero.

Related

Passing functor to std::thread by value: Why is copy constructor called twice? [duplicate]

Why in the example code below, object is copied twice? According documentation constructor of thread class copies all arguments to thread-local storage so we have reason for the first copy. What about second?
class A {
public:
A() {cout << "[C]" << endl;}
~A() {cout << "[~D]" << endl;}
A(A const& src) {cout << "[COPY]" << endl;}
A& operator=(A const& src) {cout << "[=}" << endl; return *this;}
void operator() () {cout << "#" << endl;}
};
void foo()
{
A a;
thread t{a};
t.join();
}
Output from above:
[C]
[COPY]
[COPY]
[~D]
#
[~D]
[~D]
Edit:
Well yes, after adding move constructor:
A(A && src) {cout << "[MOVE]" << endl;}
The output is like this:
[C]
[COPY]
[MOVE]
[~D]
#
[~D]
[~D]
For anything you want to move or avoid copies, prefer move constructors and std::move.
But Why doesn't this happen automatically for me?
Move in C++ is conservative. It generally will only move if you explicitly write std::move(). This was done because move semantics, if extended beyond very explicit circumstances, might break older code. Automatic-moves are often restricted to very careful set of circumstances for this reason.
In order to avoid copies in this situation, you need to shift a around by using std::move(a) (even when passing it into std::thread). The reason it makes a copy the first time around is because std::thread can't guarantee that the value will exist after you have finished constructing the std::thread (and you haven't explicitly moved it in). Thusly, it will do the safe thing and make a copy (not take a reference/pointer to what you passed in and store it: the code has no idea whether or not you'll keep it alive or not).
Having both a move constructor and using std::move will allow the compiler to maximally and efficiently move your structure. If you're using VC++ (with the CTP or not), you must explicitly write the move constructor, otherwise MSVC will (even sometimes erroneously) declare and use a Copy constructor.
The object is copied twice because the object cannot be moved. The standard does not require this, but it is legitimate behavior.
What's happening inside of the implementation is that it seems to be doing a decay_copy of the parameters, as required by the standard. But it doesn't do the decay_copy into the final destination; it does it into some internal, possibly stack, storage. Then it moves the objects from that temporary storage to the final location within the thread. Since your type is not moveable, it must perform a copy.
If you make your type moveable, you'll find that the second copy becomes a move.
Why might an implementation do this, rather than just copying directly into the final destination? There could be any number of implementation-dependent reasons. It may have just been simpler to build a tuple of the function+parameters on the stack, then move that into the eventual destination.
try with: thread t{std::move(a)};

Is pass by value that much faster?

I've heard that you should always prefer "pass by value" in C++11 because of the introduction of move semantics. I wanted to see what the hype was all about and constructed a test case. First my class:
struct MyClass {
MyClass() { }
MyClass(const MyClass&) { std::cout << "Copy construct" << std::endl; }
MyClass(MyClass&&) { std::cout << "Move construct" << std::endl; }
~MyClass() { }
};
And the test harness:
class Test
{
public:
void pass_by_lvalue_ref(const MyClass& myClass)
{
_MyClass.push_back(myClass);
}
void pass_by_rvalue_ref(MyClass&& myClass)
{
_MyClass.push_back(std::move(myClass));
}
void pass_by_value(MyClass myClass)
{
_MyClass.push_back(std::move(myClass));
}
private:
std::vector<MyClass> _MyClass;
};
Presumably, pass_by_value should outperform pass_by_lvalue_ref and pass_by_rvalue_ref (together, not separately).
int main()
{
MyClass myClass;
Test Test;
std::cout << "--lvalue_ref--\n";
Test.pass_by_lvalue_ref(myClass);
std::cout << "--rvalue_ref--\n";
Test.pass_by_rvalue_ref(MyClass{});
std::cout << "--value - lvalue--\n";
Test.pass_by_value(myClass);
std::cout << "--value - rvalue--\n";
Test.pass_by_value(MyClass{});
}
This is my output on GCC 4.9.2 with -O2:
--lvalue_ref--
Copy construct
--rvalue_ref--
Move construct
Copy construct
--value - lvalue--
Copy construct
Move construct
Copy construct
Copy construct
--value - rvalue--
Move construct
As you can see, the non-pass_by_value functions requires a total of 2 copy constructs and 1 move construct. The pass_by_value function requires a total of 3 copy constructs and 2 move constructs. It looks like that, as expected, the object is going to be copied anyway, so why does everyone say pass by value?
First, your reporting is entirely flawed. Each of your functions pushes back to the same vector. When that vector runs out of capacity (which depends upon how many items you've inserted so far), it is going to trigger a re-allocation which will require more moves and/or copies than an insertion which doesn't trigger an allocation.
Second, std::vector::push_back has a strong exception safety guarantee. So if your move constructor is not noexcept, it will not use it (unless the class is non-copyable). It will use the copy constructor instead.
Third,
I've heard that you should always prefer "pass by value" in C++11
because of the introduction of move semantics.
I'm pretty sure you didn't hear that from any reputable source. Or are actually inappropriately paraphrasing what was actually said. But I don't have the source of the quote. What is usually advised is actually that if you are going to copy your arguments in your function anyway, don't. Just do it in the parameter list (via pass by value). This will allow your function to move r-value arguments straight to their destination. When you pass l-values, they will be copied, but you were going to do that anyway.
If you are going to make an internal copy, then passing by value will do exactly one move construct more than the pair of overloads (pass by rvalue ref)+(pass by const lvalue ref).
If move construct is cheap, this is a small amount of runtime overhead in exchange for less compile time and code maintenance overhead.
The idiom is "Want speed? Making a copy anyhow? Pass by value, instead of by const lvalue reference." in reality.
Finally, your benchmark is flawed as you failed to reserve(enough) before your push backs. Reallocation can cause extra operations. Oh, and make your move constructor noexcept, as conforming libraries will prefer a copy to a move if move can throw in many situations.

Returning automatic local objects in C++

I'm trying to understand some aspects of C++.
I have written this short programme to show different ways of returning objects from functions in C++:
#include <iostream>
using namespace std;
// A simple class with only one private member.
class Car{
private:
int maxSpeed;
public:
Car( int );
void print();
Car& operator= (const Car &);
Car(const Car &);
};
// Constructor
Car::Car( int maxSpeed ){
this -> maxSpeed = maxSpeed;
cout << "Constructor: New Car (speed="<<maxSpeed<<") at " << this << endl;
}
// Assignment operator
Car& Car::operator= (const Car &anotherCar){
cout << "Assignment operator: copying " << &anotherCar << " into " << this << endl;
this -> maxSpeed = anotherCar.maxSpeed;
return *this;
}
// Copy constructor
Car::Car(const Car &anotherCar ) {
cout << "Copy constructor: copying " << &anotherCar << " into " << this << endl;
this->maxSpeed = anotherCar.maxSpeed;
}
// Print the car.
void Car::print(){
cout << "Print: Car (speed=" << maxSpeed << ") at " << this << endl;
}
// return automatic object (copy object on return) (STACK)
Car makeNewCarCopy(){
Car c(120);
return c; // object copied and destroyed here
}
// return reference to object (STACK)
Car& makeNewCarRef(){
Car c(60);
return c; // c destroyed here, UNSAFE!
// compiler will say: warning: reference to local variable ā€˜cā€™ returned
}
// return pointer to object (HEAP)
Car* makeNewCarPointer(){
Car * pt = new Car(30);
return pt; // object in the heap, remember to delete it later on!
}
int main(){
Car a(1),c(2);
Car *b = new Car(a);
a.print();
a = c;
a.print();
Car copyC = makeNewCarCopy(); // safe, but requires copy
copyC.print();
Car &refC = makeNewCarRef(); // UNSAFE
refC.print();
Car *ptC = makeNewCarPointer(); // safe
if (ptC!=NULL){
ptC -> print();
delete ptC;
} else {
// NULL pointer
}
}
The code doesn't seem to crash, and I get the following output:
Constructor: New Car (speed=1) at 0x7fff51be7a38
Constructor: New Car (speed=2) at 0x7fff51be7a30
Copy constructor: copying 0x7fff51be7a38 into 0x7ff60b4000e0
Print: Car (speed=1) at 0x7fff51be7a38
Assignment operator: copying 0x7fff51be7a30 into 0x7fff51be7a38
Print: Car (speed=2) at 0x7fff51be7a38
Constructor: New Car (speed=120) at 0x7fff51be7a20
Print: Car (speed=120) at 0x7fff51be7a20
Constructor: New Car (speed=60) at 0x7fff51be79c8
Print: Car (speed=60) at 0x7fff51be79c8
Constructor: New Car (speed=30) at 0x7ff60b403a60
Print: Car (speed=30) at 0x7ff60b403a60
Now, I have the following questions:
Is makeNewCarCopy safe? Is the local object being copied and destroyed at the end of the function? If so, why isn't it calling the overloaded assignment operator? Does it call the default copy constructor?
My guts tell me to use makeNewCarPointer as the most usual way of returning objects from a C++ function/method. Am I right?
Is makeNewCarCopy safe? Is the local object being copied and destroyed
at the end of the function? If so, why isn't it calling the overloaded
assignment operator? Does it call the default copy constructor?
The important question here is "Is makeNewCarCopy safe?" The answer to that question is, "yes." You are making a copy of the object and returning that copy by-value. You do not attempt to return a reference to a local automatic object, which is a common pitfall among newbies, and that is good.
The answers to the other parts of this question are philisophically less important, although once you know how to do this safely they may become critically important in production code. You may or may not see construction and destruction of the local object. In fact, you probably won't, especially when compiling with optimizations turned on. The reason is because the compiler knows that you are creating a temporary and returning that, which in turn is being copied somewhere else. The temporary becomes meaningless in a sense, so the compiler skips the whole bothersome create-copy-destroy step and simply constructs the new copy directly in the variable where it's ultimately intended. This is called copy elision. Compilers are allowed to make any and all changes to your program so long as the observable behavior is the same as if no changes were made (see: As-If Rule) even in cases where the copy constructor has side-effects (see: Return Value Optimization) .
My guts tell me to use makeNewCarPointer as the most usual way of
returning objects from a C++ function/method. Am I right?
No. Consider copy elision, as I described it above. All contemporary, major compilers implement this optimization, and do a very good job at it. So if you can copy by-value as efficiently (at least) as copy by-pointer, is there any benefit to copy by-pointer with respect to performance?
The answer is no. These days, you generally want to return by-value unless you have compelling need not to. Among those compelling needs are when you need the returned object to outlive the "scope" in which it was created -- but not among those is performance. In fact, dynamic allocation can be significantly more expensive time-wise than automatic (ie, "stack") allocation.
Yes, makeNewCarCopy is safe. Theoretically there will be a copy made as the function exits, however because of the return value optimization the compiler is allowed to remove the copy.
In practice this means that makeNewCarCopy will have a hidden first parameter which is a reference to an uninitialized Car and the constructor call inside makeNewCarCopy will actually initialize the Car instance that resides outside of the function's stack frame.
As to your second question: Returning a pointer that has to be freed is not the preferred way. It's unsafe because the implementation detail of how the function allocated the Car instance is leaked out and the caller is burdened with cleaning it up. If you need dynamic allocation then I suggest that you return an std::shared_ptr<Car> instead.
Yes makeNewCarCopy is safe. And in most cases it is effective as compiler can do certain optimizations like copy elision (and that the reason you do not see assignment operator or copy ctor called) and/or move semantics added by C++11
makeNewCarPointer can be very effective, but is is very dangerous at the same time. The problem is you can easily ignore return value and compiler will not produce any warnings. So at least you should return smart pointer like std::unique_ptr or std::shared_ptr. But IMHO previous method is more preferred and would be at least not slower. Different story if you have to create object on heap by different reason.

std::thread Why object is copied twice?

Why in the example code below, object is copied twice? According documentation constructor of thread class copies all arguments to thread-local storage so we have reason for the first copy. What about second?
class A {
public:
A() {cout << "[C]" << endl;}
~A() {cout << "[~D]" << endl;}
A(A const& src) {cout << "[COPY]" << endl;}
A& operator=(A const& src) {cout << "[=}" << endl; return *this;}
void operator() () {cout << "#" << endl;}
};
void foo()
{
A a;
thread t{a};
t.join();
}
Output from above:
[C]
[COPY]
[COPY]
[~D]
#
[~D]
[~D]
Edit:
Well yes, after adding move constructor:
A(A && src) {cout << "[MOVE]" << endl;}
The output is like this:
[C]
[COPY]
[MOVE]
[~D]
#
[~D]
[~D]
For anything you want to move or avoid copies, prefer move constructors and std::move.
But Why doesn't this happen automatically for me?
Move in C++ is conservative. It generally will only move if you explicitly write std::move(). This was done because move semantics, if extended beyond very explicit circumstances, might break older code. Automatic-moves are often restricted to very careful set of circumstances for this reason.
In order to avoid copies in this situation, you need to shift a around by using std::move(a) (even when passing it into std::thread). The reason it makes a copy the first time around is because std::thread can't guarantee that the value will exist after you have finished constructing the std::thread (and you haven't explicitly moved it in). Thusly, it will do the safe thing and make a copy (not take a reference/pointer to what you passed in and store it: the code has no idea whether or not you'll keep it alive or not).
Having both a move constructor and using std::move will allow the compiler to maximally and efficiently move your structure. If you're using VC++ (with the CTP or not), you must explicitly write the move constructor, otherwise MSVC will (even sometimes erroneously) declare and use a Copy constructor.
The object is copied twice because the object cannot be moved. The standard does not require this, but it is legitimate behavior.
What's happening inside of the implementation is that it seems to be doing a decay_copy of the parameters, as required by the standard. But it doesn't do the decay_copy into the final destination; it does it into some internal, possibly stack, storage. Then it moves the objects from that temporary storage to the final location within the thread. Since your type is not moveable, it must perform a copy.
If you make your type moveable, you'll find that the second copy becomes a move.
Why might an implementation do this, rather than just copying directly into the final destination? There could be any number of implementation-dependent reasons. It may have just been simpler to build a tuple of the function+parameters on the stack, then move that into the eventual destination.
try with: thread t{std::move(a)};

how to get stl map to construct/destruct inserted object only once

I have found a very prejudicial fact about stl maps. For some reason I cant get objects being inserted in the map to get constructed/destructed only once.
Example:
struct MyObject{
MyObject(){
cout << "constructor" << endl;
}
~MyObject(){
cout << "destructor" << endl;
}
};
int main() {
std::map<int, MyObject> myObjectsMap;
myObjectsMap[0] = MyObject();
return 0;
}
returns:
constructor
destructor
destructor
constructor
destructor
If I do:
typedef std::pair<int, MyObject> MyObjectPair;
myObjectsMap.insert( MyObjectPair(0,MyObject()));
returns:
constructor
destructor
destructor
destructor
I'm inserting Objects responsible for their own memory allocation, so when destructed they'll clean themselves up, being destructed several times is causing me some trouble.
I suggest you add a copy constructor - that's what is being used for the 'missing' constructions I think.
Code:
#include <iostream>
#include <map>
using namespace std;
struct MyObject{
MyObject(){
cout << "no-arg constructor" << endl;
}
MyObject(const MyObject&) {
cout << "const copy constructor" << endl;
}
~MyObject(){
cout << "destructor" << endl;
}
};
int main() {
std::map<int, MyObject> myObjectsMap;
myObjectsMap[0] = MyObject();
return 0;
}
Output:
no-arg constructor
const copy constructor
const copy constructor
destructor
destructor
no-arg constructor
destructor
destructor
std::map is allowed to make as many copies of your objects as it wishes. This is implementation defined and you have no control over this. The "missing" constructions you notice, by the way, may be for calling the copy-constructor, which you didn't define.
What you can do, however is use a flyweight so constructing an object in fact fetches a existing object from a pool of pre-existing objects, and destructing an object does nothing. The pool never frees its objects, but it always maintains a handle over all of them. This way, your memory usage is large from start to stop, but it doesn't change much throughout the life of the program.
To be able to be used in a standard container your objects must be copyable and assignable. If your objects don't conform to this you may are likely to have problems.
That said, if (as your sample code indicates) you just need a default constructed object inserted in the map you can just use operator[] for its side effect:
// Insert default constructed MyObject at key 0
myObjectsMap[0];
Edit
I'm not quite clear from your question, but if you're unclear about the number of objects constructed and believe there is a constructor/destructor mismatch then note that the compiler will provide a copy constructor which doesn't log to std::cout as you don't provide a user-declared one.
When you say myObjectsMap[0], you are calling the default constructor for MyObject. This is because there is nothing in [0] yet, and you just accessed it. It's in the manual.
When you hit MyObject(); you are creating a temporary MyObject instance using the default constructor.
Because you allowed the compiler to define your copy constructor, you see more destructor than constructor messages. (There can only be one destructor, but many constructors, unlike building a house.) If you object should never be copied this way, then you likely want to declare a private copy constructor and copy assignment operator.
You're calling the default constructor and the copy constructor twice each with this code:
myObjectsMap[0] = MyObject();
When you do this:
myObjectsMap.insert( MyObjectPair(0,MyObject()));
you call the default constructor once and the copy constructor 3 times.
You should likely use pointers as map values instead of the objects themselves, in particular I suggest looking at shared_ptr.
note: tests were done using GCC 3.4.5 on a Windows NT 5.1 machine.
That's the way map and the other containers work, you can't get around it. That's why std::auto_ptr can't be used in a collection, for example.