Best way to overload the C++ assignment operator - c++

I have a class A which dynamically allocates memory for an integer(pointed by a class memeber say _pPtrMem) in its constructor and deallocates the same in destructor. To avoid Shallow copy, I have overloaded assignment operator and copy constructor. The widely used way in which assignment operator is overloaded is as following:
A& operator = (const A & iToAssign)
{
if (this == & iToAssign) // Check for self assignment
return *this;
int * pTemp = new int(*(iToAssign._pPtrMem)); // Allocate new memory with same value
if (pTemp)
{
delete _pPtrMem; // Delete the old memory
_pPtrMem = pTemp; // Assign the newly allocated memory
}
return *this; // Return the reference to object for chaining(a = b = c)
}
Another way for implementing the same could be
A& operator = (const A & iToAssign)
{
*_pPtrMem= *(iToAssign._pPtrMem); // Just copy the values
return *this;
}
Since the second version is comparatively much simpler and faster(no deallocation, allocation of memory) why is it not used widely? Any problems that I am not able to make out?
Also we return an object of the same type from the assignment operator for chaining(a = b = c)... so instead of returning *this is it fine to return the iToAssign object as both objects are supposedly now equal?

Usually, the best way to implement a copy assignment operator is to provide a swap() function for your class (or use the standard one if it does what you want) and then implement the copy assignment operator via the copy constructor:
A& A::operator= (A iToAssign) // note pass by value here - will invoke copy constructor
{
iToAssign.swap(*this);
return *this;
}
// void swap(A& other) throws() // C++03
void A::swap(A& other) noexcept
{
std::swap(_pPtrMem, other._pPtrMem);
}
This makes sure that your copy assignment operator and copy constructor never diverge (that is, it cannot happen that you change one and forget to change the other).

No, there is no problem with your implementation. But having one integer dynamic allocated is at least very special.
This implementation is not widely used, because no one allocates a single integer on the free store. You usually use dynamic allocated memory for arrays with a variable length unknown at compile time. And in this case it's most of the time a good idea to just use std::vector.
No it's not fine, to return an different object. Identity is not the same as equality:
T a, b, d;
T& c = a = b;
c = d; // should change a, not b
Would you expect that the third line changes b?
Or a event better Example:
T a;
T& b = a = T();
This would result in a dangling reference, referencing an temporary and destructed object.

The first version is used in case _pPtrMem is a pointer to some basic type for instance a dynamically allocated array. In case the pointer is pointing to a single object with correctly implemented assignment operator the second version will do just as good. But in that case I don't think you will need to use a pointer at all.

In the second case, if _pPtrMem was initially unassigned, the line
*_pPtrMem= *(iToAssign._pPtrMem); // Just copy the values
is causing an assignment to an invalid memory location (possibly a segmentation fault). This can only work if _pPtrMem has been allocated memory prior to this call.

In this case, the second implementation is by far the better.
But the usual reason for using dynamic in an object is because
the size may vary. (Another reason is because you want the
reference semantics of shallow copy.) In such cases, the
simplest solution is to use your first assignment (without the
test for self-assignment). Depending on the object, one might
consider reusing the memory already present if the new value
will fit; it adds to the complexity somewhat (since you have to
test whether it will fit, and still do the
allocation/copy/delete if it doesn't), but it can improve
performance in certain cases.

Related

Side effect c++ vector copy and deletion

I have this problem:
void foo(vector<int> &a){
vector<int> b;
b.push_back(1); // in general many push backs
a = b;
}
since b is a local variable it will be deleted when foo ends.
Will a retain the values inserted in b?
I tried and it does but maybe is only due to chance.
Thank you for your help.
EDIT:
I think this answers my question. Isn't it?
http://www.cplusplus.com/reference/vector/vector/operator=/
So basically a will retain the values in c++98. But I cannot fully understand what it does in c++11.
a = b will invoke the copy assignment operator of std::vector. This operator copies all the contents of the vector from b to a so b can be safely deleted afterwards.
Considering your edit:
This answer is also true for C++11. C++11 only adds a move assignment operator, but this operator is not invoked since b is not an r-value.
Even if b was an r-value (e.g., by wrapping b into a std::move(b)), then the move assignment operator would be invoked, which would also be fine. It would move the contents of b to a, so b would be an empty vector afterwards which can be safely deleted, as well.
What happens in C++ 11 is that the assignment operator will do a copy unless you explicitly tell it to move using std::move.
To force the move you could do this:
void foo(vector<int> &a){
vector<int> b;
b.push_back(1); // in general many push backs
a = std::move(b);
}
For this to work, vector has implemented the assignment operator for move something like this:
vector<T>& vector<T>::operator =(vector<T>&& to_move) {}
When using std::move, this tells the compiler to use the overloaded function that has implemented a move, otherwise it will fall back to const vector<T>& on older versions that have not implemented move semantics.
Move might or might not be faster depending on how this is implemented, but for vector you would expect the internal pointer to be moved from the old vector to the new vector, avoiding a copy of all contained objects.
e.g.
vector<T>& vector<T>::operator =(vector<T>&& to_move)
{
delete [] m_ptrs;
m_ptrs = to_move.m_ptrs;
m_size = to_move.m_size;
to_move.m_ptrs = nullptr;
to_move.m_size = 0;
return *this;
}
This is of course an educated guess, but hopefully it demonstrates what might happen when using move semantics.
If those values are not pointers, they will be copied in a = b so they will be preserved (They will be also previously copied in b when they are inserted http://www.cplusplus.com/reference/vector/vector/push_back/).
If they are, they will be also copied too but if they were pointing to local variables you will get a good collection of invalid pointers.
If, on the other hand, they were pointers to dynamically allocated memory you would not find any problem.

calling copy constructor and operator= for class data members

Below is an example code (for learning purpose only). Classes A and B are independent and have copy contructors and operators= .
class C
{
public:
C(string cName1, string cName2): a(cName1), b(new B(cName2)) {}
C(const C &c): a(c.a), b(new B(*(c.b))) {}
~C(){ delete b; }
C& operator=(const C &c)
{
if(&c == this) return *this;
a.operator=(c.a);
//1
delete b;
b = new B(*(c.b));
//What about this:
/*
//2
b->operator=(*(c.b));
//3
(*b).operator=(*(c.b));
*/
return *this;
}
private:
A a;
B *b;
};
There are three ways of making assignment for data member b. In fact first of them calls copy constructor. Which one should I use ? //2 and //3 seems to be equivalent.
I decided to move my answer to answers and elaborate.
You want to use 2 or 3 because 1 reallocated the object entirely. You do all the work to clean up, and then do all the work to reallocate/reinitialized the object. However copy assignment:
*b = *c.b;
And the variants you used in your code simply copy the data.
however, we gotta ask, why are you doing it this way in the first place?
There are two reasons, in my mind, to have pointers as members of the class. The first is using b as an opaque pointer. If that is the case, then you don't need to keep reading.
However, what is more likely is that you are trying to use polymorphism with b. IE you have classes D and E that inherit from B. In that case, you CANNOT use the assignment operator! Think about it this way:
B* src_ptr = new D();//pointer to D
B* dest_ptr = new E();//pointer to E
*dest_ptr = *src_ptr;//what happens here?
What happens?
Well, the compiler sees the following function call with the assignment operator:
B& = const B&
It is only aware of the members of B: it can't clean up the no longer used members of E, and it can't really translate from D to E.
In this situation, it is often better to use situation 1 rather than try to decern the subtypes, and use a clone type operator.
class B
{
public:
virtual B* clone() const = 0;
};
B* src_ptr = new E();//pointer to D
B* dest_ptr = new D();//pointer to E, w/e
delete dest_ptr;
dest_ptr = src_ptr->clone();
It may be down to the example but I actually don't even see why b is allocated on the heap. However, the reason why b is allocate on the heap informs how it needs to be copied/assigned. I think there are three reasons for objects to be allocated on the heap rather than being embedded or allocated on the stack:
The object is shared between multiple other objects. Obviously, in this case there is shared ownership and it isn't the object which is actually copied but rather a pointer to the object. Most likely the object is maintained using a std::shared_ptr<T>.
The object is polymorphic and the set of supported types is unknown. In this case the object is actually not copied but rather cloned using a custom, virtual clone() function from the base class. Since the type of the object assigned from doesn't have to be the same, both copy construction and assignment would actually clone the object. The object is probably held using a std::unique_ptr<T> or a custom clone_ptr<T> which automatically takes care of appropriate cloning of the type.
The object is too big to be embedded. Of course, that case doesn't really happen unless you happen to implement the large object and create a suitable handle for it.
In most cases I would actually implement the assignment operator in an identical form, though:
T& T::operator=(T other) {
this->swap(other);
return *this;
}
That is, for the actual copy of the assigned object the code would leverage the already written copy constructor and destructor (both are actually likely to be = defaulted) plus a swap() method which just exchanges resources between two objects (assuming equal allocators; if you need to take case of non-equal allocators things get more fun). The advantage of implementing the code like this is that the assignment is strong exception safe.
Getting back to your approach to the assignment: in no case would I first delete an object and then allocate the replace. Also, I would start off with doing all the operations which may fail, putting them into place at an appropriate place:
C& C::operator=(C const& c)
{
std::unique_ptr tmp(new B(*c.b));
this->a = c.a;
this->b = tmp.reset(this->b);
return *this;
}
Note that this code does not do a self-assignment check. I claim that any assignment operator which actually only works for self-assignment by explicitly guarding against is not exception-safe, at least, it isn't strongly exception safe. Making the case for the basic guarantee is harder but in most cases I have seen the assignment wasn't basic exception safe and your code in the question is no exception: if the allocation throws, this->b contains a stale pointer which can't be told from another pointer (it would, at the very least, need to be set to nullptr after the delete b; and before the allocation).
b->operator=(*(c.b));
(*b).operator=(*(c.b));
These two operations are equivalent and should be spelled
*this->b = *c.b;
or
*b = *c.b;
I prefer the qualified version, e.g., because it works even if b is a base class of template inheriting from a templatized base, but I know that most people don't like it. Using operator=() fails if the type of the object happens to be a built-in type. However, a plain assignment of a heap allocated object doesn't make any sense because the object should be allocated on the heap if that actually does the right thing.
If you use method 1 your assignment operator doesn't even provide the basic (exception) guarantee so that's out for sure.
Best is of course to compose by value. Then you don't even have to write your own copy assignment operator and let the compiler do it for you!
Next best, since it appears you will always have a valid b pointer, is to assign into the existing object: *b = *c.b;
a = c.a;
*b = *c.b;
Of course, if there is a possibility that b will be a null pointer the code should check that before doing the assignment on the second line.

Does this C++ code leak memory?

struct Foo
{
Foo(int i)
{
ptr = new int(i);
}
~Foo()
{
delete ptr;
}
int* ptr;
};
int main()
{
{
Foo a(8);
Foo b(7);
a = b;
}
//Do other stuff
}
If I understand correctly, the compiler will automatically create an assignment operator member function for Foo. However, that just takes the value of ptr in b and puts it in a. The memory allocated by a originally seems lost. I could do a call a.~Foo(); before making the assignment, but I heard somewhere that you should rarely need to explicitly call a destructor. So let's say instead I write an assignment operator for Foo that deletes the int pointer of the left operand before assigning the r-value to the l-value. Like so:
Foo& operator=(const Foo& other)
{
//To handle self-assignment:
if (this != &other) {
delete this->ptr;
this->ptr = other.ptr;
}
return *this;
}
But if I do that, then when Foo a and Foo b go out of scope, don't both their destructors run, deleting the same pointer twice (since they both point to the same thing now)?
Edit:
If I understand Anders K correctly, this is the proper way to do it:
Foo& operator=(const Foo& other)
{
//To handle self-assignment:
if (this != &other) {
delete this->ptr;
//Clones the int
this->ptr = new int(*other.ptr);
}
return *this;
}
Now, a cloned the int that b pointed to, and sets its own pointer to it. Perhaps in this situation, the delete and new were not necessary because it just involves ints, but if the data member was not an int* but rather a Bar* or whatnot, a reallocation could be necessary.
Edit 2:
The best solution appears to be the copy-and-swap idiom.
Whether this leaks memory?
No it doesn't.
It seems most of the people have missed the point here. So here is a bit of clarification.
The initial response of "No it doesn't leak" in this answer was Incorrect but the solution that was and is suggested here is the only and the most appropriate solution to the problem.
The solution to your woes is:
Not use a pointer to integer member(int *) but to use just an integer (int), You don't really need dynamically allocated pointer member here. You can achieve the same functionality using an int as member.
Note that in C++ You should use new as little as possible.
If for some reason(which I can't see in the code sample) You can't do without dynamically allocated pointer member read on:
You need to follow the Rule of Three!
Why do you need to follow Rule of Three?
The Rule of Three states:
If your class needs either
a copy constructor,
an assignment operator,
or a destructor,
then it is likely to need all three of them.
Your class needs an explicit destructor of its own so it also needs an explicit copy constructor and copy assignment operator.
Since copy constructor and copy assignment operator for your class are implicit, they are implicitly public as well, Which means the class design allows to copy or assign objects of this class. The implicitly generated versions of these functions will only make a shallow copy of the dynamically allocated pointer member, this exposes your class to:
Memory Leaks &
Dangling pointers &
Potential Undefined Behavior of double deallocation
Which basically means you cannot make do with the implicitly generated versions, You need to provide your own overloaded versions and this is what Rule of Three says to begin with.
The explicitly provided overloads should make a deep copy of the allocated member and it thus prevents all your problems.
How to implement the Copy assignment operator correctly?
In this case the most efficient and optimized way of providing a copy assignment operator is by using:
copy-and-swap Idiom
#GManNickG's famous answer provides enough detail to explain the advantages it provides.
Suggestion:
Also, You are much better off using smart pointer as an class member rather than a raw pointer which burdens you with explicit memory management. A smart pointer will implicitly manage the memory for you. What kind of smart pointer to use depends on lifetime and ownership semantics intended for your member and you need to choose an appropriate smart pointer as per your requirement.
the normal way to handle this is to create a clone of the object the pointer points to, that is why it is important to have an assignment operator. when there is no assigment operator defined the default behavior is a memcpy which will cause a crash when both destructors try to delete the same object and a memory leak since the previous value ptr was pointing to in b will not be deleted.
Foo a
+-----+
a->ptr-> | |
+-----+
Foo b
+-----+
b->ptr-> | |
+-----+
a = b
+-----+
| |
+-----+
a->ptr
\ +-----+
b->ptr | |
+-----+
when a and b go out of scope delete will be called twice on the same object.
edit: as Benjamin/Als correctly pointed out the above is just referring to this particular example, see below in comments
The code as presented has Undefined Behavior. As such, if it leaks memory (as expected) then that is just one possible manifestation of the UB. It can also send an angry threatening letter to Barack Obama, or spew out red (or orange) nasal daemons, or do nothing, or act as if there was no memory leak, miraculously reclaiming the memory, or whatever.
Solution: instead of int*, use int, i.e.
struct Foo
{
Foo(int i): blah( i ) {}
int blah;
};
int main()
{
{
Foo a(8);
Foo b(7);
a = b;
}
//Do other stuff
}
That’s safer, shorter, far more efficient and far more clear.
No other solution presented for this question, beats the above on any objective measure.

Questions about a Segmentation Fault in C++ most likely caused by a custom copy constructor

I'm getting a segmentation fault which I believe is caused by the copy constructor. However, I can't find an example like this one anywhere online. I've read about shallow copy and deep copy but I'm not sure which category this copy would fall under. Anyone know?
MyObject::MyObject{
lots of things including const and structs, but no pointers
}
MyObject::MyObject( const MyObject& oCopy){
*this = oCopy;//is this deep or shallow?
}
const MyObject& MyObject::operator=(const MyObject& oRhs){
if( this != oRhs ){
members = oRhs.members;
.....//there is a lot of members
}
return *this;
}
MyObject::~MyObject(){
//there is nothing here
}
Code:
const MyObject * mpoOriginal;//this gets initialized in the constructor
int Main(){
mpoOriginal = new MyObject();
return DoSomething();
}
bool DoSomething(){
MyObject *poCopied = new MyObject(*mpoOriginal);//the copy
//lots of stuff going on
delete poCopied;//this causes the crash - can't step into using GDB
return true;
}
EDIT: Added operator= and constructor
SOLVED: Barking up the wrong tree, it ended up being a function calling delete twice on the same object
It is generally a bad idea to use the assignment operator like this in the copy constructor. This will default-construct all the members and then assign over them. It is much better to either just rely on the implicitly-generated copy constructor, or use the member initializer list to copy those members that need copying, and apply the appropriate initialization to the others.
Without details of the class members, it is hard to judge what is causing your segfault.
According to your code you're not creating the original object... you're just creating a pointer like this:
const MyObject * mpoOriginal;
So the copy is using bad data into the created new object...
Wow....
MyObject::MyObject( const MyObject& oCopy)
{
*this = oCopy;//is this deep or shallow?
}
It is neither. It is a call to the assignment operator.
Since you have not finished the construction of the object this is probably ill-advised (though perfectly valid). It is more traditional to define the assignment operator in terms of the copy constructor though (see copy and swap idium).
const MyObject& MyObject::operator=(const MyObject& oRhs)
{
if( this != oRhs ){
members = oRhs.members;
.....//there is a lot of members
}
return *this;
}
Basically fine, though normally the result of assignment is not cont.
But if you do it this way you need to divide up your processing a bit to make it exception safe. It should look more like this:
const MyObject& MyObject::operator=(const MyObject& oRhs)
{
if( this == oRhs )
{
return *this;
}
// Stage 1:
// Copy all members of oRhs that can throw during copy into temporaries.
// That way if they do throw you have not destroyed this obbject.
// Stage 2:
// Copy anything that can **not** throw from oRhs into this object
// Use swap on the temporaries to copy them into the object in an exception sage mannor.
// Stage 3:
// Free any resources.
return *this;
}
Of course there is a simpler way of doing this using copy and swap idum:
MyObject& MyObject::operator=(MyObject oRhs) // use pass by value to get copy
{
this.swap(oRhs);
return *this;
}
void MyObject::swap(MyObject& oRhs) throws()
{
// Call swap on each member.
return *this;
}
If there is nothing to do in the destructor don't declare it (unless it needs to be virtual).
MyObject::~MyObject(){
//there is nothing here
}
Here you are declaring a pointer (not an object) so the constructor is not called (as pointers don;t have constructors).
const MyObject * mpoOriginal;//this gets initialized in the constructor
Here you are calling new to create the object.
Are you sure you want to do this? A dynamically allocated object must be destroyed; ostensibly via delete, but more usually in C++ you wrap pointers inside a smart pointer to make sure the owner correctly and automatically destroys the object.
int main()
{ //^^^^ Note main() has a lower case m
mpoOriginal = new MyObject();
return DoSomething();
}
But since you probably don't want a dynamic object. What you want is automatic object that is destroyed when it goes out of scope. Also you probably should not be using a global variable (pass it as a parameter otherwise your code is working using the side affects that are associated with global state).
int main()
{
const MyObject mpoOriginal;
return DoSomething(mpoOriginal);
}
You do not need to call new to make a copy just create an object (passing the object you want to copy).
bool DoSomething(MyObject const& data)
{
MyObject poCopied (data); //the copy
//lots of stuff going on
// No need to delete.
// delete poCopied;//this causes the crash - can't step into using GDB
// When it goes out of scope it is auto destroyed (as it is automatic).
return true;
}
What you are doing is making your copy constructor use the assignment operator (which you don't seem to have defined). Frankly I'm surprised it compiles, but because you haven't shown all your code maybe it does.
Write you copy constructor in the normal way, and then see if you still get the same problem. If it's true what you say about 'lots of things ... but I don't see any pointers' then you should not be writing a copy constructor at all. Try just deleting it.
I don't have a direct answer as for what exactly causes the segfault, but conventional wisdom here is to follow the rule of three, i.e. when you find yourself needing any of copy constructor, assignment operator, or a destructor, you better implement all three of them (c++0x adds move semantics, which makes it "rule of four"?).
Then, it's usually the other way around - the copy assignment operator is implemented in terms of copy constructor - copy and swap idiom.
MyObject::MyObject{
lots of things including const and structs, but no pointers
}
The difference between a shallow copy and a deep copy is only meaningful if there is a pointer to dynamic memory. If any of those member structs isn't doing a deep copy of it's pointer, then you'll have to work around that (how depends on the struct). However, if all members either don't contain pointers, or correctly do deep copies of their pointers, then the copy constructor/assignment is not the source of your problems.
It's either, depending on what your operator= does. That's where the magic happens; the copy constructor is merely invoking it.
If you didn't define an operator= yourself, then the compiler synthesised one for you, and it is performing a shallow copy.

Checklist for writing copy constructor and assignment operator in C++

Please write a list of tasks that a copy constructor and assignment operator need to do in C++ to keep exception safety, avoid memory leaks etc.
First be sure you really need to support copy. Most of the time it is not the case, and thus disabling both is the way to go.
Sometimes, you'll still need to provide duplication on a class from a polymorphic hierarchy, in that case: disable the assignment operator, write a (protected?) copy constructor, and provide a virtual clone() function.
Otherwise, in the case you are writing a value class, you're back into the land of the Orthogonal Canonical Form of Coplien. If you have a member that can't be trivially copied, you'll need to provide a copy-constructor, a destructor, an assignment-operator and a default constructor. This rule can be refined, see for instance: The Law of The Big Two
I'd also recommend to have a look at C++ FAQ regarding assignment operators, and at the copy-and-swap idiom and at GOTW.
The compiler generated versions work in most situations.
You need to think a bit harder about the problem when your object contains a RAW pointer (an argument for not having RAW pointers). So you have a RAW pointer, the second question is do you own the pointer (is it being deleted by you)? If so then you will need to apply the rule of 4.
Owning more than 1 RAW pointer becomes increasingly hard to do correctly (The increase in complexity is not linear either [but that is observational and I have no real stats to back that statement up]). So if you have more than 1 RAW pointer think about wrapping each in its own class (some form of smart pointer).
Rule of 4: If an object is the owner of a RAW pointer then you need to define the following 4 members to make sure you handle the memory management correctly:
Constructor
Copy Constructor
Assignment Operator
Destructor
How you define these will depend on the situations. But things to watch out for:
Default Construction: Set pointer to NULL
Copy Constructor: Use the Copy and Swap ideum to provide to the "Strong Exception Guarantee"
Assignment operator: Check for assignment to self
Destructor: Guard against exceptions propagating out of the destructor.
try to read this.
http://www.icu-project.org/docs/papers/cpp_report/the_anatomy_of_the_assignment_operator.html
is a very good analysis of Assignment operator
I have no idea about exception safely here but I go this way. Let's imagine it's a templated array wrapper. Hope it helps :)
Array(const Array& rhs)
{
mData = NULL;
mSize = rhs.size();
*this = rhs;
}
Array& operator=(const Array& rhs)
{
if(this == &rhs)
{
return *this;
}
int len = rhs.size();
delete[] mData;
mData = new T[len];
for(int i = 0; i < len; ++i)
{
mData[i] = rhs[i];
}
mSize = len;
return *this;
}