In c++ when classes contains dynamically allocated data it is usually reasonable to explicitly define copy constructor, operator= and destructor. But the activity of these special methods overlaps. More specifically operator= usually first does some destruction and then it does coping similar to the one in copy constructor.
My question is how to write this the best way without repeating the same lines of code and without the need for processor to do unnecessary work (like unnecessary copying).
I usually end up with two helping methods. One for construction and one for destruction. The first is called from both copy constructor and operator=. The second is used by destructor and operator=.
Here is the example code:
template <class T>
class MyClass
{
private:
// Data members
int count;
T* data; // Some of them are dynamicly allocated
void construct(const MyClass& myClass)
{
// Code which does deep copy
this->count = myClass.count;
data = new T[count];
try
{
for (int i = 0; i < count; i++)
data[i] = myClass.data[i];
}
catch (...)
{
delete[] data;
throw;
}
}
void destruct()
{
// Dealocate all dynamicly allocated data members
delete[] data;
}
public: MyClass(int count) : count(count)
{
data = new T[count];
}
MyClass(const MyClass& myClass)
{
construct(myClass);
}
MyClass& operator = (const MyClass& myClass)
{
if (this != &myClass)
{
destruct();
construct(myClass);
}
return *this;
}
~MyClass()
{
destruct();
}
};
Is this even correct?
And is it a good habit to split the code this way?
One initial comment: the operator= does not start by
destructing, but by constructing. Otherwise, it will leave the
object in an invalid state if the construction terminates via an
exception. Your code is incorrect because of this. (Note that
the necessity to test for self assignment is usually a sign that
the assignment operator is not correct.)
The classical solution for handling this is the swap idiom: you
add a member function swap:
void MyClass:swap( MyClass& other )
{
std::swap( count, other.count );
std::swap( data, other.data );
}
which is guaranteed not to throw. (Here, it just swaps an int
and a pointer, neither of which can throw.) Then you
implement the assignment operator as:
MyClass& MyClass<T>::operator=( MyClass const& other )
{
MyClass tmp( other );
swap( tmp );
return *this;
}
This is simple and straight forward, but any solution in which
all operations which may fail are finished before you start
changing the data is acceptable. For a simple case like your
code, for example:
MyClass& MyClass<T>::operator=( MyClass const& other )
{
T* newData = cloneData( other.data, other.count );
delete data;
count = other.count;
data = newData;
return *this;
}
(where cloneData is a member function which does most of what
your construct does, but returns the pointer, and doesn't
modify anything in this).
EDIT:
Not directly related to your initial question, but generally, in
such cases, you do not want to do a new T[count] in
cloneData (or construct, or whatever). This constructs all
of the T with the default constructor, and then assigns them.
The idiomatic way of doing this is something like:
T*
MyClass<T>::cloneData( T const* other, int count )
{
// ATTENTION! the type is a lie, at least for the moment!
T* results = static_cast<T*>( operator new( count * sizeof(T) ) );
int i = 0;
try {
while ( i != count ) {
new (results + i) T( other[i] );
++ i;
}
} catch (...) {
while ( i != 0 ) {
-- i;
results[i].~T();
}
throw;
}
return results;
}
Most often, this will be done using a separate (private) manager
class:
// Inside MyClass, private:
struct Data
{
T* data;
int count;
Data( int count )
: data( static_cast<T*>( operator new( count * sizeof(T) ) )
, count( 0 )
{
}
~Data()
{
while ( count != 0 ) {
-- count;
(data + count)->~T();
}
}
void swap( Data& other )
{
std::swap( data, other.data );
std::swap( count, other.count );
}
};
Data data;
// Copy constructor
MyClass( MyClass const& other )
: data( other.data.count )
{
while ( data.count != other.data.count ) {
new (data.data + data.count) T( other.date[data.count] );
++ data.count;
}
}
(and of course, the swap idiom for assignment). This allows
multiple count/data pairs without any risk of loosing exception
safety.
I don't see any inherent problem with that, as long as you make sure not to declare construct or destruct virtual.
You might be interested in chapter 2 in Effective C++ (Scott Meyers), which is completely devoted to constructors, copy operators and destructors.
As for exceptions, which your code is not handling as it should, consider items 10 & 11 in More effective C++ (Scott Meyers).
Implement the assignment by first copying the right-hand side and then swapping with that. This way you also get exception safety, which your code above doesn't provide. You could end up with a broken container when construct() fails after destruct() succeeded otherwise, because the member pointer references some deallocated data, and on destruction that will be deallocated again, causing undefined behaviour.
foo&
foo::operator=(foo const& rhs)
{
using std::swap;
foo tmp(rhs);
swap(*this, tmp);
return *this;
}
Related
In c++ when classes contains dynamically allocated data it is usually reasonable to explicitly define copy constructor, operator= and destructor. But the activity of these special methods overlaps. More specifically operator= usually first does some destruction and then it does coping similar to the one in copy constructor.
My question is how to write this the best way without repeating the same lines of code and without the need for processor to do unnecessary work (like unnecessary copying).
I usually end up with two helping methods. One for construction and one for destruction. The first is called from both copy constructor and operator=. The second is used by destructor and operator=.
Here is the example code:
template <class T>
class MyClass
{
private:
// Data members
int count;
T* data; // Some of them are dynamicly allocated
void construct(const MyClass& myClass)
{
// Code which does deep copy
this->count = myClass.count;
data = new T[count];
try
{
for (int i = 0; i < count; i++)
data[i] = myClass.data[i];
}
catch (...)
{
delete[] data;
throw;
}
}
void destruct()
{
// Dealocate all dynamicly allocated data members
delete[] data;
}
public: MyClass(int count) : count(count)
{
data = new T[count];
}
MyClass(const MyClass& myClass)
{
construct(myClass);
}
MyClass& operator = (const MyClass& myClass)
{
if (this != &myClass)
{
destruct();
construct(myClass);
}
return *this;
}
~MyClass()
{
destruct();
}
};
Is this even correct?
And is it a good habit to split the code this way?
One initial comment: the operator= does not start by
destructing, but by constructing. Otherwise, it will leave the
object in an invalid state if the construction terminates via an
exception. Your code is incorrect because of this. (Note that
the necessity to test for self assignment is usually a sign that
the assignment operator is not correct.)
The classical solution for handling this is the swap idiom: you
add a member function swap:
void MyClass:swap( MyClass& other )
{
std::swap( count, other.count );
std::swap( data, other.data );
}
which is guaranteed not to throw. (Here, it just swaps an int
and a pointer, neither of which can throw.) Then you
implement the assignment operator as:
MyClass& MyClass<T>::operator=( MyClass const& other )
{
MyClass tmp( other );
swap( tmp );
return *this;
}
This is simple and straight forward, but any solution in which
all operations which may fail are finished before you start
changing the data is acceptable. For a simple case like your
code, for example:
MyClass& MyClass<T>::operator=( MyClass const& other )
{
T* newData = cloneData( other.data, other.count );
delete data;
count = other.count;
data = newData;
return *this;
}
(where cloneData is a member function which does most of what
your construct does, but returns the pointer, and doesn't
modify anything in this).
EDIT:
Not directly related to your initial question, but generally, in
such cases, you do not want to do a new T[count] in
cloneData (or construct, or whatever). This constructs all
of the T with the default constructor, and then assigns them.
The idiomatic way of doing this is something like:
T*
MyClass<T>::cloneData( T const* other, int count )
{
// ATTENTION! the type is a lie, at least for the moment!
T* results = static_cast<T*>( operator new( count * sizeof(T) ) );
int i = 0;
try {
while ( i != count ) {
new (results + i) T( other[i] );
++ i;
}
} catch (...) {
while ( i != 0 ) {
-- i;
results[i].~T();
}
throw;
}
return results;
}
Most often, this will be done using a separate (private) manager
class:
// Inside MyClass, private:
struct Data
{
T* data;
int count;
Data( int count )
: data( static_cast<T*>( operator new( count * sizeof(T) ) )
, count( 0 )
{
}
~Data()
{
while ( count != 0 ) {
-- count;
(data + count)->~T();
}
}
void swap( Data& other )
{
std::swap( data, other.data );
std::swap( count, other.count );
}
};
Data data;
// Copy constructor
MyClass( MyClass const& other )
: data( other.data.count )
{
while ( data.count != other.data.count ) {
new (data.data + data.count) T( other.date[data.count] );
++ data.count;
}
}
(and of course, the swap idiom for assignment). This allows
multiple count/data pairs without any risk of loosing exception
safety.
I don't see any inherent problem with that, as long as you make sure not to declare construct or destruct virtual.
You might be interested in chapter 2 in Effective C++ (Scott Meyers), which is completely devoted to constructors, copy operators and destructors.
As for exceptions, which your code is not handling as it should, consider items 10 & 11 in More effective C++ (Scott Meyers).
Implement the assignment by first copying the right-hand side and then swapping with that. This way you also get exception safety, which your code above doesn't provide. You could end up with a broken container when construct() fails after destruct() succeeded otherwise, because the member pointer references some deallocated data, and on destruction that will be deallocated again, causing undefined behaviour.
foo&
foo::operator=(foo const& rhs)
{
using std::swap;
foo tmp(rhs);
swap(*this, tmp);
return *this;
}
I have an Array class written with RAII in mind (super simplified for the purpose of this example):
struct Array
{
Array(int size) {
m_size = size;
m_data = new int[m_size];
}
~Array() {
delete[] m_data;
}
int* m_data = nullptr;
int m_size = 0;
};
Then I have a function that takes a reference to an Array and does some operations on it. I use a temporary Array temp to perform the processing, because for several reasons I can't work directly with the reference. Once done, I would like to transfer the data from the temporary Array to the real one:
void function(Array& array)
{
Array temp(array.m_size * 2);
// do heavy processing on `temp`...
array.m_size = temp.m_size;
array.m_data = temp.m_data;
}
The obvious problem is that temp goes out of scope at the end of the function: its destructor is triggered, which in turn deletes the memory. This way array would contain non-existent data.
So what is the best way to "move" the ownership of data from an object to another?
What you want is move construction or move assignment for your array class, which saves you redundant copies. Also, it will help you to use std::unique_ptr which will take care of move semantics for allocated memory for you, so that you don't need to do any memory allocation directly, yourself. You will still need to pull up your sleeves and follow the "rule of zero/three/five", which in our case is the rule of five (copy & move constructor, copy & move assignment operator, destructor).
So, try:
class Array {
public:
Array(size_t size) : m_size(size) {
m_data = std::make_unique<int[]>(size);
}
Array(const Array& other) : Array(other.size) {
std::copy_n(other.m_data.get(), other.m_size, m_data.get());
}
Array(Array&& other) : m_data(nullptr) {
*this = other;
}
Array& operator=(Array&& other) {
std::swap(m_data, other.m_data);
std::swap(m_size,other.m_size);
return *this;
}
Array& operator=(const Array& other) {
m_data = std::make_unique<int[]>(other.m_size);
std::copy_n(other.m_data.get(), other.m_size, m_data.get());
return *this;
}
~Array() = default;
std::unique_ptr<int[]> m_data;
size_t m_size;
};
(Actually, it's a bad idea to have m_size and m_data public, since they're tied together and you don't want people messing with them. I would also template that class on the element type, i.e. T instead of int and use Array<int>)
Now you can implement your function in a very straightforward manner:
void function(Array& array)
{
Array temp { array }; // this uses the copy constructor
// do heavy processing on `temp`...
std::swap(array, temp); // a regular assignment would copy
}
if you insist on your example format, then you are missing a couple of things. you need to make sure that the temp array does not destroy memory reused by the non-temp array.
~Array() {
if (md_data != nullptr)
delete [] m_data;
}
and the function needs to do clean move
void function(Array& array)
{
Array temp(array.m_size * 2);
// do heavy processing on `temp`...
array.m_size = temp.m_size;
array.m_data = temp.m_data;
temp.m_data = nullptr;
temp.m_size = 0;
}
To make it more c++ like, you can do multiple things. i.e. in the array you can create member function to move the data from other array (or a constructor, or assign operator, ...)
void Array::move(Array &temp) {
m_size = temp.m_size;
temp.m_size = 0;
m_data = temp.m_data;
temp.m_data = 0;
}
in c++11 and higher you can create a move constructor and return your value from the function:
Array(Array &&temp) {
do your move here
}
Array function() {
Array temp(...);
return temp;
}
there are other ways as well.
In C++11/14, an object can be transfered by move or smark pointer.
(1) This is an example for move:
class MoveClass {
private:
int *tab_;
int alloc_;
void Reset() {
tab_ = nullptr;
alloc_ = 0;
}
void Release() {
if (tab_) delete[] tab_;
tab_ = nullptr;
alloc_ = 0;
}
public:
MoveClass() : tab_(nullptr), alloc_(0) {}
~MoveClass() {
Release();
}
MoveClass(MoveClass && other) : tab_( other.tab_ ), alloc_( other.alloc_ ) {
other.Reset();
}
MoveClass & operator=(MoveClass && other) {
if (this == &other) return *this;
std::swap(tab_, other.tab_);
std::swap(alloc_, other.alloc_);
return *this;
}
void DoSomething() { /*...*/ }
};
When we use this movable MoveClass, we can write code like this :
int main() {
MoveClass a;
a.DoSomething(); // now a has some memory resource
MoveClass b = std::move(a); // move a to b
return 0;
}
Always write move-constructor/move-operator= is boring, use shared_ptr/unique_ptr some times have the same effect, just like java, reference/pointer everywhere.
(2) Here is the example:
class NoMoveClass {
private:
int *tab_;
int alloc_;
void Release() {
if (tab_) delete[] tab_;
tab_ = nullptr;
alloc_ = 0;
}
public:
NoMoveClass() : tab_(nullptr), alloc_(0) {}
~NoMoveClass() {
Release();
}
MoveClass(MoveClass && other) = delete;
MoveClass & operator=(MoveClass && other) = delete;
void DoSomething() { /*...*/ }
};
We can use it like this:
int main() {
std::shared_ptr<NoMoveClass> a(new NoMoveClass());
a->DoSomething();
std::shared_ptr<NoMoveClass> b = a; // also move a to b by copy pointer.
return 0;
}
Is it a good habit to always use the 2nd one?
Why many libraries, STL use the 1st one, not the 1st one ?
Always write move-constructor/move-operator= is boring
You almost never need to write your own move constructor/assignment, because (as you mentioned) C++ supplies you with a number of basic resource managers - smart pointers, containers, smart locks etc.
By relying on those in your class you enable default move operations and that results in minimal code size as well as proper semantics:
class MoveClass {
private:
std::vector<int> data;
public:
void DoSomething() { /*...*/ }
};
Now you can use your class as in (1) or as a member in other classes, you can be sure that it has move semantics and you did it in the minimal possible amount of code.
The point is one usually only needs to implement move operations for the most low-level classes which are probably covered already by STL, or if some weird specific behavior is needed - both cases should be really rare and not result in "Always writing move-constructor/move-operator=".
Also notice that while approach (1) is unnecessarily verbose, (2) is just unacceptable - you have a resource managing class that doesn't do its job and as a result you have to wrap it in smart pointers everywhere in your code, making it harder to understand and eventually resulting in even more code than (1)
I am writting a small wrapper around a library that uses handles.
Basic uses of this library are :
int handle;
createHandle( 1, &handle )
doSomethingWithHandle( handle );
// ...
destroyHandle( handle );
I made a wrapper class following RAII principles:
Handle::Handle()
{
createHandle( 1, &m_handle );
}
Handle::~Handle()
{
if( m_handle!= 0 )
destroyHandle( m_handle );
}
Handle::Handle( Handle&& h ) :
m_handle( h.m_handle )
{
h.m_handle = 0;
}
Handle& Handle::operator=( Handle&& h )
{
m_handle = h.m_handle;
h.m_handle = 0;
return *this;
}
// copy constructor and copy assignment operator are explicitely deleted
It's working, however, a lot a classes depends on those wrappers, which means everytime a class has a Handle member, I have to explicitely write move constructor/move assignment operators:
SomeClass::SomeClass( SomeClass&& s ) :
m_handle( std::move( s.m_handle ) )
{
}
SomeClass& SomeClass::SomeClass( SomeClass&& s )
{
m_handle = std::move( s.m_handle );
return *this;
}
This is, of course, not hard to do, but I wonder if there is a way to avoid that, because thats a lot of redundant code.
If that's not possible, why aren't the move operators not generated by the compiler? Let's take the following lines:
SomeClass a;
m_vector.push_back( a );
In this case, someclass is not copyable so the compiler will have an error because a.m_handle have copy operators deleted. So it means we HAVE to move them. But if we do, doesn't it makes sence that we want to move every members ( if we can't copy them )?
Edit: I just tried something and it seems to work, just declare the move operators using default. This is the way to go I suppose. But the "why" question remains.
Edit2: Another example
class Test
{
public:
Test()
{
m_vec.resize( 10 );
}
Test( const Test& ) = delete;
Test& operator=( const Test& ) = delete;
//Test( Test&& ) = default;
//Test& operator=( Test&& ) = default;
void cout()
{
std::cout << m_vec.size() << std::endl;
}
private:
std::vector< int > m_vec;
};
int main()
{
Test a;
std::vector< Test > v;
v.push_back( std::move( a ) );
v[ 0 ].cout();
}
I have to explicitely write move constructor/move assignment operators:
This is where you are wrong. Either do not write them at all and let the compiler, or =default; them.
SomeClass a;
m_vector.push_back( a );
You did that wrong, it must be:
SomeClass a;
m_vector.push_back( std::move(a) );
After making a lot of changes to a project, I created an error that took me quite a while to track down.
I have a class which contains a dynamically allocated array. I then create a dynamic array of this class. I can then delete[] that array. But, if I replace an item in the array before deleting it, it causes an error. In debug mode, it gives an assertion message from dbgdel.cpp "Expression: _BLOCK_TYPE_IS_VALID(pHead->nBlockUse)". Here is a small program to demonstrate.
class SomeClass {
public:
int *data;
SomeClass() {
data = nullptr;
}
SomeClass(int num) {
data = new int[num];
}
~SomeClass() {
if (data != nullptr) {
delete[] data;
}
}
};
int main(int argc, char *args[]) {
SomeClass *someArray = new SomeClass[10];
//If you comment this out, there is no error. Error gets thrown when deleting
someArray[0] = SomeClass(10);
delete[] someArray;
return 0;
}
I'm curious, why does this happen? When the item in the array gets replaced, its destructor gets called. Then the new item allocates its data in a location separate from the array. Then delete[] calls the destructors of all the objects in the array. When the destructors get called, they should delete the item's data array. I can't imagine what the problem is, but I'd like if someone could explain.
Your class is broken: It has a non-trivial destructor, but you do not define copy constructors and copy assignment operators. That means that the class cannot be correctly copied or assigned-to (since the destructible state is not copied or assigned appropriately), as you are noticing in your example code.
You can either make your class uncopiable (in which case your code won't compile any more), or move-only, in which case you need to define move construction and move-assignment, or properly copyable by implementing a deep copy of the data.
Here's how, add the following definitions:
Non-copyable:
SomeClass(SomeClass const &) = delete;
SomeClass & operator(SomeClass const &) = delete;
Moveable-only:
SomeClass(SomeClass const &) = delete;
SomeClass(SomeClass && rhs) : data(rhs.data) { rhs.data = nullptr; }
SomeClass & operator(SomeClass const &) = delete;
SomeClass & operator(SomeClass && rhs) {
if (this != &rhs) { delete data; data = rhs.data; rhs.data = nullptr; }
return *this;
}
Copyable:
SomeClass(SomeClass const & rhs) : ptr(new int[rhs->GetSizeMagically()]) {
/* copy from rhs.data to data */
}
SomeClass & operator(SomeClass const & rhs) {
if (this == &rhs) return *this;
int * tmp = new int[rhs->GetSizeMagically()];
/* copy data */
delete data;
data = tmp;
}
// move operations as above
The upshot is that the nature of the destructor determines the invariants of the class, because every object must be consistently destructible. From that you can infer the required semantics of the copy and move operations. (This is often called the Rule of Three or Rule of Five.)
Kerrek SB s answer is great. I just want to clarify that in your code memory is freed twice.
This code
someArray[0] = SomeClass(10);
is the same as this
SomeClass temp(10);
someArray[0] = temp; //here temp.data is copied to someArray[0].data
Then ~SomeClass() is called for temp and data is freed first time.
Here
delete[] someArray;
~SomeClass() is called for someArray[0] and data is freed second time.