Move from stack to heap - c++

I'm trying to call some function that includes adding element to vector ( argument passed by value ):
std::vector<Val> vec;
void fun( Val v )
{
...
vec.push_back(std::move( v ));
...
}
The question is: are there any benefits from move semantics?
I'm confused about variable will be "moved" from stack to heap when new vector element constructing: it seems it will be just copied instead.

Whether there is any benefit by moving instead of copying actually depends on Val's implementation.
For example, if Val's copy constructor has a time complexity of O(n), but its move constructor has a complexity of O(1) (e.g.: it consists of just changing a few internal pointers), then there could be a performance benefit.
In the following implementation of Val there can't be any performance benefit by moving instead of copying:
class Val {
int data[1000];
...
};
But for the one below it can:
class Val {
int *data;
size_t num;
...
};

If std::move allows the underlying objects to move pointers from one to the other; rather than (what in done in copying) allocating a new area, and copying the contents of what could be a large amount of memory. For example
class derp {
int* data;
int dataSize;
}
data & dataSize could be just swapped in a move; but a copy could be much more expensive
If however you just have a collection of integers; moving and copying will amount to the same thing; except the old version of the object should be invalidated; whatever that should mean in the context of that object.

A move is a theft of the internal storage of the object, if applicable. Many standard containers can be moved from because they allocate storage on the heap which is trivial to "move". Internally, what happens is something like the following.
template<class T>
class DumbContainer {
private:
size_t size;
T* buffer;
public:
DumbContainer(DumbContainer&& other) {
buffer = other.buffer;
other.buffer = nullptr;
}
// Constructors, Destructors and member functions
}
As you can see, objects are not moved in storage, the "move" is purely conceptual from container other to this. In fact the only reason this is an optimization is because the objects are left untouched.
Storage on the stack cannot be moved to another container because lifetime on the stack is tied to the current scope. For example an std::array<T> will not be able to move its internal buffer to another array, but if T has storage on the heap to pass around, it can still be efficient to move from. For example moving from an std::array<std::vector<T>> will have to construct vector objects (array not moveable), but the vectors will cheaply move their managed objects to the vectors of the newly created array.
So coming to your example, a function such as the following can reduce overhead if Val is a moveable object.
std::vector<Val> vec;
void fun( Val v )
{
...
vec.emplace_back(std::move( v ));
...
}
You can't get rid of constructing the vector, of allocating extra space when it's capacity is filled or of constructing the Val objects inside the vector, but if Val is just a pointer to heap storage, it's as cheap as you can get without actually stealing another vector. If Val is not moveable you don't lose anything by casting it to an rvalue. The nice thing about this type of function signature is that the caller can decide whether they want a copy or a move.
// Consider the difference of these calls
Val v;
fun(v);
fun(std::move v);
The first will call Val's copy constructor if present, or fail to compile otherwise. The second will call the move constructor if present, or the copy constructor otherwise. It will not compile if neither is present.

Related

Concurrency-safe stack interface method: correct or not?

I came across a thread-safe stack implementation of an interface method for stack<>::pop():
void pop(T& value)
{
std::lock_guard<std::mutex> lock(m);
if(data.empty()) throw empty_stack();
value=std::move(data.top()); <----- why not just value = data.top()?
data.pop();
}
Granted, my question has nothing to do with concurrency, but why MOVE the value at the top of the stack into variable value? My understanding is once it is moved, you can't pop it as it is no longer there.
Either it is a mistake in the source where I found it, or if someone can explain it to me I would be thankful.
Thanks,
Amine
value=std::move(data.top()); <----- why not just value = data.top()?
This largely depends on what T is, basically it is going to try and use the move constructor, T(T &&mv) instead of the copy constructor T(const T &cp) if the move constructor exists.
In both cases ~T() will be called for the original object on the data.pop(); line.
Firstly, using the move constructor might be required. Some objects are moveable, but not copyable, e.g. a unique_ptr.
Secondly, where move is provided, it is often more efficient. e.g. say T is a std::vector, the copy constructor will allocate another array and then copy over each element (which might also be expensive), and then it deletes the original array anyway. Which is pretty wasteful.
A move constructor just keeps the original elements by moving the internal data array from one vector to a new one, and leaves the original (which is about to be deleted anyway in this case) in an unspecified but valid state (probably "empty").
It might look something like:
template<typename T> class vector
{
public:
vector<T>(vector<T> &&mv)
: arr(mv.arr) , arr_len(mv.arr_len), arr_capacity(mv.arr_capacity)
{
mv.arr = nullptr;
mv.arr_len = 0;
mv.arr_capacity = 0;
}
...
private:
T *arr;
size_t arr_len;
size_t arr_capacity;
};
Since the original object state is "unspecified" in the general case if you want to keep using the original object, you have to be careful. Destroying it as in the pop case is OK, as is assignment.
T tmp = std::move(some_value);
some_value.foo(); // In general, what state some_value is in is unknown, this might vary even from compiler to compiler
some_value = some_other_value; // But assignment should work
some_value.foo(); // So it is now in a known state
Which can implement for example a "swap" without copying any "contents" where possible.
template<typename T> void swap(T &a, T &b)
{
T tmp = std::move(a);
a = std::move(b);
b = std::move(tmp);
}
Many types will be more specified, for example for std::vector, it promises to be empty().
std::vector<T> tmp = std::move(some_array);
assert(some_array.empty()); // guaranteed
some_array.push_back(x); // guaranteed to have one element
you can't pop it as it is no longer there.
So the important bit here is that data.top() does not remove the element, so it is still there. And the move doesn't actually delete the thing, just leaves it in some unspecified state.
Concurrency-safe stack interface method
On the separate subject of concurrency the thing here is both the top to access the value, and the pop to remove it are under the same lock. To be safe all accesses to this instance stack must use that same lock instance, so make sure any data.push also holds the lock.

C++ Move assignment operator: Do I want to be using std::swap with POD types?

Since C++11, when using the move assignment operator, should I std::swap all my data, including POD types? I guess it doesn't make a difference for the example below, but I'd like to know what the generally accepted best practice is.
Example code:
class a
{
double* m_d;
unsigned int n;
public:
/// Another question: Should this be a const reference return?
const a& operator=(a&& other)
{
std::swap(m_d, other.m_d); /// correct
std::swap(n, other.n); /// correct ?
/// or
// n = other.n;
// other.n = 0;
}
}
You might like to consider a constructor of the form: - ie: there are always "meaningful" or defined values stores in n or m_d.
a() : m_d(nullptr), n(0)
{
}
I think this should be rewriten this way.
class a
{
public:
a& operator=(a&& other)
{
delete this->m_d; // avoid leaking
this->m_d = other.m_d;
other.m_d = nullptr;
this->n = other.n;
other.n = 0; // n may represents array size
return *this;
}
private:
double* m_d;
unsigned int n;
};
should I std::swap all my data
Not generally. Move semantics are there to make things faster, and swapping data that's stored directly in the objects will normally be slower than copying it, and possibly assigning some value to some of the moved-from data members.
For your specific scenario...
class a
{
double* m_d;
unsigned int n;
...it's not enough to consider just the data members to know what makes sense. For example, if you use your postulated combination of swap for non-POD members and assignment otherwise...
std::swap(m_d, other.m_d);
n = other.n;
other.n = 0;
...in the move constructor or assignment operator, then it might still leave your program state invalid if say the destructor skipped deleting m_d when n was 0, or if it checked n == 0 before overwriting m_d with a pointer to newly allocated memory, old memory may be leaked. You have to decide on the class invariants: the valid relationships of m_d and n, to make sure your move constructor and/or assignment operator leave the state valid for future operations. (Most often, the moved-from object's destructor may be the only thing left to run, but it's valid for a program to reuse the moved-from object - e.g. assigning it a new value and working on it in the next iteration of a loop....)
Separately, if your invariants allow a non-nullptr m_d while n == 0, then swapping m_ds is appealing as it gives the moved-from object ongoing control of any buffer the moved-to object may have had: that may save time allocating a buffer later; counter-balancing that pro, if the buffer's not needed later you've kept it allocated longer than necessary, and if it's not big enough you'll end up deleting and newing a larger buffer, but at least you're being lazy about it which tends to help performance (but profile if you have to care).
No, if efficiency is any concern, don't swap PODs. There is just no benefit compared to normal assignment, it just results in unnecessary copies. Also consider if setting the moved from POD to 0 is even required at all.
I wouldn't even swap the pointer. If this is an owning relationship, use unique_ptr and move from it, otherwise treat it just like a POD (copy it and set it to nullptr afterwards or whatever your program logic requires).
If you don't have to set your PODs to zero and you use smart pointers, you don't even have to implement your move operator at all.
Concerning the second part of your question:
As Mateusz already stated, the assignment operator should always return a normal (non-const) reference.

Deep copy of a pointer to an array of pointers to objects

Edit: I originally posed this question out of context so I've reworked it. I've left as much as possible unchanged so most of your responses will still apply.
I'm having trouble understanding how to implement a constructor which accepts a pointer to an array of pointers.
I have the following class which contains a member, bodies, of type Body** (i.e. it is a pointer to an array of pointers to body objects).
class Galaxy
{
private:
int n; // Number of bodies in galaxy.
Body** bodies; // Ptr to arr of ptrs to Body objects.
public:
Galaxy();
Galaxy(int, Body**);
// Some other member functions.
};
Here is the implementation of the constructors:
// Default constructor. Initializes bodies to null pointer.
Galaxy::Galaxy() : bodies(NULL) {}
// Alternate constructor. Here I try to perform a deep copy of bodiesIn.
Galaxy::Galaxy(int nIn, Body** bodiesIn)
{
n = nIn;
// Allocate memory for an array of n pointers to Body objects.
bodies = new Body*[n];
// Perform deep copy.
for (int i=0; i<n; i++)
{
bodies[i] = new Body;
*bodies[i] = *bodiesIn[i];
}
}
Is this method sound, or is there a preferred way to construct such an object.
P.S. I realize it would be easier to code this with std::vector's, however the size of the array doesn't change, and minimizing memory usage is important.
There's lots wrong with your function:
Creating an object and then immediately assigning to it is inefficient, use the copy ctor instead.
If an exception is thrown by any new-expression but the first one or by one of the assignments, you are leaking objects.
Better take a std::size_t for the size, it's designed for it.
Better swap the arguments, that's more idiomatic.
You don't return the copy at the moment
Why not templatize it?
BTW: std::unique_ptr does not add any overhead, but provides plenty of comfort and safety.

Move semantics with a pointer to an internal buffer

Suppose I have a class which manages a pointer to an internal buffer:
class Foo
{
public:
Foo();
...
private:
std::vector<unsigned char> m_buffer;
unsigned char* m_pointer;
};
Foo::Foo()
{
m_buffer.resize(100);
m_pointer = &m_buffer[0];
}
Now, suppose I also have correctly implemented rule-of-3 stuff including a copy constructor which copies the internal buffer, and then reassigns the pointer to the new copy of the internal buffer:
Foo::Foo(const Foo& f)
{
m_buffer = f.m_buffer;
m_pointer = &m_buffer[0];
}
If I also implement move semantics, is it safe to just copy the pointer and move the buffer?
Foo::Foo(Foo&& f) : m_buffer(std::move(f.m_buffer)), m_pointer(f.m_pointer)
{ }
In practice, I know this should work, because the std::vector move constructor is just moving the internal pointer - it's not actually reallocating anything so m_pointer still points to a valid address. However, I'm not sure if the standard guarantees this behavior. Does std::vector move semantics guarantee that no reallocation will occur, and thus all pointers/iterators to the vector are valid?
I'd do &m_buffer[0] again, simply so that you don't have to ask these questions. It's clearly not obviously intuitive, so don't do it. And, in doing so, you have nothing to lose whatsoever. Win-win.
Foo::Foo(Foo&& f)
: m_buffer(std::move(f.m_buffer))
, m_pointer(&m_buffer[0])
{}
I'm comfortable with it mostly because m_pointer is a view into the member m_buffer, rather than strictly a member in its own right.
Which does all sort of beg the question... why is it there? Can't you expose a member function to give you &m_buffer[0]?
I'll not comment the OP's code. All I'm doing is aswering this question:
Does std::vector move semantics guarantee that no reallocation will occur, and thus all pointers/iterators to the vector are valid?
Yes for the move constructor. It has constant complexity (as specified by 23.2.1/4, table 96 and note B) and for this reason the implementation has no choice other than stealing the memory from the original vector (so no memory reallocation occurs) and emptying the original vector.
No for the move assignment operator. The standard requires only linear complexity (as specified in the same paragraph and table mentioned above) because sometimes a reallocation is required. However, in some cirsunstances, it might have constant complexity (and no reallocation is performed) but it depends on the allocator. (You can read the excelent exposition on moved vectors by Howard Hinnant here.)
A better way to do this may be:
class Foo
{
std::vector<unsigned char> m_buffer;
size_t m_index;
unsigned char* get_pointer() { return &m_buffer[m_index];
};
ie rather than store a pointer to a vector element, store the index of it. That way it will be immune to copying/resizing of the vectors backing store.
The case of move construction is guaranteed to move the buffer from one container to the other, so from the point of view of the newly created object, the operation is fine.
On the other hand, you should be careful with this kind of code, as the donor object is left with a empty vector and a pointer referring to the vector in a different object. This means that after being moved from your object is in a fragile state that might cause issues if anyone accesses the interface and even more importantly if the destructor tries to use the pointer.
While in general there won't be any use of your object after being moved from (the assumption being that to be bound by an rvalue-reference it must be an rvalue), the fact is that you can move out of an lvalue by casting or by using std::move (which is basically a cast), in which case code might actually attempt to use your object.

Why does my class's destructor get called when I add instances to a vector?

It seems that every time I add an object to the vector m_test, the destructor method is called. Am I missing something? How can I prevent this from happening?
class TEST
{
public:
TEST();
~TEST();
int * x;
};
TEST::TEST()
{
}
TEST::~TEST()
{
... it is called every time I push_back something to the vector ...
delete x;
}
vector<TEST> m_test;
for (unsigned int i=0; i<5; i++)
{
m_test.push_back(TEST());
}
The problem here is that you're violating the Rule of Three. Your class has a destructor so you need a copy-constructor and an assignment operator, too. Alternatively, you could not allow your class to be copied (for example by making T(T const&) and T& operator=(T const&) private, or by deriving from boost::noncopyable), and then resize the vector instead of using push_back.
In the first case, you can just push_back your class as you usually would. In the second, the syntax would be something like
std::vector<TEST> vec(5);
// vec now has five default-constructed elements of type TEST.
Not doing either of these things is a bad idea, as you are very likely to run into double deletion issues at some point -- even if you think you'll never copy or assign a TEST where x != nullptr, it's much safer to explicitly forbid it.
By the way, if you have member pointers that should be deleted when an object goes out of scope, consider using smart pointers like scoped_ptr, unique_ptr and shared_ptr (and maybe auto_ptr if you're unable to use Boost or C++11).
It's not called when you push_back, it's called when the temporary is destroyed.
To fix it in your example:
TEST test;
for (int i = 0; i < 5; ++i)
{
m_test.push_back(test);
}
Should only call it once.
Your code is creating a temporary TEST within the loop, using it in push_back, then that temporary is going out of scope when the loop ends/repeats and getting destroyed. That occurs exactly as it should, since the temporary TEST needs cleaned up.
If you want to avoid that, you need to do anything else but make a temporary object for each push. One potential solution is to:
vector<TEST> m_test(5); // Note reserving space in the vector for 5 objects
std::fill(m_test.begin(), m_test.end(), TEST()); // Fill the vector with the default ctor
Depending on how your STL is optimized, this may not need to make multiple copies.
You may also be able to get better handling if you implement a copy constructor in your TEST class, like:
TEST::TEST(const TEST & other)
{
x = new int(*other.x); // Not entirely safe, but the simplest copy ctor for this example.
}
Whether this is appropriate, or how you handle it, depends on your class and its needs, but you should typically have a copy constructor when you have defined your own regular constructor and destructor (otherwise the compiler will generate one, and in this case, it will result in copied and hanging pointers to x).
To avoid destruction of a temporary and to avoid copy constructors, consider using vector::resize or vector::emplace_back. Here's an example using emplace_back:
vector<TEST> m_test;
m_test.reserve(5);
for ( uint i=0; i<5; i++ )
{
m_test.emplace_back();
}
The vector element will be constructed in-place without the need to copy. When vt is destroyed, each vector element is automatically destroyed.
c++0x is required (use -std=c++0x with gnu). #include <vector> is of course also required.
If a default constructor is not used (for example, if the TEST::x was a reference instead of a pointer), simply add arguements to the call to emplace_back() as follows:
class TEST
{
public:
TEST( int & arg) : x(arg) {;} // no default constructor
int & x; // reference instead of a pointer.
};
. . .
int someInt;
vector<TEST> m_test;
m_test.reserve(5);
for ( uint i=0; i<5; i++ ) {
m_test.emplace_back( someInt ); // TEST constructor args added here.
}
The reserve() shown is optional but insures that sufficient space is available before beginning to construct vector elements.
vector.push_back() copies the given object into its storage area. The temporary object you're constructing in the push_back() call is destroyed immediately after being copied, and that's what you're seeing. Some compilers may be able to optimize this copy away, but yours apparently can't.
In m_test.push_back(TEST());, TEST() will create an temporary variable. After the vector copy it to its own memory, the temporary variable is destructed.
You may do like this:
vector<TEST> m_test(5, TEST());
The destructor is not only being called for the temporary variable.
The destructor will also get called when the capacity of the vector changes.
This happens often on very small vectors, less so on large vectors.
This causes:
A new allocation of memory (size based on a growth metric, not just size+1)
Copy of the old elements into the new allocation
Destruction of the elements in the old vector
Freeing of the old vector memory.
Copy construction of the new item onto the end of the new new vector.
See the third answer here:
Destructor is called when I push_back to the vector