I have read the below post which gives a very good insight into move semantics:
Can someone please explain move semantics to me?
but I am still fail to understand following things regarding move semantics -
Does copy elision and RVO would still work for classes without move constructors?
Even if our classes doesn't have move constructors, but STL containers has one. For operation like
std::vector<MyClass> vt = CreateMyClassVector();
and to perform operations like sorting etc. Why can't STL internally leverage move semantics to improve such operations internally using operations like copy elision or RVO which doesn't require move constructors?
3.
Do we get benefited by move semantics in below case -
std::vector< int > vt1(1000000, 5); // Create and initialize 1 million entries with value 5
std::vector< int > vt2(std::move(vt1)); // move vt1 to vt2
as integer is a primitive type, moving integer elements will not offer any advantage.
or here after move operation vt2 simply points to vt1 memory in heap and vt1 is set to null. what is actually happening? If latter is the case then even point 2 holds that we may not need move constructor for our classes.
4.
When a push_back() is called using std::move on lvalue for e.g :
std::vector<MyClass> vt;
for(int i=0; i<10; ++i)
{
vt.push_back(MyClass());
}
MyClass obj;
vt.push_back(std::move(obj));
now as vector has contiguous memory allocation, and obj is defined somewhere else in memory how would move semantics move the obj memory to vector vt contiguous memory region, wouldn't moving memory in this case is as good as copying memory, how does move justifies vectors contiguous memory requirements by simply moving a pointer pointing to a memory in different region of a heap.?
Thanks for explanation in advance!
[Originally posted as Move semantics clarification but now as the context is changed a bit posting it as new question shall delete the old one ASAP.]
Does copy elision and RVO would still work for classes without move constructors?
Yes, RVO still kicks in. Actually, the compiler is expected to pick:
RVO (if possible)
Move construction (if possible)
Copy construction (last resort)
Why can't STL internally leverage move semantics to improve such operations internally using operations like copy elision or RVO which doesn't require move constructors?
The STL containers are movable, regardless of the types stored within. However, operations on the objects in the container require the object cooperation, and as such sort (for example) may only move objects if those objects are movable.
Do we get benefited by move semantics in below case [...] as integer is a primitive type ?
Yes, you do, because containers are movable regardless of their content. As you deduced, st2 will steal the memory from st1. The state of st1 after the move is unspecified though, so I cannot guarantee its storage will have been nullified.
When a push_back() is called using std::move on lvalue [what happens] ?
The move constructor of the type of the lvalue is called, typically this involves a bitwise copy of the original into the destination, and then a nullification of the original.
In general, the cost of a move constructor is proportional to sizeof(object); for example, sizeof(std::string) is stable regardless of how many characters the std::string has, because in effect those characters are stored on the heap (at least when there is a sufficient number of them) and thus only the pointer to the heap storage is moved around (plus some metadata).
Yes.
They do, as far as possible.
Yes. std::vector has a move constructor that avoids copying all the elements.
It is still in contiguous.
e.g.
struct MyClass
{
MyClass(MyClass&& other)
: xs(other.xs), size(other.size)
{
other.xs = nullptr;
}
MyClass(const MyClass& other)
: xs(new int[other.size]), size(other.size)
{
memcpy(xs, other.xs, size);
}
~MyClass()
{
delete[] xs;
}
int* xs;
int size;
}
With a move constructor only xs and size needs to be copied into the vector (for contiguous memory), however we do not need the perform memory allocation and memcpy as in the copy constructor.
Related
Assume a following non copyable and non movable struct X with no default constructor and with no single argument constructor:
struct X
{
X(int x, int y) { }
X(const X&) = delete;
X(X&&) = delete;
};
and a vector std::vector<pair<X,X>> v. For inserting into v one could use emplace_back if X was constructible from just one argument, since it effectively calls the constructor of std::pair<X,X>.
We could do something like this:
v.emplace_back(X(42,42),X(69,69));
but in this case a move constructor of X gets called and the latter does not compile. Since this is not possible, we have to make use of the std::piecewise_construct constructor of std::pair and call:
v.emplace_back(std::piecewise_construct, std::forward_as_tuple(42,42), std::forward_as_tuple(69,69));
I would expect this to work properly, but the vector, for some reason, is calling move ctor (or copy, if only move was deleted).
For example changing the container to be std::list, everything works just fine. Adding a < operator to X and creating a std::map<X,X> (which has pairs of X as the nodes) or std::set<std::pair<X,X>> and using emplace instead of emplace_back all seems to work. What is wrong with std::vector?
Full code snippet can be found here.
std::vector is subject to reallocation once the size reach the capacity.
when reallocating the elements into a new memory segment std::vector has to copy/move the values from the old segment and this is made by calling copy/move constructors.
if you don't need that the elements are sequential in memory you can use std::deque instead, since std::deque doesn't reallocate the elements internally.
you can't store non copyable and non moveable objects into std::vectors.
EDIT suggested by #François Andrieux
In case you still need for any reason an std::vector you may think to use a vector made using std::unique_ptr<X> as value type using std::vector<std::unique_ptr<X>>.
With this solution you still don't get a sequential order in memory of your elements, and they are keep in memory till they are still in the vector, so except in case you are forced by any reason to use std::vectors, i think the best match is still the std::deque.
One of my function takes a vector as a parameter and stores it as a member variable. I am using const reference to a vector as described below.
class Test {
public:
void someFunction(const std::vector<string>& items) {
m_items = items;
}
private:
std::vector<string> m_items;
};
However, sometimes items contains a large number of strings, so I'd like to add a function (or replace the function with a new one) that supports move semantics.
I am thinking of several approaches, but I'm not sure which one to choose.
1) unique_ptr
void someFunction(std::unique_ptr<std::vector<string>> items) {
// Also, make `m_itmes` std::unique_ptr<std::vector<string>>
m_items = std::move(items);
}
2) pass by value and move
void someFunction(std::vector<string> items) {
m_items = std::move(items);
}
3) rvalue
void someFunction(std::vector<string>&& items) {
m_items = std::move(items);
}
Which approach should I avoid and why?
Unless you have a reason for the vector to live on the heap, I would advise against using unique_ptr
The vector's internal storage lives on the heap anyway, so you'll be requiring 2 degrees of indirection if you use unique_ptr, one to dereference the pointer to the vector, and again to dereference the internal storage buffer.
As such, I would advise to use either 2 or 3.
If you go with option 3 (requiring an rvalue reference), you are foisting a requirement on the users of your class that they pass an rvalue (either directly from a temporary, or move from an lvalue), when calling someFunction.
The requirement of moving from an lvalue is onerous.
If your users want to keep a copy of the vector, they have to jump through hoops to do so.
std::vector<string> items = { "1", "2", "3" };
Test t;
std::vector<string> copy = items; // have to copy first
t.someFunction(std::move(items));
However, if you go with option 2, the user can decide if they want to keep a copy, or not - the choice is theirs
Keep a copy:
std::vector<string> items = { "1", "2", "3" };
Test t;
t.someFunction(items); // pass items directly - we keep a copy
Don't keep a copy:
std::vector<string> items = { "1", "2", "3" };
Test t;
t.someFunction(std::move(items)); // move items - we don't keep a copy
On the surface, option 2 seems like a good idea since it handles both lvalues and rvalues in a single function. However, as Herb Sutter notes in his CppCon 2014 talk Back to the Basics! Essentials of Modern C++ Style, this is a pessimization for the common case of lvalues.
If m_items was "bigger" than items, your original code will not allocate memory for the vector:
// Original code:
void someFunction(const std::vector<string>& items) {
// If m_items.capacity() >= items.capacity(),
// there is no allocation.
// Copying the strings may still require
// allocations
m_items = items;
}
The copy-assignment operator on std::vector is smart enough to reuse the existing allocation. On the other hand, taking the parameter by value will always have to make another allocation:
// Option 2:
// When passing in an lvalue, we always need to allocate memory and copy over
void someFunction(std::vector<string> items) {
m_items = std::move(items);
}
To put it simply: copy construction and copy assignment do not necessarily have the same cost. It's not unlikely for copy assignment to be more efficient than copy construction — it is more efficient for std::vector and std::string †.
The easiest solution, as Herb notes, is to add an rvalue overload (basically your option 3):
// You can add `noexcept` here because there will be no allocation‡
void someFunction(std::vector<string>&& items) noexcept {
m_items = std::move(items);
}
Do note that the copy-assignment optimization only works when m_items already exists, so taking parameters to constructors by value is totally fine - the allocation would have to be performed either way.
TL;DR: Choose to add option 3. That is, have one overload for lvalues and one for rvalues. Option 2 forces copy construction instead of copy assignment, which can be more expensive (and is for std::string and std::vector)
† If you want to see benchmarks showing that option 2 can be a pessimization, at this point in the talk, Herb shows some benchmarks
‡ We shouldn't have marked this as noexcept if std::vector's move-assignment operator wasn't noexcept. Do consult the documentation if you are using a custom allocator.
As a rule of thumb, be aware that similar functions should only be marked noexcept if the type's move-assignment is noexcept
It depends on your usage patterns:
Option 1
Pros:
Responsibility is explicitly expressed and passed from the caller to the callee
Cons:
Unless the vector was already wrapped using a unique_ptr, this doesn't improve readability
Smart pointers in general manage dynamically allocated objects. Thus, your vector must become one. Since standard library containers are managed objects that use internal allocations for the storage of their values, this means that there are going to be two dynamic allocations for each such vector. One for the management block of the unique ptr + the vector object itself and an additional one for the stored items.
Summary:
If you consistently manage this vector using a unique_ptr, keep using it, otherwise don't.
Option 2
Pros:
This option is very flexible, since it allows the caller to decide whether he wan't to keep a copy or not:
std::vector<std::string> vec { ... };
Test t;
t.someFunction(vec); // vec stays a valid copy
t.someFunction(std::move(vec)); // vec is moved
When the caller uses std::move() the object is only moved twice (no copies), which is efficient.
Cons:
When the caller doesn't use std::move(), a copy constructor is always called to create the temporary object. If we were to use void someFunction(const std::vector<std::string> & items) and our m_items was already big enough (in terms of capacity) to accommodate items, the assignment m_items = items would have been only a copy operation, without the extra allocation.
Summary:
If you know in advance that this object is going to be re-set many times during runtime, and the caller doesn't always use std::move(), I would have avoided it. Otherwise, this is a great option, since it is very flexible, allowing both user-friendliness and higher performance by demand despite the problematic scenario.
Option 3
Cons:
This option forces the caller to give up on his copy. So if he wants to keep a copy to himself, he must write additional code:
std::vector<std::string> vec { ... };
Test t;
t.someFunction(std::vector<std::string>{vec});
Summary:
This is less flexible than Option #2 and thus I would say inferior in most scenarios.
Option 4
Given the cons of options 2 and 3, I would deem to suggest an additional option:
void someFunction(const std::vector<int>& items) {
m_items = items;
}
// AND
void someFunction(std::vector<int>&& items) {
m_items = std::move(items);
}
Pros:
It solves all the problematic scenarios described for options 2 & 3 while enjoying their advantages as well
Caller decided to keep a copy to himself or not
Can be optimized for any given scenario
Cons:
If the method accepts many parameters both as const references and/or rvalue references the number of prototypes grows exponentially
Summary:
As long as you don't have such prototypes, this is a great option.
The current advice on this is to take the vector by value and move it into the member variable:
void fn(std::vector<std::string> val)
{
m_val = std::move(val);
}
And I just checked, std::vector does supply a move-assignment operator. If the caller doesn't want to keep a copy, they can move it into the function at the call site: fn(std::move(vec));.
I made a sparse matrix class for some work I am doing. For the sparse structures, I used pointers, e.g. int* rowInd = new int[numNonZero]. For the class I wrote copy and move assignment operators and all works fine.
Reading about the move and copy semantics online, I have tangentially found an overwhelming opinion that in modern C++ I should probably not be using raw pointers. If this is the case, then I would like to modify my code to use vectors for good coding practice.
I mostly have read vectors over raw pointers. Is there any reason not to change to vectors?
If I change the data to be stored in vectors instead of new[] arrays, do I still need to manually write copy/move assignment and constructor operators for classes? Are there any important differences between vector and new[] move/copy operators?
Suppose I have a class called Levels, which contains several sparse matrix variables. I would like a function to create a vector of Levels, and return it:
vector<Levels> GetGridLevels(int &n, ... ) {
vector<Levels> grids(n);
\\ ... Define matrix variables for each Level object in grids ...
return grids;
}
Will move semantics prevent this from being an expensive copy? I would think so, but it's a vector of objects containing objects containing member vector variables, which seems like a lot...
Yes, use std::vector<T> instead of raw T *.
Also yes, the compiler will generate copy and move assignment operators for you and those will very likely have optimal performance, so don't write your own. If you want to be explicit, you can say that you want the generated defaults:
struct S
{
std::vector<int> numbers {};
// I want a default copy constructor
S(const S&) = default;
// I want a default move constructor
S(S &&) noexcept = default;
// I want a default copy-assignment operator
S& operator=(const S&) = default;
// I want a default move-assignment operator
S& operator=(S&&) noexcept = default;
};
Regarding your last question, if I understand correctly, you mean whether returning a move-aware type by-value will be efficient. Yes, it will. To get the most out of your compiler's optimizations, follow these rules:
Return by-value (not by const value, this will inhibit moving).
Don't return std::move(x), just return x (at least if your return type is decltype(x)) so not to inhibit copy elision.
If you have more than one return statement, return the same object on every path to facilitate named return value optimization (NRVO).
std::string
good(const int a)
{
std::string answer {};
if (a % 7 > 3)
answer = "The argument modulo seven is greater than three.";
else
answer = "The argument modulo seven is less than or equal to three.";
return answer;
}
std::string
not_so_good(const int a)
{
std::string answer {"The argument modulo seven is less than or equal to three."};
if (a % 7 > 3)
return "The argument modulo seven is greater than three.";
return answer;
}
For those types where you write move constructors and assignment operators, make sure to declare them noexcept or some standard library containers (notably std::vector) will refuse to use them.
Nothing related to correctness. Just be aware that constructing a vector of size n means it will initialize all of its elements, so you might prefer to construct an empty vector, then reserve(n), then push_back the elements.
No, the implicit move constructor/assignment should take care of it all - unless you suppress them.
Yes, if you don't write code to prevent the move, you'll get an efficient move from std::vector automatically.
Also, consider using an existing library such as Eigen, so you get some fairly optimized routines for free.
No. In 99% of the cases the simplest use of std::vector will do the job better and safer than raw pointers, and in the less common cases where you need to manually manage memory, these class can work with custom allocators/deallocators (for instance, if you want aligned memory for use of aligned SSE intrinsics). If you use custom allocators, the code will be potentially more complex than raw pointers, but more maintainable and less prone to memory problems.
Depending on what your other members are, and what your class does, you may need to implement move/copy assignment/ctors. But this will be much more simple. You may have to implement them yourself, but for your vectors you just need to call the corresponding operators/ctors. The code will be simple, readable, and you will have no risks of segfaults / memory leaks
Yes, but move semantics are not even necessary. Return value optimization will be responsible for the optimized copy (in fact there will be no copy). However this is compiler specific, and not guaranteed by the standard.
Suppose I have a class which manages a pointer to an internal buffer:
class Foo
{
public:
Foo();
...
private:
std::vector<unsigned char> m_buffer;
unsigned char* m_pointer;
};
Foo::Foo()
{
m_buffer.resize(100);
m_pointer = &m_buffer[0];
}
Now, suppose I also have correctly implemented rule-of-3 stuff including a copy constructor which copies the internal buffer, and then reassigns the pointer to the new copy of the internal buffer:
Foo::Foo(const Foo& f)
{
m_buffer = f.m_buffer;
m_pointer = &m_buffer[0];
}
If I also implement move semantics, is it safe to just copy the pointer and move the buffer?
Foo::Foo(Foo&& f) : m_buffer(std::move(f.m_buffer)), m_pointer(f.m_pointer)
{ }
In practice, I know this should work, because the std::vector move constructor is just moving the internal pointer - it's not actually reallocating anything so m_pointer still points to a valid address. However, I'm not sure if the standard guarantees this behavior. Does std::vector move semantics guarantee that no reallocation will occur, and thus all pointers/iterators to the vector are valid?
I'd do &m_buffer[0] again, simply so that you don't have to ask these questions. It's clearly not obviously intuitive, so don't do it. And, in doing so, you have nothing to lose whatsoever. Win-win.
Foo::Foo(Foo&& f)
: m_buffer(std::move(f.m_buffer))
, m_pointer(&m_buffer[0])
{}
I'm comfortable with it mostly because m_pointer is a view into the member m_buffer, rather than strictly a member in its own right.
Which does all sort of beg the question... why is it there? Can't you expose a member function to give you &m_buffer[0]?
I'll not comment the OP's code. All I'm doing is aswering this question:
Does std::vector move semantics guarantee that no reallocation will occur, and thus all pointers/iterators to the vector are valid?
Yes for the move constructor. It has constant complexity (as specified by 23.2.1/4, table 96 and note B) and for this reason the implementation has no choice other than stealing the memory from the original vector (so no memory reallocation occurs) and emptying the original vector.
No for the move assignment operator. The standard requires only linear complexity (as specified in the same paragraph and table mentioned above) because sometimes a reallocation is required. However, in some cirsunstances, it might have constant complexity (and no reallocation is performed) but it depends on the allocator. (You can read the excelent exposition on moved vectors by Howard Hinnant here.)
A better way to do this may be:
class Foo
{
std::vector<unsigned char> m_buffer;
size_t m_index;
unsigned char* get_pointer() { return &m_buffer[m_index];
};
ie rather than store a pointer to a vector element, store the index of it. That way it will be immune to copying/resizing of the vectors backing store.
The case of move construction is guaranteed to move the buffer from one container to the other, so from the point of view of the newly created object, the operation is fine.
On the other hand, you should be careful with this kind of code, as the donor object is left with a empty vector and a pointer referring to the vector in a different object. This means that after being moved from your object is in a fragile state that might cause issues if anyone accesses the interface and even more importantly if the destructor tries to use the pointer.
While in general there won't be any use of your object after being moved from (the assumption being that to be bound by an rvalue-reference it must be an rvalue), the fact is that you can move out of an lvalue by casting or by using std::move (which is basically a cast), in which case code might actually attempt to use your object.
For some reason, I want to return an object of my::Vector (which is basically a wrapper class that internally use STL vector for actual storage plus do provide some extra functions). I return vector by value as the function creates a vector locally every time.
my::Vector<int> calcOnCPU()
{
my::Vector<int> v....
return v;
}
Now I can have multiple nesting of function calls (considering a library design), so in short something like following:
my::Vector<int> calc()
{
if(...)
return calcOnCPU();
}
AFAIK, returning by value would invoke copy constructor of my::Vector class which is something:
Vector<int>::Vector(const Vector& c)
{
....
m_vec = c.m_vec; // where m_vec is std::vector<int>
}
Few questions:
1) In copy constructor, is it invoking copy constructor of std::vector? or assignment operator and Just to confirm, std::vector creates deep copy (meaning copies all elements considering basic integer type).
2) With nesting of calcOnCPU() in calc() each returning Vector of int: 2 or 1 copies of Vector will be created? How could I avoid multiple copies in case of such simple method nesting? Inline functions or there exist another way?
UPDATE 1: It became apparent to me that I need to keep my own copy constructor as there are some custom requirements. However, I did a simple test in main function:
int main() {
...
my::Vector v = calc();
std::cout<<v;
}
I put some prints using "std::cerr" in my copy constructor to see when it gets called. Interestingly, it is not called even once for above program (atleast nothing gets printed). Is it copy ellision optimization? I am using GNU C++ compiler (g++) v4.6.3 on Linux.
In copy constructor, is it invoking copy constructor of std::vector? or assignment operator
In your case, it's creating an empty std::vector, then copy-assigning it. Using an initialiser list would copy-construct it directly, which is neater and possibly more efficient:
Vector<int>::Vector(const Vector& c) : m_vec(c.m_vec) {
....
}
Just to confirm, std::vector creates deep copy
Yes, copying a std::vector will allocate a new block of memory and copy all the elements into that.
With nesting of calcOnCPU() in calc() each returning Vector of int: 2 or 1 copies of Vector will be created?
That's up to the compiler. It should apply the "return value optimisation" (a special case of copy elision), in which case it won't create a local object and return a copy, but will create it directly in the space allocated for the returned object. There are some cases where this can't be done - if you have multiple return statements that might return one of several local objects, for example.
Also, a modern compiler will also support move semantics where, even if the copy can't be elided, the vector's contents will be moved to the returned object rather than copied; that is, they will be transferred quickly by setting the vector's internal pointers, with no memory allocation or copying of elements. However, since you're wrapping the vector in your own class, and you've declared a copy constructor, you'll have to give that class a move constructor in order for that to work - which you can only do if you're using C++11.
How could I avoid multiple copies in case of such simple method nesting? Inline functions or there exist another way?
Make sure the structure of your function is simple enough for copy elision to work. If you can, give your class a move constructor that moves the vector (or remove the copy constructor, and the assignment operator if there is one, to allow one to be implicitly generated). Inlining is unlikely to make a difference.