does std::vector copy/move elements when re-sizing?

does std::vector copy/move elements when re-sizing? - c++

I was messing around with move c'tors for a learning / refresh exercise and I came across something unexpected to me. Below I have a class person that contains a std::string m_name;. I am using this as a test class for copy/move c'tors.
Here is the code for quick reference:
#include <iostream>
#include <vector>
class person
{
public:
std::string m_name;
explicit person(const std::string &name) : m_name(name)
{
std::cout << "created " << m_name << std::endl;
}
~person()
{
std::cout << "destroyed " << m_name << std::endl;
}
person(const person &other) : m_name(other.m_name)
{
m_name += ".copied";
std::cout << "copied " << other.m_name << " -> " << m_name << std::endl;
}
person(const person &&other) noexcept : m_name(std::move(other.m_name))
{
m_name += ".moved";
std::cout << "moved " << other.m_name << " -> " << m_name << std::endl;
}
};
int main()
{
std::vector<person> people;
people.reserve(10);
std::cout << "\ncopy bob (lvalue):" << std::endl;
person bob{"bob"};
people.push_back(bob);
std::cout << "\nmove fred (lvalue):" << std::endl;
person fred{"fred"};
people.push_back(std::move(fred));
std::cout << "\ntemp joe (rvalue):" << std::endl;
people.push_back(person{"joe"});
std::cout << "\nterminating:" << std::endl;
}
This gives me the output that I would expect (mostly, except why std::string contents is not "moved"?): https://godbolt.org/z/-J_56i
Then I remove the std::vector reserve so that std::vector has to "grow" as I am adding elements. Now I get something that I really don't expect: https://godbolt.org/z/rS6-mj
Now I can see that bob is copied and then moved when fred is added and then moved again when joe is added. I was under the impression that std::vector "moved" when it has to reallocate space. But I thought that it did a memory copy/move, not an object-by-object copy/move. I really did not expect it to call the move constructor.
Now if I remove the move c'tor, I find that bob is copied three times!: https://godbolt.org/z/_BxnvU
This seems really inefficient.
From cplusplus.com:
push_back()
Add element at the end Adds a new element at the end of the vector,
after its current last element. The content of val is copied (or
moved) to the new element.
This effectively increases the container size by one, which causes an
automatic reallocation of the allocated storage space if -and only if-
the new vector size surpasses the current vector capacity.
resize()
Resizes the container so that it contains n elements.
If n is smaller than the current container size, the content is
reduced to its first n elements, removing those beyond (and destroying
them).
If n is greater than the current container size, the content is
expanded by inserting at the end as many elements as needed to reach a
size of n. If val is specified, the new elements are initialized as
copies of val, otherwise, they are value-initialized.
If n is also greater than the current container capacity, an automatic
reallocation of the allocated storage space takes place.
Notice that this function changes the actual content of the container
by inserting or erasing elements from it.
I guess it does not really describe "how" it does the reallocation, but surely a memory copy is the fastest way to move the vector to its newly allocated memory space?
So why are the copy/move c'tors called when std::vector is added to instead of a memory copy?
A side note/question: (any maybe this should be a separate question): In person move c'tor why is moved fred -> fred.moved printed and not moved -> fred.moved. It appears that the std::string move assignment does not really "move" the data...

If it needs to relocate, something similar to std::move(xold.begin(), xold.end(), xnew.begin()); will be used. It depends on the value type and the vector usually does its own internal placement new . but it'll move if it can move.
Your move constructor
person(const person &&other) noexcept;
has a flaw though: other should not be const since it must be allowed to change other to steal its resources. In this move constructor
person(person&& other) noexcept : m_name(std::move(other.m_name)) {}
the std::strings own move constructor will do something similar to this:
string(string&& other) noexcept :
the_size(other.the_size),
data_ptr(std::exchange(other.data_ptr, nullptr))
{}
You also need to add a move assignment operator:
person& operator=(person &&other) noexcept;

Related

Why is the copy constructor called in this example with std::vector?

#include <iostream>
#include <vector>
using namespace std;
// Move Class
class Move {
private:
// Declare the raw pointer as
// the data member of class
int* data;
public:
// Constructor
Move(int d)
{
// Declare object in the heap
data = new int;
*data = d;
cout << "Constructor is called for "
<< d << endl;
};
// Copy Constructor
Move(const Move& source)
: Move{ *source.data }
{
cout << "Copy Constructor is called -"
<< "Deep copy for "
<< *source.data
<< endl;
}
// Move Constructor
Move(Move&& source)
: data{ source.data }
{
cout << "Move Constructor for "
<< *source.data << endl;
source.data = nullptr;
}
// Destructor
~Move()
{
if (data != nullptr)
cout << "Destructor is called for "
<< *data << endl;
else
cout << "Destructor is called"
<< " for nullptr "
<< endl;
delete data;
}
};
// Driver Code
int main()
{
// Vector of Move Class
vector<Move> vec;
// Inserting Object of Move Class
vec.push_back(Move{ 10 });
vec.push_back(Move{ 20 });
return 0;
}
output:
Constructor is called for 10
Move Constructor for 10
Destructor is called for nullptr
Constructor is called for 20
Move Constructor for 20
Constructor is called for 10
Copy Constructor is called -Deep copy for 10
Destructor is called for 10
Destructor is called for nullptr
Destructor is called for 10
Destructor is called for 20

I think what you are asking is why is the copy constructor called instead of your move constructor? It is because your move constructor is not noexcept. If you mark it as no except, the move constructor will be used during reallocation instead.
See this related answer for more information: https://stackoverflow.com/a/47017784/6324364
The common recommendation is that move constructors should be noexcept if at all possible. Many standard library functions will not use the move constructor if it could throw an exception.
If your question is instead about the order of constructor calls - that is just dependent on the implementation. An example implementation of push_back could be:
// The callsite of this function is where "Constructor is called for 20"
template<typename T>
void push_back(T&& value) {
// If the vector is full resize
if (size() == capacity()) {
auto* oldBuffer = m_buffer;
// allocate larger amount of space,
// increasing capacity but keeping size the same
// ...
// Here is your move constructor call
m_buffer[size()] = std::move(value);
// copy/move over old elements. This is your copy constructor
// If your move constructor was noexcept, it would be used.
// Otherwise the copy constructor is called.
// ...
m_size++;
// deallocate old buffer
// ...
} else {
m_buffer[size() - 1] = std::move(value);
m_size++;
}
}
The reason you seemingly get two constructor calls for 10 after the move constructor for 20 is because that is what your code does:
// Constructor
Move(int d)
{
// Declare object in the heap
data = new int;
*data = d;
cout << "Constructor is called for "
<< d << endl;
};
// Copy Constructor
Move(const Move& source)
// Here, your Copy constructor is calling the other constructor
: Move{ *source.data }
{
cout << "Copy Constructor is called -"
<< "Deep copy for "
<< *source.data
<< endl;
}
In your copy constructor you are explicitly calling your other constructor leading to the two log messages.
The implementation may do things in this order (move/copy new element, then move/copy old elements) as part of the strong exception guarantee provided by vector which says that the vector state will not be corrupted if copy/move of the new items fails. If it moved the old items first and then the new item constructor threw an exception, the vector state would be corrupted.
If you can add what you expected to happen in the code, maybe someone can give a better answer. As it stands, I don't fully understand what you are asking. You get those calls because that is how it is implemented.

Vector is stored as a one piece in memory, continously. When you add the second element to the vector, its current memory needs to be expanded to accomodate the newly added element. This will require your already existing elements to be copied somewhere else where enough memory is allocated to accomodate both (in your case) the first and second element. Take a look at what reserve is doing and how you can benefit from it. Or, similarly, if your vector is of fixed size, you may also take a look at array.

Strange side effect from a copy constructor

This simple code:
#include <iostream>
#include <vector>
struct my_struct
{
int m_a;
my_struct(int a) : m_a(a) { std::cout << "normal const " << m_a << std::endl; }
my_struct(const my_struct&& other) : m_a(other.m_a) { std::cout << "copy move " << other.m_a << std::endl; }
my_struct(const my_struct &other) : m_a(other.m_a) { std::cout << "copy const " << other.m_a << std::endl; }
};
class my_class
{
public:
my_class() {}
void append(my_struct &&m) { m_vec.push_back(m); }
private:
std::vector<my_struct> m_vec;
};
int main()
{
my_class m;
m.append(my_struct(5));
m.append(std::move(my_struct(6)));
}
produces this output:
normal const 5
copy const 5
normal const 6
copy const 6
copy const 5
The first call to append creates the object, and push_back creates a copy. Likewise, the second call to append creates the object, and push_back creates a copy. Now, a copy constructor of the first object is mysteriously called. Could someone explain me what happens? It looks like a strange side effect...

Now, a copy constructor of the first object is mysteriously called. Could someone explain me what happens? It looks like a strange side effect...
When you call push_back on std::vector, vector may need to grow it's size as stated in the cppreference:
If the new size() is greater than capacity() then all iterators and references (including the past-the-end iterator) are invalidated. Otherwise only the past-the-end iterator is invalidated.
You can use reserve before pushing anything to your vector. Try this:
class my_class
{
public:
my_class()
{
m_vec.reserve(10); // Use any number that you want.
}
void append(my_struct &&m) { m_vec.push_back(m); }
private:
std::vector<my_struct> m_vec;
};
Few other issues with your program:
You need to fix signature of your move constructor as move constructor requires rvalue reference (more specifically, xvalue or prvalue). It should like this:
my_struct(my_struct&& other) noexcept : m_a(other.m_a)
{
std::cout << "copy move " << other.m_a << std::endl;
}
noexcept is required as we need to inform C++ (specifically std::vector) that move constructor and destructor does not throw, using noexcept. Then the move constructor will be called when the vector grows. See this.
The method append should be:
void append(my_struct &&m)
{
m_vec.push_back(std::move(m));
}
To know why we need to use std::move on rvalue reference, see this Is an Rvalue Reference an Rvalue?. It says:
Things that are declared as rvalue reference can be lvalues or rvalues. The distinguishing criterion is: if it has a name, then it is an lvalue. Otherwise, it is an rvalue.
If you don't use std::move, then copy constructor would be called.

That's just how std::vector works!
When you call push_back(), the underlying array needs to grow to make room for the new element.
So internally, a new larger array is allocated and all the elements of the previous smaller array are copied into the freshly created array. This also comes with some overhead. Now, you can use some techniques to optimize away the copies.
If you have an idea of how large the array could grow, you can use the reserve() method to ensure that no resizing will occur upto that many locations.
vct.reserve(5)
This is will ensure that no resizing will occur until 5 elements.
Also, you can use the emplace_back() function to avoid an additional copy. It constructs the object in place. Simply pass the constructor parameters of the object to emplace_back()

Copy Constructor called multiple times while doing push_back in a vector [duplicate]

I am a bit confused with the way vector push_back behaves, with the following snippet I expected the copy constructor to be invoked only twice, but the output suggest otherwise. Is it a vector internal restructuring that results in this behaviour.
Output:
Inside default
Inside copy with my_int = 0
Inside copy with my_int = 0
Inside copy with my_int = 1
class Myint
{
private:
int my_int;
public:
Myint() : my_int(0)
{
cout << "Inside default " << endl;
}
Myint(const Myint& x) : my_int(x.my_int)
{
cout << "Inside copy with my_int = " << x.my_int << endl;
}
void set(const int &x)
{
my_int = x;
}
}
vector<Myint> myints;
Myint x;
myints.push_back(x);
x.set(1);
myints.push_back(x);

What happens:
x is inserted via push_back. One copy occurs: The newly created element is initialized with the argument. my_int is taken over as zero because xs default constructor initialized it so.
The second element is push_back'd; The vector needs to reallocate the memory since the internal capacity was reached. As no move constructor is implicitly defined for Myint1 the copy constructor is chosen; The first element is copied into the newly allocated memory (its my_int is still zero... so the copy constructor shows my_int as 0 again) and then x is copied to initialize the second element (as with the first in step 1.). This time x has my_int set to one and that's what the output of the copy constructor tells us.
So the total amount of calls is three. This might vary from one implementation to another as the initial capacity might be different. However, two calls are be the minimum.
You can reduce the amount of copies by, in advance, reserving more memory - i.e. higher the vectors capacity so the reallocation becomes unnecessary:
myints.reserve(2); // Now two elements can be inserted without reallocation.
Furthermore you can elide the copies when inserting as follows:
myints.emplace_back(0);
This "emplaces" a new element - emplace_back is a variadic template and can therefore take an arbitrary amount of arguments which it then forwards - without copies or moves - to the elements constructor.
1 Because there is a user-declared copy constructor.

You got it...it was the resizing. But I'll just point out that if you're doing some bean counting on your constructors, you might be interested in "emplacement":
#include <iostream>
#include <vector>
using namespace std;
class Myint
{
private:
int my_int;
public:
explicit Myint(int value = 0) : my_int(value)
{
cout << "Inside default " << endl;
}
Myint(const Myint& x) : my_int(x.my_int)
{
cout << "Inside copy with my_int = " << x.my_int << endl;
}
Myint(const Myint&& x) noexcept : my_int(x.my_int) {
cout << "Inside move with my_int = " << x.my_int << endl;
}
};
int main() {
vector<Myint> myints;
myints.reserve(2);
myints.emplace_back(0);
myints.emplace_back(1);
// your code goes here
return 0;
}
That should give you:
Inside default
Inside default
And, due to the noexcept on the move constructor...if you delete the reserve you'd get a move, not a copy:
Inside default
Inside default
Inside move with my_int = 0
There's no real advantage with this datatype of a move over a copy. But semantically it could be a big difference if your data type was more "heavy weight" and had a way of "moving" its members that was more like transferring ownership of some pointer to a large data structure.

When the size of the vector is increased with the second push_back, the existing contents of the vector must be copied to a new buffer. To verify, output myints.capacity() after the first push_back, it should be 1.

This depends on how much memory was reserved to an object of type std::vector. It seems that when push_back was first executed there was allocated memory only for one element. When the second time push_back was called the memory was reallocated to reserve memory for the second element. In this case the element that is already in the vector is copied in the new place. And then the second element is also added.
You could reserve enough memory yourself that to escape the second call of the copy constructor:
vector<Myint> myints;
myints.reserve( 2 );

You're correct in assuming the additional invocation of the copy constructor comes from internal restructuring of the vector.
See this answer for more detail: https://stackoverflow.com/a/10368636/3708904
Or this answer for the reason why copy construction is necessary: https://stackoverflow.com/a/11166959/3708904

How to pass the ownership of an object to the outside of the function

Is there any method that can pass the ownership of an object created in the function on the stack memory to the outside of the function without using copy construction?
Usually, compiler will automatically call destruction to the object on the stack of a function. Therefore, if we want to create an object of a class(maybe with some specific parameters), how can we avoid wasting lots of resources copying from temp objects?
Here is one common case:
while(...){
vectors.push_back(createObject( parameter ));
}
So when we want to create objects in a iteration with some parameters, and push them into a vector, normal value passed way will take a lot of time copying objects. I don't want to use pointer and new objects on the heap memory as user are likely to forget delete them and consequently cause memory leak.
Well, smart pointer maybe a solution. But..less elegent, I think. hhhh
Is there any way of applying rvalue reference and move semantics to solve this problem?

Typically, returning an object by value will not copy the object, as the compiler should do a (named) return value optimization and thereby elide the copy.
With this optimization, the space for the returned object is allocated from the calling context (outer stack frame) and the object is constructed directly there.
In your example, the compiler will allocate space for the object in the context where createObject() is called. As this context is an (unnamed) parameter to the std::vector<T>.push_back() member function, this works as an rvalue reference, so the push_back() by-value will consume this object by moving (instead of copying) it into the vector. This is possible since if the generated objects are movable. Otherwise, a copy will occur.
In sum, each object will be created and then moved (if moveable) into the vector.
Here is a sample code that shows this in more detail:
#include <iostream>
#include <string>
#include <vector>
using Params = std::vector<std::string>;
class Object
{
public:
Object() = default;
Object(const std::string& s) : s_{s}
{
std::cout << "ctor: " << s_ << std::endl;
}
~Object()
{
std::cout << "dtor: " << s_ << std::endl;
}
// Explicitly no copy constructor!
Object(const Object& other) = delete;
Object(Object&& other)
{
std::swap(s_, other.s_);
std::cout << "move: traded '" << s_ << "' for '" << other.s_ << "'" << std::endl;
}
Object& operator=(Object other)
{
std::swap(s_, other.s_);
std::cout << "assign: " << s_ << std::endl;
return *this;
}
private:
std::string s_;
};
using Objects = std::vector<Object>;
Object createObject(const std::string& s)
{
Object o{s};
return o;
}
int main ()
{
Objects v;
v.reserve(4); // avoid moves, if initial capacity is too small
std::cout << "capacity(v): " << v.capacity() << std::endl;
Params ps = { "a", "bb", "ccc", "dddd" };
for (auto p : ps) {
v.push_back(createObject(p));
}
return 0;
}
Note that the class Object explicitly forbids copying. But for this to work, the move constructur must be available.
A detailed summary on when copy elision can (or will) happen is available here.

Move semantics and copy elison of vectors should mean the elements of the local std::vector are in fact passed out of the object and into your local variable.
Crudely you can expect the move constructor of std::vector to be something like:
//This is not real code...
vector::vector(vector&& tomove){
elems=tomove.elems; //cheap transfer of elements - no copying of objects.
len=tomove.elems;
cap=tomove.cap;
tomove.elems=nullptr;
tomove.len=0;
tomove.cap=0;
}
Execute this code and notice the minimum number of objects are constructed and destructed.
#include <iostream>
#include <vector>
class Heavy{
public:
Heavy(){std::cout<< "Heavy construction\n";}
Heavy(const Heavy&){std::cout<< "Heavy copy construction\n";}
Heavy(Heavy&&){std::cout<< "Heavy move construction\n";}
~Heavy(){std::cout<< "Heavy destruction\n";}
};
std::vector<Heavy> build(size_t size){
std::vector<Heavy> result;
result.reserve(size);
for(size_t i=0;i<size;++i){
result.emplace_back();
}
return result;
}
int main() {
std::vector<Heavy> local=build(5);
std::cout<<local.size()<<std::endl;
return 0;
}
Move semantics and copy elison tend to take care of this problem C++11 onwards.
Expected output:
Heavy construction
Heavy construction
Heavy construction
Heavy construction
Heavy construction
5
Heavy destruction
Heavy destruction
Heavy destruction
Heavy destruction
Heavy destruction
Notice that I reserved capacity in the vector before filling it and used emplace_back to construct the objects straight into the vector.
You don't have to get the value passed to reserve exactly right as the vector will grow to accommodate values but it that will eventually lead to a re-allocation and move of all the elements which may be costly depending on whether you or the compiler implemented an efficient move constructor.

vector push_back calling copy_constructor more than once?

I am a bit confused with the way vector push_back behaves, with the following snippet I expected the copy constructor to be invoked only twice, but the output suggest otherwise. Is it a vector internal restructuring that results in this behaviour.
Output:
Inside default
Inside copy with my_int = 0
Inside copy with my_int = 0
Inside copy with my_int = 1
class Myint
{
private:
int my_int;
public:
Myint() : my_int(0)
{
cout << "Inside default " << endl;
}
Myint(const Myint& x) : my_int(x.my_int)
{
cout << "Inside copy with my_int = " << x.my_int << endl;
}
void set(const int &x)
{
my_int = x;
}
}
vector<Myint> myints;
Myint x;
myints.push_back(x);
x.set(1);
myints.push_back(x);

What happens:
x is inserted via push_back. One copy occurs: The newly created element is initialized with the argument. my_int is taken over as zero because xs default constructor initialized it so.
The second element is push_back'd; The vector needs to reallocate the memory since the internal capacity was reached. As no move constructor is implicitly defined for Myint1 the copy constructor is chosen; The first element is copied into the newly allocated memory (its my_int is still zero... so the copy constructor shows my_int as 0 again) and then x is copied to initialize the second element (as with the first in step 1.). This time x has my_int set to one and that's what the output of the copy constructor tells us.
So the total amount of calls is three. This might vary from one implementation to another as the initial capacity might be different. However, two calls are be the minimum.
You can reduce the amount of copies by, in advance, reserving more memory - i.e. higher the vectors capacity so the reallocation becomes unnecessary:
myints.reserve(2); // Now two elements can be inserted without reallocation.
Furthermore you can elide the copies when inserting as follows:
myints.emplace_back(0);
This "emplaces" a new element - emplace_back is a variadic template and can therefore take an arbitrary amount of arguments which it then forwards - without copies or moves - to the elements constructor.
1 Because there is a user-declared copy constructor.

You got it...it was the resizing. But I'll just point out that if you're doing some bean counting on your constructors, you might be interested in "emplacement":
#include <iostream>
#include <vector>
using namespace std;
class Myint
{
private:
int my_int;
public:
explicit Myint(int value = 0) : my_int(value)
{
cout << "Inside default " << endl;
}
Myint(const Myint& x) : my_int(x.my_int)
{
cout << "Inside copy with my_int = " << x.my_int << endl;
}
Myint(const Myint&& x) noexcept : my_int(x.my_int) {
cout << "Inside move with my_int = " << x.my_int << endl;
}
};
int main() {
vector<Myint> myints;
myints.reserve(2);
myints.emplace_back(0);
myints.emplace_back(1);
// your code goes here
return 0;
}
That should give you:
Inside default
Inside default
And, due to the noexcept on the move constructor...if you delete the reserve you'd get a move, not a copy:
Inside default
Inside default
Inside move with my_int = 0
There's no real advantage with this datatype of a move over a copy. But semantically it could be a big difference if your data type was more "heavy weight" and had a way of "moving" its members that was more like transferring ownership of some pointer to a large data structure.

When the size of the vector is increased with the second push_back, the existing contents of the vector must be copied to a new buffer. To verify, output myints.capacity() after the first push_back, it should be 1.

This depends on how much memory was reserved to an object of type std::vector. It seems that when push_back was first executed there was allocated memory only for one element. When the second time push_back was called the memory was reallocated to reserve memory for the second element. In this case the element that is already in the vector is copied in the new place. And then the second element is also added.
You could reserve enough memory yourself that to escape the second call of the copy constructor:
vector<Myint> myints;
myints.reserve( 2 );

You're correct in assuming the additional invocation of the copy constructor comes from internal restructuring of the vector.
See this answer for more detail: https://stackoverflow.com/a/10368636/3708904
Or this answer for the reason why copy construction is necessary: https://stackoverflow.com/a/11166959/3708904

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

does std::vector copy/move elements when re-sizing? - c++

Related

Why is the copy constructor called in this example with std::vector?

Strange side effect from a copy constructor

Copy Constructor called multiple times while doing push_back in a vector [duplicate]

How to pass the ownership of an object to the outside of the function

vector push_back calling copy_constructor more than once?

Categories

Resources