Does a vector re-create all object on a push_back()? - c++

I noticed that whenever I push_back() an object into a vector, all existing objects in the vector seem to be "re-created".
To test this I made the following test program down below. What the program does is to push_back() a few instances of MyClass into a vector and print the adresses of the current objects in the vector after each push_back(). The constructor and destructor prints a message with the adress of the object whenever an instance is created/destroyed.
#include "stdafx.h"
#include <vector>
#include <iostream>
using namespace std;
class MyClass {
public:
MyClass() {
cout << "Constructor: " << this << endl;
}
~MyClass() {
cout << "Destructor: " << this << endl;
}
};
void printAdresses(const vector<MyClass> & v) {
for (int i = 0; i < v.size(); i++) {
cout << &v[i] << endl;
}
}
int main()
{
vector<MyClass> v;
for (int i = 0; i < 4; i++) {
v.push_back(MyClass());
cout << "Contains objects: " << endl;
printAdresses(v);
cout << endl;
}
system("pause");
return 0;
}
Down below is the output I got from running the program. Every block is the output from one iteration of the loop.
Constructor: 00EFF6AF
Destructor: 00EFF6AF
Contains objects:
034425A8
Constructor: 00EFF6AF
Destructor: 034425A8
Destructor: 00EFF6AF
Contains objects:
034426F8
034426F9
Constructor: 00EFF6AF
Destructor: 034426F8
Destructor: 034426F9
Destructor: 00EFF6AF
Contains objects:
034424E8
034424E9
034424EA
Constructor: 00EFF6AF
Destructor: 034424E8
Destructor: 034424E9
Destructor: 034424EA
Destructor: 00EFF6AF
Contains objects:
034423F8
034423F9
034423FA
034423FB
Appart from the constructor and destructor of the item that was given as argument for the push_back(), destructors of other objects are called during a push_back() (but no corresponding constructors?). When taking a closer look at the adresses we see that the destructors belong to elements in the vector prior to the push_back(). Afterwards the vector contains other objects with different adresses, but the same number of elements (plus the newly push_backed).
What all this tells me is that all objects in the vector are destroyed and re-created after each push_back. If this is the case, why is that? It seems really inefficient when a vector gets too large. If it's not the case, then what exactly happens?

Vector pre-allocates some space.
For example, it can have an initial size of 2 elements. When you try to push the 3rd one, it re-allocates for 4 elements. When you try to push the 5th element, it re-allocates for 8 elements and so on.
That's why you have a .size() that reflects how many elements you have and a .capacity() that reflects what is the capacity (how many elements can be stored before a reallocation is needed).
You can manually handle that capacity by calling .reserve()
To answer your original question, reallocation requires copying all the elements from the previous array to a bigger one.
This requires to destroy the previous elements and to create some new ones.
If you saw the destructors calls but not the matching constructor calls, this is because it does not use the default constructor, but the copy constructor:
MyClass(const MyClass&)
This constructor takes one object has a parameter and uses it to construct and initialize a new object.
Moreover, as mentioned in the comments: even though the re-allocation/copy process can be seen as expensive, on average it is not and actually constitutes a constant time operation. This is because the cost of re-allocation and copy is amortized and future insertion because we do not re-allocate the exact space needed, but more space than necessary.

Related

Interesting extra destruction call during push_back in std::vector

I find the output of the following code very interesting.
class Name
{
string _name;
public:
Name(const string& name) : _name(name) { cout << "ctor of " << _name << endl; }
~Name(){ cout << "dtor of " << _name << endl; }
};
int main() {
vector<Name> list;
cout << "------------------START push_back" << endl;
list.push_back(Name(string("A")));
cout << "------------------push_back(A) performed..." << endl;
list.push_back(Name(string("B")));
cout << "------------------push_back(B) performed..." << endl;
list.push_back(Name(string("C")));
cout << "------------------push_back(C) performed..." << endl;
cout << "------------------END push_back" << endl;
return 0;
}
I can understand that push_back uses an extra temporary object which is why emplace_back is recommended for better performance. Can someone explain the extra destructor calls shown in the output below with (???) next to it.
------------------START push_back
ctor of A
dtor of A
------------------push_back(A) performed...
ctor of B
dtor of A(???)
dtor of B
------------------push_back(B) performed...
ctor of C
dtor of A(???)
dtor of B(???)
dtor of C
------------------push_back(C) performed...
------------------END push_back
dtor of A
dtor of B
dtor of C
------------------START emplace_back
ctor of A
------------------emplace_back(A) performed...
ctor of B
------------------emplace_back(B) performed...
ctor of C
------------------emplace_back(C) performed...
------------------END emplace_back()
dtor of A
dtor of B
dtor of C
Can someone explain the extra destructor calls shown in the output below with (???) next to it.
The size of a memory allocation cannot change. The size of an array correspondingly cannot change since the memory allocated for the array cannot change.
The elements of a vector are stored in a single block of allocated memory, forming an array. If the memory block cannot change size, then how is it possible to add elements to a vector? Well, what we can do is allocate a larger block of memory, then copy (move) the elements from the smaller block to the larger one, then destroy the old elements in the old block, and finally deallocate the old block. This is what a vector does when you add more elements than fit within the block of memory that it has allocated for its elements (which is called capacity).
Can someone explain the extra destructor calls shown in the output below with (???) next to it.
Those are the destroyed elements from the old memory when the vector reallocates.
In case you're wondering why there is no output from the construction of the objects in the new memory, that is because the move constructor of your class produces no output.
In case you're wondering why there is no extra destruction after every added element, that is because vector doesn't just allocate memory for a single element when growing. Instead, it multiplies the capacity by some factor (typically implementation use a factor of 2 or 1.5). This results in better asymptotic complexity for inserting high number of elements when the final number of elements isn't initially known.
std::vector is a dynamic array.
Every time there isn't enough memory allocated to store elements, std::vector is trying to reallocate for a greater chunk of memory to store more elements. After that it needs to copy (move) your elements to other chunk implicitly calling dtors for previous elements.
To understand what is happening try to extend your class:
class Name
{
string _name;
public:
Name(const string& name) : _name(name) { cout << "ctor of " << _name << endl; }
Name(const Name& other) : _name(other._name) { cout << "copy ctor for " << _name << endl; }
~Name(){ cout << "dtor of " << _name << endl; }
};
Rerun your program now and see how many copy ctors were called.
Now try to call
vector<Name> list;
list.reserve(3);
/* push_backs */
and enjoy :)
You can find more about vectors here and about vector member function reserve here

How do vector elements preserve their original address after a vector std::move?

As you can see in the output, the objects of the vector pre not only "moved" to the vector post, but also preserved their original address space in memory. What is really going on behind this move? Is this behaviour expected? Say I need to have a separate vector of pointers to these objects, is it safe to assume that after this move the objects will always have their original addresses?
Actually, I have a class containing a vector like this and the vector of pointers I mentioned as members. I have also deleted the copy ctors, and defined the move ones for the class.
#include <iostream>
#include <vector>
struct B {
int val = 0;
B(int aInt) : val(aInt) { };
};
int main() {
std::vector<B> pre;
pre.push_back(B(1));
pre.push_back(B(2));
std::cout << "pre-move:\t" << (void*)&pre.at(0) << '\n';
std::cout << "pre-move:\t" << (void*)&pre.at(1) << '\n';
std::vector<B> post(std::move(pre));
std::cout << "post-move:\t" << (void*)&post.at(0) << '\n';
std::cout << "post-move:\t" << (void*)&post.at(1) << '\n';
return 0;
}
Output:
pre-move: 0x1d7b150
pre-move: 0x1d7b154 <------|
post-move: 0x1d7b150 |
post-move: 0x1d7b154 <------|
A vector is basically nothing more than a pointer to heap-allocated memory, the current length and the current capacity of the vector.
By "moving" a vector, all you're doing is copying those values, and resetting the values of the moved-from vector.
For the data of the vector, it's basically equivalent to
original_pointer = some_place_in_memory;
new_pointer = original_pointer; // Copies the *value* of original_pointer
original_pointer = nullptr;
There's no need to allocate new memory and copy the data in the vector.
The whole point of the move operation is to avoid copying the elements, so if they got copied(there is no such thing as truly "moving" the memory) the move would be just a copy.
Vectors are usually implemented as 3 pointers: begin,end and capacity. All point to a dynamically-allocated array. Then moving the vector is just copying those three pointers and so the array and elements just change their owner.
I think it should be safe to assume that pointers to the elements remain valid.
It will be clear, if we write semantically equal code without std::vector:
B* pre = new B[2]; // Declare std::vector<B> and allocate some space to make the following line correct
B[0] = 1; // pre.push_back(B(1));
B[1] = 2; // pre.push_back(B(2));
B* post = pre; // std::vector<B> post(std::move(pre));
Actually, vector move boils down to pointer copying without reallocation. Data which the pointer points at remains in it's place, so addresses of vector elements do not change.
In this code example after the fourth line, both pre and post point to the same data with same address.
std::vector is a wrapper for a pointer to array with some additional functionality. So after doing std::vector<B> post(std::move(pre));, post will contain a pointer with the same value which was in pre.

Iterate through vector of objects c++

I've been struggling with this problem so I hope someone can help me.
I am creating a vector of objects and I would like to access elements of this vector but when I push back en element into the vector and try to access it, it doesn't work.
Thank you for your answers.
class reserve{
private:
vector<float> coordinates;
public:
reserve(const reserve &);
void addACoordinate(float);
void readCoordinates();
int getDim();
};
reserve::reserve(const reserve &w){
int i;
for(i=0; i<coordinates.size(); i++){
coordinates[i] = w.coordinates[i];
}
}
void reserve::addACoordinate(float c){
coordinates.push_back(c);
}
void reserve::readCoordinates(){
int i;
cout << "the coordinates are : ";
for(i=0; i<coordinates.size(); i++){
cout << coordinates[i] << " ";
}
cout << endl;
}
int reserve::getDim(){
return coordinates.size();
}
Then I create a reserve with 2 coordinates and push it into the reserves vector.
vector<reserve> reserves;
reserve r;
r.addACoordinate(1.9);
r.addACoordinate(1.9);
r.getDim();
reserves.push_back(r);
cout << "size of reserves " << reserves.size() << endl;
for (vector<reserve>::iterator k = reserves.begin(); k != reserves.end(); ++k) {
k->getDim();
}
But the output of the iteration is 0, instead of 2. I don't understand why I am not accessing the reserve r.
The shown copy constructor is completely broken:
reserve::reserve(const reserve &w){
int i;
for(i=0; i<coordinates.size(); i++){
The newly-constructed coordinates class member is always completely empty when the copy-constructor runs. It obviously won't magically have any content right from the start, so coordinates.size() will always be zero here. The copy constructor will never actually copy the element that's being used to copy from, and the alleged "copy" of it will always have an empty vector. And changing this to w.coordinates.size() will result in memory corruption, unless you also replace the assignment in the loop with a push_back(). Speaking of push_back()...
reserves.push_back(r);
push_back() attempts to copy r into the vector, here. That's it's job. Guess what's going to be used to do that? Why, the broken copy constructor, of course. Even if r has an initialized coordinates vector, the broken copy constructor will ensure that reserves will end up with a class instance with a completely empty coordinates vector.
You can completely get rid of the copy constructor, here. It serves no useful purpose, and the default copy constructor will do everything correctly.
If you need to manually implement a copy constructor for your homework assignment, fix it so it properly push_back()s the values from the class instance being copied from, or manually copies the class members itself, whatever you need to do the job.

Dynamic memory allocation in Vector

I have a doubt regarding memory allocation in vector(STL - C++). As far as I know, its capacity gets doubled dynamically every time the size of vector gets equal to its capacity. If this is the case, how come the allocation be continuous? How does it still allow to use the [] access operator for O(1) access just like arrays? Can anyone explain this behavior?
(List also has dynamic memory allocation but we cannot access its elements using [] access operator, how is it still possible with vector? )
#include<iostream>
#include<vector>
using namespace std;
int main() {
// your code goes here
vector<int> v;
for(int i=0;i<10;i++){
v.push_back(i);
cout<<v.size()<<" "<<v.capacity()<<" "<<"\n";
}
return 0;
}
Output:
1 1
2 2
3 4
4 4
5 8
6 8
7 8
8 8
9 16
10 16
As far as I know, its capacity gets doubled dynamically every time the size of vector gets equal to its capacity.
It does not need to double like in your case, it's implementation defined. So it may differ if you use another compiler.
If this is the case, how come the allocation be continuous?
If there is no more continuous memory which the vector could allocate, the vector has to move it's data to a new continuous memory block which meets it's size requirements. The old block will be marked as free, so that other can use it.
How does it still allow to use the [] access operator for O(1) access just like arrays?
Because of the facts said before the access will be possible with the [] operator or a pointer + offset. The access to the data will be O(1).
List also has dynamic memory allocation but we cannot access its elements using [] access operator, how is it still possible with vector?
A list (std::list for example) is totally different from a std::vector. In the case of a C++ std::list it saves nodes with data, a pointer to the next node and a pointer the previous node (double-linked list). So you have to walk through the list to get one specific node you want.
Vectors work like said above.
The vector has to store the objects in one continuous memory area. Thus when it needs to increase its capacity, it has to allocate a new (larger) memory area (or expand the one it already has, if that's possible), and either copy or move the objects from the "old, small" area to the newly allocated one.
This can be made apparent by using a class with a copy/move constructor with some side effect (ideone link):
#include <iostream>
#include <vector>
using std::cout;
using std::endl;
using std::vector;
#define V(p) static_cast<void const*>(p)
struct Thing {
Thing() {}
Thing(Thing const & t) {
cout << "Copy " << V(&t) << " to " << V(this) << endl;
}
Thing(Thing && t) /* noexcept */ {
cout << "Move " << V(&t) << " to " << V(this) << endl;
}
};
int main() {
vector<Thing> things;
for (int i = 0; i < 10; ++i) {
cout << "Have " << things.size() << " (capacity " << things.capacity()
<< "), adding another:\n";
things.emplace_back();
}
}
This will lead to output similar to
[..]
Have 2 (capacity 2), adding another:
Move 0x2b652d9ccc50 to 0x2b652d9ccc30
Move 0x2b652d9ccc51 to 0x2b652d9ccc31
Have 3 (capacity 4), adding another:
Have 4 (capacity 4), adding another:
Move 0x2b652d9ccc30 to 0x2b652d9ccc50
Move 0x2b652d9ccc31 to 0x2b652d9ccc51
Move 0x2b652d9ccc32 to 0x2b652d9ccc52
Move 0x2b652d9ccc33 to 0x2b652d9ccc53
[..]
This shows that, when adding a third object to the vector, the two objects it already contains are moved from one continuous area (look at the by 1 (sizeof(Thing)) increasing addresses to another continuous area. Finally, when adding the fifth object, you can see that the third object was indeed placed directly after the second.
When does it move and when copy? The move constructor is considered when it is marked as noexcept (or the compiler can deduce that). Otherwise, if it would be allowed to throw, the vector could end up in a state where some part of its objects are in the new memory area, but the rest is still in the old one.
The question should be considered at 2 different levels.
From a standard point of view, it is required to provided a continuous storage to allow the programmer to use the address of its first element as the address of the first element of an array. And it is required to let its capacity grow when you add new elements by reallocation still keeping previous elements - but their address may change.
From an implementation point of view, it can try to extend the allocated memory in place and, if it cannot, allocate a brand new piece of memory and move or copy construct existing elements in the new allocated memory zone. The size increase is not specified by the standard and is left to the implementation. But you are right, doubling allocated size on each time is the common usage.

Vector erase function deleting wrong object

I have a vector declared as:
vector<Curve> workingSet;
Curve is a class I've created that contains a string "name" and an array of structs, dynamically allocated by the constructor.
I have a loop that is supposed to delete 2 (out of 4) items from the vector.
vector<Curve>::iterator it = workingSet.begin();
//remove tbi's children from the working set
for ( ; it != workingSet.end();){
if (it->thisName == tbi.childAName ||it->thisName == tbi.childBName)
it= workingSet.erase(it);
else
++it;
}
When the debugger reaches the .erase(it) call I can see that the 'it' iterator is pointing to curve 2 in the vector. This is good; I want curve 2 to be removed from the vector.
I'm then taken by the debugger to the destructor (I have a breakpoint there), which presumably should be destructing curve 2. But when I look at the 'this' watch, I can see that the curve being destructed is curve 4! The destructor then proceeds to 'delete[]' the array in the object, as required and sets the array pointer to NULL.
When the debugger returns to the program, having completed the erase() call, I can see that vector 2 has been removed from the array and curve 4 is still there. Curve 4's array pointer still points to the same location as before, but the memory has been deallocated and the array contains garbage.
Can anyone suggest why curve 4 is being messed with?
N.b. (1) There is a copy constructor for the curve class, which does a 'deep' copy.
N.b. (2) There is more to the class/program than I've mentioned here
Btw, the array pointers in curves 2 and 4 point to different locations and contain different values, according to the debugger.
Edit: I've now implemented copy assignment. Now the correct item seems to be being erased from the vector, but the wrong destructor is still being called! However, when the debugger returns to the array curve 4 is still intact.
When an item is erased from the vector, all elements behind it are shifted towards the front to fill the empty space. If your compiler doesn't support move yet, it's done by copying all the elements, and the last item in the vector, which is now copied to the item before it, is a duplicate and is getting deleted.
At least that's how it should work.
It appears to me that vector::erase cannot be used with a vector of locally constructed non-trivial data types (objects not constructed on the heap, the proper name escapes me right now). I have the exact same behavior you describe, the last element gets destructed twice (especially dangerous if your object has memory that is freed by the destructor) and the element that you removed is never destructed. I do not know why this happens, but it is a pitfall to watch out for.
Here is one way to fix the problem:
#include <iostream>
#include <vector>
#include <memory>
using namespace std;
class MyClass
{
public:
int *x;
MyClass(int x)
{
cout << "MyClass Constructor " << x << endl;
this->x = new int(x);
}
MyClass(const MyClass& obj)
{
this->x = new int(*obj.x);
cout << "copy constructor " << *this->x << endl;
}
~MyClass()
{
cout << "MyClass Destructor " << *x << endl;
delete x;
}
};
int main(int argc, char* argv[])
{
// incorrect
vector<MyClass> bad_vect;
for(int i=0;i<3;i++){
bad_vect.push_back(MyClass(i));
// causes a bunch of copying to happen.
// std::move does not seem to fix this either
// but in the end everything gets constructed as we'd like
}
cout << " ---- " << endl;
bad_vect.erase(bad_vect.begin() + 1); // we expect this to remove item with x = 1 and destruct it.
// but it does NOT do that, it does remove the item with x=1 from the vector
// but it destructs the last item in the vector, with x=2, clearly
// not what we want. The destructor for object with x=1 never gets called
// and the destructor for the last item gets called twice.
// The first time with x=2 and since that memory is freed, the 2nd time
// it prints garbage. Strangely the double-free doesn't crash the prog
// but I wouldn't count on that all the time.
// Seems some parts of STL have pitfalls with locally constructed objects
cout << " ------------ " << endl;
// below fixes this
vector<unique_ptr<MyClass> >vect;
for(int i=0;i<3;i++){
unique_ptr<MyClass> ptr(new MyClass(i));
vect.push_back( move( ptr )); // move is required since unique_ptr can only have one owner
// or the single one-liner below
//vect.push_back( move( unique_ptr<MyClass>(new MyClass(i)) ));
}
// the above prints out MyClass Constructor 0,1,2, etc
vect.erase(vect.begin() + 1); // remove the 2nd element, ie with x=1
// the above prints out MyClass Destructor 1, which is what we want
for(auto& v : vect){
cout << *(v->x) << endl;
} // prints out 0 and 2
return 0; // since we're using smart pointers, the destructors for the
// remaining 2 objects are called. You could use regular pointers
// but you would have to manually delete everything. shared_ptr
// also works and you don't need the move(), but with a bit more overhead
}