How do vector elements preserve their original address after a vector std::move? - c++

As you can see in the output, the objects of the vector pre not only "moved" to the vector post, but also preserved their original address space in memory. What is really going on behind this move? Is this behaviour expected? Say I need to have a separate vector of pointers to these objects, is it safe to assume that after this move the objects will always have their original addresses?
Actually, I have a class containing a vector like this and the vector of pointers I mentioned as members. I have also deleted the copy ctors, and defined the move ones for the class.
#include <iostream>
#include <vector>
struct B {
int val = 0;
B(int aInt) : val(aInt) { };
};
int main() {
std::vector<B> pre;
pre.push_back(B(1));
pre.push_back(B(2));
std::cout << "pre-move:\t" << (void*)&pre.at(0) << '\n';
std::cout << "pre-move:\t" << (void*)&pre.at(1) << '\n';
std::vector<B> post(std::move(pre));
std::cout << "post-move:\t" << (void*)&post.at(0) << '\n';
std::cout << "post-move:\t" << (void*)&post.at(1) << '\n';
return 0;
}
Output:
pre-move: 0x1d7b150
pre-move: 0x1d7b154 <------|
post-move: 0x1d7b150 |
post-move: 0x1d7b154 <------|

A vector is basically nothing more than a pointer to heap-allocated memory, the current length and the current capacity of the vector.
By "moving" a vector, all you're doing is copying those values, and resetting the values of the moved-from vector.
For the data of the vector, it's basically equivalent to
original_pointer = some_place_in_memory;
new_pointer = original_pointer; // Copies the *value* of original_pointer
original_pointer = nullptr;
There's no need to allocate new memory and copy the data in the vector.

The whole point of the move operation is to avoid copying the elements, so if they got copied(there is no such thing as truly "moving" the memory) the move would be just a copy.
Vectors are usually implemented as 3 pointers: begin,end and capacity. All point to a dynamically-allocated array. Then moving the vector is just copying those three pointers and so the array and elements just change their owner.
I think it should be safe to assume that pointers to the elements remain valid.

It will be clear, if we write semantically equal code without std::vector:
B* pre = new B[2]; // Declare std::vector<B> and allocate some space to make the following line correct
B[0] = 1; // pre.push_back(B(1));
B[1] = 2; // pre.push_back(B(2));
B* post = pre; // std::vector<B> post(std::move(pre));
Actually, vector move boils down to pointer copying without reallocation. Data which the pointer points at remains in it's place, so addresses of vector elements do not change.
In this code example after the fourth line, both pre and post point to the same data with same address.
std::vector is a wrapper for a pointer to array with some additional functionality. So after doing std::vector<B> post(std::move(pre));, post will contain a pointer with the same value which was in pre.

Related

C++ : Size of the array didn't change even after deleting it

Assuming I have this program below
The size of the array is 3 and basically when deleting it and then try to print the size again , I still get the value 3 .
Why is that ?
#include <vector>
#include <iostream>
int main(){
std::vector<int*> v;
v.push_back(new int{1});
v.push_back(new int{2});
v.push_back(new int{3});
std::cout << v.size() << '\n';
for (auto iptr : v){
delete iptr;
}
std::cout << v.size() << '\n';
return 0;
}
Because you deleted whatever was pointed by the pointer
vector<CPerson *> Persons;
As far as the vector is concerned, it still has 3 pointers pointing somewhere.Beware, you are not permitted to dereference them pointers, otherwise you will invoke undefined behavior).
You have a vector of pointers. Using delete on a pointer in that vector will delete the thing the pointer is pointing at. It does not remove the pointer itself from the vector, and so does not change the size of the vector. Why should it? How could it?
Here the code you were looking for (I guess):
for (int i = 0; i < Persons.size(); i++)
{
CPerson *Student = Persons[i];
if (Student)
{
Persons.erase(Persons.begin() + i);
delete Student;
i--;
}
}
Note:
vector::erase is (one way) to remove items from a vector.
The cast is unnecessary, because your base class has a virtual destructor.
Erasing items from a vector while you are iterating through it is tricky because the size of the vector changes when you erase an item and all the later items in the vector move down one place, that why I have i-- to compensate, without that you will at least skip items and at worst iterate off the end of the vector.
When you delete the pointers that are in the vector, it just deletes the memory that you assigned to them with new, it doesn't do anything about the fact that you have a pointer, there for you are cleaning up memory that is outside of the vector, the vector will clean it self up at the end of the scope. (In this case, the end of the function.) Just remember, the vector doesn't clean up your new memory, you have to clean that up.

How to dynamically store address of a QVector<T> in another QVector<QVector<T>*>?

I am working with large amounts of data, something like 100,000 double values, being gathered every 100 milliseconds or so. I have to store 25 or even more of these generated data in my software at any given time. Both, the data length on each acquisition as well as the number of acquisitions varies depending on the current situation. I do not want to store my data as QVector<QVector<double>> because QVector stores data in adjacent memory locations and each addition of QVector<double> to QVector<QVector<double>> results in the entire thing being copied to a new location given, resulting in huge lags (especially if there is already 20 QVector<double> present and I am adding the 21st one to it).
My solution is to store the given data as QVector<QVector<double>*> i.e. each data acquisition I am just storing the address to the QVector for the current acquisition. However, I want it to be stored in QVector<QVector<double>*> independently i.e. as long as I don't clear the pointer I am able to access the data. I am unable to figure out how to do this. I have tried multiple methods. Let's say I declare QVector<QVector<double>*> myVec; in my .h file and I use a function to new add data to it on each acquisition like this (below function is just for testing, that's why I am creating data inside the function):
void MainWindow::addData()
{
myVec << &QVector<double>(0) // Creating a new empty vector and storing it's location
*myVec[0] << 2.7 << 3.4 << 4.5;
}
This doesn't work, most of the time it throws an exception, and a few times it just shows some garbage value.
void MainWindow::addData()
{
QVector<double> tempVec; // Create a temprory vector
tempVec << 2.7 << 3.4 << 4.5;
myVec << &tempVec;
}
This also doesn't work, of course, because tempVec is destroyed as soon as addData() is exited.
void MainWindow::addData()
{
QVector<double> tempVec; // Create a temprory vector
tempVec << 2.7 << 3.4 << 4.5;
myVec << &QVector(tempVec);
}
I thought this would work as I am not storing the address of tempVec but rather copying tempVec in a new QVector and assigning its address to myVec, but it is also throwing an exception.
Is there any way I can store QVector purely as pointers inside another QVector ?
You're storing addresses to objects that are then destroyed. That means your vector of pointers contains dangling pointers:
void MainWindow::addData()
{
myVec << &QVector<double>(0);
// the temporary vector is destroyed here and myVec contains a
// dangling pointer
*myVec[0] << 2.7 << 3.4 << 4.5;
}
I'm surprised your compiler even accepts this code, since taking the address of a temporary should be a compilation error. Naming the temporary allows it to compile, but it's not any better, since you still get a dangling pointer:
void MainWindow::addData()
{
QVector<double> tempVec(0);
myVec << &tempVec;
*myVec[0] << 2.7 << 3.4 << 4.5;
} // <-- tempVec is destroyed here, so you get a dangling pointer
If you store a pointer to an object, you need to make sure this object still exists whenever you access that pointer.
One way around this would be to allocate each vector with new before storing its address. But then you need to remember to delete it when you no longer need the pointers. Instead of that, you can use a smart pointer instead. std::unique_ptr in this case would do the job, but unfortunately QVector cannot store non-copyable elemens. So you'd need std::shared_ptr. This way, you don't need to store the pointers anywhere else but myVec:
#include <memory>
QVector<std::shared_ptr<QVector<double>>> myVec;
// ...
void MainWindow::addData()
{
myVec << std::make_shared<QVector<double>>(0);
*myVec[0].get() << 2.7 << 3.4 << 4.5;
}
If you don't absolutely require QVector and can do with std::vector instead, I'd recommend that, since it can store std::unique_ptr elements just fine:
#include <memory>
#include <vector>
std::vector<std::unique_ptr<QVector<double>>> myVec;
// ...
void MainWindow::addData()
{
myVec.push_back(std::make_unique<QVector<double>>(0));
*myVec[0].get() << 2.7 << 3.4 << 4.5;
}
If you really need to store raw pointers and can't use smart pointers, then I'm afraid you need to manage the pointed-to objects yourself by allocating them with new and then once you're done with myVec and no longer need it, clean up with delete:
QVector<QVector<double>*> myVec;
// ...
void MainWindow::addData()
{
myVec << new QVector<double>(0);
*myVec[0] << 2.7 << 3.4 << 4.5;
}
// Cleanup code you need to call when you no longer need myVec
for (auto ptr : myVec) {
delete ptr;
}
Keep in mind that you also need to delete a pointer if you remove it from myVec or override it with a different pointer. This is tedious and error prone. When you forget to do it, you leak memory. So I recommend going the unique_ptr route instead, which will automatically delete the pointer for you.
Finally, and this might be preferable in your case, you can avoid pointers altogether and move the vector into myVec by casting it to an rvalue using std::move(). Not sure how well QVector works with move semantics, but std::vector does the job just fine, in which case you'd use std::vector for both "inner" and "outer" vectors. As an optimization, if you know the amount of doubles you need to store, you can pre-allocate with reserve() to avoid costly re-allocations when you add elements:
#include <memory>
#include <vector>
std::vector<std::vector<double>> myVec;
// ...
void MainWindow::addData()
{
std::vector<double> tmpVec;
tmpVec.reserve(amount_of_elements);
tmpVec.push_back(2.4);
tmpVec.push_back(3.7);
// etc until filled
myVec.push_back(std::move(tmpVec));
}
With moving, no allocations or copies are performed, if the objects involved support move semantics. As mentioned, std::vector does.
Very important: after an object has been "moved from" (tmpVec in this case), it's in a valid but unspecified state (usually empty.)

"Moving" sequential containers to pointers

I'm building a buffer for network connections where you can explicitly allocate memory or you can supply it on your own via some sequential container(eg.:std::vector,std::array)these memory chunks are stored in a list what we use later for read/write operations. (the chunks are needed for handle multiple read/write requests)
I have a question about the last part, I want to make a pointer to the container's data and then tell the container to not care about it's data anymore.
So something like move semantics.
std::vector<int> v = {9,8,7,6,5,4,3,2,1,0};
std::vector<int> _v(std::move(v));
Where _v has all the values of v and v left in a safe state.
The problem is if I just make a pointer for v.data() after the lifetime of the container ends, the data pointed by the pointer releases with the container.
For example:
// I would use span to make sure It's a sequential container
// but for simplicity i use a raw pointer
// gsl::span<int> s;
int *p;
{
std::vector<int> v = {9,8,7,6,5,4,3,2,1,0};
// s = gsl::make_span(v);
p = v.data();
}
for(int i = 0; i < 10; ++i)
std::cout << p[i] << " ";
std::cout << std::endl;
Now p contains some memory trash and i would need the memory previously owned by the vector.
I also tried v.data() = nullptr but v.data() is rvalue so it's not possible to assign it. Do you have any suggestions, or is this possible?
edit.:
To make it more clear what i'm trying to achieve:
class readbuf_type
{
struct item_type // representation of a chunk
{
uint8_t * const data;
size_t size;
inline item_type(size_t psize)
: size(psize)
, data(new uint8_t[psize])
{}
template <std::ptrdiff_t tExtent = gsl::dynamic_extent>
inline item_type(gsl::span<uint8_t,tExtent> s)
: size(s.size())
, data(s.data())
{}
inline ~item_type()
{ delete[] data; }
};
std::list<item_type> queue; // contains the memory
public:
inline size_t read(uint8_t *buffer, size_t size); // read from queue
inline size_t write(const uint8_t *buffer, size_t size); // write to queue
inline void *get_chunk(size_t size)
{
queue.emplace_back(size);
return queue.back().data;
}
template <std::ptrdiff_t tExtent = gsl::dynamic_extent>
inline void put_chunk(gsl::span<uint8_t,tExtent> arr)
{
queue.emplace_back(arr);
}
} readbuf;
I have the get_chunkfunction what basically just allocates memory with the size, and I have put_chunk what I'm struggling with, the reason i need this because before you can write to this queue you need to allocate memory and then copy all the elements from the buffer(vector,array) you're trying to write from to the queue.
Something like:
std::vector<int> v = {9,8,7,6,5,4,3,2,1,0};
// instead of this
readbuf.get_chunk(v.size);
readbuf.write(v.data(), v.size());
// we want this
readbuf.put_chunk({v});
Since we're developing for distributed systems memory is crucial and that's why we want to avoid the unnecessary allocation, copying.
ps.This is my first post, so sorry if i wasn't precise in the first place..
No, it is not possible to "steal" the buffer of the standard vector in the manner that you suggest - or any other standard container for that matter.
You've already shown one solution: Move the buffer into another vector, instead of merely taking the address (or another non-owning reference) of the buffer. Moving from the vector transfers the ownership of the internal buffer.
It would be possible to implement such custom vector class, whose buffer could be stolen, but there is a reason why vector doesn't make it possible. It can be quite difficult to prove the correctness of your program if you release resources willy-nilly. Have you considered how to prevent the data from leaking? The solution above is much simpler and easier to verify for correctness.
Another approach is to re-structure your program in such way that no references to the data of your container outlive the container itself (or any invalidating operation).
Unfortunately the memory area of the vector cannot be detached from the std::vector object. The memory area can be deleted even if you insert some data to the std::vector object. Therefore use of this memory area later is not safe, unless you are sure that this particular std::vector object exists and is not modified.
The solution to this problem is to allocate a new memory area and copy the content of the vector to this newly allocated memory area. The newly allocated memory area can be safely accessed without worrying about the state of the std::vector object.
std::vector<int> v = {1, 2, 3, 4};
int* p = new int[v.size()];
memcpy(p, v.data(), sizeof(int) * v.size());
Don't forget to delete the memory area after you are finished using this memory area.
delete [] p;
Your mistake is in thinking that the pointer "contains" memory. It doesn't contain anything, trash or ints or otherwise. It is a pointer. It points to stuff. You have deleted that stuff and not transferred it anywhere else, so it can't work any more.
In general, you will need a container to put this information in, be it another vector, or even your own hand-made array. Just having a pointer to data does not mean you have data.
Furthermore, since it is impossible to ask a vector to relinquish its buffer to a non-vector thing, a vector is really your only chance in this particular case. It's not quite clear why that's not a good enough solution for you. :)
Not sure what you try to achieve but I would use moving semantic like this:
#include <iostream>
#include <memory>
#include <vector>
int main() {
std::unique_ptr<std::vector<int>> p;
{
std::vector<int> v = {9,8,7,6,5,4,3,2,1,0};
p = std::move(make_unique<std::vector<int>>(v));
}
for(int i = 0; i < 10; ++i)
std::cout << (*p)[i] << " ";
std::cout << std::endl;
return 0;
}

Does a vector re-create all object on a push_back()?

I noticed that whenever I push_back() an object into a vector, all existing objects in the vector seem to be "re-created".
To test this I made the following test program down below. What the program does is to push_back() a few instances of MyClass into a vector and print the adresses of the current objects in the vector after each push_back(). The constructor and destructor prints a message with the adress of the object whenever an instance is created/destroyed.
#include "stdafx.h"
#include <vector>
#include <iostream>
using namespace std;
class MyClass {
public:
MyClass() {
cout << "Constructor: " << this << endl;
}
~MyClass() {
cout << "Destructor: " << this << endl;
}
};
void printAdresses(const vector<MyClass> & v) {
for (int i = 0; i < v.size(); i++) {
cout << &v[i] << endl;
}
}
int main()
{
vector<MyClass> v;
for (int i = 0; i < 4; i++) {
v.push_back(MyClass());
cout << "Contains objects: " << endl;
printAdresses(v);
cout << endl;
}
system("pause");
return 0;
}
Down below is the output I got from running the program. Every block is the output from one iteration of the loop.
Constructor: 00EFF6AF
Destructor: 00EFF6AF
Contains objects:
034425A8
Constructor: 00EFF6AF
Destructor: 034425A8
Destructor: 00EFF6AF
Contains objects:
034426F8
034426F9
Constructor: 00EFF6AF
Destructor: 034426F8
Destructor: 034426F9
Destructor: 00EFF6AF
Contains objects:
034424E8
034424E9
034424EA
Constructor: 00EFF6AF
Destructor: 034424E8
Destructor: 034424E9
Destructor: 034424EA
Destructor: 00EFF6AF
Contains objects:
034423F8
034423F9
034423FA
034423FB
Appart from the constructor and destructor of the item that was given as argument for the push_back(), destructors of other objects are called during a push_back() (but no corresponding constructors?). When taking a closer look at the adresses we see that the destructors belong to elements in the vector prior to the push_back(). Afterwards the vector contains other objects with different adresses, but the same number of elements (plus the newly push_backed).
What all this tells me is that all objects in the vector are destroyed and re-created after each push_back. If this is the case, why is that? It seems really inefficient when a vector gets too large. If it's not the case, then what exactly happens?
Vector pre-allocates some space.
For example, it can have an initial size of 2 elements. When you try to push the 3rd one, it re-allocates for 4 elements. When you try to push the 5th element, it re-allocates for 8 elements and so on.
That's why you have a .size() that reflects how many elements you have and a .capacity() that reflects what is the capacity (how many elements can be stored before a reallocation is needed).
You can manually handle that capacity by calling .reserve()
To answer your original question, reallocation requires copying all the elements from the previous array to a bigger one.
This requires to destroy the previous elements and to create some new ones.
If you saw the destructors calls but not the matching constructor calls, this is because it does not use the default constructor, but the copy constructor:
MyClass(const MyClass&)
This constructor takes one object has a parameter and uses it to construct and initialize a new object.
Moreover, as mentioned in the comments: even though the re-allocation/copy process can be seen as expensive, on average it is not and actually constitutes a constant time operation. This is because the cost of re-allocation and copy is amortized and future insertion because we do not re-allocate the exact space needed, but more space than necessary.

Order of inserting elements into a 2D Vector in c++

I was practicing with C++ vectors, and found a problem when I was inserting elements into a 2D vector. In the following example:
#include <iostream>
#include <vector>
void fillVector(std::vector<std::vector<int> > &vec) {
std::vector<int> list1;
vec.push_back(list1);
list1.push_back(1);
list1.push_back(2);
std::vector<int> list2;
list2.push_back(3);
list2.push_back(4);
vec.push_back(list2);
return;
}
int main() {
std::vector<std::vector<int> > vect;
fillVector(vect);
std::cout << "vect size: " << vect.size() << std::endl;
for(int i = 0; i < vect.size(); i++) {
std::cout << "vect in size: " << vect.at(i).size() << std::endl;
}
}
the size of the first inner-list is 0, and the size of the second inner-list is 2. The only difference between list1 and list2 is that list1 is first inserted into the vec 2D vector, before elements are inserted into it, while elements are first inserted into list2, before it is itself inserted into the 2D vector. After returning from the function, the elements inserted into list1 are not printed, and its size remains the same.
I also attempted the first method with pointers instead,
std::vector<int> *list3 = new std::vector<int>();
vec.push_back(*list3);
list3->push_back(5);
list3->push_back(6);
However, the size of list3 when read from the calling function is still 0.
I don't understand the difference between the two approaches. Why does the list have to appended after it's elements are inserted?
It almost seems like you are expecting python-like behavior? In any case, in C++ the distinction between references, pointers, and values is very important.
Your fillVector function has the right idea, as it takes a reference to a 2D vector std::vector<std::vector<int> > &vec - notice the &. However, when you create list1, and use push_back() right away
std::vector<int> list1;
vec.push_back(list1);
you are pushing the empty vector. push_back() will create a copy of this vector, which will be contained in vect (in main), and is a completely separate vector from list1.
At this point, if you want to access the vector already pushed, you can use back(), which returns a reference to the last element in the vector vec, that is, the last one pushed.
vec.back().push_back(1);
vec.back().push_back(2);
list2 you modify before pushing back, so when the copy is made, it is made of the already modified vector. Your attempt with list3 doens't really change things much, you dereference the pointer when you push_back() and a copy is made all the same. You could make vect be std::vector<std::vector<int>*>, but I'd strongly advice against it, as you have to do manual memory management - using new.
Note: While it's important for you to learn at some point, you should really try to avoid using pointers whenever possible, specially RAW pointers (look at smart pointers instead). std::vector, and all other std containers I know of, do their own memory management - they are sure to do it more efficiently than you, and BUG FREE.
I would suggest that you simply work on the last vector pushed, as such:
void fillVector(std::vector<std::vector<int> > &vec) {
vec.push_back(std::vector<int>());
vec.back().push_back(1);
vec.back().push_back(2);
vec.push_back(std::vector<int>());
vec.back().push_back(3);
vec.back().push_back(4);
return;
}
as you can see it's pretty much the same code repeated twice, so you can easily loop to get this or other results.
vector.push_back(var) makes a copy of var and inserts it into the vector. If you use push_back() on an empty list, it copies the empty list into the vector. Changing values in the list after this does not affect the copy that was inserted into the vector. This is because you are passing an actual object to push_back(), not a pointer to an object.
In the third example, you take a step in the right direction, but you de-reference the list before you pass it in, so push_back() makes a copy of what is at that address.
A simple solution to the problem is to always set your values before you insert the list into the vector.
If you wish to be able to change the values after the list is inserted, use vect.at(i).push_back(val) to add a value to the list at i.
You could also make the vector contain pointers to other vectors, rather than the vectors themselves:
void fillVector(std::vector<std::vector<int> *> &vec) {
std::vector<int> *list1 = new std::vector<int>(); //Remember to allocate memory since we're using pointers now
list1->push_back(1);
list1->push_back(2);
vec.push_back(list1); // Copy the pointer that is list1 into vec
std::vector<int> *list2 = new std::vector<int>();
vec.push_back(list2); // Copy the pointer that is list2 into vec
list2->push_back(3);
list2->push_back(4);
return;
}
int main() {
std::vector<std::vector<int> *> vect; // Vector of pointers to vectors
fillVector(vect);
std::cout << "vect size: " << vect.size() << std::endl;
for(int i = 0; i < vect.size(); i++) {
std::cout << "vect in size: " << vect.at(i)->size() << std::endl;
}
}
std::vector<std::vector<int> *> vec = new; // Vector of pointers
When you put something into a std::vector, the vector stores a copy. Manipulating the original will have no effect on the copy. If you put a pointer into a vector of pointers the vector still stores a copy of the pointer. Both the original and the copy in the vector point at the same memory, so you can manipulate the referenced data and see a change in the referenced data.
So...
std::vector<int> list1;
vec.push_back(list1);
list1.push_back(1);
list1.push_back(2);
puts a copy of the emptylist1 into vec. Then copies of 1 and 2 are placed into the original list1. The copy of list1 in vec is unaffected.
Writing this as
std::vector<int> list1;
vec.push_back(list1);
vec.back().push_back(1);
vec.back().push_back(2);
will correct this. As will a slightly cleaner version
vec.push_back(std::vector<int>());
vec.back().push_back(1);
vec.back().push_back(2);
as it doesn't have a waste list1 hanging around cluttering up the scope.
And
vec.push_back(std::vector<int>{1,2});
will simplify even further if your compiler supports C++11 or better.
On the other hand...
std::vector<int> list2;
list2.push_back(3);
list2.push_back(4);
vec.push_back(list2);
puts copies of 3 and 4 into list2 and then puts a copy of list2, complete with copies of the copies of 3 and 4.
Similar to above,
std::vector<int> list2{3,4};
vec.push_back(list2);
can reduce the workload.
Unfortunately your experiment with list3 fails because while list3 is a pointer vec does not hold a pointer, so list3 is dereferenced and the vector referenced is copied. No pointer to the data list3 references is stored, and vec contains an empty vector for the same reason as above.
std::vector<int> *list3 = new std::vector<int>();
vec.push_back(*list3);
list3->push_back(5);
list3->push_back(6);
A few notes on storing pointers in vectors
The vector only stores a copy of the pointer. The data pointed at must be scoped in such a way that it will not be destroyed before the vector is done with it. One solution is to dynamically allocate the storage.
(This applies to dynamically allocated pointers in general) If you dynamically allocate, sooner or later someone has to clean up the mess and delete those pointers. Look into storing smart pointers rather than raw pointers and not storing pointers at all.
Familiarize yourself with the Rule of Three. vector looks after itself, but if you have two copies of the vector and you remove and delete a pointer from only one of them, you're going to have some debugging to do.