This question already has answers here:
Iterator invalidation rules for C++ containers
(6 answers)
Closed 6 years ago.
I filled a vector with A objects, then stored these objects address in a multimap [1], but the print message shows that the reference to the object stored in the vector changed [2]. Do you see why? and how avoid any changes.
//[1]
vector<A> vec;
multimap<const A*, const double > mymultimap;
for (const auto &a : A) {
double val = a.value();
vec.push_back(a);
mymultimap.insert(std::pair<const A*, const double >( &vel.back(), val));
// displaying addresses while storing them
cout<<"test1: "<<&vec.back()<<endl;
}
//[2]
// displaying addresses after storing them
for(auto &i : vec)
cout << "test2: " << &i <<endl;
Results:
test1: 0x7f6a13ab4000
test1: 0x7f6a140137c8
test2 :0x7f6a14013000
test2 :0x7f6a140137c8
You are calling vec.push_back(a) within your for loop. Therefore the vector may re-allocate the underlying array if it runs out of space. Therefore the address of the previous elements are no longer valid if they were copied to the new memory location.
For example say you allocated 3 elements and stored their addresses. After pushing back a 4th element the vector has to reallocate. That means the first 3 elements will be copied to a new location, then the 4th will be added after that. Therefore the address you had stored for the first 3 are now invalid.
Iterators (and references and object adresses) are not guaranteed to be preserved when you call vector<T>::push_back(). If the new size() is greater than current capacity(), a reallocation will happen and all the elements are moved or copied to the new location.
To avoid this, you can call reserve() before you start inserting.
One of the main features of std::vector is that it stores its elements in contiguous memory (which is great for performance when you visit vector's items on modern CPUs).
The downside of that is when the pre-allocated memory for the vector is full and you want to add a new item (e.g. calling vector::push_back()), the vector has to allocate another chunk of contiguous memory, and copy/move the data from the previous location to the new one. As a consequence of this re-allocation and copy/move, the address of the old items can change.
If for some reason you want to preserve the address of your objects, instead of storing instances of those objects inside std::vector, you may consider having a vector of pointers to objects. In this case, even after the reallocation, the object pointers won't change.
For example, you may use shared_ptr for both the vector's items and the multimap's key:
vector<shared_ptr<const A>> vec;
multimap<shared_ptr<const A>, const double> mymultimap;
Related
Let us say I create std::vector<int*> myVec; and reserve 100 entries and populate the vector with values so that all 100 elements have valid pointers. I then cache a pointer to one of the elements,
int * x = myVec[60];
Then, if I append another int * which triggers a resize along with a move due to heap fragmentation, does the previous pointer to a pointer become invalidated or does it point to the new location in memory?
If my memory servers me correct, if the example were to std::vector<int> myVecTwo with the same conditions as above and I stored a ptr like
int * x = &myVecTwo[60];
and proceeded to append and resize, that pointer would be invalided.
So, my question is as follows. Would the pointer to the pointer become invalidated? I am no longer certain because of the new C++ std functionality of is_trivially_copyable and whether the std::vector makes use of such functionality or POD checks.
Would the pointer by invalidated in C++11?
No.
As you showed, after the reallocation, the pointers to the elements of the vector, like int * x = &myVecTwo[60]; would be invalidated. After the reallocation, the elements themselves would be copied, but the object pointed by the element pointers won't be affected, then for int * x = myVec[60];, after the reallocation x is still pointing to the same object, which has nothing to do with the reallocation of the vector.
This question already has answers here:
Does insertion of elements in a vector damages a pointer to the vector?
(6 answers)
does a pointer to an element of a vector remain after adding to or removing from the vector (in c++)
(1 answer)
Closed 5 years ago.
I run into an issue which I don't quite understand:
I am creating an object Edge with
edge_vec1.push_back(Edge(src,dest));
Then I want to keep a pointer to this Edge in a separate vector:
edge_vec2.push_back(&edge_vec1.back());
However, once I add the second Edge object, the pointer to the first Edge in edge_vec2 is invalidated(gets some random data). Is it because the pointer in edge_vec2 actually points to some place in edge_vec1, and not the underlying element? I can avoid this by creating my Edge objects on the heap, but I'd like to understand what's going on.
Thank you.
When a new element is added to a vector then the vector can be reallocated. So the previous values of pointers to the elements of the vector can be invalid.
You should at first reserve enough memory for the vector preventing the reallocation.
edge_vec2.reserve( SomeMaxValue );
From http://en.cppreference.com/w/cpp/container/vector/push_back:
If the new size() is greater than capacity() then all iterators and references (including the past-the-end iterator) are invalidated. Otherwise only the past-the-end iterator is invalidated.
It's a bad idea to depend on pointers/references to objects in a vector when you are adding items to it. It is better to store the value of the index and then use the index to fetch the item from the vector.
edge_vec2.push_back(edge_vec1.size()-1);
Later, you can use:
edge_vec1[edge_vec2[i]]
for some valid value of i.
The requirements of std::vector are that the underlying storage is a continuous block of memory. As such, a vector has to reallocate all of its elements when you want to insert an element but the currently allocated block is not large enough to hold the additional element. When this happens, all iterators and pointers are invalidated, as the complete block is reallocated (moved) to a completely different part of memory.
The member function capacity can be used to query the maximum amount of elements which can be inserted without reallocating the underlying memory block. Examplary code querying this:
std::vector<int> vec;
for(int i = 0; i < 1000; i++) {
bool still_has_space = vec.capacity() > vec.size();
if (!still_has_space) std::cout << "Reallocating block\n";
vec.push_back(i);
}
In case the strong guarantee of contiguous memory layout is not need, you might be better of using std::deque instead of std::vector. It allows pushing elements on either end without moving around any other element. You trade this for slightly worse iteration speeds.
std::deque<int> deq;
std::vector<int*> pointers;
for(int i = 0; i < 1000; i++) {
deq.push_back(i);
pointers.push_back(&deq.back());
}
for(auto p : pointers) std::cout << *p << "\n"; // Valid
When I am pushing a reference to an object from a vector in a map the previous value in the reference vector becomes garbage, but the original object doesn't.
Here is the minimal code that reproduces the problem:
#include <iostream>
#include <vector>
#include <map>
#include <string>
class foo
{
private:
std::map<std::string, std::vector<int>> _allObjs;
std::vector<int*> _someObjs;
public:
void addObj(const std::string &name, int obj)
{
_allObjs[name].push_back(obj);
_someObjs.push_back(&_allObjs[name].back());
}
void doStuff()
{
for (auto &obj : _someObjs)
{
std::cout << *obj << std::endl;
}
}
};
int main()
{
foo test;
test.addObj("test1", 5);
test.addObj("test1", 6);
test.addObj("test2", 7);
test.addObj("test2", 8);
test.doStuff();
}
Expected Output
5
6
7
8
Actual Output
-572662307
6
-572662307
8
When debugging it I found the pointer becomes garbage as soon as I push the object to _allObjs in addObj. I have no idea what is causing this, so I can't be much help there. Thanks!
A vector stores its data in a contiguous block of memory.
When you want to store more than it currently has capacity for, it will allocate a new, larger contiguous block of memory, and copy/move all the existing elements from the previous block of memory into the new one.
When you store pointers to your ints (&_allObjs[name].back()), you're storing the memory address of the int in one of these blocks of memory.
As soon as the vector grows to a size where it needs to create additional space, all these memory addresses will be pointing to deallocated addresses. Accessing them is undefined behaviour.
Let us see what this reference page says about inserting new objects to a vector:
If the new size() is greater than capacity() then all iterators and references (including the past-the-end iterator) are invalidated. Otherwise only the past-the-end iterator is invalidated.
So, when you add a new object, the previously stored pointers that refer to objects in that same vector may become invalid i.e. they no longer point to valid objects (unless you have made sure that capacity is not exceeded, which you didn't).
The pointers in your
std::vector<int*> _someObjs;
aren't stable.
When you use
_allObjs[name].push_back(obj);
any addresses obtained earlier might be invalidated due to reallocation.
As written in the reference:
If the new size() is greater than capacity() then all iterators and references (including the past-the-end iterator) are invalidated. Otherwise only the past-the-end iterator is invalidated.
As others have rightfully mentioned, adding items to a vector may be invalidating iterators and references on resizing of the vector.
If you want to quickly alleviate the situation with minimal code changes, change from a
std::map<std::string, std::vector<int>>
to
std::map<std::string, std::forward_list<int>>
or
std::map<std::string, std::list<int>>
Since a std::list / std::forward_list does not invalidate iterators and references when resizing the list, then the above scenario should work (the only iterators that will be invalidated are ones pointing to items that you've removed from the list).
Example using std::list
Note that the drawback is that usage of linked list will not store items in contiguous memory unlike std::vector, and std::list takes up more memory.
This: _allObjs[name].push_back(obj); (and the other push_back) potentially invalidates all iterators (and pointers) into the vector. You cannot assume anything about them afterwards.
Suppose I have the following:
struct Foo {
Foo () : bar(NULL), box(true) {}
Bar* bar;
bool box;
};
and I declare the following:
std::vector<Foo> vec(3);
I have a function right now which does something like this:
Foo& giveFoo() { //finds a certain foo and does return vec[i]; }
Then the caller passes along the address of the Foo it obtains by reference as a Foo* to some other guy. What I'm wondering, however, is if this pointer to Foo will remain valid after a vector grow is triggered in vec? If the existing Foo elements in vec are copied over then presumably the Foo* that was floating around will now be dangling? Is this the case or not? I'm debugging an application but cannot reproduce this.
Any pointers or references to elements will be invalidated when the vector is reallocated, just like any iterators are.
The pointer will remain valid until you call a non-const member function on the vector that:
causes its size to grow beyond its capacity (when that happens, the internal storage will
be reallocated and all pointers and references to the elements will be invalidated) or
inserts an element before the element that the pointer points to, or
deletes the element from the vector, or
deletes an element from the vector that was located before the element that the pointer points to.
The first two bullets can happen at the same time. The difference is that references/pointers to elements located before insertion point remain valid as long as the size doesn't exceed capacity.
Yes it may become invalid because basically when vector needs to increase it's reserved size, it just deletes it's internal storage (which is basically an array), allocates enlarged one and copies it's previous contents there.
If you sure that index stays the same though you may access desired data by using this index each time you need it.
Below is what the standard says about the validity of vector iterators for vector modifiers:
vector::push_back(), vector::insert(), vector::emplace_back(), vector::emplace():
Causes reallocation if the new size is greater than the old capacity. If no reallocation happens, all the iterators and references before the insertion point remain valid.
vector::erase():
Invalidates iterators and references at or after the point of the erase.
Any assumption beyond that is not safe.
I am doing something like:
struct ABC{
int p,q,r;
};
struct X{
ABC *abc;
X(ABC &abc) : abc(&abc) {}
};
std::vector<ABC> vec;
... //populate vec
X x(vec[2]);
When I debug, x.abc looks correct directly after the assignment but then shortly afterwards the data in x.abc is garbage. It's making me think the pointer is to a local variable... but vector::operator[] returns a reference so is that possible?
Internally, the std::vector usually maintains a dynamic array of the elements it contains. If the vector's size grows too large and exceeds its old capacity, it allocates a new array, copies the old elements over, then deallocates the old array. As a result, any references or pointers into that old array become invalid and will result in undefined behavior if used.
If you want to store a pointer into a vector, you should be sure that the vector does not end up reallocating its internal buffer. You can do this either by waiting until you've added all the elements you're going to add to the vector before taking references, or by calling vector::reserve to ensure that the capacity is large enough.
Even better, though, would be to instead store a reference to the vector object itself along with the index, then look up the element at that index each time. That way, if the vector resizes its internal buffer, your pointers don't become garbage because you're re-indexing into the vector each time.
Hope this helps!