This question already has answers here:
Does insertion of elements in a vector damages a pointer to the vector?
(6 answers)
does a pointer to an element of a vector remain after adding to or removing from the vector (in c++)
(1 answer)
Closed 5 years ago.
I run into an issue which I don't quite understand:
I am creating an object Edge with
edge_vec1.push_back(Edge(src,dest));
Then I want to keep a pointer to this Edge in a separate vector:
edge_vec2.push_back(&edge_vec1.back());
However, once I add the second Edge object, the pointer to the first Edge in edge_vec2 is invalidated(gets some random data). Is it because the pointer in edge_vec2 actually points to some place in edge_vec1, and not the underlying element? I can avoid this by creating my Edge objects on the heap, but I'd like to understand what's going on.
Thank you.
When a new element is added to a vector then the vector can be reallocated. So the previous values of pointers to the elements of the vector can be invalid.
You should at first reserve enough memory for the vector preventing the reallocation.
edge_vec2.reserve( SomeMaxValue );
From http://en.cppreference.com/w/cpp/container/vector/push_back:
If the new size() is greater than capacity() then all iterators and references (including the past-the-end iterator) are invalidated. Otherwise only the past-the-end iterator is invalidated.
It's a bad idea to depend on pointers/references to objects in a vector when you are adding items to it. It is better to store the value of the index and then use the index to fetch the item from the vector.
edge_vec2.push_back(edge_vec1.size()-1);
Later, you can use:
edge_vec1[edge_vec2[i]]
for some valid value of i.
The requirements of std::vector are that the underlying storage is a continuous block of memory. As such, a vector has to reallocate all of its elements when you want to insert an element but the currently allocated block is not large enough to hold the additional element. When this happens, all iterators and pointers are invalidated, as the complete block is reallocated (moved) to a completely different part of memory.
The member function capacity can be used to query the maximum amount of elements which can be inserted without reallocating the underlying memory block. Examplary code querying this:
std::vector<int> vec;
for(int i = 0; i < 1000; i++) {
bool still_has_space = vec.capacity() > vec.size();
if (!still_has_space) std::cout << "Reallocating block\n";
vec.push_back(i);
}
In case the strong guarantee of contiguous memory layout is not need, you might be better of using std::deque instead of std::vector. It allows pushing elements on either end without moving around any other element. You trade this for slightly worse iteration speeds.
std::deque<int> deq;
std::vector<int*> pointers;
for(int i = 0; i < 1000; i++) {
deq.push_back(i);
pointers.push_back(&deq.back());
}
for(auto p : pointers) std::cout << *p << "\n"; // Valid
Related
Let us say I create std::vector<int*> myVec; and reserve 100 entries and populate the vector with values so that all 100 elements have valid pointers. I then cache a pointer to one of the elements,
int * x = myVec[60];
Then, if I append another int * which triggers a resize along with a move due to heap fragmentation, does the previous pointer to a pointer become invalidated or does it point to the new location in memory?
If my memory servers me correct, if the example were to std::vector<int> myVecTwo with the same conditions as above and I stored a ptr like
int * x = &myVecTwo[60];
and proceeded to append and resize, that pointer would be invalided.
So, my question is as follows. Would the pointer to the pointer become invalidated? I am no longer certain because of the new C++ std functionality of is_trivially_copyable and whether the std::vector makes use of such functionality or POD checks.
Would the pointer by invalidated in C++11?
No.
As you showed, after the reallocation, the pointers to the elements of the vector, like int * x = &myVecTwo[60]; would be invalidated. After the reallocation, the elements themselves would be copied, but the object pointed by the element pointers won't be affected, then for int * x = myVec[60];, after the reallocation x is still pointing to the same object, which has nothing to do with the reallocation of the vector.
This question already has answers here:
Iterator invalidation rules for C++ containers
(6 answers)
Closed 6 years ago.
I filled a vector with A objects, then stored these objects address in a multimap [1], but the print message shows that the reference to the object stored in the vector changed [2]. Do you see why? and how avoid any changes.
//[1]
vector<A> vec;
multimap<const A*, const double > mymultimap;
for (const auto &a : A) {
double val = a.value();
vec.push_back(a);
mymultimap.insert(std::pair<const A*, const double >( &vel.back(), val));
// displaying addresses while storing them
cout<<"test1: "<<&vec.back()<<endl;
}
//[2]
// displaying addresses after storing them
for(auto &i : vec)
cout << "test2: " << &i <<endl;
Results:
test1: 0x7f6a13ab4000
test1: 0x7f6a140137c8
test2 :0x7f6a14013000
test2 :0x7f6a140137c8
You are calling vec.push_back(a) within your for loop. Therefore the vector may re-allocate the underlying array if it runs out of space. Therefore the address of the previous elements are no longer valid if they were copied to the new memory location.
For example say you allocated 3 elements and stored their addresses. After pushing back a 4th element the vector has to reallocate. That means the first 3 elements will be copied to a new location, then the 4th will be added after that. Therefore the address you had stored for the first 3 are now invalid.
Iterators (and references and object adresses) are not guaranteed to be preserved when you call vector<T>::push_back(). If the new size() is greater than current capacity(), a reallocation will happen and all the elements are moved or copied to the new location.
To avoid this, you can call reserve() before you start inserting.
One of the main features of std::vector is that it stores its elements in contiguous memory (which is great for performance when you visit vector's items on modern CPUs).
The downside of that is when the pre-allocated memory for the vector is full and you want to add a new item (e.g. calling vector::push_back()), the vector has to allocate another chunk of contiguous memory, and copy/move the data from the previous location to the new one. As a consequence of this re-allocation and copy/move, the address of the old items can change.
If for some reason you want to preserve the address of your objects, instead of storing instances of those objects inside std::vector, you may consider having a vector of pointers to objects. In this case, even after the reallocation, the object pointers won't change.
For example, you may use shared_ptr for both the vector's items and the multimap's key:
vector<shared_ptr<const A>> vec;
multimap<shared_ptr<const A>, const double> mymultimap;
When I am pushing a reference to an object from a vector in a map the previous value in the reference vector becomes garbage, but the original object doesn't.
Here is the minimal code that reproduces the problem:
#include <iostream>
#include <vector>
#include <map>
#include <string>
class foo
{
private:
std::map<std::string, std::vector<int>> _allObjs;
std::vector<int*> _someObjs;
public:
void addObj(const std::string &name, int obj)
{
_allObjs[name].push_back(obj);
_someObjs.push_back(&_allObjs[name].back());
}
void doStuff()
{
for (auto &obj : _someObjs)
{
std::cout << *obj << std::endl;
}
}
};
int main()
{
foo test;
test.addObj("test1", 5);
test.addObj("test1", 6);
test.addObj("test2", 7);
test.addObj("test2", 8);
test.doStuff();
}
Expected Output
5
6
7
8
Actual Output
-572662307
6
-572662307
8
When debugging it I found the pointer becomes garbage as soon as I push the object to _allObjs in addObj. I have no idea what is causing this, so I can't be much help there. Thanks!
A vector stores its data in a contiguous block of memory.
When you want to store more than it currently has capacity for, it will allocate a new, larger contiguous block of memory, and copy/move all the existing elements from the previous block of memory into the new one.
When you store pointers to your ints (&_allObjs[name].back()), you're storing the memory address of the int in one of these blocks of memory.
As soon as the vector grows to a size where it needs to create additional space, all these memory addresses will be pointing to deallocated addresses. Accessing them is undefined behaviour.
Let us see what this reference page says about inserting new objects to a vector:
If the new size() is greater than capacity() then all iterators and references (including the past-the-end iterator) are invalidated. Otherwise only the past-the-end iterator is invalidated.
So, when you add a new object, the previously stored pointers that refer to objects in that same vector may become invalid i.e. they no longer point to valid objects (unless you have made sure that capacity is not exceeded, which you didn't).
The pointers in your
std::vector<int*> _someObjs;
aren't stable.
When you use
_allObjs[name].push_back(obj);
any addresses obtained earlier might be invalidated due to reallocation.
As written in the reference:
If the new size() is greater than capacity() then all iterators and references (including the past-the-end iterator) are invalidated. Otherwise only the past-the-end iterator is invalidated.
As others have rightfully mentioned, adding items to a vector may be invalidating iterators and references on resizing of the vector.
If you want to quickly alleviate the situation with minimal code changes, change from a
std::map<std::string, std::vector<int>>
to
std::map<std::string, std::forward_list<int>>
or
std::map<std::string, std::list<int>>
Since a std::list / std::forward_list does not invalidate iterators and references when resizing the list, then the above scenario should work (the only iterators that will be invalidated are ones pointing to items that you've removed from the list).
Example using std::list
Note that the drawback is that usage of linked list will not store items in contiguous memory unlike std::vector, and std::list takes up more memory.
This: _allObjs[name].push_back(obj); (and the other push_back) potentially invalidates all iterators (and pointers) into the vector. You cannot assume anything about them afterwards.
I have a snippet of code where I first put some values in to a std::vector and then give an address of each of them to one of the objects that will be using them, like this:
std::vector < useSomeObject > uso;
// uso is filled
std::vector < someObject > obj;
for (int i=0; i < numberOfDesiredObjects; ++i){
obj.push_back(someObject());
obj.back().fillWithData();
}
for (int i=0; i < numberOfDesiredObjects; ++i){
uso[i].setSomeObject(&obj[i]);
}
// use objects via uso vector
// deallocate everything
Now, since I'm sometimes a little bit of a style freak, I think this is ugly and would like to use only 1 for loop, kind of like this:
for (int i=0; i < numberOfDesiredObjects; ++i){
obj.push_back(someObject());
obj.back().fillWithData();
uso[i].setSomeObject(&obj.back());
}
Of course, I can not do that because reallocation happens occasionally, and all the pointers I set became invalid.
So, my question is:
I know that std::vector.reserve() is the way to go if you know how much you will need and want to allocate the memory in advance. If I make sure that I am trying to allocate enough memory in advance with reserve(), does that guarantee that my pointers will stay valid?
Thank you.
Sidenote. This is a similar question, but there is not an answer to what I would like to know. Just to prevent it from popping up as a first comment to this question.
This is, in fact, one of the principal reasons for using reserve. You
are guaranteed that appending to the end of an std::vector will not
invalidate iterators, references or pointers to elements in the vector
as long as the new size of the vector does not exceed the old capacity.
If I make sure that I am trying to allocate enough memory in advance with reserve(), does that guarantee that my pointers will stay valid?
Yes, it guarantees your pointers will stay valid unless:
The size increases beyond the current capacity or
Unless, you erase any elements, which you don't.
The iterator Invalidation rules for an vector are specified in 23.2.4.3/1 & 23.2.4.3/3 as:
All iterators and references before the point of insertion are unaffected, unless the new container size is greater than the previous capacity (in which case all iterators and references are invalidated)
Every iterator and reference after the point of erase is invalidated.
I am currently filling an vector array of elements like so:
std::vector<T*> elemArray;
for (size_t i = 0; i < elemArray.size(); ++i)
{
elemArray = new T();
}
The code has obviously been simplified. Now after asking another question (unrelated to this problem but related to the program) I realized I need an array that has new'd objects (can't be on the stack, will overflow, too many elements) but are contiguous. That is, if I were to receive an element, without the array index, I should be able to find the array index by doing returnedElement - elemArray[0] to get the index of the element in the array.
I hope I have explained the problem, if not, please let me know which parts and I will attempt to clarify.
EDIT: I am not sure why the highest voted answer is not being looked into. I have tried this many times. If I try allocating a vector like that with more than 100,000 (approximately) elements, it always gives me a memory error. Secondly, I require pointers, as is clear from my example. Changing it suddenly to not be pointers will require a large amount of code re-write (although I am willing to do that, but it still does not address the issue that allocating vectors like that with a few million elements does not work.
A std::vector<> stores its elements in a heap allocated array, it won't store the elements on the stack. So you won't get any stack overflow even if you do it the simple way:
std::vector<T> elemArray;
for (size_t i = 0; i < elemCount; ++i) {
elemArray.push_back(T(i));
}
&elemArray[0] will be a pointer to a (continuous) array of T objects.
If you need the elements to be contiguous, not the pointers, you can just do:
std::vector<T> elemArray(numberOfElements);
The elements themselves won't be on the stack, vector manages the dynamic allocation of memory and as in your example the elements will be value-initialized. (Strictly, copy-initialized from a value-initialized temporary but this should work out the same for objects that it is valid to store in a vector.)
I believe that your index calculation should be: &returnedElement - &elemArray[0] and this will work with a vector. Provided that returnedElement is actually stored in elemArray.
Your original loop should look something like this: (though it doesn't create objects in contiguous memory).
for (size_t i = 0; i < someSize ; ++i)
{
elemArray.push_back(new T());
}
And then you should know two basic things here:
elemArray.size() returns the number of elements elemArray currently holds. That means, if you use it in the for loop's condition, then your for loop would become an infinite loop, because you're adding elements to it, and so the vector's size would keep on increasing.
elemArray is a vector of T*, so you can store only T*, and to populate it, you've to use push_back function.
Considering your old code caused a stack overflow I think it probably looked like this:
Item items[2000000]; // stack overflow
If that is the case, then you can use the following syntax:
std::vector<Item> items(2000000);
This will allocate (and construct) the items contiguously on the heap.
While I also have the same requirements but for a different reason, mainly for being cache hot...(for small number of objects)
int maxelements = 100000;
T *obj = new T [100000];
vector<T *> vectorofptr;
for (int i = 0; i < maxelements; i++)
{
vectorofptr.push_back(&obj[i]);
}
or
int sz = sizeof(T);
int maxelements = 100000;
void *base = calloc(maxelements, sz); //need to save base for free()
vector<T *> vectorofptr;
int offset = 0;
for (int i = 0; i < maxelements; i++)
{
vectorofptr.push_back((T *) base + offset);
offset += sz;
}
Allocate a large chunk of memory and use placement new to construct your vector elements in it:
elemArray[i] = new (GetNextContinuousAddress()) T();
(assuming that you really need pointer to indidually new'ed objects in your array and are aware of the reasons why this is not recommended ..)
I know this is an old post but it might be useful intel for any who might need it at some point. Storing pointers in a sequence container is perfectly fine only if you have in mind a few important caveats:
the objects your stored pointers point to won't be contiguously aligned in memory since they were allocated (dynamically or not) elsewhere in your program, you will end up with a contiguous array of pointers pointing to data likely scattered across memory.
since STL containers work with value copy and not reference copy, your container won't take ownership of allocated data and thus will only delete the pointer objects on destruction and not the objects they point to. This is fine when those objects were not dynamically allocated, otherwise you will need to provide a release mechanism elsewhere on the pointed objects like simply looping through your pointers and delete them individually or using shared pointers which will do the job for you when required ;-)
one last but very important thing to remember is that if your container is to be used in a dynamic environment with possible insertions and deletions at random places, you need to make sure to use a stable container so your iterators/pointers to stored elements remain valid after such operations otherwise you will end up with undesirable side effects... This is the case with std::vector which will release and reallocate extra space when inserting new elements and then perform a copy of the shifted elements (unless you insert at the end), thus invalidating element pointers/iterators (some implementations provide stability though, like boost::stable_vector, but the penalty here is that you lose the contiguous property of the container, magic does not exist when programming, life is unfair right? ;-))
Regards