Reallocation in std::vector after std::vector.reserve()

Reallocation in std::vector after std::vector.reserve() - c++

I have a snippet of code where I first put some values in to a std::vector and then give an address of each of them to one of the objects that will be using them, like this:
std::vector < useSomeObject > uso;
// uso is filled
std::vector < someObject > obj;
for (int i=0; i < numberOfDesiredObjects; ++i){
obj.push_back(someObject());
obj.back().fillWithData();
}
for (int i=0; i < numberOfDesiredObjects; ++i){
uso[i].setSomeObject(&obj[i]);
}
// use objects via uso vector
// deallocate everything
Now, since I'm sometimes a little bit of a style freak, I think this is ugly and would like to use only 1 for loop, kind of like this:
for (int i=0; i < numberOfDesiredObjects; ++i){
obj.push_back(someObject());
obj.back().fillWithData();
uso[i].setSomeObject(&obj.back());
}
Of course, I can not do that because reallocation happens occasionally, and all the pointers I set became invalid.
So, my question is:
I know that std::vector.reserve() is the way to go if you know how much you will need and want to allocate the memory in advance. If I make sure that I am trying to allocate enough memory in advance with reserve(), does that guarantee that my pointers will stay valid?
Thank you.
Sidenote. This is a similar question, but there is not an answer to what I would like to know. Just to prevent it from popping up as a first comment to this question.

This is, in fact, one of the principal reasons for using reserve. You
are guaranteed that appending to the end of an std::vector will not
invalidate iterators, references or pointers to elements in the vector
as long as the new size of the vector does not exceed the old capacity.

If I make sure that I am trying to allocate enough memory in advance with reserve(), does that guarantee that my pointers will stay valid?
Yes, it guarantees your pointers will stay valid unless:
The size increases beyond the current capacity or
Unless, you erase any elements, which you don't.
The iterator Invalidation rules for an vector are specified in 23.2.4.3/1 & 23.2.4.3/3 as:
All iterators and references before the point of insertion are unaffected, unless the new container size is greater than the previous capacity (in which case all iterators and references are invalidated)
Every iterator and reference after the point of erase is invalidated.

Related

getting pointers to vector elements [duplicate]

This question already has answers here:
Does insertion of elements in a vector damages a pointer to the vector?
(6 answers)
does a pointer to an element of a vector remain after adding to or removing from the vector (in c++)
(1 answer)
Closed 5 years ago.
I run into an issue which I don't quite understand:
I am creating an object Edge with
edge_vec1.push_back(Edge(src,dest));
Then I want to keep a pointer to this Edge in a separate vector:
edge_vec2.push_back(&edge_vec1.back());
However, once I add the second Edge object, the pointer to the first Edge in edge_vec2 is invalidated(gets some random data). Is it because the pointer in edge_vec2 actually points to some place in edge_vec1, and not the underlying element? I can avoid this by creating my Edge objects on the heap, but I'd like to understand what's going on.
Thank you.

When a new element is added to a vector then the vector can be reallocated. So the previous values of pointers to the elements of the vector can be invalid.
You should at first reserve enough memory for the vector preventing the reallocation.
edge_vec2.reserve( SomeMaxValue );

From http://en.cppreference.com/w/cpp/container/vector/push_back:
If the new size() is greater than capacity() then all iterators and references (including the past-the-end iterator) are invalidated. Otherwise only the past-the-end iterator is invalidated.
It's a bad idea to depend on pointers/references to objects in a vector when you are adding items to it. It is better to store the value of the index and then use the index to fetch the item from the vector.
edge_vec2.push_back(edge_vec1.size()-1);
Later, you can use:
edge_vec1[edge_vec2[i]]
for some valid value of i.

The requirements of std::vector are that the underlying storage is a continuous block of memory. As such, a vector has to reallocate all of its elements when you want to insert an element but the currently allocated block is not large enough to hold the additional element. When this happens, all iterators and pointers are invalidated, as the complete block is reallocated (moved) to a completely different part of memory.
The member function capacity can be used to query the maximum amount of elements which can be inserted without reallocating the underlying memory block. Examplary code querying this:
std::vector<int> vec;
for(int i = 0; i < 1000; i++) {
bool still_has_space = vec.capacity() > vec.size();
if (!still_has_space) std::cout << "Reallocating block\n";
vec.push_back(i);
}
In case the strong guarantee of contiguous memory layout is not need, you might be better of using std::deque instead of std::vector. It allows pushing elements on either end without moving around any other element. You trade this for slightly worse iteration speeds.
std::deque<int> deq;
std::vector<int*> pointers;
for(int i = 0; i < 1000; i++) {
deq.push_back(i);
pointers.push_back(&deq.back());
}
for(auto p : pointers) std::cout << *p << "\n"; // Valid

Wrong output with C++ Linked List using vector and struct [duplicate]

When I do:
for(i=0; i<size; i++){
//create objectA here
vectorA.push_back(objectA);
pvectorA.push_back(&vectorA[i]);
}
some elements of pvectorA is garbage. However when I do:
for(i=0; i<size; i++){
//create objectA here
vectorA.push_back(objectA);
}
for(i=0; i<size; i++){
pvectorA.push_back(&vectorA[i]);
}
Everything is okay. Why is it happens?

Read the documentation of std::vector::push_back
First the description:
Adds a new element at the end of the vector, after its current last element. The content of val is copied (or moved) to the new element.
This effectively increases the container size by one, which causes an automatic reallocation of the allocated storage space if -and only if- the new vector size surpasses the current vector capacity.
Then about validity of iterators:
If a reallocation happens, all iterators, pointers and references related to the container are invalidated.
So, when you add an object to the vector, all the pointers pointing to objects in that vector may become invalid - unless you've guaranteed that the vector has enough capacity with std::vector::reserve.
Invalid means that the pointer no longer points to a valid object and dereferencing it will have undefined behaviour.
In the latter code, you never add objects to the pointed-to vector after you've stored the pointers, so the pointers are valid.

When you use push_back to add an element to a vector, it is copied into the vector. The means that, in the first example, vectorA contains a copy of objectA, and objectA goes out of scope and gets deleted right after the closing brace. This means that the pointer in pvectorA is pointing at at an address that doesn't necessarily contain objectA any more.
Your second example shouldn't work, because objectA has gone out of scope after the first loop, so I can't help you there.

When you push elements into vectorA it will occasionally get full, and have to relocate its objects to a larger memory block. That will change the address of each element.
If pvectorA has stored pointers to the elements' original position, those pointers will still point to the old positions even after the vectorA elements have been moved to a new location.

When you do vectorA.push_back you may cause that vector to reallocate itself to increase capacity, which means all its contents are moved, which means any pointers to its contents that you have saved are made invalid.
Maybe you want to rethink the whole idea of storing pointers to elements of a vector.
If you can't drop the whole idea of storing pointers, but you know the required size in advance, you could use reserve before the loop:
vectorA.reserve(size);
for(i=0; i<size; i++){
//create objectA here
vectorA.push_back(objectA);
pvectorA.push_back(&vectorA[i]);
}
In this version, the pointers are valid until you either grow vectorA further or destroy it.

This is because the vector reallocated its internal storage when it grows beyond its current capacity; after the reallocation, the address of elements may have changed.
You can avoid reallocation by reserving a big enough storage beforehand:
vectorA.reserve(size);
//vectorB.reserve(size); // this would not hurt either
for(i=0; i<size; i++){
//create objectA here
vectorA.push_back(objectA);
pvectorA.push_back(&vectorA[i]);
}
One final note: if you can use C++11, emplace_back constructs your object in place, hence, without copying.

If you want a sequence container with constant-time insertion and no invalidation of iterators pointing to other elements, use a std::list. Note however that std::vector is often the fastest data structure to use (in some cases, you need a pre-sorted std::vector). One prominent reason for this is that arrays are more cache friendly than e.g. trees or linked lists.

c++ garbage values in vector of pointer

When I do:
for(i=0; i<size; i++){
//create objectA here
vectorA.push_back(objectA);
pvectorA.push_back(&vectorA[i]);
}
some elements of pvectorA is garbage. However when I do:
for(i=0; i<size; i++){
//create objectA here
vectorA.push_back(objectA);
}
for(i=0; i<size; i++){
pvectorA.push_back(&vectorA[i]);
}
Everything is okay. Why is it happens?

Read the documentation of std::vector::push_back
First the description:
Adds a new element at the end of the vector, after its current last element. The content of val is copied (or moved) to the new element.
This effectively increases the container size by one, which causes an automatic reallocation of the allocated storage space if -and only if- the new vector size surpasses the current vector capacity.
Then about validity of iterators:
If a reallocation happens, all iterators, pointers and references related to the container are invalidated.
So, when you add an object to the vector, all the pointers pointing to objects in that vector may become invalid - unless you've guaranteed that the vector has enough capacity with std::vector::reserve.
Invalid means that the pointer no longer points to a valid object and dereferencing it will have undefined behaviour.
In the latter code, you never add objects to the pointed-to vector after you've stored the pointers, so the pointers are valid.

When you use push_back to add an element to a vector, it is copied into the vector. The means that, in the first example, vectorA contains a copy of objectA, and objectA goes out of scope and gets deleted right after the closing brace. This means that the pointer in pvectorA is pointing at at an address that doesn't necessarily contain objectA any more.
Your second example shouldn't work, because objectA has gone out of scope after the first loop, so I can't help you there.

When you push elements into vectorA it will occasionally get full, and have to relocate its objects to a larger memory block. That will change the address of each element.
If pvectorA has stored pointers to the elements' original position, those pointers will still point to the old positions even after the vectorA elements have been moved to a new location.

When you do vectorA.push_back you may cause that vector to reallocate itself to increase capacity, which means all its contents are moved, which means any pointers to its contents that you have saved are made invalid.
Maybe you want to rethink the whole idea of storing pointers to elements of a vector.
If you can't drop the whole idea of storing pointers, but you know the required size in advance, you could use reserve before the loop:
vectorA.reserve(size);
for(i=0; i<size; i++){
//create objectA here
vectorA.push_back(objectA);
pvectorA.push_back(&vectorA[i]);
}
In this version, the pointers are valid until you either grow vectorA further or destroy it.

This is because the vector reallocated its internal storage when it grows beyond its current capacity; after the reallocation, the address of elements may have changed.
You can avoid reallocation by reserving a big enough storage beforehand:
vectorA.reserve(size);
//vectorB.reserve(size); // this would not hurt either
for(i=0; i<size; i++){
//create objectA here
vectorA.push_back(objectA);
pvectorA.push_back(&vectorA[i]);
}
One final note: if you can use C++11, emplace_back constructs your object in place, hence, without copying.

If you want a sequence container with constant-time insertion and no invalidation of iterators pointing to other elements, use a std::list. Note however that std::vector is often the fastest data structure to use (in some cases, you need a pre-sorted std::vector). One prominent reason for this is that arrays are more cache friendly than e.g. trees or linked lists.

How do I reserve space for std::vector/std::string?

I'm looking for a way that prevents std::vectors/std::strings from growing in a given range of sizes (say I want to assume that a string will hold around 64 characters, but it can grow if needed). What's the best way to achieve this?

Look at the .reserve() member function. The standard docs at the SGI site say
[4] Reserve() causes a reallocation manually. The main reason for
using reserve() is efficiency: if you know the capacity to which your
vector must eventually grow, then it is usually more efficient to
allocate that memory all at once rather than relying on the automatic
reallocation scheme. The other reason for using reserve() is so that
you can control the invalidation of iterators. [5]
[5] A vector's iterators are invalidated when its memory is
reallocated. Additionally, inserting or deleting an element in the
middle of a vector invalidates all iterators that point to elements
following the insertion or deletion point. It follows that you can
prevent a vector's iterators from being invalidated if you use
reserve() to preallocate as much memory as the vector will ever use,
and if all insertions and deletions are at the vector's end.
That said, as a general rule unless you really know what is going to happen, it may be best to let the STL container deal with the allocation itself.

You reserve space for vector and string by their reserve(size_type capacity) member function. But it doesn't prevent it from anything :). You're just telling it to allocate at least that much of uninitialized memory (that is, no constructors of your type will be called) and resize to more if needed.
std::vector<MyClass> v;
v.reserve(100); //no constructor of MyClass is called
for(int i = 0; i < 100; ++i)
{
v.push_back(MyClass()); // no reallocation will happen. There is enough space in the vector
}

For vector:
std::vector<char> v;
v.reserve(64);
For string:
std::string s;
s.reserve(64);
Where's your C++ Standard Library reference got to?

Both of them have member function called reserve which you can use to reserve space.
c.reserve(100); //where c is vector (or string)

Array of contiguous new objects

I am currently filling an vector array of elements like so:
std::vector<T*> elemArray;
for (size_t i = 0; i < elemArray.size(); ++i)
{
elemArray = new T();
}
The code has obviously been simplified. Now after asking another question (unrelated to this problem but related to the program) I realized I need an array that has new'd objects (can't be on the stack, will overflow, too many elements) but are contiguous. That is, if I were to receive an element, without the array index, I should be able to find the array index by doing returnedElement - elemArray[0] to get the index of the element in the array.
I hope I have explained the problem, if not, please let me know which parts and I will attempt to clarify.
EDIT: I am not sure why the highest voted answer is not being looked into. I have tried this many times. If I try allocating a vector like that with more than 100,000 (approximately) elements, it always gives me a memory error. Secondly, I require pointers, as is clear from my example. Changing it suddenly to not be pointers will require a large amount of code re-write (although I am willing to do that, but it still does not address the issue that allocating vectors like that with a few million elements does not work.

A std::vector<> stores its elements in a heap allocated array, it won't store the elements on the stack. So you won't get any stack overflow even if you do it the simple way:
std::vector<T> elemArray;
for (size_t i = 0; i < elemCount; ++i) {
elemArray.push_back(T(i));
}
&elemArray[0] will be a pointer to a (continuous) array of T objects.

If you need the elements to be contiguous, not the pointers, you can just do:
std::vector<T> elemArray(numberOfElements);
The elements themselves won't be on the stack, vector manages the dynamic allocation of memory and as in your example the elements will be value-initialized. (Strictly, copy-initialized from a value-initialized temporary but this should work out the same for objects that it is valid to store in a vector.)
I believe that your index calculation should be: &returnedElement - &elemArray[0] and this will work with a vector. Provided that returnedElement is actually stored in elemArray.

Your original loop should look something like this: (though it doesn't create objects in contiguous memory).
for (size_t i = 0; i < someSize ; ++i)
{
elemArray.push_back(new T());
}
And then you should know two basic things here:
elemArray.size() returns the number of elements elemArray currently holds. That means, if you use it in the for loop's condition, then your for loop would become an infinite loop, because you're adding elements to it, and so the vector's size would keep on increasing.
elemArray is a vector of T*, so you can store only T*, and to populate it, you've to use push_back function.

Considering your old code caused a stack overflow I think it probably looked like this:
Item items[2000000]; // stack overflow
If that is the case, then you can use the following syntax:
std::vector<Item> items(2000000);
This will allocate (and construct) the items contiguously on the heap.

While I also have the same requirements but for a different reason, mainly for being cache hot...(for small number of objects)
int maxelements = 100000;
T *obj = new T [100000];
vector<T *> vectorofptr;
for (int i = 0; i < maxelements; i++)
{
vectorofptr.push_back(&obj[i]);
}
or
int sz = sizeof(T);
int maxelements = 100000;
void *base = calloc(maxelements, sz); //need to save base for free()
vector<T *> vectorofptr;
int offset = 0;
for (int i = 0; i < maxelements; i++)
{
vectorofptr.push_back((T *) base + offset);
offset += sz;
}

Allocate a large chunk of memory and use placement new to construct your vector elements in it:
elemArray[i] = new (GetNextContinuousAddress()) T();
(assuming that you really need pointer to indidually new'ed objects in your array and are aware of the reasons why this is not recommended ..)

I know this is an old post but it might be useful intel for any who might need it at some point. Storing pointers in a sequence container is perfectly fine only if you have in mind a few important caveats:
the objects your stored pointers point to won't be contiguously aligned in memory since they were allocated (dynamically or not) elsewhere in your program, you will end up with a contiguous array of pointers pointing to data likely scattered across memory.
since STL containers work with value copy and not reference copy, your container won't take ownership of allocated data and thus will only delete the pointer objects on destruction and not the objects they point to. This is fine when those objects were not dynamically allocated, otherwise you will need to provide a release mechanism elsewhere on the pointed objects like simply looping through your pointers and delete them individually or using shared pointers which will do the job for you when required ;-)
one last but very important thing to remember is that if your container is to be used in a dynamic environment with possible insertions and deletions at random places, you need to make sure to use a stable container so your iterators/pointers to stored elements remain valid after such operations otherwise you will end up with undesirable side effects... This is the case with std::vector which will release and reallocate extra space when inserting new elements and then perform a copy of the shifted elements (unless you insert at the end), thus invalidating element pointers/iterators (some implementations provide stability though, like boost::stable_vector, but the penalty here is that you lose the contiguous property of the container, magic does not exist when programming, life is unfair right? ;-))
Regards

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Reallocation in std::vector after std::vector.reserve() - c++

This is, in fact, one of the principal reasons for using reserve. You are guaranteed that appending to the end of an std::vector will not invalidate iterators, references or pointers to elements in the vector as long as the new size of the vector does not exceed the old capacity.

Related

getting pointers to vector elements [duplicate]

Wrong output with C++ Linked List using vector and struct [duplicate]

c++ garbage values in vector of pointer

How do I reserve space for std::vector/std::string?

Array of contiguous new objects

Categories

Resources