c++ garbage values in vector of pointer - c++

When I do:
for(i=0; i<size; i++){
//create objectA here
vectorA.push_back(objectA);
pvectorA.push_back(&vectorA[i]);
}
some elements of pvectorA is garbage. However when I do:
for(i=0; i<size; i++){
//create objectA here
vectorA.push_back(objectA);
}
for(i=0; i<size; i++){
pvectorA.push_back(&vectorA[i]);
}
Everything is okay. Why is it happens?

Read the documentation of std::vector::push_back
First the description:
Adds a new element at the end of the vector, after its current last element. The content of val is copied (or moved) to the new element.
This effectively increases the container size by one, which causes an automatic reallocation of the allocated storage space if -and only if- the new vector size surpasses the current vector capacity.
Then about validity of iterators:
If a reallocation happens, all iterators, pointers and references related to the container are invalidated.
So, when you add an object to the vector, all the pointers pointing to objects in that vector may become invalid - unless you've guaranteed that the vector has enough capacity with std::vector::reserve.
Invalid means that the pointer no longer points to a valid object and dereferencing it will have undefined behaviour.
In the latter code, you never add objects to the pointed-to vector after you've stored the pointers, so the pointers are valid.

When you use push_back to add an element to a vector, it is copied into the vector. The means that, in the first example, vectorA contains a copy of objectA, and objectA goes out of scope and gets deleted right after the closing brace. This means that the pointer in pvectorA is pointing at at an address that doesn't necessarily contain objectA any more.
Your second example shouldn't work, because objectA has gone out of scope after the first loop, so I can't help you there.

When you push elements into vectorA it will occasionally get full, and have to relocate its objects to a larger memory block. That will change the address of each element.
If pvectorA has stored pointers to the elements' original position, those pointers will still point to the old positions even after the vectorA elements have been moved to a new location.

When you do vectorA.push_back you may cause that vector to reallocate itself to increase capacity, which means all its contents are moved, which means any pointers to its contents that you have saved are made invalid.
Maybe you want to rethink the whole idea of storing pointers to elements of a vector.
If you can't drop the whole idea of storing pointers, but you know the required size in advance, you could use reserve before the loop:
vectorA.reserve(size);
for(i=0; i<size; i++){
//create objectA here
vectorA.push_back(objectA);
pvectorA.push_back(&vectorA[i]);
}
In this version, the pointers are valid until you either grow vectorA further or destroy it.

This is because the vector reallocated its internal storage when it grows beyond its current capacity; after the reallocation, the address of elements may have changed.
You can avoid reallocation by reserving a big enough storage beforehand:
vectorA.reserve(size);
//vectorB.reserve(size); // this would not hurt either
for(i=0; i<size; i++){
//create objectA here
vectorA.push_back(objectA);
pvectorA.push_back(&vectorA[i]);
}
One final note: if you can use C++11, emplace_back constructs your object in place, hence, without copying.

If you want a sequence container with constant-time insertion and no invalidation of iterators pointing to other elements, use a std::list. Note however that std::vector is often the fastest data structure to use (in some cases, you need a pre-sorted std::vector). One prominent reason for this is that arrays are more cache friendly than e.g. trees or linked lists.

Related

Wrong output with C++ Linked List using vector and struct [duplicate]

When I do:
for(i=0; i<size; i++){
//create objectA here
vectorA.push_back(objectA);
pvectorA.push_back(&vectorA[i]);
}
some elements of pvectorA is garbage. However when I do:
for(i=0; i<size; i++){
//create objectA here
vectorA.push_back(objectA);
}
for(i=0; i<size; i++){
pvectorA.push_back(&vectorA[i]);
}
Everything is okay. Why is it happens?
Read the documentation of std::vector::push_back
First the description:
Adds a new element at the end of the vector, after its current last element. The content of val is copied (or moved) to the new element.
This effectively increases the container size by one, which causes an automatic reallocation of the allocated storage space if -and only if- the new vector size surpasses the current vector capacity.
Then about validity of iterators:
If a reallocation happens, all iterators, pointers and references related to the container are invalidated.
So, when you add an object to the vector, all the pointers pointing to objects in that vector may become invalid - unless you've guaranteed that the vector has enough capacity with std::vector::reserve.
Invalid means that the pointer no longer points to a valid object and dereferencing it will have undefined behaviour.
In the latter code, you never add objects to the pointed-to vector after you've stored the pointers, so the pointers are valid.
When you use push_back to add an element to a vector, it is copied into the vector. The means that, in the first example, vectorA contains a copy of objectA, and objectA goes out of scope and gets deleted right after the closing brace. This means that the pointer in pvectorA is pointing at at an address that doesn't necessarily contain objectA any more.
Your second example shouldn't work, because objectA has gone out of scope after the first loop, so I can't help you there.
When you push elements into vectorA it will occasionally get full, and have to relocate its objects to a larger memory block. That will change the address of each element.
If pvectorA has stored pointers to the elements' original position, those pointers will still point to the old positions even after the vectorA elements have been moved to a new location.
When you do vectorA.push_back you may cause that vector to reallocate itself to increase capacity, which means all its contents are moved, which means any pointers to its contents that you have saved are made invalid.
Maybe you want to rethink the whole idea of storing pointers to elements of a vector.
If you can't drop the whole idea of storing pointers, but you know the required size in advance, you could use reserve before the loop:
vectorA.reserve(size);
for(i=0; i<size; i++){
//create objectA here
vectorA.push_back(objectA);
pvectorA.push_back(&vectorA[i]);
}
In this version, the pointers are valid until you either grow vectorA further or destroy it.
This is because the vector reallocated its internal storage when it grows beyond its current capacity; after the reallocation, the address of elements may have changed.
You can avoid reallocation by reserving a big enough storage beforehand:
vectorA.reserve(size);
//vectorB.reserve(size); // this would not hurt either
for(i=0; i<size; i++){
//create objectA here
vectorA.push_back(objectA);
pvectorA.push_back(&vectorA[i]);
}
One final note: if you can use C++11, emplace_back constructs your object in place, hence, without copying.
If you want a sequence container with constant-time insertion and no invalidation of iterators pointing to other elements, use a std::list. Note however that std::vector is often the fastest data structure to use (in some cases, you need a pre-sorted std::vector). One prominent reason for this is that arrays are more cache friendly than e.g. trees or linked lists.

std::vector::assign - reallocates the data?

I am working with STL library and my goal is to minimize the data reallocation cases.
I was wndering, does
std::vector::assign(size_type n, const value_type& val)
reallocated the data if the size is not changed or does is actually just assign the new values (for example, using operator=) ?
The STL documentation at http://www.cplusplus.com/ sais the following (C++98):
In the fill version (2), the new contents are n elements, each initialized to a copy of val.
If a reallocation happens,the storage needed is allocated using the internal allocator.
Any elements held in the container before the call are destroyed and replaced by newly constructed elements (no assignments of elements take place).
This causes an automatic reallocation of the allocated storage space if -and only if- the new vector size surpasses the current vector capacity.
The phrase "no assignments of elements take place" make it all a little confusing.
So for example, I want to have a vector of classes (for example, cv::Vec3i of OpenCV). Does this mean, that
the destructor or constructor of cv::Vec3i will be called?
a direct copy of Vec3i memory will be made and fills the vector?
what happens, if my class allocates memory at run time with new
operator? This memory cannot be accounted for by plain memory
copying. Does it mean, that assign() should not be used for such
objects?
EDIT: the whole purpose of using assign in this case is to set all values in the vector to 0 (in case I have std::vector< cv::Vec3i > v). It will be done many-many times. The size of std::vector itself will not be changed.
what i want to do (in a shorter way) is the following:
for(int i=0; i<v.size(); i++)
for(int j=0; j<3; j++)
v[i][j] = 0;
right now I am interested in C++98
std::vector.assign(...) does not reallocate the vector if it does not have to grow it. Still, it must copy the actual element.
If you want to know what the standard guarantees, look in the standard: C++11 standard plus minor editorial changes.
I assume that you have a vector filled with some data and you call an assign on it, this will:
destroy all the elements of the vector (their destructor is called) like a call to clear()
fill the now-empty vector with n copies of the given object, which must have a copy constructor.
So if your class allocates some memory you have to:
take care of this in your copy constructor
free it in the destructor
Reallocations happen when the size exceed the allocated memory (vector capacity). You can prevent this with a call to reserve(). However I think that assign() is smart enough to allocate all the memory it needs (if this is more than the already allocated) before starting to fill the vector and after having cleared it.
You may want to avoid reallocations because of their cost, but if you are trying to avoid them because your objects cannot handle properly, then I strongly discourage you to put them in a vector.
The semantics of assign are defined the standard in a quite straightforward way:
void assign(size_type n, const T& t);
Effects:
erase(begin(), end());
insert(begin(), n, t);
This means that first the destructors of the elements will be called. The copies of t are made in what is now a raw storage left after the lifetime of the elements ended.
The requirements are that value_type is MoveAssignable (for when erase doesn't erase to the end of the container and needs to move elements towards the beginning).
insert overload used here requires that value_type is CopyInsertable and CopyAssignable.
In any case, vector is oblivious to how your class is managing its own resources. That's on you to take care of. See The Rule of Three.
As in the case of vector::resize method,
std::vector::assign(size_type n, const value_type& val)
will initialize each element to a copy of "val". I prefer to use resize as it minimizes the number of object instanciations/destructions, but it does the same. Use resize if you want to minimize the data realloc, but keep in mind the following:
While doing this is safe for certain data structures, be aware that assigning /pushing elements of a class containing pointers to dynamically allocated data (e.g. with a new in the constructor) may cause havoc.
If your class dynamically allocates data then you should reimplement the YourClass::operator= to copy the data to the new object instead of copying the pointer.
Hope that it helps!

Lifetime of element removed from global vector

I basically have a
vector<Object> vec_;
as a class member in a cpp class. At a certain class function this vector will be filled with "Objects" just like this:
vec_.push_back(Object());
At a later time I iterate through the vector elements and keep a pointer to the best element. Then the vector is cleared as shown in the following:
Object* o_ptr = &(vec_[0]);
for (unsigned int i = 1; i < vec_.size(); i++) {
if (o_ptr->getCost() > vec_[i].getCost()) {
o_ptr = &(vec_[i]);
}
vec_.clear();
Now my question is: What happens to the objects removed from the vector? Is their lifetime over as soon as the they are removed from the vector? And does the pointer also point to empty space then?
And if not when does the lifetime of these objects end?
Regards scr
When vector.clear() is called (or when the vector is destructed) all objects contained with the vector will be destructed (as in this case they are objects and not raw pointers), leaving o_ptr as a dangling pointer.
Note that caching the address (or iterator) of an element in vector is dangerous, even without a call to clear(). For example, push_back() could result in internal reallocation of the vector, invalidating the cached address (or iterator).
What happens to the objects removed from the vector?
They are destructed.
Is their lifetime over as soon as the they are removed from the vector?
Yes.
Does the pointer also point to empty space then?
Yes it does. If you want to save the object, make a copy of it before clearing the vector.
It's also important to note that certain vector operations might leave pointers pointing to nothing even if the vector is not cleared. Specifically resizes, which typically involve a reallocation. It's generally best to store references to vector elements by index, since you will always be able to get the element via the index (whereas a pointer may be invalidated after certain operations).
The subscript operator, applied to vectors, returns a reference and does not make a copy.
The objects are owned by the vector and .clear() indeed removes them.
In addition, pointing to the objects stored in a std::vector is a bit dangerous. If you push new elements to the vector, at some point the vector might need to allocate more memory - which is likely to copy all the previous elements to a different address using their copy constructor (hence invalidating your pointers).
So: Use integer indices with std::vector instead of pointers, unless you know that you won't be pushing beyond the reserved capacity (which you can assure using .reserve()).
(Also while we're at it, make sure not to confuse the vector's internal buffer size with .size(), which is simply the number of actual elements stored).
The best way to track the lifetime of an object is to add a printf to both the constructor and destructor.
class MyObject
{
public:
MyObject()
{
printf("MyObject constructed, this=%p\n", this);
}
~MyObject()
{
printf("MyObject destructed, this=%p\n", this);
}
};

Reallocation in std::vector after std::vector.reserve()

I have a snippet of code where I first put some values in to a std::vector and then give an address of each of them to one of the objects that will be using them, like this:
std::vector < useSomeObject > uso;
// uso is filled
std::vector < someObject > obj;
for (int i=0; i < numberOfDesiredObjects; ++i){
obj.push_back(someObject());
obj.back().fillWithData();
}
for (int i=0; i < numberOfDesiredObjects; ++i){
uso[i].setSomeObject(&obj[i]);
}
// use objects via uso vector
// deallocate everything
Now, since I'm sometimes a little bit of a style freak, I think this is ugly and would like to use only 1 for loop, kind of like this:
for (int i=0; i < numberOfDesiredObjects; ++i){
obj.push_back(someObject());
obj.back().fillWithData();
uso[i].setSomeObject(&obj.back());
}
Of course, I can not do that because reallocation happens occasionally, and all the pointers I set became invalid.
So, my question is:
I know that std::vector.reserve() is the way to go if you know how much you will need and want to allocate the memory in advance. If I make sure that I am trying to allocate enough memory in advance with reserve(), does that guarantee that my pointers will stay valid?
Thank you.
Sidenote. This is a similar question, but there is not an answer to what I would like to know. Just to prevent it from popping up as a first comment to this question.
This is, in fact, one of the principal reasons for using reserve. You
are guaranteed that appending to the end of an std::vector will not
invalidate iterators, references or pointers to elements in the vector
as long as the new size of the vector does not exceed the old capacity.
If I make sure that I am trying to allocate enough memory in advance with reserve(), does that guarantee that my pointers will stay valid?
Yes, it guarantees your pointers will stay valid unless:
The size increases beyond the current capacity or
Unless, you erase any elements, which you don't.
The iterator Invalidation rules for an vector are specified in 23.2.4.3/1 & 23.2.4.3/3 as:
All iterators and references before the point of insertion are unaffected, unless the new container size is greater than the previous capacity (in which case all iterators and references are invalidated)
Every iterator and reference after the point of erase is invalidated.

Array of contiguous new objects

I am currently filling an vector array of elements like so:
std::vector<T*> elemArray;
for (size_t i = 0; i < elemArray.size(); ++i)
{
elemArray = new T();
}
The code has obviously been simplified. Now after asking another question (unrelated to this problem but related to the program) I realized I need an array that has new'd objects (can't be on the stack, will overflow, too many elements) but are contiguous. That is, if I were to receive an element, without the array index, I should be able to find the array index by doing returnedElement - elemArray[0] to get the index of the element in the array.
I hope I have explained the problem, if not, please let me know which parts and I will attempt to clarify.
EDIT: I am not sure why the highest voted answer is not being looked into. I have tried this many times. If I try allocating a vector like that with more than 100,000 (approximately) elements, it always gives me a memory error. Secondly, I require pointers, as is clear from my example. Changing it suddenly to not be pointers will require a large amount of code re-write (although I am willing to do that, but it still does not address the issue that allocating vectors like that with a few million elements does not work.
A std::vector<> stores its elements in a heap allocated array, it won't store the elements on the stack. So you won't get any stack overflow even if you do it the simple way:
std::vector<T> elemArray;
for (size_t i = 0; i < elemCount; ++i) {
elemArray.push_back(T(i));
}
&elemArray[0] will be a pointer to a (continuous) array of T objects.
If you need the elements to be contiguous, not the pointers, you can just do:
std::vector<T> elemArray(numberOfElements);
The elements themselves won't be on the stack, vector manages the dynamic allocation of memory and as in your example the elements will be value-initialized. (Strictly, copy-initialized from a value-initialized temporary but this should work out the same for objects that it is valid to store in a vector.)
I believe that your index calculation should be: &returnedElement - &elemArray[0] and this will work with a vector. Provided that returnedElement is actually stored in elemArray.
Your original loop should look something like this: (though it doesn't create objects in contiguous memory).
for (size_t i = 0; i < someSize ; ++i)
{
elemArray.push_back(new T());
}
And then you should know two basic things here:
elemArray.size() returns the number of elements elemArray currently holds. That means, if you use it in the for loop's condition, then your for loop would become an infinite loop, because you're adding elements to it, and so the vector's size would keep on increasing.
elemArray is a vector of T*, so you can store only T*, and to populate it, you've to use push_back function.
Considering your old code caused a stack overflow I think it probably looked like this:
Item items[2000000]; // stack overflow
If that is the case, then you can use the following syntax:
std::vector<Item> items(2000000);
This will allocate (and construct) the items contiguously on the heap.
While I also have the same requirements but for a different reason, mainly for being cache hot...(for small number of objects)
int maxelements = 100000;
T *obj = new T [100000];
vector<T *> vectorofptr;
for (int i = 0; i < maxelements; i++)
{
vectorofptr.push_back(&obj[i]);
}
or
int sz = sizeof(T);
int maxelements = 100000;
void *base = calloc(maxelements, sz); //need to save base for free()
vector<T *> vectorofptr;
int offset = 0;
for (int i = 0; i < maxelements; i++)
{
vectorofptr.push_back((T *) base + offset);
offset += sz;
}
Allocate a large chunk of memory and use placement new to construct your vector elements in it:
elemArray[i] = new (GetNextContinuousAddress()) T();
(assuming that you really need pointer to indidually new'ed objects in your array and are aware of the reasons why this is not recommended ..)
I know this is an old post but it might be useful intel for any who might need it at some point. Storing pointers in a sequence container is perfectly fine only if you have in mind a few important caveats:
the objects your stored pointers point to won't be contiguously aligned in memory since they were allocated (dynamically or not) elsewhere in your program, you will end up with a contiguous array of pointers pointing to data likely scattered across memory.
since STL containers work with value copy and not reference copy, your container won't take ownership of allocated data and thus will only delete the pointer objects on destruction and not the objects they point to. This is fine when those objects were not dynamically allocated, otherwise you will need to provide a release mechanism elsewhere on the pointed objects like simply looping through your pointers and delete them individually or using shared pointers which will do the job for you when required ;-)
one last but very important thing to remember is that if your container is to be used in a dynamic environment with possible insertions and deletions at random places, you need to make sure to use a stable container so your iterators/pointers to stored elements remain valid after such operations otherwise you will end up with undesirable side effects... This is the case with std::vector which will release and reallocate extra space when inserting new elements and then perform a copy of the shifted elements (unless you insert at the end), thus invalidating element pointers/iterators (some implementations provide stability though, like boost::stable_vector, but the penalty here is that you lose the contiguous property of the container, magic does not exist when programming, life is unfair right? ;-))
Regards