Does std::vector change its address? How to avoid - c++

Since vector elements are stored contiguously, I guess it may not have the same address after some push_back's , because the initial allocated space could not suffice.
I'm working on a code where I need a reference to an element in a vector, like:
int main(){
vector<int> v;
v.push_back(1);
int *ptr = &v[0];
for(int i=2; i<100; i++)
v.push_back(i);
cout << *ptr << endl; //?
return 0;
}
But it's not necessarily true that ptr contains a reference to v[0], right? How would be a good way to guarantee it?
My first idea would be to use a vector of pointers and dynamic allocation. I'm wondering if there's an easier way to do that?
PS.: Actually I'm using a vector of a class instead of int, but I think the issues are the same.

Don't use reserve to postpone this dangling pointer bug - as someone who got this same problem, shrugged, reserved 1000, then a few months later spent ages trying to figure out some weird memory bug (the vector capacity exceeded 1000), I can tell you this is not a robust solution.
You want to avoid taking the address of elements in a vector if at all possible precisely because of the unpredictable nature of reallocations. If you have to, use iterators instead of raw addresses, since checked STL implementations will tell you when they have become invalid, instead of randomly crashing.
The best solution is to change your container:
You could use std::list - it does not invalidate existing iterators when adding elements, and only the iterator to an erased element is invalidated when erasing
If you're using C++0x, std::vector<std::unique_ptr<T>> is an interesting solution
Alternatively, using pointers and new/delete isn't too bad - just don't forget to delete pointers before erasing them. It's not hard to get right this way, but you have to be pretty careful to not cause a memory leak by forgetting a delete. (Mark Ransom also points out: this is not exception safe, the entire vector contents is leaked if an exception causes the vector to be destroyed.)
Note that boost's ptr_vector cannot be used safely with some of the STL algorithms, which may be a problem for you.

You can increase the capacity of the underlying array used by the vector by calling its reserve member function:
v.reserve(100);
So long as you do not put more than 100 elements into the vector, ptr will point to the first element.

How would be a good way to guarantee it?
std::vector<T> is guaranteed to be continous, but the implementation is free to reallocate or free storage on operations altering the vector contents (vector iterators, pointers or references to elements become undefined as well).
You can achieve your desired result, however, by calling reserve. IIRC, the standard guarantees that no reallocations are done until the size of the vector is larger than its reserved capacity.
Generally, I'd be careful with it (you can quickly get trapped…). Better don't rely on std::vector<>::reserve and iterator persistence unless you really have to.

If you don't need your values stored contiguously, you can use std::deque instead of std::vector. It doesn't reallocate, but holds elements in several chunks of memory.

Another possibility possibility would be a purpose-built smart pointer that, instead of storing an address would store the address of the vector itself along with the the index of the item you care about. It would then put those together and get the address of the element only when you dereference it, something like this:
template <class T>
class vec_ptr {
std::vector<T> &v;
size_t index;
public:
vec_ptr(std::vector<T> &v, size_t index) : v(v), index(index) {}
T &operator*() { return v[index]; }
};
Then your int *ptr=&v[0]; would be replaced with something like: vec_ptr<int> ptr(v,0);
A couple of points: first of all, if you rearrange the items in your vector between the time you create the "pointer" and the time you dereference it, it will no longer refer to the original element, but to whatever element happens to be at the specified position. Second, this does no range checking, so (for example) attempting to use the 100th item in a vector that only contains 50 items will give undefined behavior.

As James McNellis and Alexander Gessler stated, reserve is a good way of pre-allocating memory. However, for completeness' sake, I'd like to add that for the pointers to remain valid, all insertion/removal operations must occur from the tail of the vector, otherwise item shifting will again invalidate your pointers.

Depending on your requirements and use case, you might want to take a look at Boost's Pointer Container Library.
In your case you could use boost::ptr_vector<yourClass>.

I came across this problem too and spent a whole day just to realize vector's address changed and the saved addresses became invalid. For my problem, my solution was that
save raw data in the vector and get relative indices
after the vector stopped growing, convert the indices to pointer addresses
I found the following works
pointers[i]=indices[i]+(size_t)&vector[0];
pointers[i]=&vector[ (size_t)indices[i] ];
However, I haven't figured out how to use vector.front() and I am not sure whether I should use
pointers[i]=indices[i]*sizeof(vector)+(size_t)&vector[0] . I think the reference way(2) should be very safe.

Related

How to access element(having a unique identifier) in a vector using a map in constant time? [duplicate]

When vector needs more memory it will reallocate memory somewhere, I don't know where yet! and then pointers become invalid, is there any good explanation on this?
I mean where they go, what happen to my containers? ( not linked list ones )
Short answer: Everything will be fine. Don't worry about this and get back to work.
Medium answer: Adding elements to or removing them from a vector invalidates all iterators and references/pointers (possibly with the exception of removing from the back). Simple as that. Don't refer to any old iterators and obtain new ones after such an operation. Example:
std::vector<int> v = get_vector();
int & a = v[6];
int * b = &v[7];
std::vector<int>::iterator c = v.begin();
std::advance(it, 8);
v.resize(100);
Now a, b and c are all invalid: You cannot use a, and you cannot dereference b or c.
Long answer: The vector keeps track of dynamic memory. When the memory is exhausted, it allocates a new, larger chunk elsewhere and copies (or moves) all the old elements over (and then frees up the old memory, destroying the old objects). Memory allocation and deallocation is done by the allocator (typically std::allocator<T>), which in turn usually invokes ::operator new() to fetch memory, which in turn usually calls malloc(). Details may vary and depend on your platform. In any event, any previously held references, pointers or iterators are no longer valid (presumably because they refer to the now-freed memory, though it's not specified in the standard why they're invalid).
When you add or remove items from a vector, all iterators (and pointers) to items within it are invalidated. If you need to store a pointer to an item in a vector, then make the vector const, or use a different container.
It shouldn't matter to you where the vector stores things. You don't need to do anything, just let it do its job.
When you use std::vector, the class takes care of all the details regarding memory allocation, pointers, resizing and so on.
The vector class exposes its contents through iterators and references. Mutations of the vector will potentially invalidate iterators and references because reallocation may be necessary.
It is valid to access the contents using pointers because the vector class guarantees to store its elements at contiguous memory locations. Clearly any mutation of the list will potentially invalidate any pointers to its contents, because of potential reallocation. Therefore, if you ever access an element using pointers, you must regard those pointers as invalid once you mutate the vector. In short the same rules apply to pointers to the contents as do to references.
If you want to maintain a reference to an item in the vector, and have this reference be valid even after mutation, then you should remember the index rather than a pointer or reference to the item. In that case it is perfectly safe to add to the end of the vector and your index value still refers to the same element.

handling for references affected by adding elements to std::vector

As pointed out in answer to another question, all pointers to a vector's elements may become invalid after new elements have been added to that vector, due to reallocation of the underlying contiguous buffer.
Is there a safe way to handle this issue at compile-time?
Are there best-practices to deal with or to avoid a situation, where references may become invalid after altering the data-structure?
Are there best-practices to deal with or to avoid a situation, where references may become invalid after altering the data-structure?
preallocate enough space for vector in advance
keep index of object in array instead of reference/pointer to object itself
use different container, for example std::list
which way will work for you the best way depends on your situation
A pointer or reference to the std::vector itself won't change. What could change is the address of a specific element inside the std::vector due to reallocation policy which is implementation dependent.
Preallocating enough space is a risk because you shouldn't relay on a particular implementation policy.
Storing the address of an element in a std::vector is a bad idea and can be easily avoided because std::vector [] operator is very fast so store the index instead the address of the element. This the right way.

std::vector and pointer predictability

when you push_back() items into an std::vector, and retain the pointers to the objects in the vector via the back() reference -- are you guaranteed (assuming no deletions occur) that the address of the objects in the vector will remain the same?
It seems like my vector changes the pointers of the objects I use, such that if I push 10 items into it, and retain the pointers to those 10 items by remembering the back() reference after each push_back.
if your vector is to store objects, not pointers to objects, are the addresses of those objects subject to constant change upon pushing more items?
Any method that causes the vector to resize itself will invalidate all iterators, pointers, and references to the elements contained within. This can be avoided by reserving mememory, or using boost::stable_vector.
23.3.6.5/1:
Remarks: Causes reallocation if the new size is greater than the old capacity. If no reallocation happens,
all the iterators and references before the insertion point remain valid.
No, std::vector is not a stable container, i.e. pointers and iterators may get invalidated by resizing the vector (or, better, by the corresponding reallocation). If you want to avoid this behaviour, use boost::stable_vector or std::list or std::deque instead (I would prefer the last). Or, more easily, you can simply store your locations by indices.
For more information, consider also the answer to this question here.
It's not guaranteed. If you push_back enough items to exceed the size of the memory buffer that's the backing store of the vector, a new buffer will be created, all the contents will be copied to the new location, and the old buffer will be deleted. At that point, old pointers (as well as iterators!) will be invalid.
If you know exactly how much maximum space you will ever need, you could set the size of the vector's buffer to that size when you create it, to avoid reallocation. However, I prefer to store "references" to elements of a vector as a reference to the vector and a size_t index into the vector, instead of using pointers. It's not necessarily slower than pointers (depending on the CPU type) but, even if it is, it won't be much slower and in my opinion it's worth it for the peace of mind knowing that no matter how the vector is used in the future or reallocated, it'll still refer to the proper element.

why not implement c++ std::vector::pop_front() by shifting the pointer to vector[0]?

Why can't pop_front() be implemented for C++ vectors by simply shifting the pointer contained in the vector's name one spot over? So in a vector containing an array foo, foo is a pointer to foo[0], so pop_front() would make the pointer foo = foo[1] and the bracket operator would just do the normal pointer math. Is this something to do with how C++ keeps track of the memory you're using for what when it allocates space for an array?
This is similar to other questions I've seen about why std::vector doesn't have a pop_front() function, I will admit, but i haven't anyone asking about why you can't shift the pointer.
The vector wouldn't be able to free its memory if it did this.
Generally, you want the overhead per vector object to be small. That means you only store three items: the pointer to the first element, the capacity, and the length.
In order to implement what you suggest, every vector ever (all of them) would need an additional member variable: the offset from the start pointer at which the zeroth element resides. Otherwise, the memory could not be freed, since the original handle to it would have been lost.
It's a tradeoff, but generally the memory consumption of an object which may have millions of instances is more valuable than the corner case of doing the absolute worst thing you can do performance-wise to the vector.
Because implementers want to optimize the size of a vector. They usually use 3 pointers, one for the beginning, one for the capacity (the allocated end) and one for the end.
Doing what you require adds another 4 bytes to every vector (and there are a lot of those in a c++ program) for very little benefit: the contract of vector is to be fast when pushing back new elements, removing and inserting are "unsual" operations and their performance matter less than the size of the class.
I started typing out an elaborate answer explaining how the memory is allocated and freed but after typing it all out I realized that memory issues alone don't justify why pop_front isn't there as other answers here suggested.
Having pop_front in a vector where the extra cost is another pointer is justifiable in most circumstances. The problem, in my opinion, is push_front. If the container has pop_front then it should also have push_front otherwise the container is not being consistent. push_front is definitely expensive for a vector container (unless you match your pushes with your pops which is not a good design). Without push_front the vector is really wasting memory if one does lots of pop_front operations with no push_front functionality.
Now the need for pop_front and push_front is there for a container that is similar to a vector (constant time random access) which is why there is deque.
You could, but it would complicate the implementation a bit, and add a pointer of overhead to the type's size (so it could track the actual allocation's address). Is that worth it? Sometimes. First consider other structures which may handle your usage better (maybe deque?).
You could do that, but vector is designed to be a simple container with constant time index lookups and push/pop from the end. Doing what you suggest would complicate the implementation as it would have to track the allocated beginning and the "current" beginning. Not to mention that you still couldn't guarantee constant time insertion at the front but you might get it sometimes.
If you need a container with constant time front and back insertion and removal, that's precisely what deque is for, there's no need to modify vector to handle it.
You can use std::deque instead of std::vector. It's a double-ended-queue with also the vector-like access members. It implements both front and back push/pop.
http://www.cplusplus.com/reference/stl/deque/
Another shortcoming of your suggestion is that you'll waste memory spaces as you can't make use of those on the left of the array after shifting. The more you execute pop_front(), the more you'll waste until the vector is destructed.

Using a pointer to an object stored in a vector... c++

I have a vector of myObjects in global scope.
std::vector<myObject>
A method is passed a pointer to one of the elements in the vector.
Can this method increment the pointer, to get to the next element,
myObject* pmObj;
++pmObj; // the next element ??
or should it be passed an std::Vector<myObject>::iterator and increment that instead?
Assume for now that the vector will not get changed in the meantime.
Yes - the standard guarantees in a technical correction that the storage for a vector is contiguous, so incrementing pointers into a vector will work.
Yes, this will work as expected since std::vector is mandated to use contiguous storage by the Standard. I would suggest passing in a pair of iterators if you are working with a range of objects. This is pretty much the standard idiom as employed by the STL. This will make your code a little safer as well since you have an explicit endpoint for iteration instead of relying on a count or something like that.
If the vector is not reallocated and you're sure not to get out of vector's bounds then you can use this approach. Incrementing pointers is legal and while you have space to move to the next element you can do so by incrementing the pointer since the vector's buffer is a single block of memory.