STL's vector resizing - c++

I can't find this piece of information. I'm dealing with an odd situation here where i'm inside of a loop and i can get a random information at any given time. This information has to be stored in a vector. Now each frame i have to set this vector to ensure that i won't exeed the space (i'm writing values into random points in the vector using indexing).
Now assuming there's no way to change this piece of code, i want to know, does the vector "ignore" the resize() function if i send an argument that's exactly the size of the vector? Where can i find this information?

From MSDN reference1
If the container's size is less than the requested size, _Newsize, elements are added to the vector until it reaches the requested size. If the container's size is larger than the requested size, the elements closest to the end of the container are deleted until the container reaches the size _Newsize. If the present size of the container is the same as the requested size, no action is taken
The ISO C++ standard (page 485 2) specifies this behaviour for vector::resize
void resize ( size_type sz , T c = T ());
if ( sz > size ())
insert ( end () , sz - size () , c );
else if ( sz < size ())
erase ( begin ()+ sz , end ());
else
; // Does nothing
So yes, the vector ignores it and you don't need to perform a check on your own.

Kinda-sorta.
Simply resizing a vector with resize() can only result in more memory being used by the vector itself (will change how much is used by its elements). If there's not enough room in the reserved space, it will reallocate (and sometimes they like to pad themselves a bit so even if there is you might grow). If there is already plenty of room for the requested size and whatever padding it wants to do, it will not regrow.
When the specification says that the elements past the end of the size will be deleted, it means in place. Basically it will call _M_buff[i].~T() for each element it is deleting. Thus any memory your object allocates will be deleted, assuming a working destructor, but the space that the object itself occupies (it's size) will not be. The vector will grow, and grow, and grow to the maximum size you ever tell it to and will not reshrink while it exists.

Related

Filling a vector with out-of-order data in C++

I'd like to fill a vector with a (known at runtime) quantity of data, but the elements arrive in (index, value) pairs rather than in the original order. These indices are guaranteed to be unique (each index from 0 to n-1 appears exactly once) so I'd like to store them as follows:
vector<Foo> myVector;
myVector.reserve(n); //total size of data is known
myVector[i_0] = v_0; //data v_0 goes at index i_0 (not necessarily 0)
...
myVector[i_n_minus_1] = v_n_minus_1;
This seems to work fine for the most part; at the end of the code, all n elements are in their proper places in the vector. However, some of the vector functions don't quite work as intended:
...
cout << myVector.size(); //prints 0, not n!
It's important to me that functions like size() still work--I may want to check for example, if all the elements were actually inserted successfully by checking if size() == n. Am I initializing the vector wrong, and if so, how should I approach this otherwise?
myVector.reserve(n) just tells the vector to allocate enough storage for n elements, so that when you push_back new elements into the vector, the vector won't have to continually reallocate more storage -- it may have to do this more than once, because it doesn't know in advance how many elements you will insert. In other words you're helping out the vector implementation by telling it something it wouldn't otherwise know, and allowing it to be more efficient.
But reserve doesn't actually make the vector be n long. The vector is empty, and in fact statements like myVector[0] = something are illegal, because the vector is of size 0: on my implementation I get an assertion failure, "vector subscript out of range". This is on Visual C++ 2012, but I think that gcc is similar.
To create a vector of the required length simply do
vector<Foo> myVector(n);
and forget about the reserve.
(As noted in the comment you an also call resize to set the vector size, but in your case it's simpler to pass the size as the constructor parameter.)
You need to call myVector.resize(n) to set (change) the size of the vector. calling reserve doesn't actually resize the vector, it just makes it so you can later resize without reallocating memory. Writing past the end of the vector (as you are doing here -- the vector size is still 0 when you write to it) is undefined behavior.

Heap_size in heap_sort

I'm reading Cormen's "Introduction to Algorithms", and I'm trying to implement a heap-sort, and there's one thing I continually fail to understand: how do we calculate the heap_size for a given array?
My textbook says
An array A that represents a heap is an object with two attributes:
A.length, which (as usual) gives the number of elements in the array,
and A.heap-size, which represents how many elements in the heap are
stored within array A. That is, although A[1 .. A.length] may contain
numbers, only the elements in A[1..A.heap-size],where 0 <= A.heap-size <=
A.length, are valid elements of the heap.
If I implement an array as std::vector<T> Arr, then its' size would be Arr.size, but what would its' heap_size be is currently beyond me.
The heap size should be a separately stored variable, which you manage yourself.
Whenever you remove from or add to the heap, you should decrement or increment the value appropriately.
In C++, using a vector, you may actually be able to use the size, since the underlying representation is an array that's at least as big as the size of the vector, and it's guaranteed to stay the same size if you call resize with a smaller size. (So the underlying array will be the array size and the vector size will be the heap size).

Control over std::vector reallocation

By reading the std::vector reference I understood that
calling insert when the the maximum capacity is reached will cause the reallocation of the std::vector (causing iterator invalidation) because new memory is allocated for it with a bigger capacity. The goal is to keep the guarantee about contiguous data.
As long as I stick below the maximum capacity insert will not cause that (and iterators will be intact).
My question is the following:
When reserve is called automatically by insert, is there any way to control how much new memory must be reserved?
Suppose that I have a vector with an initial capacity of 100 and, when the maximum capacity is hit, I want to allocate an extra 20 bytes.
Is it possible to do that?
You can always track it yourself and call reserve before it would allocate, e.g.
static const int N = 20 // Amount to grow by
if (vec.capacity() == vec.size()) {
vec.reserve(vec.size() + N);
}
vec.insert(...);
You can wrap this in a function of your own and call that function instead of calling insert() directly.

How to make sure that recently created area points to NULL?

How can I make sure that, for each allocated new space on heap area, the recently created element of vector of pointers points to NULL ?
Ex:
vector < Sometype* >
vector ----------------------
| | | | ... |
----------------------
new element is pushed back but no available area so double space
index x x+1 y
vector -------------------------------------------
| | | | ... | | | ... |
-------------------------------------------
^^^^^^^^^^^^^^^^^^^^^
recently created
x, x+1, ... y all points to the NULL
I want each space on recently created part point to NULL ?
This new space is part of the capacity of the vector, but not part of the size. You shouldn't need to care what values it contains, since you're not allowed to access it anyway. Other than the one value you pushed back, the extra space is not "elements of the vector", it's just unused space.
As far as the standard is concerned, the implementation could use it to store something meaningful, if it wanted. For example, an implementation could legally store some eye-catcher value in the unused memory, which conflicts with your desire for the unused memory to contain null pointers.
You could write code like this:
v.push_back(some_value);
if (v.capacity() > v.size()) {
size_t oldsize = v.size();
v.resize(v.capacity(), NULL);
v.resize(oldsize);
}
There's no guarantee this will actually leave the memory set to 0 once you resize back down again, but it probably will. So it might be good enough for debugging. If the purpose you have in mind is not debugging, please say what it is, because if not debugging then either your purpose is illegitimate or else one of us has misunderstood something.
If I correctly understood your question, one straightforward solution is to call resize() yourself passing NULL as second argument to be used as default value for newly created items:
if (v.size() == v.capacity()) //vector is full
{
//compute the new size
size_t newSize = 2 * v.size();
//second argument is the default value for newly added items
v.resize(newSize, NULL);
}
Why would you need it to be NULL unless it's been constructed? If you create a vector of 10 objects, say, and then push an extra, 11th item onto the vector then the vector may reserve enough space for another 10 items but you cannot use those items unless you either push items onto the vector increasing it's size, or you call resize.
size is not the same as capacity
Why would you need that? vector does not allow you to access these elements anyway. The expanding of it's capacity is implementation detail of vector and values of elements at the allocated space are not relevant. These elements will get overwritten once you push_back something there, or resize the vector with the given value.
There is an important difference between capacity and size of a vector.
When you push new elements into vector and there is no room, although std::vector allocates extra memory for new elements (similar to reserve() call), it does not create them (does not call constructors). See placement "new" to understand how this could work. There's no real way to enforce certain value for new elements, because there are no new elements - only raw memory block allocated for future elements. By using std::vector::at instead of operator[] you can ensure that you're accessing elements within valid range.
If you resize vector yourself by calling std::vector::resize, then simply provide default value for new elements in 2nd parameters. However, there's a catch. When you resize std::vector yourself and do not provide value for 2nd argument of std::vector::resize, std::vector will value-initialize new elements if value stored in std::vector has constructor and zero-initialize them otherwise. Which means, that if you do std::vector<int*> v; v.resize(200);, all new elements of v will be initialized to zero. See this answer for details.

basic question on std::vector in C++

C++ textbooks, and threads, like these say that vector elements are physically contiguous in memory.
But when we do operations like v.push_back(3.14) I would assume the STL is using the new operator to get more memory to store the new element 3.14 just introduced into the vector.
Now say the vector of size 4 is stored in computer memory cells labelled 0x7, 0x8, 0x9, 0xA. If cell 0xB contains some other unrelated data, how will 3.14 go into this cell? Does that mean cell 0xB will be copied somewhere else, erased to make room for 3.14?
The short answer is that the entire array holding the vector's data is moved around to a location where it has space to grow. The vector class reserves a larger array than is technically required to hold the number of elements in the vector. For example:
vector< int > vec;
for( int i = 0; i < 100; i++ )
vec.push_back( i );
cout << vec.size(); // prints "100"
cout << vec.capacity(); // prints some value greater than or equal to 100
The capacity() method returns the size of the array that the vector has reserved, while the size() method returns the number of elements in the array which are actually in use. capacity() will always return a number larger than or equal to size(). You can change the size of the backing array by using the reserve() method:
vec.reserve( 400 );
cout << vec.capacity(); // returns "400"
Note that size(), capacity(), reserve(), and all related methods refer to individual instances of the type that the vector is holding. For example, if vec's type parameter T is a struct that takes 10 bytes of memory, then vec.capacity() returning 400 means that the vector actually has 4000 bytes of memory reserved (400 x 10 = 4000).
So what happens if more elements are added to the vector than it has capacity for? In that case, the vector allocates a new backing array (generally twice the size of the old array), copies the old array to the new array, and then frees the old array. In pseudo-code:
if(capacity() < size() + items_added)
{
size_t sz = capacity();
while(sz < size() + items_added)
sz*=2;
T* new_data = new T[sz];
for( int i = 0; i < size(); i++ )
new_data[ i ] = old_data[ i ];
delete[] old_data;
old_data = new_data;
}
So the entire data store is moved to a new location in memory that has enough space to store the current data plus a number of new elements. Some vectors may also dynamically decrease the size of their backing array if they have far more space allocated than is actually required.
std::vector first allocates a bigger buffer, then copies existing elements from the "old" buffer to the "new" buffer, then it deletes the "old buffer", and finally adds the new element into the "new" buffer.
Generally, std::vector implementation grow their internal buffer by doubling the capacity each time it's necessary to allocate a bigger buffer.
As Chris mentioned, every time the buffer grows, all existing iterators are invalidated.
When std::vector allocates memory for the values, it allocates more than it needs; you can find out how much by calling capacity. When that capacity is used up, it allocates a bigger chunk, again larger than it needs, and copies everything from the old memory to the new; then it releases the old memory.
If there is not enough space to add the new element, more space will be allocated (as you correctly indicated), and the old data will be copied to the new location. So cell 0xB will still contain the old value (as it might have pointers to it in other places, it is impossible to move it without causing havoc), but the whole vector in question will be moved to the new location.
A vector is an array of memory. Typical implementation is that it grabs more memory than is required. It that footprint needs to expand over any other memory - the whole lot is copied. The old stuff is freed. The vector memory is on the stack - and that should be noted. It is also a good idea to say the maximum size is required.
In C++, which comes from C, memory is not 'managed' the way you describe - Cell 0x0B's contents will not be moved around. If you did that, any existing pointers would be made invalid! (The only way this could be possible is if the language had no pointers and used only references for similar functionality.)
std::vector allocates a new, larger buffer and stores the value of 3.14 to the "end" of the buffer.
Usually, though, for optimized this->push_back()s, a std::vector allocates memory about twice its this->size(). This ensures that a reasonable amount of memory is exchanged for performance. So, it is not guaranteed 3.14 will cause a this->resize(), and may simply be put into this->buffer[this->size()++] if and only if this->size() < this->capacity().