Vector Resize vs Reserve for nested vectors - c++

I'm trying to use another project's code and they have structs of this form:
struct data{
std::vector<sparse_array> cols,rows;
}
struct sparse_array {
std::vector<unsigned int> idxs;
std::vector<double> values;
void add(unsigned int idx, double value) {
idxs.push_back(idx);
values.push_back(value);
}
}
For my code, I tried using the following lines:
data prob;
prob.cols.reserve(num_cols);
prob.rows.reserve(num_rows);
// Some loop that calls
prob.cols[i].add(idx, value);
prob.rows[i].add(idx, value);
And when I output the values, prob.rows[i].value[j] to a file I get all zeros. But when I use resize instead of reserve I get the actual value that I read in. Can someone give me an explanation about this?

Function reserve() simply allocates a contiguous region of memory big enough to hold the number of items you specify and moves the vector's old content into this new block, which makes sure no more reallocations for the vectors' storage will be done upon insertions as long as the specified capacity is not exceeded. This function is used to reduce the number of reallocations (which also invalidate iterators), but does not insert any new items at the end of your vector.
From the C++11 Standard, Paragraph 23.3.6.3/1 about reserve():
A directive that informs a vector of a planned change in size, so that it can manage the storage
allocation accordingly. After reserve(), capacity() is greater or equal to the argument of reserve if reallocation happens; and equal to the previous value of capacity() otherwise. Reallocation happens at this point if and only if the current capacity is less than the argument of reserve(). If an exception is thrown other than by the move constructor of a non-CopyInsertable type, there are no effects.
Notice that by doing prob.cols[i].push_back(idx, value); you are likely to get undefined behavior, since i is probably an out-of-bounds index.
On the other hand, function resize() does insert items at the end of your vector, so that the final size of the vector will be the one you specified (this means it can even erase elements, if you specify a size smaller than the current one). If you specify no second argument to a call to resize(), the newly inserted items will be value-initialized. Otherwise, they will be copy-initialized from the value you provide.
From the C++11 Standard, Paragraph 23.3.6.3/9 about resize():
If sz <= size(), equivalent to erase(begin() + sz, end());. If size() < sz, appends
sz - size() value-initialized elements to the sequence.
So to sum it up, the reason why accessing your vector after invoking resize() gives the expected result is that items are actually being added to the vector. On the other hand, since the call to reserve() does not add any item, subsequent accesses to non-existing elements will give you undefined behavior.

If the vector is empty, then std::vector::resize(n) expands the content of this vector by inserting n new elements at the end. std::vector::reserve(n) only reallocates the memory block that your vector uses for storing its elements so that it's big enough to hold n elements.
Then when you call prob.cols[i], you are trying to access the element at index i. In case you used reserve before, this results in accessing the memory where no element resides yet, which produces the undefined behavior.
So just use resize in this case :)

Related

C++ std::vector clear all elements

Consider a std::vector:
std::vector<int> vec;
vec.push_back(1);
vec.push_back(2);
Would vec.clear() and vec = std::vector<int>() do the same job? What about the deallocation in second case?
vec.clear() clears all elements from the vector, leaving you with a guarantee of vec.size() == 0.
vec = std::vector<int>() calls the copy/move(Since C++11) assignment operator , this replaces the contents of vec with that of other. other in this case is a newly constructed empty vector<int> which means that it's the same effect as vec.clear();. The only difference is that clear() doesn't affect the vector's capacity while re-assigning does, it resets it.
The old elements are deallocated properly just as they would with clear().
Note that vec.clear() is always as fast and without the optimizer doing it's work most likely faster than constructing a new vector and assigning it to vec.
They are different:
clear is guaranteed to not change capacity.
Move assignment is not guaranteed to change capacity to zero, but it may and will in a typical implementation.
The clear guarantee is by this rule:
No reallocation shall take place during insertions that happen after a call to reserve() until the time when an insertion would make the size of the vector greater than the value of capacity()
Post conditions of clear:
Erases all elements in the
container. Post: a.empty()
returns true
Post condition of assignment:
a = rv;
a shall be equal to
the value that rv
had before this
assignment
a = il;
Assigns the range
[il.begin(),il.end()) into a. All existing
elements of a are either assigned to or
destroyed.

Use std::vector::data after reserve

I have a std::vector on which I call reserve with a large value. Afterwards I retrieve data().
Since iterating data is then crashing I am wondering whether this is even allowed. Is reserve forced to update data to the allocated memory range?
The guarantee of reserve is that subsequent insertions do not reallocate, and thus do not cause invalidation. That's it. There are no further guarantees.
Is reserve forced to update data to the allocated memory range?
No. The standard only guarantees that std::vector::data returns a pointer and [data(), data() + size()) is a valid range, the capacity is not concerned.
ยง23.3.11.4/1 vector data
[vector.data]:
Returns: A pointer such that [data(), data() + size()) is a valid
range. For a non-empty vector, data() == addressof(front()).
There is no requirement that data() returns dereferencable pointer for empty (size() == 0) vector, even if it has nonzero capacity. It might return nullptr or some arbitrary value (only requirement in this case is that it should be able to be compared with itself and 0 could be added to it without invoking UB).
I'd say the documentation is pretty clear on this topic: anything after data() + size() may be allocated but not initialized memory: if you want to also initialize this memory you should use vector::resize.
void reserve (size_type n);
Request a change in capacity
Requests that the vector capacity be at least enough to contain n elements.
If n is greater than the current vector capacity, the function causes
the container to reallocate its storage increasing its capacity to n
(or greater).
In all other cases, the function call does not cause a reallocation
and the vector capacity is not affected.
This function has no effect on the vector size and cannot alter its
elements.
I'm not sure why you would want to access anything after data() + size() after reserve() in the first place: the intended use of reserve() is to prevent unnecessary reallocations when you know or can estimate the expected size of your container, but at the same time avoid the unnecessary initializon of memory which may be either inefficient or impractical (e.g. non-trivial data for initialization is not available). In this situation you could replace log(N) reallocations and copies with only 1 improving performance.

Is std::vector::reserve(0); legal?

Is std::vector::reserve(0); legal and what will it do?
There's nothing to prohibit it. The effect of reserve is:
After reserve(), capacity() is greater or equal to the argument of reserve if
reallocation happens; and equal to the previous value of capacity() otherwise. Reallocation happens
at this point if and only if the current capacity is less than the argument of reserve().1
Since the value of capacity() can never be less than 0 (it's unsigned), this can never have any effect; it can never cause a reallocation.
1. c++ standard, [vector.capacity]
Yes, it is a legal no-op.
If new_cap is greater than the current capacity(), new storage is allocated, otherwise the method does nothing.
(Source, emphasis mine.)
Since capacity() will always be >= 0 (due to size_type being unsigned), passing a zero is guaranteed to do nothing.
According to the C++ Standard
After reserve(), capacity() is greater or equal to the argument of
reserve if reallocation happens; and equal to the previous value of
capacity() otherwise. Reallocation happens at this point if and only
if the current capacity is less than the argument of reserve().
So there simply will not be a reallocation if the argument of reserve is equal to 0.
The function itself throws an exception only in one case
Throws: length_error if n > max_size().
Take into account that reserve( 0 ) is not equivalent to resize( 0 ). In the last case all elements of the vector will be removed.
It is legal and will reserve no space. Though if the call is lower than its capacity the call will do nothing.
The documentation provides a clear answer to this:
Increase the capacity of the container to a value that's greater or equal to new_cap. If new_cap is greater than the current capacity(), new storage is allocated, otherwise the method does nothing.
capacity() returns a value that cannot be negative. Hence, passing zero for new_cap always falls into the second category - i.e. when the function does nothing.
void reserve (size_type n);
If n is greater than the current vector capacity, the function causes the container to reallocate its storage increasing its capacity to n (or greater).
In all other cases, the function call does not cause a reallocation and the vector capacity is not affected.
First of all, you should try to understand how Vector works. It is an array that reserve memory in order to use it when you need to store a new value trying to do the insert operation faster and efficient.
With std::vector::reserve() you can determine the amount of memory that you want to reserve, in your case, zero.
In case you want to add another value to your vector and the reserve space is zero, it will work with no problem at all, but the operation will be slower. It could be a problem if you want to do this for a lot of values, but probably you won't notice this if you do it just a few times.

Which is most costly in C++, remove last element or resize of a vector?

Which operation is most costly in C++?
1. Resize of a vector (decrease size by 1)
2. Remove last element in vector
From http://en.cppreference.com/w/cpp/container/vector which essentially quotes the Standard:
void pop_back();
Removes the last element of the container.
No iterators or references except for back() and end() are invalidated.
void resize( size_type count );
Resizes the container to contain count elements. If the current size
is greater than count, the container is reduced to its first count
elements as if by repeatedly calling pop_back().
So in this case, calling resize(size() - 1) should be equivalent to calling pop_back(). However, calling pop_back() is the right thing to do as it expresses your intent.
NOTE: the answer is reflecting the changed interface of C++11's std::vector::resize(), which used to contain a hidden default argument which was being copied around (and which may or may not have been optimized away).
In my opinion they are equivalent. The both operations remove the last element and decrease the size.:)
According to the C++ Standard
void resize(size_type sz); 12 Effects: If sz <= size(), equivalent to
calling pop_back() size() - sz times
So they are simply equivalent according to my opinion and the point of view of the Standard.:)
Also if to consider member function erase instead of pop_back (in fact they do the same in this case) then according to the same Standard
4 Complexity: The destructor of T is called the number of times equal
to the number of the elements erased, but the move assignment
operator of T is called the number of times equal to the number of
elements in the vector after the erased elements.
As there are no move operations for the last element then the cost is the same.

Resizing an "std::vector"; which elements are affected?

std::vector<AClass> vect;
AClass Object0, Object1, Object2, Object3, Object4;
vect.push_back(Object0); // 0th
vect.push_back(Object1); // 1st
vect.push_back(Object2); // 2nd
vect.push_back(Object3); // 3rd
vect.push_back(Object4); // 4th
Question 1 (Shrinking): Is it guarantied that the 0th, 1st and 2nd elements are protected (i.e.; their values do not change) after resizing this vector with this code: vect.resize(3)?
Question 2 (Expanding): After expanded this vector by the code vect.resize(7);
a. Are the first 5 elements (0th through 4th) kept unchanged?
b. What happens to the newly added two elements (5th and 6th)? What are their default values?
Question 1: Yes, the standard says:
void resize(size_type sz);
If sz < size(), equivalent to erase(begin() + sz, end());.
Question 2: If no resizing is required, yes. Otherwise, your elements will be copied to a different place in memory. Their values will remain unchanged, but those values will be stored somewhere else. All iterators, pointers and references to those objects will be invalidated. The default value is AClass().
Question 1:
Yes, from cplusplus.com "If sz is smaller than the current vector size, the content is reduced to its first sz elements, the rest being dropped."
Question 2:
a) The first elements are kept unchanged, the vector just increases the size of it's internal buffer to add the new elements.
b) The default constructor of AClass is called for the insertion of each new element.
vector always grows and shrinks at the end, so if you reduce the size of a vector only the last elements are removed. If you grow a vector with resize, new elements are added onto the end using the a default-constructed object as the value for the new entries. For a class, this is the value of a new object created with the default constructormas. For a primitive, this is zero (or false for bool).
And yes, elements that aren't removed are always protected during a resize.
Yes, when you shrink a vector, all the objects that remain retain their prior values.
When you expand a vector, you supply a parameter specifying a value that will be used to fill the new slots. That parameter defaults to T().