std::vector::reserve allows random access before push_back - c++

I recently learned about std::vector::reserve online. The websites say that reserving memory inside a std::vector does NOT change the size of the vector, but instead increases the std::vector's capacity. After reserving, attempting to access random elements should crash.
However, when I run this code:
#include <iostream>
#include <vector>
using namespace std;
int main(){
vector <int> v;
v.reserve(1000000);
v[4] = 5;
cout << v[4] << endl; // this line and the above line should cause errors
return 0;
}
Nothing happens. The program runs and prints 5 to the screen, and I don't get any errors at all.
I'm not sure if I'm making a mistake here, so can somebody tell me why the above program runs?

After reserving, attempting to access random elements should crash.
No, it would be more correct to say that "after reserving, attempting to access random elements will result in undefined behaviour"(a).
And undefined behaviour means exactly that, undefined. It may work, it may not. It may seem to work but set up conditions for spectacular failure later on in your program. It may not work in another implementation, it may even not work in the same implementation on certain days of the week.
Bottom line, don't do it.
(a) Table 69 in ISO C++20 has the two element access operations stating exactly the same thing:
a[n] - returns reference, or const reference for constant a. Semantics: *(a.begin() + n).
a.at(n) - returns reference, or const reference for constant a. Semantics: *(a.begin() + n).
But the note immediately after that clarifies the difference:
The member function at() provides bounds-checked access to container elements. at() throws out_of_range if n >= a.size().
Hence, if you need it to "crash" (quoted since it' really raising an exception rather than crashing), use the latter.

Related

Why does const_casting a heap.top() of priority_queue have undefined behavior?

I've made a simple Huffman encoding program to output individual encodings for characters and save the encoded file. This was for an assignment, and I was told that using const_cast on heap.top() is considered undefined behavior if we heap.pop() afterwards, but I'm not sure I understand why.
I've read cppreference regarding the std::pop_heap which is the underlying function called when we call heap.pop() and I believe that a nullptr in the comparison is still defined and understood. It doesn't seem to function abnormally to me when I debugged it.
Here's an example
#include <functional>
#include <queue>
#include <vector>
#include <iostream>
#include <memory>
template<typename T> void print_queue_constcast(T& q) {
while(!q.empty()) {
auto temp = std::move(const_cast<int&>(q.top()));
std::cout << temp << " ";
q.pop();
}
std::cout << '\n';
}
template<typename T> void print_queue(T& q) {
while(!q.empty()) {
std::cout << q.top() << " ";
q.pop();
}
std::cout << '\n';
}
int main() {
std::priority_queue<int> q1;
std::priority_queue<int> q2;
for(int n : {1,8,5,6,3,4,0,9,7,2}){
q1.push(n);
q2.push(n);
}
print_queue(q1);
print_queue_constcast(q2);
}
Could anyone explain what is actually going in the backgroun that'd be undefined behavior or that would cause this to fail under certain circumstances?
tl;dr: Maybe; maybe not.
Language-level safety
Like a set, a priority_queue is in charge of ordering its elements. Any modification to an element would potentially "break" the ordering, so the only safe way to do that is via the container's own mutating methods. (In fact, neither one actually provides such a thing.) Directly modifying elements is dangerous. To enforce this, these containers only expose const access to your elements.
Now, at the language level, the objects won't actually have a static type of const T; most likely they're just Ts. So modifying them (after a const_cast to cheat the type system) doesn't have undefined behaviour in that sense.
Library-level safety
However, you are potentially breaking a condition of using the container. The rules for priority_queue don't ever actually say this, but since its mutating operations are defined in terms of functions like push_heap and pop_heap, your use of such operations will break preconditions of those functions if the container's ordering is no longer satisfied after your direct mutation.
Thus your program will have undefined behaviour if you break the ordering and later mutate the priority_queue in such a way that depends on the ordering being intact. If you don't, technically your program's behaviour is well-defined; however, in general, you'd still be playing with fire. A const_cast should be a measure of last resort.
So, where do we stand?
The question is: did you break the ordering? What's the state of the element after moving from it, and is the ordering satisfied by having an object in that state at the top of the queue?
Your original example uses shared_ptrs, and we know from the documentation that a moved-from shared_ptr turns safely into a null pointer.
The default priority_queue ordering is defined by std::less, which yields a strict total order over raw pointers; std::less on a shared_ptr will actually invoke its base case of operator<, but that in turn is defined to invoke std::less on its raw pointer equivalent.
Unfortunately, that doesn't mean that a null shared_ptr is ordered "first": though std::less's pointer ordering is strict and total, where null pointers land in this ordering is unspecified.
So, it is unspecified as to whether your mutation will break the ordering, and therefore it is unspecified as to whether your pop() will have undefined behaviour.
(The MCVE example with int is safe because std::move on an int has no work to do: it'll just copy the int. So, the ordering remains unaffected.)
Conclusion
I would agree with what was presumably your driving rationale, that it is unfortunate pop() doesn't return you the popped thing, which you could then move from. Similar restrictions with sets and maps are why we now have node splicing features for those containers. There is not such a thing for a priority_queue, which is just a wrapper around another container like a vector. If you need more fine-grained control, you can substitute that for your own which has the features you need.
Anyway, for the sake of a shared_ptr increment (as in your original code), I'd probably just take the hit of the copy, unless you have some really extreme performance requirements. That way, you know everything will be well-defined.
Certainly, for the sake of an int copy (as in your MCVE), a std::move is entirely pointless (there are no indirect resources to steal!) and you're doing a copy anyway, so the point is rather moot and all you've done is to create more complex code for no reason.
I would also recommend not writing code where you have to ask whether it's well-defined, even if it turns out it is. That's not ideal for readability or maintainability.

Unexpected result when C++ store element into std::vector from return value of function

When the function involves reallocation, I found some compilers may save the address before the function call. It leads the return value stored in the invalid address.
There is an example to explain behavior in above description.
#include <stdio.h>
#include <vector>
using namespace std;
vector<int> A;
int func() {
A.push_back(3);
A.push_back(4);
return 5;
}
int main() {
A.reserve(2);
A.push_back(0);
A.push_back(1);
A[1] = func();
printf("%d\n", A[1]);
return 0;
}
There are some common C++ compiler, and the test result as follows.
GCC(GNU Compiler Collection): Runtime Error or output 1
Clang: output 5
VC++: output 5
Is it undefined behavior?
The behaviour is undefined in all C++ versions before C++17. The simple reason is that the two sides of the assignment operator can be evaluated in any order:
Assuming A[1] is evaluated first, you get an int& referring to the second element of A at that point.
Then, the func() is evaluated, which can reallocate the storage for the vector, leaving the previously retrieved int& a dangling reference.
Finally, the assignment is performed, writing to unallocated storage. Since the standard allocators cache memory, the OS often won't catch this error.
Only in C++17, the special rule 20 for the assignment was made:
In every simple assignment expression E1=E2 and every compound
assignment expression E1#=E2, every value computation and side-effect
of E2 is sequenced before every value computation and side effect of
E1
With C++17, A[1] must be evaluated after the call to func(), which then provides defined, reliable behaviour.
If you check the documentation, under "Iterator Invalidation", you'll see that push_back() may invalidate every iterator if it changes capacity, since it would have to reallocate memory. Remember that, for an std::vector, a pointer is a valid iterator as well. Because push_back() may or may not reallocate, and you have no way of knowing if it will, the behavior is undefined.

container iterators and operations on containers

I'm studying C++ and I'm reading about STL containers,iterators and the operations that can be performed on them.
I know that every container type (or better, the corresponding template of which each type is an instance) defines a companio type that acts like a pointer-like type and it's called iterator. What I understand is that once you get an iterator to a container,performing operations like adding an element may invalidate that iterator,so I tried to test this statement with an example:
#include <vector>
#include <iostream>
using namespace std;
int main()
{
vector<int> ivec={1,2,3,4,5,6,7,8,9,0};
auto beg=ivec.begin();
auto mid=ivec.begin()+ivec.size()/2;
while (beg != mid) {
if (*beg==2)
ivec.insert(beg,0);
++beg;
}
for (auto i:ivec)
cout<<i<<" ";
}
here,I'm simply contructing a vector of ints, brace initialize it,and performing a condition based operation,inserting an element in the first half of the container.
The code is flawed I think, because I'm initializing two iterator objects beg
and end and then I use them in the while statement as a condition.
BUT, if the code should change the contents of the container (and it sure does) what happens to the iterators?
The code seems to run just fine,it add a 0 in the ivec[1] position and prints the result.
What I thought is that the beg iterator would point to the newly added element and that the mid iterator would have pointed to the element before the formerly pointed to by mid (it's like the iterators point to the same memory locations while the underlying array,"slides" under.. unless it's reallocated that is)
Can someone explain me this behaviour??
When the standard says iterators are invalidated, this does not guarantee that they will be invalid in the sense of preventing your program from working. Using an invalid iterator is undefined behavior which is a huge and important topic in C++. It doesn't mean your program will crash, but it might. Your program might also do something else--the behavior is completely undefined.

Is there a reason for zero sized std::array in C++11?

Consider the following piece of code, which is perfectly acceptable by a C++11 compiler:
#include <array>
#include <iostream>
auto main() -> int {
std::array<double, 0> A;
for(auto i : A) std::cout << i << std::endl;
return 0;
}
According to the standard § 23.3.2.8 [Zero sized arrays]:
1 Array shall provide support for the special case N == 0.
2 In the case that N == 0, begin() == end() == unique value. The return value of
data() is unspecified.
3 The effect of calling front() or back() for a zero-sized array is undefined.
4 Member function swap() shall have a noexcept-specification which is equivalent to
noexcept(true).
As displayed above, zero sized std::arrays are perfectly allowable in C++11, in contrast with zero sized arrays (e.g., int A[0];) where they are explicitly forbidden, yet they are allowed by some compilers (e.g., GCC) in the cost of undefined behaviour.
Considering this "contradiction", I have the following questions:
Why the C++ committee decided to allow zero sized std::arrays?
Are there any valuable uses?
If you have a generic function it is bad if that function randomly breaks for special parameters. For example, lets say you could have a template function that takes N random elements form a vector:
template<typename T, size_t N>
std::array<T, N> choose(const std::vector<T> &v) {
...
}
Nothing is gained if this causes undefined behavior or compiler errors if N for some reason turns out to be zero.
For raw arrays a reason behind the restriction is that you don't want types with sizeof T == 0, this leads to strange effects in combination with pointer arithmetic. An array with zero elements would have size zero, if you don't add any special rules for it.
But std::array<> is a class, and classes always have size > 0. So you don't run into those problems with std::array<>, and a consistent interface without an arbitrary restriction of the template parameter is preferable.
One use that I can think of is the return of zero length arrays is possible and has functionality to be checked specifically.
For example see the documentation on the std::array function empty(). It has the following return value:
true if the array size is 0, false otherwise.
http://www.cplusplus.com/reference/array/array/empty/
I think the ability to return and check for 0 length arrays is in line with the standard for other implementations of stl types, for eg. Vectors and maps and is therefore useful.
As with other container classes, it is useful to be able to have an object that represents an array of things, and to have it possible for that array to be or become empty. If that were not possible, then one would need to create another object, or a managing class, to represent that state in a legal way. Having that ability already contained in all container classes, is very helpful. In using it, one then just needs to be in the habit of relating to the array as a container that might be empty, and checking the size or index before referring to a member of it in cases where it might not point to anything.
There are actually quite a few cases where you want to be able to do this. It's present in a lot of other languages too. For example Java actually has Collections.emptyList() which returns a list which is not only size zero but cannot be expanded or resized or modified.
An example usage might be if you had a class representing a bus and a list of passengers within that class. The list might be lazy initialized, only created when passengers board. If someone calls getPassengers() though then an empty list can be returned rather than creating a new list each time just to report empty.
Returning null would also work for the internal efficiency of the class - but would then make life a lot more complicated for everyone using the class since whenever you call getPassengers() you would need to null check the result. Instead if you get an empty list back then so long as your code doesn't make assumptions that the list is not empty you don't need any special code to handle it being null.

Program crashes if I don't define a vector size

Currently learning C++ and stumbled upon this problem
std::vector<int> example;
example[0] = 27;
std::cout << example[0];
This crashes the program, but if I define a size std::vector<int> example(1) it works fine. What I also noticed was that if I use example.push_back(27) instead of example[0] = 27 without defining the size it also works fine. Is there a reasoning behind this?
An empty vector has no elements allocated in memory.
You should use example.push_back(27) instead of trying to subscript it. push_back() allocates a new element and then adds it to the vector. Once that new element is added to the vector, it can be reassigned using example[0] = something.
The reason is very simple: an empty vector has no elements so you may not use the subscript operator. No memory was allocated and no element was created that you could access it with the subscript operator
As for the method push_back then it adds an element to the vector.
Also using the constructor the way you wrote
std::vector<int> example(1)
creates a vector with one element.
To be more specific, you're using the default constructor so the vector is not obligated to allocate any memory (because you did not ask it to).
So if you run the following code:
std::vector<int> e;
cout<< e.size() << endl;
cout<< e.capacity() << endl;
It'll print 0 and 0. So the size is 0
And as the documentation states:
If the container size is greater than n, the function never throws
exceptions (no-throw guarantee). Otherwise, the behavior is undefined.
e.push_back(5);
std::cout << e[0];
Would work.
From: http://www.cplusplus.com/reference/vector/vector/operator[]/
std::vector::operator[]
Access element
Returns a reference to the element at position n in the vector container.
A similar member function, vector::at, has the same behavior as this operator function, except that vector::at is bound-checked and signals if the requested position is out of range by throwing an out_of_range exception.
Portable programs should never call this function with an argument n that is out of range, since this causes undefined behavior.
This can get confusing in STL since std::map, for example:
std::map<int, std::string> myMap;
myMap[0] = "hello"
will create the mapping if one does not already exist.