Unexpected behavior using iterators with nested vectors - c++

This sample program gets an iterator to an element of a vector contained in another vector. I add another element to the containing vector and then print out the value of the previously obtained iterator:
#include <vector>
#include <iostream>
int main(int argc, char const *argv[])
{
std::vector<std::vector<int> > foo(3, std::vector<int>(3, 1));
std::vector<int>::iterator foo_it = foo[0].begin();
std::cout << "*foo_it: " << *foo_it << std::endl;
foo.push_back(std::vector<int>(3, 2));
std::cout << "*foo_it: " << *foo_it << std::endl;
return 0;
}
Since the vector correspinding to foo_it has not been modified I expect the iterator to still be valid. However when I run this code I get the following output (also on ideone):
*foo_it: 1
*foo_it: 0
For reference I get this result using g++ versions 4.2 and 4.6 as well as clang 3.1. However I get the expected output with g++ using -std=c++0x (ideone link) and also with clang when using both -std=c++0x and -stdlib=libc++.
Have I somehow invoked some undefined behavior here? If so is this now defined behavior C++11? Or is this simply a compiler/standard library bug?
Edit I can see now that in C++03 the iterators are invalidated since the vector's elements are copied on reallocation. However I would still like to know if this would be valid in C++11 (i.e. are the vector's elements guaranteed to be moved instead of copied, and will moving a vector not invalidate it's iterators).

push_back invalidates iterators, simple as that.
std::vector<int>::iterator foo_it = foo[0].begin();
foo.push_back(std::vector<int>(3, 2));
After this, foo_ti is no longer valid. Any insert/push_back has the potential to internally re-allocate the vector.

Since the vector correspinding to foo_it has not been modified
Wrong. The push_back destroyed the vector corresponding to foo_it. foo_it became invalid when foo[0] was destroyed.

I guess the misperception is that vector< vector < int > > is a vector of pointers and when the outer one is reallocated, the pointers to the inner ones are still valid which is true for **int. But instead, reallocating the vector also reallocates all inner vectors, which makes the inner iterator invalid as well.

Related

iterator invalidation in map C++

I have a sample program in which I am trying to see how the iterator invalidates while deleting the elements from a map.
The program is here:
#include <iostream>
#include <map>
using namespace std;
int main(int argc, char *argv[])
{
map<int, int> myMap;
myMap.insert(pair<int, int>(0, 2));
myMap.insert(pair<int, int>(1, 4));
myMap.insert(pair<int, int>(3, 18));
myMap.insert(pair<int, int>(2, 20));
map<int, int>::iterator it;
for(it = myMap.begin(); it != myMap.end(); ++it)
{
myMap.erase(it); // erasing the element pointed at by iterator
cout << it->first << endl; // iterator is invalid here
}
return 0;
}
The problem is that I am getting output is:
0
1
2
3
Why the iterator is not invalidating and giving me wrong results. Any help would be highly appreciated.
Documentation of C++ STL maps says that: References and iterators to
the erased elements are invalidated. Other references and iterators
are not affected.
Using an invalidated iterator is undefined behaviour. In such case, anything could happen.
Why do you see the values? The iterator contains a pointer to some piece of memory, by pure accident, this memory has not yet been returned to the system and has not yet been overwritten. This is why you still can see the already "dead" values.
It does not change anything, it remains undefined behaviour, and the next time you run the program, the memory page the map element resided in could already have been returned to the OS again and you get an access violation (segmentation fault)...
Invalidated iterator does not mean that its internal data was erased. Sometimes like in this case the invalidated iterator may hold a valid reference to the next item. However, using it like this is Undefined Behavior and it likely to cause some problems in your application.
There are no run-time checks for invalid iterators by default.
You can enable the debug checks for invalid iterators with -D_GLIBCXX_DEBUG for GNU C++ standard library. That produces the following run-time error:
iterator "this" # 0x0x7fff9f3d7060 {
type = N11__gnu_debug14_Safe_iteratorISt17_Rb_tree_iteratorISt4pairIKiiEENSt7__debug3mapIiiSt4lessIiESaIS4_EEEEE (mutable iterator);
state = singular;
references sequence with type `NSt7__debug3mapIiiSt4lessIiESaISt4pairIKiiEEEE' # 0x0x7fff9f3d7150
}
For other standard libraries check the documentation.

'std::bad_alloc' or even wrong vector size when using iterator on vector

I got terminate called after throwing an instance of 'std::bad_alloc' when trying to push an additional string to a middle of a vector. I used g++ 4.8.2.
I even got output with wrong vector sizes size of str_vector 0: 1, size of str_vector 1: 1 when using g++ 5.2 on coliru.
The program works correctly when I use index (e.g., str_vector[0]) to access vectors or use std::list.
Does this mean there is some restriction on the use of iterator? I assume that there should not any difference when I use index or iterator to access vectors.
#include <iostream>
#include <string>
#include <vector>
using std::vector;
using std::string;
int main() {
vector<vector<string>> str_vector;
str_vector.emplace_back();
vector<vector<string>>::iterator it0 = std::prev(str_vector.end());
it0->push_back("a");
str_vector.emplace_back();
vector<vector<string>>::iterator it1 = std::prev(str_vector.end());
it1->push_back("a");
it0->push_back("a"); // bad alloc here
std::cerr << "size of str_vector 0: " << it0->size() << std::endl;
std::cerr << "size of str_vector 1: " << it1->size() << std::endl;
return 0;
}
When you add elements to a vector it might need to reallocate its internal memory, which leads to all iterator to become invalid. So after you do the second emplace_back the first iterator it0 becomes invalid.
Iterators are nothing but object oriented pointers. Iterator invalidation is a lot like pointer invalidation.
C++ Spec:
vector: all iterators and references before the point of insertion are
unaffected, unless the new container size is greater than the previous
capacity (in which case all iterators and references are invalidated)
[23.2.4.3/1]
First time you execute the below line, it's valid.Second time you do it with the same iterator, the iterator has already been invalidated:
it0->push_back("a"); // bad alloc here
For knowing what iterator invalidation is and how to handle it , there is this excellent post on iterator invalidation here:
[Iterator invalidation rules

std::transform needs special care with sets

I don't understand why this snippet of code compiles:
#include <set>
#include <list>
#include <algorithm>
int modify(int i)
{
return 2*i;
}
int main (int args, char** argv)
{
std::set<int> a;
a.insert(1);
a.insert(2);
a.insert(3);
std::list<int> b; // change to set here
std::transform(a.begin(), a.end(), b.begin(), modify); // line 19
}
while, if I just change the type of b from std::list<int> to std::set<int> it fails at compilation time (at line 19) with the error: read-only variable is not assignable.
To use b as a set I need to change the transform line to
std::transform(a.begin(), a.end(), std::inserter(b, b.begin()), modify);
Why is that? I somehow guess the reason has to do with the fact that set is an associative container, while list is a sequence container, but I might be completely off the point here.
Edit
I forgot to mention: I tried this on gcc 3.4.2 and llvm 3.3 using the default standard (c++98). I tried again on llvm 3.3 using c++03 and I get the same behavior.
In your code without std::inserter, transform assigns to *b.begin(). In the case of set that's a const reference to an element (since C++11). Hence, a compile-time error.
In the case of the list it still assigns to *b.begin(), which compiles but has undefined behavior because the list has size 0. So b.begin() may not be dereferenced.
You are correct that this is to do with the fact that set is an associative container whereas list is a sequence. Associative containers don't let you modify the part of the element used as a key. In the case of set that part is the element, for map you can modify the value but not the key.
The whole point of std::inserter is to arrange that instead of assigning through an iterator, it calls insert.
First, the code as you have it exhibits undefined behavior, since the target list doesn't actually have space. Use a back_inserter to create space as you go.
As for the set, a set's elements are immutable. This is why you can't assign to a dereferenced iterator, even if you had space. But using the inserter is perfectly fine.
In C++03 this code compiles but results in undefined behavior - you cannot simply change values of set because they must be in ascending order. In c++11 that was fixed and set::iterator is a bidirectional iterator on const T, so you cannot change its values at all. std::inserter does not change existing values, instead it inserts new values in operator++ and so everything works
This record
std::transform(a.begin(), a.end(), b.begin(), modify);
is invalid even for std::list<int> (though it can be compiled for std::list or std::set in C++ 2003 where set has non-const iterator) because you defined an empty list.
std::list<int> b;
When such a record is used it is supposed that the output container already has elements in the range
[b.begin(), b.begin() + distance( a.begin(), a.end() ) )
because they are reassigned.
So if to consider a set then it is supposed that the set has already all required elements but you may not change them. When you use iterator adapter std::insert_iterator then it adds new elements in the container. So in this case you may use a set.

Order of Vector elements for C++

The following piece of c++ code gives
int main()
{
vector <int> myvect(3,0);
vector <int> :: iterator it;
it = myvect.begin();
myvect.insert(it,200);
myvect.insert(it+5,400); //Not sure what 5 makes the difference here
cout << myvect[0] << endl << myvect[1];
}
Output :
200
400
And the same code with minor changes gives
int main()
{
vector <int> myvect(3,0);
vector <int> :: iterator it;
it = myvect.begin();
myvect.insert(it,200);
myvect.insert(it+4,400); //Not sure what 4 makes the difference here
cout << myvect[0] << endl << myvect[1];
}
Output:
400
200
Can someone tell me why adding 4 or 5 to the iterator changes the order of elements?
Thanks
Your program has Undefined Behavior.
You are creating a vector of 3 elements (all initialized to 0), and you are inserting elements at position v.begin() + 5, which is beyond the end of the vector.
Moreover, you are using an iterator (it) after inserting an element before the position it points to. According to Paragraph 23.3.6.5/1 of the C++11 Standard:
[...] If no reallocation happens, all the iterators and references before the insertion point remain valid. [...]
Therefore, iterator it itself is not guaranteed to be valid after the statement myvect.insert(it, 200), and using it in the next instruction (myvect.insert(it + 4, 400)) is again Undefined Behavior.
You cannot expect anything of a program with Undefined Behavior. It may crash, give you bizarre results, or (in the worst case) behave just as you would expect.
The member function vector::insert(const_iterator, const value_type&) requires a valid iterator that refers to the vector but it+4 and it+5 are not valid iterators.
Before the first insertion, it+3 is a valid (non-dereferencable) iterator, pointing just past-the-end of the vector sequence, but it+4 is invalid. After the insertion it might get invalidated, in which case no expression using it is valid, certainly not it+5 because the sequence only has four elements at that point.
The code would be valid if changed like so:
it = myvect.begin();
myvect.insert(it,200);
it = myvect.begin(); // make it valid again
myvect.insert(it+4,400);

What's this unexpected std::vector behavior?

I found something surprising with std::vector that I thought I'd ask about here to hopefully get some interesting answers.
The code below simply copies a string into a char vector and prints the contents of the vector in two ways.
#include <vector>
#include <string>
#include <iostream>
int main()
{
std::string s("some string");
std::vector<char> v;
v.reserve(s.size()+1);
// copy using index operator
for (std::size_t i=0; i<=s.size(); ++i)
v[i] = s[i];
std::cout << "&v[0]: " << &v[0] << "\n";
std::cout << "begin/end: " << std::string(v.begin(), v.end()) << "\n";
// copy using push_back
for (std::size_t i=0; i<=s.size(); ++i)
v.push_back(s[i]);
std::cout << "&v[0]: " << &v[0] << "\n";
std::cout << "begin/end: " << std::string(v.begin(), v.end()) << "\n";
return 0;
}
Building and running this yields:
$ g++ main.cpp -o v && ./v
&v[0]: some string
begin/end:
&v[0]: some string
begin/end: some string
My expectation was that it would print the string correctly in both cases, but assigning character by character using the index operator doesn't print anything when later using begin() and end() iterators.
Why isn't end() updated when when using []? If this is intentional, what's the reason it's working like this?
Is there a reasonable explanation for this behaviour? :)
I've only tried this with gcc 4.6.1 so far.
Typical example of Undefined Behavior.
You are only ever allowed to access elements by index (using operator[]) between 0 and v.size()-1 (included).
Using reserve does not modify the size, only the capacity. Would you have used resize instead, it would work as expected.
In the first case, you have undefined behaviour. reserve sets the capacity, but leaves the size as zero. Your loop then writes to invalid locations beyond the end of the vector. Printing using the (invalid) pointer appears to work (although there is no guarantee of that), since you've written the string to the memory that it points at; printing using the iterator range prints nothing, because the vector is still empty.
The second loop correctly increases the size each time, so that the vector actually contains the expected contents.
Why isn't end() updated when when using []? If this is intentional, what's the reason it's working like this?
[] is intended to be as fast as possible, so it does no range checking. If you want a range check, use at(), which will throw an exception on an out-of-range access. If you want to resize the array, you have to do it yourself.