I have a sample program in which I am trying to see how the iterator invalidates while deleting the elements from a map.
The program is here:
#include <iostream>
#include <map>
using namespace std;
int main(int argc, char *argv[])
{
map<int, int> myMap;
myMap.insert(pair<int, int>(0, 2));
myMap.insert(pair<int, int>(1, 4));
myMap.insert(pair<int, int>(3, 18));
myMap.insert(pair<int, int>(2, 20));
map<int, int>::iterator it;
for(it = myMap.begin(); it != myMap.end(); ++it)
{
myMap.erase(it); // erasing the element pointed at by iterator
cout << it->first << endl; // iterator is invalid here
}
return 0;
}
The problem is that I am getting output is:
0
1
2
3
Why the iterator is not invalidating and giving me wrong results. Any help would be highly appreciated.
Documentation of C++ STL maps says that: References and iterators to
the erased elements are invalidated. Other references and iterators
are not affected.
Using an invalidated iterator is undefined behaviour. In such case, anything could happen.
Why do you see the values? The iterator contains a pointer to some piece of memory, by pure accident, this memory has not yet been returned to the system and has not yet been overwritten. This is why you still can see the already "dead" values.
It does not change anything, it remains undefined behaviour, and the next time you run the program, the memory page the map element resided in could already have been returned to the OS again and you get an access violation (segmentation fault)...
Invalidated iterator does not mean that its internal data was erased. Sometimes like in this case the invalidated iterator may hold a valid reference to the next item. However, using it like this is Undefined Behavior and it likely to cause some problems in your application.
There are no run-time checks for invalid iterators by default.
You can enable the debug checks for invalid iterators with -D_GLIBCXX_DEBUG for GNU C++ standard library. That produces the following run-time error:
iterator "this" # 0x0x7fff9f3d7060 {
type = N11__gnu_debug14_Safe_iteratorISt17_Rb_tree_iteratorISt4pairIKiiEENSt7__debug3mapIiiSt4lessIiESaIS4_EEEEE (mutable iterator);
state = singular;
references sequence with type `NSt7__debug3mapIiiSt4lessIiESaISt4pairIKiiEEEE' # 0x0x7fff9f3d7150
}
For other standard libraries check the documentation.
Related
I am experiencing the following behavior: I create a map, do a find on the key and delete the map entry. After erase, I print the elements using the iterator and I expected it to dump core but it works.
Why does it work?
typedef std::pair<std::string, int> pair;
std::map<pair, int> nameidCntMap;
pair pair1("ABC", 139812);
pair pair2("XYZ", 139915);
pair pair3("PQR", 139098);
nameidCntMap.insert(std::make_pair(pair1, 1));
nameidCntMap.insert(std::make_pair(pair2, 1));
nameidCntMap.insert(std::make_pair(pair3, 1));
std::map<pair, int>::iterator it = nameidCntMap.find(pair1);
if (it != nameidCntMap.end())
{
symsrcidCntMap.erase(it);
std::cout<<"Pair::first: "<<it->first.first << "Pair::second: "<<it->first.second<<"map second:"<<it->second<<std::endl;
}
Why does it work?
It doesn't "work".
Behaviour of the program is undefined.
expected it to dump core
Your expectation is misguided. The program isn't defined to "dump core" when you indirect through an invalid iterator. No behaviour is defined for such program. As such, any behaviour is possible. Among all possible behaviours, there is the possibility that the behaviour is what you didn't expect, or what you consider to be "working".
I used poll() with std::vector.
registed listen socket.
std::vector<struct pollfd> fds;
fds.push_back(server_sock);
and add new client socket or connected client session do something.
// poll() ...
for(std::vector<struct pollfd>::reverse_iterator it = fds.rbegin(); it != fds.rend(); it++) {
if (it->fd == server_sock) {
struct pollfd newFd;
newFd.fd = newClient;
newFd.events = POLLIN;
fds.push_back(newFd);
} else {
// do something.
}
}
but the reverse_iterator does not work properly when there is a 1 or 2 or 4 vector's element. I don't understand why this work.
attached sample code.
typedef struct tt_a {
int a;
short b;
short c;
} t_a;
vector<t_a> vec;
for (int i = 0; i < 1; i++) {
t_a t;
t.a = i;
t.b = i;
t.c = i;
vec.push_back(t);
}
for(vector<t_a>::reverse_iterator it = vec.rbegin(); it != vec.rend(); it++) {
if (it->a == 0) {
t_a t;
t.a = 13;
t.b = 13;
t.c = 13;
vec.push_back(t);
}
printf("[&(*it):0x%08X][it->a:%d][&(*vec.rend()):0x%08X]\n",
&(*it), it->a, &(*vec.rend()));
}
printf("---------------------------------------------\n");
for(vector<t_a>::reverse_iterator it = vec.rbegin(); it != vec.rend(); ++it) {
if (it->a == 3) {
it->a = 33;
it->b = 33;
it->c = 33;
}
printf("[&(*it):0x%08X][it->a:%d][&(*vec.rend()):0x%08X]\n",
&(*it), it->a, &(*vec.rend()));
}
result:
[&(*it):0x01ADC010][it->a:0][&(*vec.rend()):0x01ADC028]
[&(*it):0x01ADC008][it->a:33][&(*vec.rend()):0x01ADC028]
[&(*it):0x01ADC000][it->a:0][&(*vec.rend()):0x01ADC048]
If vector has 5 elements, it works normally.
[&(*it):0x007620A0][it->a:4][&(*vec.rend()):0x00762078]
[&(*it):0x00762098][it->a:3][&(*vec.rend()):0x00762078]
[&(*it):0x00762090][it->a:2][&(*vec.rend()):0x00762078]
[&(*it):0x00762088][it->a:1][&(*vec.rend()):0x00762078]
[&(*it):0x00762080][it->a:0][&(*vec.rend()):0x00762078]
---------------------------------------------
[&(*it):0x007620A8][it->a:13][&(*vec.rend()):0x00762078]
[&(*it):0x007620A0][it->a:4][&(*vec.rend()):0x00762078]
[&(*it):0x00762098][it->a:33][&(*vec.rend()):0x00762078]
[&(*it):0x00762090][it->a:2][&(*vec.rend()):0x00762078]
[&(*it):0x00762088][it->a:1][&(*vec.rend()):0x00762078]
[&(*it):0x00762080][it->a:0][&(*vec.rend()):0x00762078]
push_back invalidates iterators when it causes size to exceed capacity:
If the new size() is greater than capacity() then all iterators and references (including the past-the-end iterator) are invalidated. Otherwise only the past-the-end iterator is invalidated.
Basically, if you must push_back, make sure to reserve ahead of time so you don't invalidate your iterator.
Your program most likely has crashed. You are manipulating the container while still iterating over it.
[&(*it):0x01ADC008][it->a:33][&(*vec.rend()):0x01ADC028]
You can see junk '33' while it should be '13'.
And why are you even trying to dereference the end iterator
&(*vec.rend())
This basically will be a junk irrespective of the vector size. Its an undefined behavior and will crash your application randomly.
As shadow points out fix the vector size before iterating, but still I am not sure how that will fix your code as your example has other issues that will cause seg fault
For normal (forward, not reverse) vector iterators, inserting into the vector invalidates any iterators that point to anywhere at or after the point of insertion. Furthermore, if the vector must be resized, all iterators are invalidated.
This alone could explain your problems, as because you have not reserved space in your vector (by calling vec.reserve(SIZE)), any of your push_back calls could trigger a resize and invalidate your iterators, which will result in undefined behaviour when you try to use them afterwards.
However, reverse iterators are more complicated, and the same guarantee does not hold for reverse iterators, and I believe any insertion may invalidate them.
Internally, a reverse iterator holds a forwards iterator to the element after the one that it points to. When dereferenced, the reverse iterator decrements this forwards iterator and returns its dereferenced value. So rbegin() internally has a copy of end(), and rend() has a copy of begin(). The above rules for forward iterator invalidation then imply that at the very least, a reverse iterator will be invalidated if an insertion occurs at any point up to one element after the location of the reverse iterator. So if you have an iterator pointing to index 0 in a length 1 vector, push_back will insert to index 1, which will invalidate the iterator. If you then continue to use that iterator (such as when dereferencing it in the subsequent printf call) then you will have undefined behaviour.
Undefined behaviour means anything could happen, and very commonly different systems will produce different behaviour. Do not assume that just because this code runs as expected on your system with an initial vector size of 5 that it will work on other systems. Any code invoking undefined behaviour is inherently fragile, and should be avoided.
For me (running Visual Studio 2015), I get a crash at the printf line regardless of the size of the vector. If I call vec.reserve(10) to eliminate the resizing-invalidation issue, then it only crashes when vec is initially of length one.
Additionally, you are dereferencing vec.rend() in your printf arguments, which is also undefined behaviour, even if you are just trying to get an address out of it. (I had to comment out this to get your code to run, otherwise it would crash every time even without the push_back call.)
I got terminate called after throwing an instance of 'std::bad_alloc' when trying to push an additional string to a middle of a vector. I used g++ 4.8.2.
I even got output with wrong vector sizes size of str_vector 0: 1, size of str_vector 1: 1 when using g++ 5.2 on coliru.
The program works correctly when I use index (e.g., str_vector[0]) to access vectors or use std::list.
Does this mean there is some restriction on the use of iterator? I assume that there should not any difference when I use index or iterator to access vectors.
#include <iostream>
#include <string>
#include <vector>
using std::vector;
using std::string;
int main() {
vector<vector<string>> str_vector;
str_vector.emplace_back();
vector<vector<string>>::iterator it0 = std::prev(str_vector.end());
it0->push_back("a");
str_vector.emplace_back();
vector<vector<string>>::iterator it1 = std::prev(str_vector.end());
it1->push_back("a");
it0->push_back("a"); // bad alloc here
std::cerr << "size of str_vector 0: " << it0->size() << std::endl;
std::cerr << "size of str_vector 1: " << it1->size() << std::endl;
return 0;
}
When you add elements to a vector it might need to reallocate its internal memory, which leads to all iterator to become invalid. So after you do the second emplace_back the first iterator it0 becomes invalid.
Iterators are nothing but object oriented pointers. Iterator invalidation is a lot like pointer invalidation.
C++ Spec:
vector: all iterators and references before the point of insertion are
unaffected, unless the new container size is greater than the previous
capacity (in which case all iterators and references are invalidated)
[23.2.4.3/1]
First time you execute the below line, it's valid.Second time you do it with the same iterator, the iterator has already been invalidated:
it0->push_back("a"); // bad alloc here
For knowing what iterator invalidation is and how to handle it , there is this excellent post on iterator invalidation here:
[Iterator invalidation rules
I'm wondering if it's "safe" to set a string equal to whatever is returned by dereferencing the off-the-end iterator of a vector of strings. When I run the program
#include <vector>
#include <string>
int main()
{
std::vector<std::string> myVec;
std::cout << *myVec.end();
return 0;
}
I get the following error.
/usr/local/lib/gcc/i686-pc-linux-gnu/4.1.2/../../../../include/c++/4.1.2/debug/safe_iterator.h:181:
error: attempt to dereference a past-the-end iterator.
Objects involved in the operation:
iterator "this" # 0x0xffdb6088 {
type = N11__gnu_debug14_Safe_iteratorIN9__gnu_cxx17__normal_iteratorIPSsN10__gnu_norm6vectorISsSaISsEEEEEN15__gnu_debug_def6vectorISsS6_EEEE (mutable iterator);
state = past-the-end;
references sequence with type `N15__gnu_debug_def6vectorISsSaISsEEE' # 0x0xffdb6088
}
Disallowed system call: SYS_kill
You can view it at http://codepad.org/fJA2yM30
The reason I'm wondering about all this is because I have a snippet in my code that is like
std::vector<const std::string>::iterator iter(substrings.begin()), offend(substrings.end());
while (true)
{
this_string = *iter;
if (_programParams.count(this_string) > 0)
{
this_string = *++iter;
and I want to make sure something weird doesn't happen if ++iter is equal to offend.
You said:
I'm wondering if it's "safe" to set a string equal to whatever is returned by dereferencing the off-the-end iterator of a vector of strings
No, it is not safe. From http://en.cppreference.com/w/cpp/container/vector/end
Returns an iterator to the element following the last element of the container.
This element acts as a placeholder; attempting to access it results in undefined behavior.
This sample program gets an iterator to an element of a vector contained in another vector. I add another element to the containing vector and then print out the value of the previously obtained iterator:
#include <vector>
#include <iostream>
int main(int argc, char const *argv[])
{
std::vector<std::vector<int> > foo(3, std::vector<int>(3, 1));
std::vector<int>::iterator foo_it = foo[0].begin();
std::cout << "*foo_it: " << *foo_it << std::endl;
foo.push_back(std::vector<int>(3, 2));
std::cout << "*foo_it: " << *foo_it << std::endl;
return 0;
}
Since the vector correspinding to foo_it has not been modified I expect the iterator to still be valid. However when I run this code I get the following output (also on ideone):
*foo_it: 1
*foo_it: 0
For reference I get this result using g++ versions 4.2 and 4.6 as well as clang 3.1. However I get the expected output with g++ using -std=c++0x (ideone link) and also with clang when using both -std=c++0x and -stdlib=libc++.
Have I somehow invoked some undefined behavior here? If so is this now defined behavior C++11? Or is this simply a compiler/standard library bug?
Edit I can see now that in C++03 the iterators are invalidated since the vector's elements are copied on reallocation. However I would still like to know if this would be valid in C++11 (i.e. are the vector's elements guaranteed to be moved instead of copied, and will moving a vector not invalidate it's iterators).
push_back invalidates iterators, simple as that.
std::vector<int>::iterator foo_it = foo[0].begin();
foo.push_back(std::vector<int>(3, 2));
After this, foo_ti is no longer valid. Any insert/push_back has the potential to internally re-allocate the vector.
Since the vector correspinding to foo_it has not been modified
Wrong. The push_back destroyed the vector corresponding to foo_it. foo_it became invalid when foo[0] was destroyed.
I guess the misperception is that vector< vector < int > > is a vector of pointers and when the outer one is reallocated, the pointers to the inner ones are still valid which is true for **int. But instead, reallocating the vector also reallocates all inner vectors, which makes the inner iterator invalid as well.