List iterator vs. vector iterator - c++

I have two questions regarding iterators.
I thought the once you define an iterator to an STL container such as a vector or a list, if you add elements to the containers then these iterators won't be able to access them. But the following code defines a list of five elements and then adds another element in each loop iteration and results in an infinite loop:
#include <iostream>
#include <list>
using namespace std;
int main()
{
list<int> ls;
for(int i = 0; i < 5; i++)
{
ls.push_back(i);
}
int idx = 0;
for(list<int>::iterator iter = ls.begin(); iter != ls.end(); iter++)
{
cout << "idx: " << idx << ", *iter: " << *iter << endl;
ls.push_back(7);
idx++;
}
}
However, doing the same for a vector results in an error:
#include <iostream>
#include <vector>
using namespace std;
int main()
{
vector<int> vec;
for(int i = 0; i < 5; i++)
{
vec.push_back(i);
}
int idx = 0;
for(vector<int>::iterator iter = vec.begin(); iter != vec.end(); iter++)
{
cout << "idx: " << idx << ", *iter: " << *iter << endl;
vec.push_back(7);
idx++;
}
}
I thought that when the vector container must be resized, it does so at powers of 2 and is located to a new area of memory, which is why you shouldn't define an iterator to a vector if you adding elements to it (since the iterators don't get passed to the new memory location). For example, I thought a vector containing 16 elements, after calling the push_back function, will be allocated space for 32 elements and the entire vector will be relocated. However, the this didn't happen for the following code. Was I just mistaken?
#include <iostream>
#include <vector>
using namespace std;
int main()
{
vector<int> vec;
for(int i = 0; i < 4; i++)
{
vec.push_back(i);
cout << "address of vec: " << &vec << ", capacity: " << vec.capacity() << endl;
}
for(int i = 0; i < 20; i++)
{
vec.push_back(i);
cout << "address of vec: " << &vec << ", capacity: " << vec.capacity() << endl;
}
}

Different container's iterators have different properties. Here are the iterator invalidation rules.
The list loop: When you push onto a list all previous iterators still valid. You will never hit the end if every time you iterator forward one you also add a new element, obviously.
The vector loop: For a vector, your iterators are invalid once a push_back results in the new size exceeding the old capacity. As soon as this happens, using iter is undefined behavior (you will likely crash).
I thought that when the vector container must be resized, it does so
at powers of 2 and is located to a new area of memory
This is unspecified by the standard. Some implementations of the C++ standard library double the capacity of a vector when the size exceeds the old capacity, and others grow at different rates.

The answer on your first question is contained in the second your question.
As for the second question then it is implementation defined how the vector allocates the memory. It is not necessary that it will double the size of the memory each time when it is exhausted.

The different containers generally have different guarantees with respect to the validity of iterators, and pointers/references to elements:
For std::list<T> the iterators and pointers/references to elements stay valid until the corresponding node gets erased or the std::list<T> exists.
For std::vector<T> the validity is more complicated:
Iterator and pointer/reference validity is identical (and I'll only use iterators below).
All iterators are invalidated when the std::vector<T> needs to resize its internal buffer, i.e., when inserting exceeds the capacity. When the capacity is exceeded and how much memory is allocated depends on the implementation (the only requirement is that the capacity grows exponentially and a factor of 2 is a reasonable choice but there are many others).
When inserting into a std::vector<T> all iterators before the insertion point stay valid unless reallocation is necessary.
When erasing from a std::vector<T> all iterators past the erase point are invalidated.
Other containers have, yet, different validity constraints (e.g. std::deque<T> keeps invalidating iterators but can keep pointers/references valid).

Related

Unexpected behavior using `std::count` on `std::vector` of pairs

My goal is to completely remove all elements in a std::vector<std::pair<int, int>> that occur more than once.
The idea was to utilize std::remove with std::count as part of the predicate. My approach looks something like this:
#include <iostream>
#include <vector>
#include <algorithm>
using std::cout;
using std::endl;
using i_pair = std::pair<int, int>;
int main()
{
std::vector<i_pair> vec;
vec.push_back(i_pair(0,0)); // Expected to stay
vec.push_back(i_pair(0,1)); // Expected to go
vec.push_back(i_pair(1,1)); // Expected to stay
vec.push_back(i_pair(0,1)); // Expected to go
auto predicate = [&](i_pair& p)
{
return std::count(vec.begin(), vec.end(), p) > 1;
};
auto it = std::remove_if(vec.begin(), vec.end(), predicate);
cout << "Reordered vector:" << endl;
for(auto& e : vec)
{
cout << e.first << " " << e.second << endl;;
}
cout << endl;
cout << "Number of elements that would be erased: " << (vec.end() - it) << endl;
return 0;
}
The array gets reordered with both of the (0,1) elements pushed to the end, however the iterator returned by std::remove points at the last element. This means that a subsequent erase operation would only get rid of one (0,1) element.
Why is this behavior occurring and how can I delete all elements that occur more than once?
Your biggest problem is std::remove_if gives very little guarantees about the contents of the vector while it is running.
It guarantees at the end, begin() to returned iterator contains elements not removed, and from there until end() there are some other elements.
Meanwhile, you are iterating over the container in the middle of this operation.
It is more likely that std::partition would work, as it guarantees (when done) that the elements you are "removing" are actually stored at the end.
An even safer one would be to make a std::unordered_map<std::pair<int,int>, std::size_t> and count in one pass, then in a second pass remove everything whose count is at least 2. This is also O(n) instead of your algorithms O(n^2) so should be faster.
std::unordered_map<i_pair,std::size_t, pair_hasher> counts;
counts.reserve(vec.size()); // no more than this
for (auto&& elem:vec) {
++counts[elem];
}
vec.erase(std::remove_if(begin(vec), end(vec), [&](auto&&elem){return counts[elem]>1;}), end(vec));
you have to write your own pair_hasher. If you are willing to accept nlgn performance, you could do
std::map<i_pair,std::size_t> counts;
for (auto&& elem:vec) {
++counts[elem];
}
vec.erase(std::remove_if(begin(vec), end(vec), [&](auto&&elem){return counts[elem]>1;}), end(vec));

Error during the usage of of size() function in vectors

So I've started learning vectors for the first time and wrote a simple program which goes like this:
#include <iostream>
#include <vector>
using namespace std;
int main()
{
vector<int> g1;
int n;
cout<<"enter values"<<endl;
do
{
cin>>n;
g1.push_back(n);
} while (n);
cout<<"Vector values are: "<<endl;
for(auto i=g1.begin(); i<g1.size();i++)
cout<<*i<<endl;
}
When I try executing it, an error shows up saying "type mismatch" at the g1.size() part. Why exactly does this happen? I used the auto keyword for the iterator involved and assumed there wouldn't be any problem?
That is the bad side of using auto. If you have no idea what the result of auto is, you get no idea why it is something totally different you expect!
std::vector::begin delivers a std::vector::iterator and you can't compare it against an size_type value which is a result of std::vector::size. This type is typically std::size_t
You have to compare against another iterator which is the representation of the end of the vector like:
for(auto i = g1.begin(); i != g1.end(); i++)
There are at least three ways to iterate through the contents of a vector.
You can use an index:
for (int i = 0; i < vec.size(); ++i)
std::cout << vec[i] << '\n';
You can use iterators:
for (auto it = vec.begin(); it != vec.end(); ++it)
std::cout << *it << '\n';
You can use a range-based for loop:
for (auto val : vec)
std::cout << Val <<'\n';
The latter two can be used with any container.
g1.begin() returns an iterator to the 1st element, whereas g1.size() returns the number of elements. You can't compare an iterator to a size, which is why you are getting the error. It has nothing to do with your use of auto, it has to do with you comparing 2 different things that are unrelated to each other.
You need to change your loop to compare your i iterator to the vector's end() iterator, eg:
for(auto i = g1.begin(); i != g1.end(); ++i)
cout << *i << endl;
Or, simply use a range-based for loop instead, which uses iterators internally:
for(auto i : g1)
cout << i << endl;
Otherwise, if you want to use size() then use indexes with the vector's operator[], instead of using iterators, eg:
for(size_t i = 0; i < g1.size(); ++i)
cout << g1[i] << endl;

C++: Segmentation fault while iterating over vector of pointers while push_back

I want to iterate through a vector of pointers pointing on objects. While iterating, I have to push_back new pointers to the vector. Before the loop, the number of push_backs is unknown and there is no abort criterion, so that I can't use a while loop.
Here is an example using pointers on integers, that shows the same error as the version with objects: Segmentation fault (core dumped) after one iteration.
vector<int*> vec;
int a = 43;
vec.push_back(&a);
for (vector<int*>::iterator it = vec.begin(); it != vec.end(); ++it) {
cout << *(*it) << " " << *it << endl;
vec.push_back(&a);
}
The same Code but with integers works great.
vector <int>vec;
int a = 43;
vec.push_back (a);
for (vector < int >::iterator it = vec.begin (); it != vec.end (); ++it){
cout << (*it) << " " << *it << endl;
vec.push_back (a);
}
push_back invalidates the iterator when appending results in size > capacity so it reallocates and copies to the new space.
Appends the given element value to the end of the container.
1) The new element is initialized as a copy of value.
2) value is moved into the new element.
If the new size() is greater than capacity() then all iterators and
references (including the past-the-end iterator) are invalidated.
Otherwise only the past-the-end iterator is invalidated.
Plus as #Jesper pointed out you are storing a reference to a local variable in your vector:
int a = 43;
vec.push_back(&a);
which if went out of scope before your vector you will have dangling references.

Why does insert invalidate the std::set reverse iterator

My understanding is the iterators of associative containers are not invalidated during insert or erase (unless the node pointed by iterator is erased). But in the below program
the insert seems to invalidate the iterator. Is my understanding wrong?
typedef std::set<unsigned int> myset_t;
int main(int argc, char **argv)
{
myset_t rs;
myset_t::reverse_iterator rit;
myset_t::reverse_iterator srit;
int ii = 500;
rs.insert(10);
rs.insert(11);
rs.insert(12);
rs.insert(13);
rs.insert(14);
rs.insert(100000);
rs.insert(102000);
rs.insert(103000);
rit = rs.rbegin();
while(rit != rs.rend()) {
srit = rit;
if (*rit < 100000) {
cout << "bailing here " << *rit << endl;
return 0;
}
rit++;
cout << "Before erase " << *rit << endl;
rs.erase(*srit);
cout << "Before insert " << *rit << endl;
rs.insert(ii);
cout << "After insert " << *rit << endl;
ii++;
}
cout << "Out of loop" << endl;
}
===
The output is
Before erase 102000
Before insert 102000
After insert 14
bailing here 14
=====
The promised behavior for iterators of a standard container does not hold for reverse iterators of that container.
A reverse iterator actually stores, as a member, the normal (forward moving) iterator which comes after the element to which the reverse iterator refers when dereferenced. Then when you dereference the reverse iterator, essentially it decrements a copy of this stored normal iterator and dereferences that. So this is a problem:
rit = rs.rbegin(); // rit stores rs.end()
srit = rit; // srit also stores rs.end()
rit++; // rit stores a normal iterator pointing to the last element
rs.erase(*srit); // this deletes the last element, invalidating the normal
// iterator which is stored in rit. Funnily enough, the
// one stored in srit remains valid, but now *srit is a
// different value
Reverse iterators behave this way because there is no "before begin" iterator. If they stored the iterator to the element to which they actually refer, what would rs.rend() store? I'm sure there are ways around this, but I guess they required compromises which the standards committee was not willing to make. Or perhaps they never considered this problem, or didn't consider it significant enough.

iterating a vector how to check at which position I am?

Example:
for (vector<string>::reverse_iterator it = myV.rbegin(); it != myV.rend(); ++it)
{
cout << "current value is: " << *it << ", and current position is: " << /* */ << endl;
}
I know I could check how many items there are in the vector, make a counter, and so on. But I wonder if there is a more direct way of checking current index without asserting that I got the length of the vector right.
vector Iterators support difference you can subtract you current iterator it from rbegin.
EDIT
As noted in a comment not all iterators support operator- so std::distance would have to be used. However I would not recommend this as std::distance will cause a liner time performance cost for iterators that are not random access while if you use it - begin() the compiler will tell you that won't work and then you can use distance if you must.
Subtract std::vector<T>::begin() (or rbegin() in your case) from the current iterator. Here's a small example:
#include <vector>
#include <iostream>
int main()
{
std::vector<int> x;
x.push_back(1);
x.push_back(1);
x.push_back(3);
std::cout << "Elements: " << x.end() - x.begin();
std::cout << "R-Elements: " << x.rend() - x.rbegin();
return 0;
}
As pointed out in a really great comment above, std::distance may be an even better choice. std::distance supports random access iterators in constant time, but also supports other categories of iterators in linear time.
Iterators are used to allow generic algorithms to be written that invariant to a choice of a container. I've read in the STL Book that this is great, but may lead to performance drop because sometimes the member functions of a container are optimized for the container and will run faster than generic code that relies on iterators. In this case, if you are dealing with a large vector, you will be calling the std::distance, which although constant is not necessary. If you know that you will be using oly vector for this algorithm, you may recognize that it supports the direct access operator "[]" and write something like this:
#include <vector>
#include <iostream>
using namespace std;
int main ()
{
vector<int> myV;
for (int I = 0; I < 100; ++I)
{
myV.push_back(I);
}
for (int I = 0; I < myV.size(); ++I)
{
cout << "current value is: " << myV[I]
<< ", and current position is: " << I << endl;
}
return 0;
}
In case you are interested in speed, you can always try the different answers proposed here and measure the execution time. It will depend on the vector size probably.
Keep a counter:
for (vector<string>::reverse_iterator it = myV.rbegin(),
int pos = myV.size;
it != myV.rend(),
--pos;
++it)
{
cout << "current value is: " << *it << ", and current position is: " << pos << endl;
}