possible inconsistency in std::vector while erasing an element [duplicate] - c++

This question already has answers here:
How to delete an element from a vector while looping over it?
(6 answers)
Closed 9 years ago.
while debugging a vector, I see an inconsistency. Assume the following code which tries to remove an entry from a vector which has only one element
#include <iostream>
#include <vector>
std::vector<int> v;
void myremove(int);
int main()
{
v.push_back(10);
std::cout << "10 pushed back\n";
myremove(10);
std::cout << "done :)\n";
return 0;
}
void myremove( int a )
{
std::vector<int>::iterator it = v.begin();
int counter = 0;
for ( ; it != v.end(); it++ ) {
std::cout << "iterating for " << counter << " times and vector size is " << v.size() << "\n";
if ( a == (*it) ) {
v.erase(it);
std::cout << "removed " << a << "\n";
}
++counter;
}
}
This is what I see in the output:
$ g++ test.cpp
$ ./a.out | more
10 pushed back
iterating for 0 times and vector size is 1
removed 10
iterating for 1 times and vector size is 0
iterating for 2 times and vector size is 0
iterating for 3 times and vector size is 0
iterating for 4 times and vector size is 0
iterating for 5 times and vector size is 0
iterating for 6 times and vector size is 0
....
....
iterating for 33790 times and vector size is 0
Segmentation fault
What I understand is that when the element is removed the size will become 0, however the iterator moves one step and still it tries to reach the end but he doesn't know that he has already passed the end point.
Can someone explain more what is going on and how to avoid that?

After the call to erase() the iterator it is invalidated:
Iterators and references to the erased elements and to the elements between them and the end of the container are invalidated.
Set it to the return value of erase() instead and only increment if no removal occured:
while (it != v.end())
{
if ( a == (*it) )
{
it = v.erase(it);
std::cout << "removed " << a << "\n";
}
else
{
++it;
}
}
where the return value of erase() is:
iterator following the last removed element.
Instead of hand coding a loop to erase elements you could use std::remove_if() instead:
v.erase(std::remove_if(v.begin(),
v.end(),
[](const int i) { return i == 10; }),
v.end());

When you erase
v.erase(it);
your iterator is not valid anymore. You have to use the returned iterator from erase. Erase gives you an iterator pointing to the element that followed the element erased by the call. And you have to break the loop if you erased the last element before the loop increments it.
it = v.erase(it);
if(it == v.end())
break;
Suggestion: you can go ahead and change the for loop to a while loop. And increment the iterator explicitly (i.e. only when you have not erased anything. If you have erased the iterator is kind of incremented already).
Like this
while(it != v.end()) {
if ( a == (*it) )
it = v.erase(it);
else
++it;
}

Every insert and erase invalidates all iterators for the container. The methods return the ONLY valid iterator after inserting/erasing.

In the documentation of std::vector::erase :
Iterator validity
Iterators, pointers and references pointing to position (or first) and beyond are invalidated, with all iterators, pointers and references to elements before position (or first) are guaranteed to keep referring to the same elements they were referring to before the call.
You erasing an element in your loop (which depends on an iterator) is making everything berserk. that's pretty much it !

When you call erase function the iterator is no more valid iterator, and when you increment the invalid iterator you will get spurious results.

The mistake is that you expect, after doing an erase on an iterator, that this iterator will still be in a consistent state. This is not the case, and your code illustrate precisely the situation when this occurs.
The semantic of your function is to remove all the elements of the vector equal to a. You can achieve the same result by filtering the vector. See that question for this point:
How to make std::vector from other vector with specific filter?

Related

How come my vector array won't output anything after I erase an element?

Recently I've started learning C++, and everyday I do a C++ practice exercise to understand the language more. Today I was learning Vector Arrays and I hit a roadblock.
I'm trying to make a simple program that takes an array, puts it into a vector, then removes all the odd numbers. But for some reason when I erase an element from the vector, and output the modified vector, it doesn't output anything.
If somebody could guide me to the right direction on what I'm doing wrong, that would be great!
remove.cpp
#include <iostream>
#include <vector>
using namespace std;
class removeOddIntegers {
public:
void removeOdd(int numbs[]) {
vector<int> removedOdds;
for(int i = 0; i < 10; ++i) {
removedOdds.push_back(numbs[i]);
}
for(auto i = removedOdds.begin(); i != removedOdds.end(); ++i) {
if(*i % 2 == 1) {
removedOdds.erase(removedOdds.begin() + *i);
std::cout << "Removed: " << *i << endl;
}
}
for(auto i = removedOdds.begin(); i != removedOdds.end(); ++i) {
std::cout << *i << endl; //doesn't output anything.
}
}
};
main.cpp
#include <iostream>
#include "remove.cpp"
using namespace std;
int main() {
removeOddIntegers r;
int numbers[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
r.removeOdd(numbers);
return 0;
}
Now, I understand that I could just filter through the array, and only push_back the even numbers to the vector, and quite frankly, that works like a charm. But I want to understand why my method doesn't work. How come when I remove an element from the vector, it just fails to output anything?
Thanks in advance!
There's a few things wrong, but they mostly boil down to the same fundamental issue. You are violating iterator guarantees of std::vector::erase:
Invalidates iterators and references at or after the point of the erase, including the end() iterator.
You do this both when dereferencing the deleted iterator to display your "removed" message, and also when calling ++i for the loop.
In addition, your call removedOdds.erase(removedOdds.begin() + *i); is wrong, because it's using the actual value in the vector as an offset from the beginning. That assumption is completely wrong.
The proper way to erase an element at an iterator and retain a valid iterator is:
i = removedOdds.erase(i);
Here is your loop with minimum changes required to fix it:
for (auto i = removedOdds.begin(); i != removedOdds.end(); ) {
if (*i % 2 == 1) {
std::cout << "Removed: " << *i << endl;
i = removedOdds.erase(i);
} else {
++i;
}
}
Notice how the iterator is advanced in the body of the loop now. You can do a thought experiment to think about why. Or you can try doing it the wrong way and use an input like { 1, 3, 5, 7, 9 } to demonstrate the problem.
This is still not the idiomatic way to remove elements from a vector. As you alluded to, elements should be swapped to the end of the vector. The reason for this is that std::vector::erase is a linear operation that must shuffle the entire remainder of the vector. If you do this multiple times, you essentially have O(N^2) time complexity.
The recommended approach is to use std::remove_if:
removedOdds.erase(removedOdds.begin(),
std::remove_if(removeOdds.begin(), removeOdds.end(),
[](int n) { return n % 2 == 1; }));
The flaw in the shown algorithm is more easily observed with a much simpler example:
for(int i = 0; i < 2; ++i) {
removedOdds.push_back(numbs[i]);
}
This initializes the vector with just two values: 0 and 1. This is small enough to be able to follow along in your head, as you mentally execute the shown code:
for(auto i = removedOdds.begin(); i != removedOdds.end(); ++i) {
Nothing interesting will happen with the first value, 0, that gets iterated here. ++i increments the iterator to point to the value 1, then:
if(*i % 2 == 1) {
removedOdds.erase(removedOdds.begin() + *i);
std::cout << "Removed: " << *i << endl;
}
This time erase() removes 1 from the vector. But that's what i is also pointing to, of course. Then, if you look in your C++ reference, you will discover that std::vector::erase:
invalidates iterators and references at or after the point of the erase,
including the end() iterator.
i is now "at the point of the erase", therefore, i is no longer a valid iterator, any more. Any subsequent use of i becomes undefined behavior.
And, i immediately gets used, namely incremented in the for loop iteration expression. That's your undefined behavior.
With the original vector containing values 0 through 9: if you use your debugger it will show all sorts of interesting kinds of undefined behavior. You can use your debugger to see if the shown code ever manages to survive when it encounters a higher odd value, like 7 or 9. If it does, at that point this vector will obviously be much, much smaller, but removedOdds.erase(removedOdds.begin() + *i); will now attempt to remove the 7th or the 9th value in a vector that's about half its size now, a completely non-existent value in the vector, with much hilarity ensuing.
To summarize: your "method doesn't work" because the algorithm is fundamentally flawed in multiple ways, and the reason you get "no output" is because the program crashes.

map find() function if the wanted key is in the last position

I am a C++ beginner. I know that find() is used to search for a certain key. This function returns an iterator to the element if the element is found, else it returns an iterator pointing to the last position of the. map i.e map.end().
I read from websites that
if(it == mp.end())
cout << "Key-value pair not present in map" ;
else
cout << "Key-value pair present : "
What if the found key is in the end postition? How can it still work? Key is sorted in a certain order and I think the iterator traverses the sorted key to find the one we want. (is it correct?)
The result of all .end() in the stl is beyond the valid values. So end() will never be valid.
int arr[10];
// arr has valid indices 0,1,2,3,...,7,8,9
// arr[10] is not valid.
for( auto i = 0; i < 10; i++ ){
}
std::vector vec;
vec.resize( 10 );
// vec.end() is equivalent to arr[10] - not part of the vector
for( auto it = vec.begin(); vec != vec.end(); vec++ ) {
}
So lets re-write the array in the vector idiom
for( auto i = 0; &arr[i] != &arr[10]; i++ ){
}
Maps are more complicated, they have a different guard mechanism, but an iterator == end() never is valid.
According to cplusplus.com , .end() Returns an iterator pointing to the past-the-end element in the sequence:
It does not point to an element in the container ( the map in your case ), rather points outside of it.

Keeping values of vector [duplicate]

This question already has answers here:
How to keep only duplicates efficiently?
(10 answers)
Closed 2 years ago.
I'm having trouble with writing a program to keep only duplicates,here is what I already wrote :
#include <iostream>
#include <vector>
#include <algorithm>
int main()
{
std::vector<int> v;
for (int n; std::cin >> n;)
{v.push_back(n); }
std::sort(v.begin(),v.end());
for(std::vector<int>::iterator b = v.begin();b<v.end();++b)
{
if(*b != *(b+1) )
{
v.erase(b);
}
}
for(std::vector<int>::iterator i = v.begin();i < v.end();++i)
{
std::cout<<*i<<" ";
}
}
What I mean by "keep duplicates" is for example
Input: 13 7 2 13 5 2 1 13
Output : 2 13
I apologise if code is not so perfect, I'm complete beginner.I hope you understand my dificulties.
When you erase() from a vector all iterators pointing at the erased element or later in the vector becomes invalidated. Fortunately, erase() returns an iterator to the element that was after the erased element, so you could do:
for(auto b = v.begin(), end=std::prev(v.end()); b != end;) {
if(*b != *(b+1) ) b = v.erase(b);
else ++b;
}
Note that you need to stop iterating one element before end() since you do *(b+1).
The above does however not remove all but one of the repeated elements.
A different approach could be to search for the first element not being part of a repeating sequence and to erase the element if it had no repetitions and to erase all repetitions but one if it had repetitions.
I've used the standard algorithm std::find_if_not in the example below but you can easily replace it with a loop that does that same thing. Just search for the first element not being equal to *b.
#include <algorithm>
for(auto b = v.begin(); b != v.end();) {
// find the first element not being a repetition of *b
auto r = std::find_if_not(b + 1, v.end(), [&b](const auto& v) { return *b==v; });
if(b + 1 == r) b = v.erase(b); // *b had no repetitions, erase b
else b = v.erase(b + 1, r); // erase all repetitions except one element
}
You are trying to dereference an invalid iterator. On the last iteration of your second for loop, *(b+1) will try to dereference this iterator.
Change:
for(std::vector<int>::iterator b = v.begin();b<v.end();++b)
to:
for(std::vector<int>::iterator b = v.begin();b<v.end()-1;++b)
And change your erase to:
v.erase(b+1)
You are trying to erase the iterator in the current iteration.

Deleting an object from an vector [duplicate]

I have a std::vector<int>, and I want to delete the n'th element. How do I do that?
std::vector<int> vec;
vec.push_back(6);
vec.push_back(-17);
vec.push_back(12);
vec.erase(???);
To delete a single element, you could do:
std::vector<int> vec;
vec.push_back(6);
vec.push_back(-17);
vec.push_back(12);
// Deletes the second element (vec[1])
vec.erase(std::next(vec.begin()));
Or, to delete more than one element at once:
// Deletes the second through third elements (vec[1], vec[2])
vec.erase(std::next(vec.begin(), 1), std::next(vec.begin(), 3));
The erase method on std::vector is overloaded, so it's probably clearer to call
vec.erase(vec.begin() + index);
when you only want to erase a single element.
template <typename T>
void remove(std::vector<T>& vec, size_t pos)
{
std::vector<T>::iterator it = vec.begin();
std::advance(it, pos);
vec.erase(it);
}
The erase method will be used in two ways:
Erasing single element:
vector.erase( vector.begin() + 3 ); // Deleting the fourth element
Erasing range of elements:
vector.erase( vector.begin() + 3, vector.begin() + 5 ); // Deleting from fourth element to sixth element
Erase an element with index :
vec.erase(vec.begin() + index);
Erase an element with value:
vec.erase(find(vec.begin(),vec.end(),value));
Actually, the erase function works for two profiles:
Removing a single element
iterator erase (iterator position);
Removing a range of elements
iterator erase (iterator first, iterator last);
Since std::vec.begin() marks the start of container and if we want to delete the ith element in our vector, we can use:
vec.erase(vec.begin() + index);
If you look closely, vec.begin() is just a pointer to the starting position of our vector and adding the value of i to it increments the pointer to i position, so instead we can access the pointer to the ith element by:
&vec[i]
So we can write:
vec.erase(&vec[i]); // To delete the ith element
If you have an unordered vector you can take advantage of the fact that it's unordered and use something I saw from Dan Higgins at CPPCON
template< typename TContainer >
static bool EraseFromUnorderedByIndex( TContainer& inContainer, size_t inIndex )
{
if ( inIndex < inContainer.size() )
{
if ( inIndex != inContainer.size() - 1 )
inContainer[inIndex] = inContainer.back();
inContainer.pop_back();
return true;
}
return false;
}
Since the list order doesn't matter, just take the last element in the list and copy it over the top of the item you want to remove, then pop and delete the last item.
It may seem obvious to some people, but to elaborate on the above answers:
If you are doing removal of std::vector elements using erase in a loop over the whole vector, you should process your vector in reverse order, that is to say using
for (int i = v.size() - 1; i >= 0; i--)
instead of (the classical)
for (int i = 0; i < v.size(); i++)
The reason is that indices are affected by erase so if you remove the 4-th element, then the former 5-th element is now the new 4-th element, and it won't be processed by your loop if you're doing i++.
Below is a simple example illustrating this where I want to remove all the odds element of an int vector;
#include <iostream>
#include <vector>
using namespace std;
void printVector(const vector<int> &v)
{
for (size_t i = 0; i < v.size(); i++)
{
cout << v[i] << " ";
}
cout << endl;
}
int main()
{
vector<int> v1, v2;
for (int i = 0; i < 10; i++)
{
v1.push_back(i);
v2.push_back(i);
}
// print v1
cout << "v1: " << endl;
printVector(v1);
cout << endl;
// print v2
cout << "v2: " << endl;
printVector(v2);
// Erase all odd elements
cout << "--- Erase odd elements ---" << endl;
// loop with decreasing indices
cout << "Process v2 with decreasing indices: " << endl;
for (int i = v2.size() - 1; i >= 0; i--)
{
if (v2[i] % 2 != 0)
{
cout << "# ";
v2.erase(v2.begin() + i);
}
else
{
cout << v2[i] << " ";
}
}
cout << endl;
cout << endl;
// loop with increasing indices
cout << "Process v1 with increasing indices: " << endl;
for (int i = 0; i < v1.size(); i++)
{
if (v1[i] % 2 != 0)
{
cout << "# ";
v1.erase(v1.begin() + i);
}
else
{
cout << v1[i] << " ";
}
}
return 0;
}
Output:
v1:
0 1 2 3 4 5 6 7 8 9
v2:
0 1 2 3 4 5 6 7 8 9
--- Erase odd elements ---
Process v2 with decreasing indices:
# 8 # 6 # 4 # 2 # 0
Process v1 with increasing indices:
0 # # # # #
Note that on the second version with increasing indices, even numbers are not displayed as they are skipped because of i++
Note also that processing the vector in reverse order, you CAN'T use unsigned types for indices (for (uint8_t i = v.size() -1; ... won't work). This because when i equals 0, i-- will overflow and be equal to 255 for uint8_t for example (so the loop won't stop as i will still be >= 0, and probably out of bounds of the vector).
If you work with large vectors (size > 100,000) and want to delete lots of elements, I would recommend to do something like this:
int main(int argc, char** argv) {
vector <int> vec;
vector <int> vec2;
for (int i = 0; i < 20000000; i++){
vec.push_back(i);}
for (int i = 0; i < vec.size(); i++)
{
if(vec.at(i) %3 != 0)
vec2.push_back(i);
}
vec = vec2;
cout << vec.size() << endl;
}
The code takes every number in vec that can't be divided by 3 and copies it to vec2. Afterwards it copies vec2 in vec. It is pretty fast. To process 20,000,000 elements this algorithm only takes 0.8 sec!
I did the same thing with the erase-method, and it takes lots and lots of time:
Erase-Version (10k elements) : 0.04 sec
Erase-Version (100k elements) : 0.6 sec
Erase-Version (1000k elements): 56 sec
Erase-Version (10000k elements): ...still calculating (>30 min)
I suggest to read this since I believe that is what are you looking for.https://en.wikipedia.org/wiki/Erase%E2%80%93remove_idiom
If you use for example
vec.erase(vec.begin() + 1, vec.begin() + 3);
you will erase n -th element of vector but when you erase second element, all other elements of vector will be shifted and vector sized will be -1. This can be problem if you loop through vector since vector size() is decreasing. If you have problem like this provided link suggested to use existing algorithm in standard C++ library. and "remove" or "remove_if".
Hope that this helped
To delete an element use the following way:
// declaring and assigning array1
std:vector<int> array1 {0,2,3,4};
// erasing the value in the array
array1.erase(array1.begin()+n);
For a more broad overview you can visit: http://www.cplusplus.com/reference/vector/vector/erase/
if you need to erase an element inside of a for-loop, do the following:
for(int i = 0; i < vec.size(); i++){
if(condition)
vec.erase(vec.begin() + i);
}
You need to use the Standard Template Library's std::vector::erase function.
Example: Deleting an element from a vector (using index)
// Deleting the eleventh element from vector vec
vec.erase( vec.begin() + 10 );
Explanation of the above code
std::vector<T,Allocator>::erase Usage:
iterator erase (iterator position); // until C++11
iterator erase( const_iterator pos ); // since C++11 and until C++20
constexpr iterator erase( const_iterator pos ); // since C++20
Here there is a single parameter, position which is an iterator pointing to a single element to be removed from the vector.
Member types iterator and const_iterator are random access iterator types that point to elements.
How it works
erase function does the following:
It removes from the vector either a single element (position) or a range of elements ([first, last)).
It reduces the container size by the number of elements removed, which are destroyed.
Note: The iterator pos must be valid and dereferenceable. Thus the end() iterator (which is valid, but is not dereferenceable) cannot be used as a value for pos.
Return value and Complexity
The return value is an iterator pointing to the new location of the element that followed the last element that was erased by the function call. This is the container end of the operation that erased the last element in the sequence.
Member type iterator is a random access iterator type that points to elements.
Here, the time complexity is linear on the number of elements erased (destructions) plus the number of elements after the last element is deleted (moving).
The previous answers assume that you always have a signed index. Sadly, std::vector uses size_type for indexing, and difference_type for iterator arithmetic, so they don't work together if you have "-Wconversion" and friends enabled. This is another way to answer the question, while being able to handle both signed and unsigned:
To remove:
template<class T, class I, class = typename std::enable_if<std::is_integral<I>::value>::type>
void remove(std::vector<T> &v, I index)
{
const auto &iter = v.cbegin() + gsl::narrow_cast<typename std::vector<T>::difference_type>(index);
v.erase(iter);
}
To take:
template<class T, class I, class = typename std::enable_if<std::is_integral<I>::value>::type>
T take(std::vector<T> &v, I index)
{
const auto &iter = v.cbegin() + gsl::narrow_cast<typename std::vector<T>::difference_type>(index);
auto val = *iter;
v.erase(iter);
return val;
}
here is one more way to do this if you want to delete a element by finding this with its value in vector,you just need to do this on vector.
vector<int> ar(n);
ar.erase(remove(ar.begin(), ar.end()), (place your value here from vector array));
it will remove your value from here.
thanks
the fastest way (for programming contests by time complexity() = constant)
can erase 100M item in 1 second;
vector<int> it = (vector<int>::iterator) &vec[pos];
vec.erase(it);
and most readable way :
vec.erase(vec.begin() + pos);

vector push_back in STL?

I found a sample program which i tried for STL Vectors
#include <vector>
#include <iostream>
using namespace std;
/* This one may well work with some compilers/OS, and crash with
others. Who said the STL was safe ?? */
int main() {
vector<int> v;
v.push_back(1);
v.push_back(2);
v.push_back(3);
v.push_back(4);
for (vector<int>::iterator i = v.begin();
i != v.end(); i++) {
cout << *i << endl;
if (*i == 1) {
v.push_back(5);
}
}
}
I was expecting result: 1 2 3 4 5
but, result is very weird - 1 0 3 4 0 0 33 0 0 0 0 0 0 0 49 0 1 2 3 4 5 5
I am guessing, this has some logic which i am missing because these don't seem like garbage values, because result is same always.
After doing push_back, is iterator getting reset or something?
I have read it somewhere that appending to a vector while iterating over it isn't a good idea. My question is why?
After some search over internet,
if (*i == 1) {
size_t diff = i-v.begin();
v.push_back(5);
i = v.begin()+diff;
}
this will solve the issue.
You are modifying the std::vector while looping through it. When calling std::vector::push_back(), if its current size() is equal to its current capacity(), the std::vector will have to reallocate its internal data storage to increase its capacity before it can store the new value. That reallocation will invalidate the loop iterator (the end() iterator is always invalidated, but you are re-evaluating it on each iteration, so it is OK in this case).
To do what you are attempting, either:
reserve() the vector's capacity before entering the loop to avoid reallocation and thus avoid invalidating the loop iterator.
reset the iterators after each reallocation.
use a std::list instead, as the loop iterator will not get invalidated by std::list::push_back().
Because std::vector::push_back might invalidate all iterators.
If the new size() is greater than capacity() then all iterators and references (including the past-the-end iterator) are invalidated.
For your code, you can use std::vector::reserve to avoid reallocation.
Increase the capacity of the container to a value that's greater or equal to new_cap.
int main() {
vector<int> v;
v.reserve(5); // ensure no reallocation until size() == 5
v.push_back(1);
v.push_back(2);
v.push_back(3);
v.push_back(4);
for (vector<int>::iterator i = v.begin();
i != v.end(); i++) {
cout << *i << endl;
if (*i == 1) {
v.push_back(5);
}
}
}
And as you said, appending to a vector while iterating over it isn't a good idea.
Some good options in existing answers, but worth mentioning another oft forgotten option when you know you're facing potential iterator invalidation due to push_back resizing beyond capacity:
for (size_t i = 0; i < v.size(); ++i) {
cout << v[i] << endl;
if (v[i] == 1)
v.push_back(5);
}
It's "C style", but simple and intuitive.