Store which elements of a vector to erase

Store which elements of a vector to erase - c++

I have a vector V and I would like to store which elements of this vector I will later have to remove.
To do that I've used an other vector Y to store the iterators of the elements of V that I want to remove. So I iterate through Y to access the iterators of the elements I need to remove in V.
The problem is that when you erase elements from V, all the iterators in Y (pointing on elements of V) become invalid.
I can't find any answer but it seems so trivial that there must be a simple workaround, isn't it ?

Use V.erase(std::remove_if(V.begin(), V.end(), MyPredicate()), V.end())

You can use std::vector<unsigned> indices to store the index values of each element.

Iterators, pointers and references pointing to position (or first) and
beyond are invalidated, with all iterators, pointers and references to
elements before position (or first) are guaranteed to keep referring
to the same elements they were referring to before the call.
http://www.cplusplus.com/reference/vector/vector/erase/
So if the elements (iterators) of Y are sorted, you could just iterate over Y backwards and delete the corresponding elements in V. This works since only iterators to later elements in V are invalidated when you erase elements in V.

This is about complexity.
You can erase elements from higher indices to lower indices, in that case your approach will work, but each time you erase an element of a vector, the ones following it will be relocated. This has complexity linear in the number of element following the erased element, so I would expect sort of quadratic compexity, or O(number_of_elements * number_of_elements_to_be_erased).
If the number of deletions is high, and the indices of elements to erase are distributed more or less uniformly over the entire range, a better solution from the complexity point of view would be to work on the complement of "element to be erased". Instead, it would be "elements to be retained", you would copy the elements which should stay to a new array and assign it to the old one. This is linear in the number of elements in the vector, O(number_of_elements).
Better still, if you have a complete control over the implementation and can change std::vector for std::list, you could proceed with the rest exactly as you described, with no ill side effects. In that case, the erases are also efficient, done in constant time, so the entire operation is O(number_of_elements_to_be_erased).

Related

How to find the index of an element in a sorted vector given a pointer to that element

I have a vector std::vector<MyClass> vec. I'm trying to sort the vector and then find the index of an element given a pointer to it
sort(vec.begin(), vec.end(), ...);
auto index = myPointer - &vec[0];
I noticed that the value of index doesn't change when the vector is sorted and therefore is incorrect. Is there a way to directly get the correct index?

Standard requires that RandomIt of sort meet the requirement of ValueSwappable, which means it use std::iter_swap to swap elements during sorting. Your pointer address won't be changed, if it points to the 2nd element before sort, it will still point to the 2nd element after sort, but the 2nd element got changed after sort most of the time. You need to use equal_range to get a new pointer address after sorting.

if I insert or erase the element of a list in std::vector<std::list>, will it result in move afterward lists in vector

As insert an element in vector will relocate the afterward elements, as below said:
Because vectors use an array as their underlying storage, inserting elements in positions other than the vector end causes the container to relocate all the elements that were after position to their new positions. This is generally an inefficient operation compared to the one performed for the same operation by other kinds of sequence containers (such as list or forward_list).
I want to know, will it relocate elements of other lists in the vector, if I insert or erase the elements of one list in std::vector<std::list<int>>. I concern the efficiency of this kind of insert and erase operation. Is the complexity of this kind of insert and erase operation still be constant as normal insert and erase operation in std::list<int>?

If you're referring to inserting an element in the list, it would be constant time and not affect other elements in the vector. Why would it? If you're inserting a list into the vector, then that would move the elements behind it and take linear time.
#include <vector>
#include <list>
int main()
{
std::vector<std::list<int>> foo(3, std::list<int>({2, 3, 4}));
foo.emplace(foo.begin(), std::list<int>({7, 8})); //shifts all lists behind insertion point
foo[0].insert(foo[0].begin(), 42); //const time, has nothing to do with other lists
}
Note that even though the lists are shifted, the elements stored inside likely won't be shifted since the lists only stores pointers to them. This allows a list to be moved in constant time, therefore the time it takes inserting a list into a vector isn't affected by the number of elements in the lists.

How to dynamically update the condition of a for loop in C++?

Take the following code snippet:
// x is a global vector object that holds values of type string as follows, vector<string> x
// x is filled/populated via the function Populate_x(y,z);
Populate_x(y,z);
for (auto i : x)
{
string v = check(i);
Populate_x(v,v);
}
My question is, how can I dynamically update x in the range based for loop shown above when calling Populate_x(v,v) from within the for loop? I'm not sure if this is even possible. If not, how can I restructure my code to achieve this behavior?
Your suggestions are greatly appreciated.

A range-based for loop is equivalent to, approximately: 1) get the container's beginning and ending iterator, 2) Use a temporary iterator to iterate from the beginning iterator value to the ending iterator value 3) On each iteration, dereference the temporary iterator and set the range loop's variable to the dereferenced iterator value.
... more or less. The bottom line is 1) a range-based for loop obtains and uses the beginning and the ending iterators for the range, and 2) you state that your container is a vector, and, as you know, modifying a vector is going to invalidate most iterators to the contents of the vector.
It is true that with certain kinds of modifications to the contents of the vector, certain iterators will not be invalidated, and will remain valid. But, practically, it is safe to assume that if you've got a std::vector::iterator or a std::vector::const_iterator somewhere, modifying a vector means that the iterator is no longer valid. Not always 100% true, as I mentioned, but that's a pretty safe assumption to make.
And since range iteration obtains and uses iterators to the container, for the lifetime of the iteration, that makes, pretty much, doing a range iteration over a vector, and modifying the vector during iteration, a non-starter. Any modifications to the vector will likely result in undefined behavior, for any continuing iteration over the vector.
Note that "modification" means, essentially, insertion or removal of values from the vector; that is, modification of the vector itself. Modifying the values in the vector has no effect on the validity of any existing iterators, and this is safe.
If you want to iterate over a vector, and then safely modify the vector during this process (with the modification consisting of inserting or removing value from the vector), the first question you have to answer yourself is what does the insertion or removal mean for your iteration. That's something that you have to figure out yourself. If, for example, your loop is currently on the 4th element in the vector, and you insert a new 2nd value in the vector, since inserting a value into the vector shifts all the remaining values in the vector up, the 4th element in the vector will become the 5th element in the vector, and if you manage to safely do this correctly, on the next iteration, the 5th element in the vector will be the same one you just iterated over previously. Is this what you want? That's something you will have to answer yourself, first.
But as far as safely modifying a vector during iteration, the most simplest way is to avoid using iterators entirely, and use an index variable:
for (size_t i=0; i<x.size(); ++i)
{
auto v=x[i];
// ...
}
Now, modifying the vector is perfectly safe, and all you have to figure out, then, is what has to happen after the vector gets modified, whether the i index variable needs adjusting, in any way.
And that question you'll have to answer yourself.

From your description, I'm not sure what you mean by dynamically update x.
Is Populate_x adding new elements to x ?
Assuming your Populate_x function tries to push_back some elements on your vector x, then you cannot do that. See this answer for more details
Modifying the vector inside the loop results in undefined behaviour because the iterators used by the loop are invalidated when the vector is modified.
If so, if you want to add multiple elements at the end of x, a practical way would be to use a temporary vector<string> y; , push_back/emplace_back elements into y, and then once you're ready to append all elements of y into x , do something similar to this (in this case v1 is your x, v2 is your y :
x.insert(x.end(), make_move_iterator(y.begin()), make_move_iterator(y.end()));

Are vector elements guaranteed to be in order?

I understand that having pointers to elements of a vector is a bad idea, because upon expanding, the memory addresses involved will change, therefore invalidating the pointer(s). However, what if I were to simply use an integer that holds the index number of the element I want to access? Would that be invalidated as the vector grows in size? What I am thinking of looks something like this:
#include <vector>
class someClass{
string name
public: string getName(){return name;}
};
vector<someClass> vObj;
int currIdx;
string search;
cout<<"Enter name: ";
cin>>search;
for(int i=0; i<vObj.size(); i++){
if(vObj[i].getName()==search)
currIdx = i;}

No, the index numbers are of course not invalidated when the vector expands. They are invalidated (in the sense that you no longer find the same elements at a constant index) if you erase a previous element, though:
vector: 3 5 1 6 7 4
Here, vector[2] == 1. But if you erase vector[1] (the 5), then afterwards, vector[2] == 6.

I think the title of your question and what you seem to be asking do not really match. No vector is by definition guaranteed to be sorted, so elements won't be "in order".
Moreover, all iterators and references to elements of a vector will be invalidated upon insertion only if reallocation occurs (i.e. when the size of the vector exceeds its capacity). Otherwise, iterators and references before the point of insertion will not be invalidated (see Paragraph 23.3.6.5/1 of the C++11 Standard).
Storing an index is only subject to a potential logical invalidation: if you insert elements into the vector in a position prior to the one you are indexing, the element you were indexing will be shifted one position to the right, and the same index will now refer to a different element ; likewise, if you erase an element prior to the position you were indexing, the element you were indexing will be shifted on position to the left - and your index may now refer to a position which is out-of-bounds.

No, the index numbers are not invalidated when the vector is expanded. Since you're declaring that the vector container object is not a pointer vector<someClass> instead of vector<someClass*>, your pointed to element will be preserved as well.

It shouldn't, as the system will simply allocate more memory and then do a memcopy.
Order should be preserved in the std::vector STL template.
And yes, if you delete elements the ordering will change. But if you are going to be doing a lot of deletes, use a different data structure such as a linked list.

Does vector::erase re-order elements in vector?

I have a vector contains a,b,c,d,e
vec[2] is c, but will it automatically reorder after i delete/erase c ? i mean vec[2] is d after the operation.

Logically yes, as a vector is a dynamic array of element. You delete one, then everything that follows is moved.
In the same manner, the total length of the vector will decrease as you erase elements.
From cplusplus.com
This effectively reduces the vector size by the number of elements
removed, calling each element's destructor before.
Because vectors keep an array format, erasing on positions other than
the vector end also moves all the elements after the segment erased to
their new positions, which may not be a method as efficient as erasing
in other kinds of sequence containers (deque, list).

According to the standard:
iterator erase(const_iterator position);
....
Effects: Invalidate iterators and references at or after the point of the erase
Complexity: The destructor of T is called the number of times equal to the number of the elements erased, but the move assignment operator of T is called the number of times equal to the number of elements in the vector after the erased elements.
As you see the move assignment operator will be called as many times as there are elements after the erased element, and every reference/iterator to the elements after are invalidated.
So when one element is erased, all elements following are moved to fill in the "blank" space where the erased element was.
The point about the invalidated references/iterators are very important to remember, esp. if you are erasing in a loop. According to the newest standard erase should return you an iterator to the next element which you could use, or the erase-and-remove idiom

If you are worried about the side effects of this during an iteration, use the Erase-remove idiom

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js