Valid or Invalid Iterators And Iterator Positions - c++

I have a simple example routine below for erasing vector elements, the positions of which are stored in another vector. I've been using this method for some time now and only recently have experienced an error: Expression: vector iterator + offset out of range.
I seem to have found the problem, that being within the parameters of the erase() call I wasn't enclosing the 2nd part in parenthesis, which occasionally resulted in the above error when erasing elements near the end of the vector.
Now I've identified and corrected the problem, I would be grateful if somebody could just confirm that my simple routine below is in fact valid and without error, and that to call erase() within a for-loop in this way is okay.
I realise this routine only works if erasing element positions in order of first to last. Please see my code below:
vector<int> mynumbers;
mynumbers.push_back(4);
mynumbers.push_back(5);
mynumbers.push_back(6);
mynumbers.push_back(7);
vector<int> delpositions;
delpositions.push_back(1);
delpositions.push_back(2);
delpositions.push_back(3);
for(unsigned int i = 0; i < delpositions.size(); ++i)
mynumbers.erase(mynumbers.begin() + (delpositions[i] - i));
// Used To Be: delpositions[i] - i Which Caused The Error! Instead of: (delpositions[i] - i)

You do the right thing by adjusting the 'delposition' by the number of elements erased. Just ensure 'delpositions' are sorted ascending.
Erasing in reverse order (last to first) might be a bit more efficient.
I consider
vector result;
result.reserve(mynumbers.size() - delpositions.size());
// copy valid positions to result
mynumbers.swap(result)
a better solution

Related

Compare each element of a 2D array with the rest C++

I'm new to C++, and what I have to do is to write a method that checks if a 2D array contains any duplicate items.
So for example, I have a matrix[3][4], I've been able to compare the first element [0][0] with the rest, till the last one [2][3].
The problem is that I don't know how to proceed, what I should do is that the method then compares the element [0][1] with the rest (except with the previous one and itself of course) and etc..
First of, the fact that it's a 2D array is irrelevant in this context; you want to find duplicates across the entire array, so you'd be better of with 1-dimensional indexing anyway. Coincidentally, that's a suggested way of handlings two-dimensional arrays in C++.
So assuming you can put the indices in some order, the general idea is to check every element with all subsequent elements. This will give you O(n2). A pseudocode is, unsurprisingly, two loops, and is a common pattern used e.g. for collision detection:
for (iterA = 0; iterA < num - 1; iterA++) {
for (iterB = iterA + 1; iterB < num; iterB++) { // note how iterB starts one further
if (*iterA == *iterB)
return true; // found a duplicate
}
}
In case of a 2D array, the *iterA dereference can be replaced with a function that breaks up the composite 1-dimensional index into two components, e.g. with i % width, i / width. This is, again, a very common pattern.
That being said, why bother? Make an std::set, start putting elements one-by-one and call find before every insert. If find returns something, break.

Erasing from vector of list doesn't work as expected

It's a simple thing that I am doing but it doesn't work as I expected.
int main(){
vector<list<int>> adjList(3);
adjList[0].push_back(1);
adjList[0].push_back(2);
adjList[1].push_back(3);
adjList[1].push_back(0);
adjList[2].push_back(4);
cout << "Original graph...\n";
printGraph(adjList);
cout << "\nAfter deleting the zeroth index...\n";
adjList.erase(adjList.begin());
printGraph(adjList);
return 0;
}
Original graph...
0:1->2->NULL
1:3->0->NULL
2:4->NULL
After deleting the zeroth index...
0:3->0->NULL
1:4->NULL
I expected the zeroth index in my vector of list to be deleted. Instead, something weird happened where the second index got deleted and the elements in the list also got shuffled.
I am sure I am missing something basic here but just not able to figure out what that is.
Any help is much appreciated!
My bad. I realize now what's wrong. I was expecting to see the same indices after deletion but of course that is a wrong expectation. So the output is actually right just the indices moved around.
Original graph...
0:1->2->NULL
1:3->0->NULL
2:4->NULL
After deleting the zeroth index...
0:3->0->NULL (index 1 becomes 0)
1:4->NULL (index 2 becomes 1)
adjList.erase(adjList.begin()+1);
I expected the zeroth index in my vector of list to be deleted.
Your expectation is wrong.
adjList.begin()+1 is an iterator to the element at index 1. Therefore erasing that iterator will cause the element at index 1 to be erased (i.e. the second element).
adjList.begin() is an iterator to the element at the index 0, so if your intention is to erase that element, then that is the iterator that you would need to erase. Note however, that if you need to often erase the first element of the sequence, and need to keep the sequence in original order, then a vector is an inefficient choice. In such case, you might want to consider using a deque.
adjList.erase(adjList.begin());
I expected to get the below: 1:3->0->NULL 2:4->NULL
Your expectation is wrong.
A vector never skips any indices. If there are n elements in a vector, then those elements are in indices 0...n-1.
When you erase an element of a vector, the elements at greater indices are shifted to the left (this is why erasing from anywhere except at the end of the vector is slow).

How to avoid out of range exception when erasing vector in a loop?

My apologies for the lengthy explanation.
I am working on a C++ application that loads two files into two 2D string vectors, rearranges those vectors, builds another 2D string vector, and outputs it all in a report. The first element of the two vectors is a code that identifies the owner of the item and the item in the vector. I pass the owner's identification to the program on start and loop through the two vectors in a nested while loop to find those that have matching first elements. When I do, I build a third vector with components of the first two, and I then need to capture any that don't match.
I was using the syntax "vector.erase(vector.begin() + i)" to remove elements from the two original arrays when they matched. When the loop completed, I had my new third vector, and I was left with two vectors that only had elements, which didn't match and that is what I needed. This was working fine as I tried the various owners in the files (the program accepts one owner at a time). Then I tried one that generated an out of range error.
I could not figure out how to do the erase inside of the loop without throwing the error (it didn't seem that swap and pop or erase-remove were feasible solutions). I solved my problem for the program with two extra nested while loops after building my third vector in this one.
I'd like to know how to make the erase method work here (as it seems a simpler solution) or at least how to check for my out of range error (and avoid it). There were a lot of "rows" for this particular owner; so debugging was tedious. Before giving up and going on to the nested while solution, I determined that the second erase was throwing the error. How can I make this work, or are my nested whiles after the fact, the best I can do? Here is the code:
i = 0;
while (i < AIvector.size())
{
CHECK:
j = 0;
while (j < TRvector.size())
{
if (AIvector[i][0] == TRvector[j][0])
{
linevector.clear();
// Add the necessary data from both vectors to Combo_outputvector
for (x = 0; x < AIvector[i].size(); x++)
{
linevector.push_back(AIvector[i][x]); // add AI info
}
for (x = 3; x < TRvector[j].size(); x++) // Don't need the the first three elements; so start with x=3.
{
linevector.push_back(TRvector[j][x]); // add TR info
}
Combo_outputvector.push_back(linevector); // build the combo vector
// then erase these two current rows/elements from their respective vectors, this revises the AI and TR vectors
AIvector.erase(AIvector.begin() + i);
TRvector.erase(TRvector.begin() + j);
goto CHECK; // jump from here because the erase will have changed the two increments
}
j++;
}
i++;
}
As already discussed, your goto jumps to the wrong position. Simply moving it out of the first while loop should solve your problems. But can we do better?
Erasing from a vector can be done cleanly with std::remove and std::erase for cheap-to-move objects, which vector and string both are. After some thought, however, I believe this isn't the best solution for you because you need a function that does more than just check if a certain row exists in both containers and that is not easily expressed with the erase-remove idiom.
Retaining the current structure, then, we can use iterators for the loop condition. We have a lot to gain from this, because std::vector::erase returns an iterator to the next valid element after the erased one. Not to mention that it takes an iterator anyway. Conditionally erasing elements in a vector becomes as simple as
auto it = vec.begin()
while (it != vec.end()) {
if (...)
it = vec.erase(it);
else
++it;
}
Because we assign erase's return value to it we don't have to worry about iterator invalidation. If we erase the last element, it returns vec.end() so that doesn't need special handling.
Your second loop can be removed altogether. The C++ standard defines functions for searching inside STL containers. std::find_if searches for a value in a container that satisfies a condition and returns an iterator to it, or end() if it doesn't exist. You haven't declared your types anywhere so I'm just going to assume the rows are std::vector<std::string>>.
using row_t = std::vector<std::string>;
auto AI_it = AIVector.begin();
while (AI_it != AIVector.end()) {
// Find a row in TRVector with the same first element as *AI_it
auto TR_it = std::find_if (TRVector.begin(), TRVector.end(), [&AI_it](const row_t& row) {
return row[0] == (*AI_it)[0];
});
// If a matching row was found
if (TR_it != TRVector.end()) {
// Copy the line from AIVector
auto linevector = *AI_it;
// Do NOT do this if you don't guarantee size > 3
assert(TR_it->size() >= 3);
std::copy(TR_it->begin() + 3, TR_it->end(),
std::back_inserter(linevector));
Combo_outputvector.emplace_back(std::move(linevector));
AI_it = AIVector.erase(AI_it);
TRVector.erase(TR_it);
}
else
++AI_it;
}
As you can see, switching to iterators completely sidesteps your initial problem of figuring out how not to access invalid indices. If you don't understand the syntax of the arguments for find_if search for the term lambda. It is beyond the scope if this answer to explain what they are.
A few notable changes:
linevector is now encapsulated properly. There is no reason for it to be declared outside this scope and reused.
linevector simply copies the desired row from AIVector rather than push_back every element in it, as long as Combo_outputvector (and therefore linevector) contains the same type than AIVector and TRVector.
std::copy is used instead of a for loop. Apart from being slightly shorter, it is also more generic, meaning you could change your container type to anything that supports random access iterators and inserting at the back, and the copy would still work.
linevector is moved into Combo_outputvector. This can be a huge performance optimization if your vectors are large!
It is possible that you used an non-encapsulated linevector because you wanted to keep a copy of the last inserted row outside of the loop. That would prohibit moving it, however. For this reason it is faster and more descriptive to do it as I showed above and then simply do the following after the loop.
auto linevector = Combo_outputvector.back();

Vector erase function in for loop is not erasing vector of classes properly

I have a simple for loop:
for (int i = 0; i < c.numparticles; i++)
{
if ( labs((noncollision[i].getypos())) > 5000 )
{
noncollision.erase (noncollision.begin()+i);
}
}
Where noncollision is a vector of class particle. In this specific example, any noncollision which has a ypos greater than 5000 should be erased. I have been working with a noncollision size of 6, of which 2 have ypos much greater than 5000. However, this for loop is only erasing one of them, completely ignoring the other. My suspicion is that because noncollision is a vector of classes, that this classes is somehow protected, or causes the array function to act differently? Here is my declaration for noncollision, and for particle:
vector<particle> noncollision;
class particle{
private:
int xpos;
int ypos;
int xvel;
int yvel;
bool jc; // Has the particle just collided?
public:
etc....
};
Could anyone explain why this is happening, and how to rectify it? Do I somehow need to set up an 'erase function' for the particle class?
If you have two candidate elements next to each other (say, at i=5 and i=6), then you jump over the second, because you just erased the one at i=5... then the second becomes the new i=5 but you increment i to get i=6 on the next loop.
You need to fix your loop to properly support the fact that you're simultaneously removing elements from the same container over which you're iterating.
Typically you'd use actual iterators (rather than a counter i), and vector::erase conveniently returns a new iterator for you to use in the next iteration:
vector<particle>::iterator it = noncollision.begin(), end = noncollision.end();
for ( ; it != end; ) { // NB. no `++it` here!
if (labs(it->getypos()) > 5000) {
// erase this element, and get an iterator to the new next one
it = noncollision.erase(it);
// the end's moved, too!
end = noncollision.end();
}
else {
// otherwise, and only otherwise, continue iterating as normal
it++;
}
}
However, quoting Joe Z:
Also, since erase can be O(N) in the size of a vector, you might (a) benchmark the loop using reverse iterators too, (b) consider copying the not-erased elements into a fresh vector as opposed to deleting elements out of the middle, or (c) using a list<> instead of a vector<> if deleting from the middle is a common operation.
Or, if you're lazy, you could also just reverse your iteration order, which preserves the sanctity of your counter i in this specific case:
for (int i = c.numparticles-1; i >= 0; i--) {
if (labs(noncollision[i].getypos()) > 5000) {
noncollision.erase(noncollision.begin()+i);
}
}
Just be careful never to change i to an unsigned variable (and your compiler is probably warning you to do just that — i.e. to use size_t instead — if c.numparticles has a sensible type) because if you do, your loop will never end!
However, this for loop is only erasing one of them, completely ignoring the other.
This is because you are going front to back. When your code erases an item at, say, index 6, the item that was previously at index 7 is at the index 6 now. However, the loop is going to skip index 6 after i++, thinking that it has already processed it.
If you go back-to-front, the problem will be fixed:
for (int i = c.numparticles-1; i >= 0; i--)
{
if ( labs((noncollision[i].getypos())) > 5000 )
{
noncollision.erase (noncollision.begin()+i);
}
}
It looks like you're suffering from "Invalidated Iterator" syndrome, although in this case it's the index that's the problem.
Are the 2 elements you want to erase next to each other?
The problem is that erasing an element from a vector causes the remaining underlying elements to be copied to a new location (unless you erase the last element), and the number of elements in the vector is reduced by one.
Since you're using indexing into the vector, you're not falling foul of the first problem (which is iterators being invalidated), but:
you will never check the element immediately after the one you just erased
your indexing will spill off the end of the vector (undefined behaviour)
Modifying any sequence you're inspecting in the same loop is a bad idea. Have a look at
remove_if for a better way. This algo puts all the matching elements at the end of the vector, and returns you an iterator to the first one that was moved, allowing you to remove them all in one go safely.

Adding object to vector with push_back working fine, but adding objects with accessor syntax [ ] , not working

I've implemented a merge function for vectors, which basically combines to sorted vectors in a one sorted vector. (yes, it is for a merge sort algorithm). I was trying to make my code faster and avoid overheads, so I decided not to use the push_back method on the vector, but try to use the array syntax instead which has lesser over head. However, something is going terribly wrong, and the output is messed up when i do this. Here's the code:
while(size1<left.size() && size2 < right.size()) //left and right are the input vectors
{
//it1 and it2 are iterators on the two sorted input vectors
if(*it1 <= *it2)
{
final.push_back(*it1); //final is the final vector to output
//final[count] = *it1; // this does not work for some reason
it1++;
size1++;
//cout<<"count ="<<count<<" size1 ="<<size1<<endl;
}
else
{
final.push_back(*it2);
//final[count] = left[size2];
it2++;
size2++;
}
count++;
//cout<<"count ="<<count<<" size1 ="<<size1<<"size2 = "<<size2<<endl;
}
It seems to me that the two methods should be functionally equivalent.
PS I have already reserved space for the final vector so that shouldnt be a problem.
You can't add new objects to vector using operator[]. .reserve() doesn't add them neither. You have to either use .resize() or .push_back().
Also, you are not avoiding overheads at all; call cost of operator[] isn't really much better that push_back() one, so until you profile your code thorougly, just use push_back. You can still use reserve to make sure unneccessary allocations won't be made.
In most of the cases, "optimizations" like this don't really help. If you want to make your code faster, profile it first and look for the hot paths.
There is a huge difference between
vector[i] = item;
and
vector.push_back(item);
Differences:
The first one modifies the element at index i and i must be valid index. That is,
0 <= i < vector.size() must be true
If i is an invalid index, the first one invokes undefined behavior, which means ANYTHING can happen. You could, however, use at() which throws exception if i is invalid:
vector.at(i) = item; //throws exception if i is invalid
The second one adds an element to the vector at the end, which means the size of the vector increases by one.
Since, sematically both of them do different thing, choose the one which you need.