C++ iterating with changing vector.size() - c++

I've written some perhaps naive code that is meant to remove elements from a vector that are too similar. The functionality is fine, but I think I may get unexpected results now and then because of the dynamic resizing of the vector.
for (size_t i = 0 ; i < vec.size(); i++) {
for(size_t j = i+1; j < vec.size(); j++) {
if(norm(vec[i]-vec[j]) <= 20 ) {
vec.erase(vec.begin()+j);
}
}
}
Is this safe to do? I'm concerned about i and j correctly adapting as I erase elements.

You need to pay better attention to where your elements are. It might be easier to express this directly in terms of iterators rather than compute iterators via indexes, like this:
for (auto it = vec.begin(); it != vec.end(); ++it)
{
for (auto jt = std::next(it); jt !=; vec.end(); )
{
if (/* condition */)
{
jt = vec.erase(jt);
}
else
{
++jt;
}
}
}

Yes, you are safe here. Since you are using indexes, not iterators, there is nothing to invalidate by erasing an item in the container except the size, and the size would be updated automatically, so we are good here.
One more thing to consider is what effect does erasing an element inside the inner loop has on the stopping condition of the outer loop. There is no problem there either, because j is guaranteed to be strictly greater than i, so j < vec.size() condition of the inner loop will be hit before the i < vec.size() condition of the outer loop, meaning that there would be no unsafe vec[i] access with an invalid index i.
Of course you should increment j after erasing an element to avoid the classic error. An even better approach would be to start walking the vector from the back, but you would need to do so in both loops to make sure that i a valid element is never erased from underneath the outer index of i.

Related

Will elements added into a std::unordered_set (or unordered_map) during iteration be visited during the iterations?

I have code that looks like the below:
std::unordered_set<int> ht{1,2,3};
ht.reserve(10000); // ht will not exceed this size
for(int i = 0; i < n; i++)
{
auto j = i;
for(auto it = ht.begin(); it != ht.end(); ++it)
{
// do some stuff
int v = j++;
ht.emplace(v);
}
}
For the inner loop, I want to loop from the beginning of ht to the end, but I don't want the loop to go over any of the newly added elements within the loop. In other words, is the above equivalent to the below?
std::unordered_set<int> ht{1,2,3};
ht.reserve(10000); // ht will not exceed this size
for(int i = 0; i < n; i++)
{
auto temp = ht;
auto j = i;
for(auto it = ht.begin(); it != ht.end(); ++it)
{
// do some stuff
auto v = j++;
temp.emplace(j);
}
ht = temp;
}
Based on a few runs that I did, it seems to be equivalent, but I don't know if this is undefined behavior, or if they are indeed equivalent. If the unordered_set was changed to a vector, this would not work, but it seems the forward iterators work.
Does the answer change if the ht.reserve(10000); // ht will not exceed this size was not present or if ht did in fact exceed the reserved capacity, and therefore all the forward iterators will be invalidated?
No, it's not safe:
On most cases, all iterators in the container remain valid after the insertion. The only exception being when the growth of the container forces a rehash. In this case, all iterators in the container are invalidated.
Sometimes it works, but I don't think this is enough for you!
No. See cppreference on std::unordered_set.
Cppreference.com has Iterator invalidation sections which describe when an iterator is invalidated. In the case of using std::unordered_set, inserting is not safe when a rehash occurs.
Rehashing occurs only if the new number of elements is greater than max_load_factor()*bucket_count()
And you can not know for certain whether inserting an element will cause that to happen.
In your example, you don't actually dereference the iterator, so why not loop over the size of the set?
size_t limit = ht.size();
for (size_t i = 0; i < limit; ++i) {
...
}

how to see if all elements in vector are the same?

I have a vector of vectors.
How would I check to see if all elements in one of the columns are the same?
I`ve tried to check it with this nested for loop but I'm getting an out of range error.
void move_bee(vector< vector<insect> > &insects_on_board){
for(int i = 0; i < 10; i++){
for(int j = 0; j < insects_on_board.at(i).size(); j++){
for(int k = insects_on_board.at(i).size(); k > 0; k--){
if(insects_on_board.at(i).at(j) == "B" &&
insects_on_board.at(i).at(k) == "B"){
insects_on_board.at(i-1).push_back(bee());
insects_on_board.at(i).erase(insects_on_board.at(i).begin() + j);
}
}
}
}
}
I have read about the:
if (equal(myVector.begin() + 1, myVector.end(), myVector.begin()) )
method but it would not compile for me, I am assuming its because it's a vector of vectors.
The initial value of k is off the end: use
for(auto k=v.size(); k--;)
to loop backwards.
The use of at(i-1) must also be wrong, since i starts at 0.
Unless your vector is sorted, checking just two elements can’t tell you if they’re all equal.
Finally, even if you can’t use a range-for (because of index use like i-1 and ...+j), do bind a reference to the element you’re working on, at least at the outer level. The readability improvement is significant.

c++ vector erase function not working for specific words?

I am using a very simple function in c++, vector.erase(), here's what I have (I'm trying to erase all instances of these three keywords from a .txt file):
First I use it in two separate for loops to erase all instances of <event> and </event>, this works perfectly and outputs the edited text file with no more instances of those words.
for (int j = 0; j< N-counter; j++) {
if(myvec[j] == "<event>") {
myvec.erase(myvec.begin()+j);
}
}
for (int j = 0; j< N-counter; j++) {
if(myvec[j] == "</event>") {
myvec.erase(myvec.begin()+j);
}
}
However, when I add a third for loop to do the EXACT same thing, literally just copy and paste with a new keyword as follows:
for (int j = 0; j< N-counter; j++) {
if(myvec[j] == "</LesHouchesEvents>") {
myvec.erase(myvec.begin()+j);
}
}
It compiles and executes, however it completely destroys the .txt file, making it completely un-openable, and when i cat it, I just get a bunch of crazy symbols.
I have tried switching the order of these for loops, even getting rid of the first two for loops entirely, everything I can think of, alas it just will not work for the keyword </LesHouchesEvents> for some strange reason.
Your loops are not taking into account that when you erase() an element from a vector, the indexes of the remaining elements will decrement accordingly. So your loops will eventually exceed the bounds of the vector once you have erased at least 1 element. You need to take that into account:
std:string word = ...;
size_t count = N-counter;
for (int j = 0; j < count;) {
if(myvec[j] == word) {
myvec.erase(myvec.begin()+j);
--count;
}
else {
++j;
}
}
With that said, it would be safer to use iterators instead of indexes. erase() returns an iterator to the element that immediately follows the removed element. You can use std::find() for the actual searching:
#include <algorithm>
std::vector<std::string>::iterator iter = std::find(myvec.begin(), myvec.end(), word);
while (iter != myvec.end())
{
iter = myvec.erase(iter);
iter = std::find(iter, myvec.end(), word);
}
Or, you could just use std::remove() instead:
#include <algorithm>
myvec.erase(std::remove(myvec.begin(), myvec.end(), word), myvec.end());
I don't know if this is your specific problem or not, but this loop is almost surely not what you want.
Note the documentation for erase - it "shifts" left the remaining elements. Unfortunately, your code still increments j, meaning you're skipping the next element:
for (int j = 0; j< N-counter; j++) { // <- Don't increment j here
...
myvec.erase(myvec.begin()+j); // <- increment it only if this didn't happen.
}
You'll also need to adjust your loop's halting condition.
Even assuming you got it working, this is nearly the worst possible way to remove items from a vector.
You almost certainly want the remove/erase idiom here, and you probably want to do all the comparisons in a single pass, so it's something like this:
std::vector<std::string> bad = {
"<event>",
"</event>",
"</LesHouchesEvents>"
};
myvec.erase(std::remove_if(my_vec.begin(), my_vec.end(),
[&](std::string const &s) {
return std::find(bad.begin(), bad.end(), s) != bad.end();
}),
my_vec.end());

call to condition on for loop (c++)

Here is a simple question I have been wondering about for a long time :
When I do a loop such as this one :
for (int i = 0; i < myVector.size() ; ++i) {
// my loop
}
As the condition i < myVector.size() is checked each time, should I store the size of the array inside a variable before the loop to prevent the call to size() each iteration ? Or is the compiler smart enough to do it itself ?
mySize = myVector.size();
for (int i = 0; i < mySize ; ++i) {
// my loop
}
And I would extend the question with a more complex condition such as i < myVector.front()/myVector.size()
Edit : I don't use myVector inside the loop, it is juste here to give the ending condition. And what about the more complex condition ?
The answer depends mainly on the contents of your loop–it may modify the vector during processing, thus modifying its size.
However if the vector is just scanned you can safely store its size in advance:
for (int i = 0, mySize = myVector.size(); i < mySize ; ++i) {
// my loop
}
although in most classes the functions like 'get current size' are just inline getters:
class XXX
{
public:
int size() const { return mSize; }
....
private:
int mSize;
....
};
so the compiler can easily reduce the call to just reading the int variable, consequently prefetching the length gives no gain.
If you are not changing anything in vector (adding/removing) during for-loop (which is normal case) I would use foreach loop
for (auto object : myVector)
{
//here some code
}
or if you cannot use c++11 I would use iterators
for (auto it = myVector.begin(); it != myVector.end(); ++it)
{
//here some code
}
I'd say that
for (int i = 0; i < myVector.size() ; ++i) {
// my loop
}
is a bit safer than
mySize = myVector.size();
for (int i = 0; i < mySize ; ++i) {
// my loop
}
because the value of myVector.size() may change (as result of , e.g. push_back(value) inside the loop) thus you might miss some of the elements.
If you are 100% sure that the value of myVector.size() is not going to change, then both are the same thing.
Yet, the first one is a bit more flexible than the second (other developer may be unaware that the loop iterates over fixed size and he might change the array size). Don't worry about the compiler, he's smarter than both of us combined.
The overhead is very small.
vector.size() does not recalculate anything, but simply returns the value of the private size variable..
it is safer than pre-buffering the value, as the vectors internal size variable is changed when an element is popped or pushed to/from the vector..
compilers can be written to optimize this out, if and only if, it can predict that the vector is not changed by ANYTHING while the for loop runs.
That is difficult to do if there are threads in there.
but if there isn't any threading going on, it's very easy to optimize it.
Any smart compiler will probably optimize this out. However just to be sure I usually lay out my for loops like this:
for (int i = myvector.size() -1; i >= 0; --i)
{
}
A couple of things are different:
The iteration is done the other way around. Although this shouldn't be a problem in most cases. If it is I prefer David Haim's method.
The --i is used rather than a i--. In theory the --i is faster, although on most compilers it won't make a difference.
If you don't care about the index this:
for (int i = myvector.size(); i > 0; --i)
{
}
Would also be an option. Altough in general I don't use it because it is a bit more confusing than the first. And will not gain you any performance.
For a type like a std::vector or std::list an iterator is the preffered method:
for (std::vector</*vectortype here*/>::iterator i = myVector.begin(); i != myVector.end(); ++i)
{
}

C++ list iterator arithmetic?

I'm trying to create a set of loops with iterators and I'm having trouble with some iterator arithmetic (that I thought was possible but is not working).
Below is some code:
for (list<Term>::iterator itr = final.begin(); itr != final.end(); itr++) {
for(list<Term>::iterator j = itr + 1; j != final.end(); j++) {
cout << itr->term << " " << j->term;
if(itr->term == j->term) {
//Do stuff
}
}
}
What I am trying to do is have j start at the next place in the queue along from itr. The reason for this is I don't want to check the first item against itself. The error itself comes from the part in the code where I have specified itr + 1. Now I was sure with pointers you could do arithmetic like this, why is it not working with the list iterator (which is essentially the same thing?)
The error I am getting from my IDE is as follows: main.cpp:237:48: error: no match for ‘operator+’ in ‘itr + 1’. Again I thought you could do this sort of arithmetic on iterators so I'm not really sure what to do to make this work, is there an alternate implementation I could try?
list iterators are not random access so you cannot do + with them. They are bidirectional iterators so the only movement operations you can do are -- and ++. You can either make a copy and use ++ on it, or make a copy and std::advance(it, 1).
For C++11 there is also std::next which gives you it + 1, without you having to explicitly make a named copy like you do with the others.
list has bidirectional iterators, that doesn't support operator +. You can use std::advance, or std::next in C++11.
for (list<Term>::iterator j = next(itr); j != final.end(); ++j)
or
list<Term>::iterator j = itr;
advance(j, 1); // or ++j
for (; j != final.end(); ++j)