Is it valid to do this on an input iterator *it++ ?
I understand the code as follow, that it dereference the iterator and gives me the value and then step one ahead.
In the c++ reference the * operator is lower than the postfix operator: http://en.cppreference.com/w/cpp/language/operator_precedence
But I read this style is bad practice. Why?
Is it valid to do this on an input iterator *it++?
Yes, that is valid. The iterator will be incremented, its previous value will be returned, and you will dereference that returned iterator.
But I read this style is bad practice. Why?
Consider these two implementations I've just pulled out of some graph code I wrote a while back:
// Pre-increment
BidirectionalIterator& operator++()
{
edge = (edge->*elem).next;
return *this;
}
// Post-increment
BidirectionalIterator operator++(int)
{
TargetIterator oldval(list, edge);
edge = (edge->*elem).next;
return oldval;
}
Notice that for post-increment, I need to first construct a new iterator to store the previous value which will be returned.
If it's simple and clear to write your program to make use of pre-increment, there may be better performance, less work for the compiler to do, or both.
Just don't go crazy on this (for example, rewriting all your loops)! That would likely be micro-optimization. However, the reason people say it's good practice is that if you get into a habit of using pre-increment as default then you get (potentially) better performance by default.
Related
I am implementing a container that presents a map-like interface. The physicals implementation is an std::vector<std::pair<K*, T>>. A K object remembers its assigned position in the vector. It is possible for a K object to get destroyed. In that case its remembered index is used to zero out its corresponding key pointer within the vector, creating a tombstone.
I would like to expose the full traditional collection of iterators, though I think that they need only claim to be forward_iterators (see next).
I want to be able to use range-based for loop iteration to return the only non-tombstoned elements. Further, I would like the implementation of my iterators to be a single pointer (i.e. no back pointer to the container).
Since the range-based for loop is pretested I think that I can implement tombstone skipping within the inequality predicate.
bool operator != (MyInterator& cursor, MyIterator stop) {
while (cursor != stop) {
if (cursor->first)
return true;
++cursor;
}
return false;
}
Is this a reasonable approach? If yes, is there a simple way for me to override the inequality operator of std::vector's iterators instead of implementing my iterators from scratch?
If this is not a reasonable approach, what would be better?
Is this a reasonable approach?
No. (Keep in mind that operator!= can be used outside a range-based for loop.)
Your operator does not accept a const object as its first parameter (meaning a const vector::iterator).
You have undefined behavior if the first parameter comes after the second (e.g. if someone tests end != cur instead of cur != end).
You get this weird case where, given iterators a and b, it might be that *a is different than *b, but if you check if (a != b) then you find that the iterators are equal and then *a is the same as *b. This probably wrecks havoc with the multipass guarantee of forward iterators (but the situation is bizarre enough that I would want to check the standard's precise wording before passing judgement). Messing with people's expectations is inadvisable.
There is no simple way to override the inequality operator of std::vector's iterators.
If this is not a reasonable approach, what would be better?
You already know what would be better. You're just shying away from it.
Implement your own iterators from scratch. Wrapping your vector in your own class has the benefit that only the code for that class has to be aware that tombstones exist.
Caveat: Document that the conditions that create a tombstone also invalidate iterators to that element. (Invalid iterators are excluded from most iterator requirements, such as the multipass guarantee.)
OR
While your implementation makes a poor operator!=, it could be a fine update or check function. There's this little-known secret that C++ has more looping structures than just range-based for loops. You could make use of one of these, for example:
for ( cur = vec.begin(); skip_tombstones(cur, vec.end()); ++cur ) {
auto& element = *cur;
where skip_tombstones() is basically your operator!= renamed. If not much code needs to iterate over the vector, this might be a reasonable option, even in the long term.
I was attempting to compare a const iterator to a non-const iterator but was not certain whether it was okay, so I looked it up. I found out it is OK due to an implicit conversion of non-const iterator to const iterator. However, I was wondering whether you should prefer to not compare these iterators, in order to avoid this conversion. For example,
begin_iterator = buf.begin(); end_iterator = buf.cend();
for (; begin_iterator != end_iterator; ++begin_iterator) { ... }
Regularly, I would consider this OK as const means read-only, which is fine. However, I am uncertain about the (unnecessary) conversion.
It's always fine to convert a non const iterator to a const iterator, just as it's fine to convert a non const pointer to a const pointer.
The opposite is not something you want to do, it can lead to crash (because the data is stored in const only space, for instance). So never do it.
While this might work in this case, it's not the standard pattern for creating ranges. In particular, the standard algorithms expect two iterators of the same type, so, for example, std::fill(whatever.begin(), whatever.cend(), 3) would not compile.
As a result, the code in the question creates a maintenance problem. Suppose a maintainer realizes that the for loop is doing something that can be done with a standard algorithm. The obvious transformation is to replace the for loop with the algorithm. So
begin_iterator = buf.begin(); end_iterator = buf.cend();
for (; begin_iterator != end_iterator; ++begin_iterator) { ... }
becomes
begin_iterator = buf.begin(); end_iterator = buf.cend();
std::whatever(begin_iterator, end_iterator);
but that doesn't compile, so the maintainer has to hunt around and discover that the two iterators are not the same type, and then figure out whether it's okay to change the type of the end iterator to match the begin iterator. That means examining all of the places where the end iterator is used, to determine why it's a different type and whether that matters.
So the real issue is, what problem does this solve, and is the cost worth it? In general, the answer to the first is "nothing important" and the answer to the second is "no".
I have a std::list<double> foo;
I'm using
if (foo.size() >= 2){
double penultimate = *(--foo.rbegin());
}
but this always gives me an arbitrary value of penultimate.
What am I doing wrong?
Rather than decrementing rbegin, you should increment it, as shown here:1
double penultimate = *++foo.rbegin();
as rbegin() returns a reverse iterator, so ++ is the operator to move backwards in the container. Note that I've also dropped the superfluous parentheses: that's not to everyone's taste.
Currently the behaviour of your program is undefined since you are actually moving to end(), and you are not allowed to dereference that. The arbitrary nature of the output is a manifestation of that undefined behaviour.
1Do retain the minimum size check that you currently have.
The clearest way, in my mind, is to use the construct designed for this purpose (C++11):
double penultimate = *std::prev(foo.end(), 2)
I would just do *--(--foo.end()); no need for reverse iterators. It's less confusing too.
The STL has many functions that return iterators. For example, the STL list function erase returns an iterator (both C++98 and C++11). Nevertheless, I see it being used as a void function. Even the cplusplus.com site has an example that contains the following code:
mylist.erase(it1, it2);
which does not return an iterator. Shouldn't the proper syntax be as follows?
std::list<int>::iterator iterList = mylist.erase(it1, it2)?
You are not forced to use a return value in C++. Normally the returned iterator should be useful to have a new valid iterator to that container. But syntactically it's correct. Just keep in mind that after the erasure, it1 and it2 will not be valid any more.
which does not return an iterator. Shouldn't the proper syntax be as
follows?
It does return an iterator, but just as with any other function you can ignore the returned value. If you just want to erase an element and dont care about the returned iterator, then you may simply ignore it.
Actually ignoring the return value is more common than you might expect. Maybe the most common place where return values are ignored is when the only purpose of the return value is to enable chaining. For example the assignment operator is usually written as
Foo& operator=(const Foo& other){
/*...*/
return *this;
}
so that you can write something like a = b = c. However, when you only write b = c you usually ignore the value returned by that call. Also note, that when you write b = c this does return a value, but if you just want to make this assignment, then you have no other choice than to ignore the return value.
The actual reason is far more banal than you might think. If you couldn't ignore a return value, many of the functions like .erase() would need to be split in two versions: an .erase() version returning void and another .erase_and_return() version which returned the iterator. What would you gain from this?
Return values are sometimes that are sometimes useful. If they are not useful, you don't have to use them.
All container.erase functions return the iterator after the newly erased range. For non-node-based containers this is often (but not always) useful because the iterators at and after the range are no longer valid.
For node-based containers this is usually useless, as the 2nd iterator passed in remains valid even after the erase operation.
Regardless, both return that iterator. This permits code that works on a generic container to not have to know if the container maintains valid iterators after the erase or not; it can just store the return value and have a valid iterator to after-the-erase.
Iterators in C++ standard containers are all extremely cheap to create and destroy and copy; in fact, they are in practice so cheap that compilers can eliminate them entirely if they aren't used. So returning an iterator that isn't used can have zero run time cost.
A program that doesn't use this return value here can be a correct program, both semantically and syntactically. At the same time, other semantically and syntactically correct programs will require that you use that return value.
Finally,
mylist.erase(it1, it2);
this does return an iterator. The iterator is immediately discarded (the return value only exists as an unnamed temporary), and compilers are likely to optimize it out of existence.
But just because you don't store a return value, doesn't mean it isn't returned.
I have an iterator DataIterator that produces values on-demand, so the dereference operator returns a Data, and not a Data&. I thought it was an OK thing to do, until I tried to reverse the data DataIterator by wrapping it in a reverse_iterator.
DataCollection collection
std::reverse_iterator<DataIterator> rBegin(iter) //iter is a DataIterator that's part-way through the collection
std::reverse_iterator<DataIterator> rEnd(collection.cbegin());
auto Found = std::find_if(
rBegin,
rEnd,
[](const Data& candidate){
return candidate.Value() == 0x00;
});
When I run the above code, it never finds a Data object whose value is equal to 0, even though I know one exists. When I stick a breakpoint inside the predicate, I see weird values that I would never expect to see like 0xCCCC - probably uninitialized memory. What happens is that the reverse_iterator's dereference operator looks like this (from xutility - Visual Studio 2010)
Data& operator*() const
{ // return designated value
DataIterator _Tmp = current;
return (*--_Tmp); //Here's the problem - the * operator on DataIterator returns a value instead of a reference
}
The last line is where the problem is - a temporary Data gets created and reference to that data gets returned. The reference is invalid immediately.
If I change my predicate in std::find_if to take (Data candidate) instead of (const Data& candidate) then the predicate works - but I'm pretty sure I'm just getting lucky with undefined behavior there. The reference is invalid, but I'm making a copy of the data before the memory gets clobbered.
What can I do?
Fix my DataIterator so that operator* returns a Data& instead of a Data? I don't really see how this is possible. The whole point of my DataIterator returning a Data instead of a Data& is because I don't have room to hold the entire uncompressed data set in memory, so I create the items that you want to look at on demand. Maybe I could hold onto the 'current' data value - but then that reference is going to become invalid the moment you increment or decrement the DataIterator. Edit one of the answers suggests a shared_ptr
Write a specialization of the reverse_iterator and make its dereference operator return a value instead of a reference? This seems like a frustrating amount of work, but understandable since it's my DataIterator that's not playing nice here - not the rest of the STL.
Along the same lines, maybe make a find_if that goes in reverse - probably less work than specializing reverse_iterator.
Something else I haven't thought of
Is there something I can do to DataIterator that will prevent someone else from blowing half a day figuring out what's wrong when they try the same thing 6 months from now?
Not that I'm a great fan of the idea, but if you heap allocated a Data object and then returned a ref to a shared_ptr to it, that would allow the outside world to hold onto it longer if needed, and for you to "forget" about it when you step forward.
On the other hand, implementing your own native reverse_iterator might be a bigger win. That's what I did for my own linked list since I didn't use a sentinel object like gcc does and couldn't use std::reverse_iterator. It really wasn't that difficult.
It's because the reverse_iterator interface was designed before the existence of decltype. Today, that would be written as
auto operator*() const -> decltype(*current)
{ // return designated value
DataIterator _Tmp = current;
return (*--_Tmp);
}
and in C++14, even the trailing return type won't be needed, since it can be inferred.
decltype(auto) operator*() const
{ // return designated value
DataIterator _Tmp = current;
return (*--_Tmp);
}
I ended up going with Casey's suggestion in the comments. He didn't post it as an answer that I can accept, so I'll write it up myself.
I made a specialization of the reverse_iterator for DataIterator that returns a value instead of a reference. This involved copy/pasting the implementation from xutility, specifying one of the template arguments to be a DataIterator, and changing
reference operator*() const
to
value operator*() const