Someone here recently brought up the article from Scott Meyers that says:
Prefer iterators over const_iterators (pdf link).
Someone else was commenting that the article is probably outdated. I'm wondering what your opinions are?
Here is mine: One of the main points of the article is that you cannot erase or insert on a const_iterator, but I think it's funny to use that as an argument against const_iterators. I thought the whole point of const_iterators it that you do not modify the range at all, neither the elements themselves by substituting their values nor the range by inserting or erasing. Or am I missing something?
I totally agree with you.
I think the answer is simple:
Use const_iterators where const values are the right thing to use, and vice versa.
Seems to me that those who are against const_iterators must be against const in general...
Here's a slightly different way to look at it. Const_iterator almost never makes sense when you are passing it as a pointer into a specific collection and you are passing the collection as well. Mr. Meyer was specifically stating that const_iterator cannot be used with most member functions of a collection instance. In that case, you will need a plain-old iterator. However, if you don't have a handle to the collection, the only difference between the two is that you can modify what is pointed to by an iterator and you can't modify the object referenced by a const_iterator.
So... you want to use iterator whenever you are passing a collection and position into the collection to an algorithm. Basically, signatures like:
void some_operation(std::vector<int>& vec, std::vector::const_iterator pos);
don't make a whole lot of sense. The implicit statement is that some_operation is free to modify the underlying collection but is not allowed to modify what pos references. That doesn't make much sense. If you really want this, then pos should be an offset instead of an iterator.
On the flip side, most of the algorithms in the STL are based on ranges specified by a pair of iterators. The collection itself is never passed so the difference between iterator and const_iterator is whether the value in the collection can be modified through the iterator or not. Without a reference to the collection, the separation is pretty clear.
Hopefully that made things as clear as mud ;)
I don't think this particular statement of Meyer's needs to be taken with special concern. When you want a non-modifying operation, it is best to use a const_iterator. Otherwise, use an ordinary iterator. However, do note the one important thing: Never mix iterators i.e. const ones with non-const ones. As long as you are aware of the latter, you should be fine.
I generally prefer constness, but recently came across a conundrum with const_iterators that has confused my "always use const were possible" philosophy:
MyList::const_iterator find( const MyList & list, int identifier )
{
// do some stuff to find identifier
return retConstItor;
}
Since passing in a const list reference required that I only use const iterators, now if I use the find, I cannot do anything with the result but look at it even though all I wanted to do was express that find would not change the list being passed in.
I wonder perhaps, then, if Scott Mayers advice has to do with issues like this where it becomes impossible to escape const-ness. From what I understand, you cannot (reliably) un-const const_iterators with a simple cast because of some internal details. This also (perhaps in conjunction) be the issue.
this is probably relevant: How to remove constness of const_iterator?
C++98
I think one needs to take into account that Meyers statement refers to c++98. Hard to tell today, but if I remember right
it simply was not easy to get a const_iterator for a non const container at all
if you got a const_iterator you could have hardly made any use of it since most (all?) position arguments for container member functions were expected to be iterators and not const_iterators
e.g.
std::vector<int> container;
would have required
static_cast<std::vector<int>::const_iterator>(container.begin())
to get a const_iterator, which would have considerably inflated a simple .find
and even if you had your result then after
std::vector<int>::const_iterator i = std::find(static_cast<std::vector<int>::const_iterator>(container.begin()), static_cast<std::vector<int>::const_iterator>(container.end()),42);
there would have been no way to use your std::vector::const_iterator for insertion into the vector or any other member function that expected iterators for a position. And there was no way to get a iterator from a const iterator. No way of casting existed (exists?) for that.
Because const iterator does not mean that the container could not be changed, but only that the element pointed to could not been changed (const iterator being the equivalent of pointer to const) that was really a big pile of crap to deal with in such cases.
Today the opposite is true.
const iterators are easy to get using cbegin etc. even for non const containers and all (?) member functions that take positions have const iterators as their arguments so there is no need for any conversion.
std::vector<int> container;
auto i = std::find(container.cbegin(), container.cend(), 42);
container.insert(i, 43);
So what once was
Prefer iterators over const_iterators
today really really should be
Prefer const_iterators over iterators
since the first one is simply an artifact of historical implementation deficits.
By my reading of that link, Meyers appears to be fundamentally saying that interator's are better than const_interator's because you cannot make changes via a const_iterator.
But if that is what he is saying then Meyers is in fact wrong. This is precisely why const_iterator's are better than iterator's when that is what you want to express.
Related
My goal would be to have an iterator that iterates over elements of type T, but in my case T is not really usable for my end-users. Instead, end-users should use a wrapper W that has a much more useful interface.
In order to construct a W, we need a T plus a reference or pointer to an additional data structure.
The problem is that I will never store elements as W. Instead, elements are always stored as T and only wrapped on-demand. Therefore, my users have to iterator over a data structure holding Ts.
My idea was to write a custom iterator for these data structures that itself iterates over the stored Ts, but upon dereferencing will return W instead. I started looking into how this can be implemented and found various information on this topic, including how to deal with the iterator's reference typedef not actually being a reference. This includes the arrow_proxy trick for implementing operator-> in such cases.
However, I have also attempted to read in the standard to see what it has to say about such iterators. My resource here is this and there it clearly states that as soon as we are dealing with forward iterators, reference is expected to be a (const) reference to the value_type. This is supported by this question.
This makes me wonder whether it is even possible to reasonably implement such a transform_iterator that remains standard conforming if one intends to be using it as a forward_iterator or above?
One way that I could come up with would be to declare the value_type of my iterator to W and then keep a member variable of type W around such that operator* could be implemented like this:
class transform_iterator {
value_type = W;
reference = W &;
// ...
reference operator*() const {
m_wrapper = W(< obtain current T >, m_struct);
return m_wrapper;
}
mutable W m_wrapper;
SeparateDataStructure m_truct;
};
However, this approach seems rather hacky to me. On top of that would this increase the iterators size seemingly considerably, which might or might not be an issue (in the long run).
Note 1: I know that Boost.iterator provides a transform_iterator, but I can't quite follow through the implementation of what iterator category they actually apply to these types of iterators. However, it does seem like they base the category on the result type of the supplied function (in some way), which would suggest that at least in their implementation it is possible for the category to be different from input_iterator_tag (though maybe the only other option is output_iterator_tag?).
Note 2: The question linked above also suggest the same workaround that I sketched here. Does that indicate that there is no better way?
TL;DR: Is there a better way to achieve e.g. a forward iterator that transforms the iterated type on dereference to a different type than to store a member of that type in the iterator itself and update that member on every dereference?
First, choose the most restrictive iterator category you can get away with:
if you don't need to allow multiple passes, use InputIterator and just return a temporary W by value
This still has all the usual iterator methods (operator*, operator->, both operator++ etc.)
if you do need multiple passes, you need a ForwardIterator with its additional requirement to return an actual reference.
As you say, this can only be done by storing a W somewhere (in the iterator or off to the side).
The big problem is with mutable forward iterators: mutating *i must also affect *j if i == j, and that means your W can't be a standalone value type, but must be some kind of write-through proxy. That's not impossible, but you can't simply take some existing type and use it like this.
If you have C++20 access, you could probably save some effort by just using a transform_view - although this may be outweighed by the work of changing everything else to use ranges rather than raw iterators.
Just a little introduction, with simple words.
In C++, iterators are "things" on which you can write at least the dereference operator *it, the increment operator ++it, and for more advanced bidirectional iterators, the decrement --it, and last but not least, for random access iterators we need operator index it[] and possibly addition and subtraction.
Such "things" in C++ are objects of types with the according operator overloads, or plain and simple pointers.
std::vector<> is a container class that wraps a continuous array, so pointer as iterator makes sense. On the nets, and in some literature you can find vector.begin() used as a pointer.
The rationale for using a pointer is less overhead, higher performance, especially if an optimizing compiler detects iteration and does its thing (vector instructions and stuff). Using iterators might be harder for the compiler to optimize.
Knowing this, my question is why modern STL implementations, let's say MSVC++ 2013 or libstdc++ in Mingw 4.7, use a special class for vector iterators?
You're completely correct that vector::iterator could be implemented by a simple pointer (see here) -- in fact the concept of an iterator is based on that of a pointer to an array element. For other containers, such as map, list, or deque, however, a pointer won't work at all. So why is this not done? Here are three reasons why a class implementation is preferrable over a raw pointer.
Implementing an iterator as separate type allows additional functionality (beyond what is required by the standard), for example (added in edit following Quentins comment) the possibility to add assertions when dereferencing an iterator, for example, in debug mode.
overload resolution If the iterator were a pointer T*, it could be passed as valid argument to a function taking T*, while this would not be possible with an iterator type. Thus making std::vector<>::iterator a pointer in fact changes the behaviour of existing code. Consider, for example,
template<typename It>
void foo(It begin, It end);
void foo(const double*a, const double*b, size_t n=0);
std::vector<double> vec;
foo(vec.begin(), vec.end()); // which foo is called?
argument-dependent lookup (ADL; pointed out by juanchopanza) If you make an unqualified call, ADL ensures that functions in namespace std will be searched only if the arguments are types defined in namespace std. So,
std::vector<double> vec;
sort(vec.begin(), vec.end()); // calls std::sort
sort(vec.data(), vec.data()+vec.size()); // fails to compile
std::sort is not found if vector<>::iterator were a mere pointer.
The implementation of the iterator is implementation defined, so long as fulfills the requirements of the standard. It could be a pointer for vector, that would work. There are several reasons for not using a pointer;
consistency with other containers.
debug and error checking support
overload resolution, class based iterators allow for overloads to work differentiating them from plain pointers
If all the iterators were pointers, then ++it on a map would not increment it to the next element since the memory is not required to be not-contiguous. Past the contiguous memory of std:::vector most standard containers require "smarter" pointers - hence iterators.
The physical requirement's of the iterator dove-tail very well with the logical requirement that movement between elements it a well defined "idiom" of iterating over them, not just moving to the next memory location.
This was one of the original design requirements and goals of the STL; the orthogonal relationship between the containers, the algorithms and connecting the two through the iterators.
Now that they are classes, you can add a whole host of error checking and sanity checks to debug code (and then remove it for more optimised release code).
Given the positive aspects class based iterators bring, why should or should you not just use pointers for std::vector iterators - consistency. Early implementations of std::vector did indeed use plain pointers, you can use them for vector. Once you have to use classes for the other iterators, given the positives they bring, applying that to vector becomes a good idea.
The rationale for using a pointer is less overhead, higher
performance, especially if an optimizing compiler detects iteration
and does its thing (vector instructions and stuff). Using iterators
might be harder for the compiler to optimize.
It might be, but it isn't. If your implementation is not utter shite, a struct wrapping a pointer will achieve the same speed.
With that in mind, it's simple to see that simple benefits like better diagnostic messages (naming the iterator instead of T*), better overload resolution, ADL, and debug checking make the struct a clear winner over the pointer. The raw pointer has no advantages.
The rationale for using a pointer is less overhead, higher
performance, especially if an optimizing compiler detects iteration
and does its thing (vector instructions and stuff). Using iterators
might be harder for the compiler to optimize.
This is the misunderstanding at the heart of the question. A well formed class implementation will have no overhead, and identical performance all because the compiler can optimize away the abstraction and treat the iterator class as just a pointer in the case of std::vector.
That said,
MSVC++ 2013 or libstdc++ in Mingw 4.7, use a special class for vector
iterators
because they view that adding a layer of abstraction class iterator to define the concept of iteration over a std::vector is more beneficial than using an ordinary pointer for this purpose.
Abstractions have a different set of costs vs benefits, typically added design complexity (not necessarily related to performance or overhead) in exchange for flexibility, future proofing, hiding implementation details. The above compilers decided this added complexity is an appropriate cost to pay for the benefits of having an abstraction.
Because STL was designed with the idea that you can write something that iterates over an iterator, no matter whether that iterator's just equivalent to a pointer to an element of memory-contiguous arrays (like std::array or std::vector) or something like a linked list, a set of keys, something that gets generated on the fly on access etc.
Also, don't be fooled: In the vector case, dereferencing might (without debug options) just break down to a inlinable pointer dereference, so there wouldn't even be overhead after compilation!
I think the reason is plain and simple: originally std::vector was not required to be implemented over contiguous blocks of memory.
So the interface could not just present a pointer.
source: https://stackoverflow.com/a/849190/225186
This was fixed later and std::vector was required to be in contiguous memory, but it was probably too late to make std::vector<T>::iterator a pointer.
Maybe some code already depended on iterator to be a class/struct.
Interestingly, I found implementations of std::vector<T>::iterator where this is valid and generated a "null" iterators (just like a null pointer) it = {};.
std::vector<double>::iterator it = {};
assert( &*it == nullptr );
Also, std::array<T>::iterator and std::initializer_list<T>::iterator are pointers T* in the implementations I saw.
A plain pointer as std::vector<T>::iterator would be perfectly fine in my opinion, in theory.
In practice, being a built-in has observable effects for metaprogramming, (e.g. std::vector<T>::iterator::difference_type wouldn't be valid, yes, one should have used iterator_traits).
Not-being a raw pointer has the (very) marginal advantage of disallowing nullability (it == nullptr) or default conductibility if you are into that. (an argument that doesn't matter for a generic programming point of view.)
At the same time the dedicated class iterators had a steep cost in other metaprogramming aspects, because if ::iterator were a pointer one wouldn't need to have ad hoc methods to detect contiguous memory (see contiguous_iterator_tag in https://en.cppreference.com/w/cpp/iterator/iterator_tags) and generic code over vectors could be directly forwarded to legacy C-functions.
For this reason alone I would argue that iterator-not-being-a-pointer was a costly mistake. It just made it hard to interact with C-code (as you need another layer of functions and type detection to safely forward stuff to C).
Having said this, I think we could still make things better by allowing automatic conversions from iterators to pointers and perhaps explicit (?) conversions from pointer to vector::iterators.
I got around this pesky obstacle by dereferencing and immediately referencing the iterator again. It looks ridiculous, but it satisfies MSVC...
class Thing {
. . .
};
void handleThing(Thing* thing) {
// do stuff
}
vector<Thing> vec;
// put some elements into vec now
for (auto it = vec.begin(); it != vec.end(); ++it)
// handleThing(it); // this doesn't work, would have been elegant ..
handleThing(&*it); // this DOES work
In Herb Sutter's When Is a Container Not a Container?, he shows an example of taking a pointer into a container:
// Example 1: Is this code valid? safe? good?
//
vector<char> v;
// ...
char* p = &v[0];
// ... do something with *p ...
Then follows it up with an "improvement":
// Example 1(b): An improvement
// (when it's possible)
//
vector<char> v;
// ...
vector<char>::iterator i = v.begin();
// ... do something with *i ...
But doesn't really provide a convincing argument:
In general, it's not a bad guideline to prefer using iterators instead
of pointers when you want to point at an object that's inside a
container. After all, iterators are invalidated at mostly the same
times and the same ways as pointers, and one reason that iterators
exist is to provide a way to "point" at a contained object. So, if you
have a choice, prefer to use iterators into containers.
Unfortunately, you can't always get the same effect with iterators
that you can with pointers into a container. There are two main
potential drawbacks to the iterator method, and when either applies we
have to continue to use pointers:
You can't always conveniently use an iterator where you can use a pointer. (See example below.)
Using iterators might incur extra space and performance overhead, in cases where the iterator is an object and not just a bald
pointer.
In the case of a vector, the iterator is just a RandomAccessIterator. For all intents and purposes this is a thin wrapper over a pointer. One implementation even acknowledges this:
// This iterator adapter is 'normal' in the sense that it does not
// change the semantics of any of the operators of its iterator
// parameter. Its primary purpose is to convert an iterator that is
// not a class, e.g. a pointer, into an iterator that is a class.
// The _Container parameter exists solely so that different containers
// using this template can instantiate different types, even if the
// _Iterator parameter is the same.
Furthermore, the implementation stores a member value of type _Iterator, which is pointer or T*. In other words, just a pointer. Furthermore, the difference_type for such a type is std::ptrdiff_t and the operations defined are just thin wrappers (i.e., operator++ is ++_pointer, operator* is *_pointer) and so on.
Following Sutter's argument, this iterator class provides no benefits over pointers, only drawbacks. Am I correct?
For vectors, in non-generic code, you're mostly correct.
The benefit is that you can pass a RandomAccessIterator to a whole bunch of algorithms no matter what container the iterator iterates, whether that container has contiguous storage (and thus pointer iterators) or not. It's an abstraction.
(This abstraction, among other things, allows implementations to swap out the basic pointer implementation for something a little more sexy, like range-checked iterators for debug use.)
It's generally considered to be a good habit to use iterators unless you really can't. After all, habit breeds consistency, and consistency leads to maintainability.
Iterators are also self-documenting in a way that pointers are not. What does a int* point to? No idea. What does an std::vector<int>::iterator point to? Aha…
Finally, they provide a measure a type safety — though such iterators may only be thin wrappers around pointers, they needn't be pointers: if an iterator is a distinct type rather than a type alias, then you won't be accidentally passing your iterator into places you didn't want it to go, or setting it to "NULL" accidentally.
I agree that Sutter's argument is about as convincing as most of his other arguments, i.e. not very.
You can't always conveniently use an iterator where you can use a pointer
That is not one of the disadvantages. Sometimes it is just too "convenient" to get the pointer passed to places where you really didn't want them to go. Having a separate type helps in validating parameters.
Some early implementations used T* for vector::iterator, but it caused various problems, like people accidentally passing an unrelated pointer to vector member functions. Or assigning NULL to the iterator.
Using iterators might incur extra space and performance overhead, in cases where the iterator is an object and not just a bald pointer.
This was written in 1999, when we also believed that code in <algorithm> should be optimized for different container types. Not much later everyone was surprised to see that the compilers figured that out themselves. The generic algorithms using iterators worked just fine!
For a std::vector there is absolutely no space of time overhead for using an iterator instead of a pointer. You found out that the iterator class is just a thin wrapper over a pointer. Compilers will also see that, and generate equivalent code.
One real-life reason to prefer iterators over pointers is that they can be implemented as checked iterators in debug builds and help you catch some nasty problems early. I.e:
vector<int>::iterator it; // uninitialized iterator
it++;
or
for (it = vec1.begin(); it != vec2.end(); ++it) // different containers
In Qt there are similar classes to list an map. These classes provide a begin_const() method that returns a const_iterator. The documentation says that these const_iterators should be used whenever possible since they are faster.
The STL only gives you a const_iterator if the instance itself is const. Only one begin() method is implemented (overloaded for const).
Is there any difference when read-accessing elements with iterator and const_iterator? (I dont'w know why there's a difference for them in Qt)
The documentation says that these const_iterators should be used whenever possible since they are faster.
It sure does. From http://qt-project.org/doc/qt-4.8/containers.html#stl-style-iterators:
For each container class, there are two STL-style iterator types: one that provides read-only access and one that provides read-write access. Read-only iterators should be used wherever possible because they are faster than read-write iterators.
What a stupid thing to say.
Safer? Yes. Faster? Even if this were the case (it apparently isn't with gcc and clang), it is rarely a reason to prefer const iterators over non-const ones. This is premature optimization. The reason to prefer const iterators over non-const ones is safety. If you don't need the pointed-to contents to be modified, use a const iterator. Think of what some maintenance programmer will do to your code.
As far as begin versus cbegin is concerned, that's a C++11 addition. This allows the auto keyword to use a const iterator, even in a non-const setting.
The best reason to use const is to avoid bugs and make the intent of the code more clear.
It's conceivable that, in some cases, the compiler could perform some optimizations that would not be possible with a non-const iterator. Aliasing (when multiple variables and parameters may reference the same object) is often an inhibitor of some optimizations. If the compiler could rule out some forms of aliasing by noting that the const-iterator can never change the value, then perhaps it would enable some optimizations.
On the other hand, I'd expect a compiler that's good enough to use constness in that way to be able to reach the same conclusion with flow analysis.
It appears that according to ISO 14882 2003 (aka the Holy Standard of C++) std::set<K, C, A>::erase takes iterator as a parameter (not a const_iterator)
from 23.3.3 [2]
void erase(iterator position);
It might also be noteworthy that on my implementation of STL which came with VS2008, erase takes a const_iterator which led to an unpleasant surprise when I tried to compile my code with another compiler. Now, since my version takes a const_iterator, then it is possible to implement erase with a const_iterator(as if it wasn't self-evident).
I suppose the standards committee had some implementation in mind (or an existing implementation at hand) which would require erase to take an iterator.
If you agree that this is the case, can you please describe an implementation of set::erase which would require to modify the element that was going to be removed (I can't).
If you disagree, please tell me why on Earth would they come up with this decision? I mean, erasing an element is just some rearranging of pointers!
It just occurred to me that even in case of iterator you can't modify the element in a set. But the question still holds: why not const_iterator, especially if they're equivalent in some sense?
This was a defect. Since C++11, set<K,C,A>::erase takes a const_iterator:
iterator erase(const_iterator position);
This paper from 2007 illustrated that error and showed implementations to avoid it. I am not sure if this paper is the reason for the change in the standard, but it's probably a good guess.
Can't really think of any reason for it to need an iterator, so I'm leaning toward the arbitrary: any op that modifies the structure would take an iterator to let the user know that what previously worked with the iterator might not afterward:
erase invalidates the iterator.
insert(iter, val) changes the next value.
etc.
My only guess is because insert, upper_bound, lower_bound and find are returning iterator (not const iterator). I don't see other explanation.