Different efficiency of iterator and const_iterator (STL) - c++

In Qt there are similar classes to list an map. These classes provide a begin_const() method that returns a const_iterator. The documentation says that these const_iterators should be used whenever possible since they are faster.
The STL only gives you a const_iterator if the instance itself is const. Only one begin() method is implemented (overloaded for const).
Is there any difference when read-accessing elements with iterator and const_iterator? (I dont'w know why there's a difference for them in Qt)

The documentation says that these const_iterators should be used whenever possible since they are faster.
It sure does. From http://qt-project.org/doc/qt-4.8/containers.html#stl-style-iterators:
For each container class, there are two STL-style iterator types: one that provides read-only access and one that provides read-write access. Read-only iterators should be used wherever possible because they are faster than read-write iterators.
What a stupid thing to say.
Safer? Yes. Faster? Even if this were the case (it apparently isn't with gcc and clang), it is rarely a reason to prefer const iterators over non-const ones. This is premature optimization. The reason to prefer const iterators over non-const ones is safety. If you don't need the pointed-to contents to be modified, use a const iterator. Think of what some maintenance programmer will do to your code.
As far as begin versus cbegin is concerned, that's a C++11 addition. This allows the auto keyword to use a const iterator, even in a non-const setting.

The best reason to use const is to avoid bugs and make the intent of the code more clear.
It's conceivable that, in some cases, the compiler could perform some optimizations that would not be possible with a non-const iterator. Aliasing (when multiple variables and parameters may reference the same object) is often an inhibitor of some optimizations. If the compiler could rule out some forms of aliasing by noting that the const-iterator can never change the value, then perhaps it would enable some optimizations.
On the other hand, I'd expect a compiler that's good enough to use constness in that way to be able to reach the same conclusion with flow analysis.

Related

C++ std::vector<>::iterator is not a pointer, why?

Just a little introduction, with simple words.
In C++, iterators are "things" on which you can write at least the dereference operator *it, the increment operator ++it, and for more advanced bidirectional iterators, the decrement --it, and last but not least, for random access iterators we need operator index it[] and possibly addition and subtraction.
Such "things" in C++ are objects of types with the according operator overloads, or plain and simple pointers.
std::vector<> is a container class that wraps a continuous array, so pointer as iterator makes sense. On the nets, and in some literature you can find vector.begin() used as a pointer.
The rationale for using a pointer is less overhead, higher performance, especially if an optimizing compiler detects iteration and does its thing (vector instructions and stuff). Using iterators might be harder for the compiler to optimize.
Knowing this, my question is why modern STL implementations, let's say MSVC++ 2013 or libstdc++ in Mingw 4.7, use a special class for vector iterators?
You're completely correct that vector::iterator could be implemented by a simple pointer (see here) -- in fact the concept of an iterator is based on that of a pointer to an array element. For other containers, such as map, list, or deque, however, a pointer won't work at all. So why is this not done? Here are three reasons why a class implementation is preferrable over a raw pointer.
Implementing an iterator as separate type allows additional functionality (beyond what is required by the standard), for example (added in edit following Quentins comment) the possibility to add assertions when dereferencing an iterator, for example, in debug mode.
overload resolution If the iterator were a pointer T*, it could be passed as valid argument to a function taking T*, while this would not be possible with an iterator type. Thus making std::vector<>::iterator a pointer in fact changes the behaviour of existing code. Consider, for example,
template<typename It>
void foo(It begin, It end);
void foo(const double*a, const double*b, size_t n=0);
std::vector<double> vec;
foo(vec.begin(), vec.end()); // which foo is called?
argument-dependent lookup (ADL; pointed out by juanchopanza) If you make an unqualified call, ADL ensures that functions in namespace std will be searched only if the arguments are types defined in namespace std. So,
std::vector<double> vec;
sort(vec.begin(), vec.end()); // calls std::sort
sort(vec.data(), vec.data()+vec.size()); // fails to compile
std::sort is not found if vector<>::iterator were a mere pointer.
The implementation of the iterator is implementation defined, so long as fulfills the requirements of the standard. It could be a pointer for vector, that would work. There are several reasons for not using a pointer;
consistency with other containers.
debug and error checking support
overload resolution, class based iterators allow for overloads to work differentiating them from plain pointers
If all the iterators were pointers, then ++it on a map would not increment it to the next element since the memory is not required to be not-contiguous. Past the contiguous memory of std:::vector most standard containers require "smarter" pointers - hence iterators.
The physical requirement's of the iterator dove-tail very well with the logical requirement that movement between elements it a well defined "idiom" of iterating over them, not just moving to the next memory location.
This was one of the original design requirements and goals of the STL; the orthogonal relationship between the containers, the algorithms and connecting the two through the iterators.
Now that they are classes, you can add a whole host of error checking and sanity checks to debug code (and then remove it for more optimised release code).
Given the positive aspects class based iterators bring, why should or should you not just use pointers for std::vector iterators - consistency. Early implementations of std::vector did indeed use plain pointers, you can use them for vector. Once you have to use classes for the other iterators, given the positives they bring, applying that to vector becomes a good idea.
The rationale for using a pointer is less overhead, higher
performance, especially if an optimizing compiler detects iteration
and does its thing (vector instructions and stuff). Using iterators
might be harder for the compiler to optimize.
It might be, but it isn't. If your implementation is not utter shite, a struct wrapping a pointer will achieve the same speed.
With that in mind, it's simple to see that simple benefits like better diagnostic messages (naming the iterator instead of T*), better overload resolution, ADL, and debug checking make the struct a clear winner over the pointer. The raw pointer has no advantages.
The rationale for using a pointer is less overhead, higher
performance, especially if an optimizing compiler detects iteration
and does its thing (vector instructions and stuff). Using iterators
might be harder for the compiler to optimize.
This is the misunderstanding at the heart of the question. A well formed class implementation will have no overhead, and identical performance all because the compiler can optimize away the abstraction and treat the iterator class as just a pointer in the case of std::vector.
That said,
MSVC++ 2013 or libstdc++ in Mingw 4.7, use a special class for vector
iterators
because they view that adding a layer of abstraction class iterator to define the concept of iteration over a std::vector is more beneficial than using an ordinary pointer for this purpose.
Abstractions have a different set of costs vs benefits, typically added design complexity (not necessarily related to performance or overhead) in exchange for flexibility, future proofing, hiding implementation details. The above compilers decided this added complexity is an appropriate cost to pay for the benefits of having an abstraction.
Because STL was designed with the idea that you can write something that iterates over an iterator, no matter whether that iterator's just equivalent to a pointer to an element of memory-contiguous arrays (like std::array or std::vector) or something like a linked list, a set of keys, something that gets generated on the fly on access etc.
Also, don't be fooled: In the vector case, dereferencing might (without debug options) just break down to a inlinable pointer dereference, so there wouldn't even be overhead after compilation!
I think the reason is plain and simple: originally std::vector was not required to be implemented over contiguous blocks of memory.
So the interface could not just present a pointer.
source: https://stackoverflow.com/a/849190/225186
This was fixed later and std::vector was required to be in contiguous memory, but it was probably too late to make std::vector<T>::iterator a pointer.
Maybe some code already depended on iterator to be a class/struct.
Interestingly, I found implementations of std::vector<T>::iterator where this is valid and generated a "null" iterators (just like a null pointer) it = {};.
std::vector<double>::iterator it = {};
assert( &*it == nullptr );
Also, std::array<T>::iterator and std::initializer_list<T>::iterator are pointers T* in the implementations I saw.
A plain pointer as std::vector<T>::iterator would be perfectly fine in my opinion, in theory.
In practice, being a built-in has observable effects for metaprogramming, (e.g. std::vector<T>::iterator::difference_type wouldn't be valid, yes, one should have used iterator_traits).
Not-being a raw pointer has the (very) marginal advantage of disallowing nullability (it == nullptr) or default conductibility if you are into that. (an argument that doesn't matter for a generic programming point of view.)
At the same time the dedicated class iterators had a steep cost in other metaprogramming aspects, because if ::iterator were a pointer one wouldn't need to have ad hoc methods to detect contiguous memory (see contiguous_iterator_tag in https://en.cppreference.com/w/cpp/iterator/iterator_tags) and generic code over vectors could be directly forwarded to legacy C-functions.
For this reason alone I would argue that iterator-not-being-a-pointer was a costly mistake. It just made it hard to interact with C-code (as you need another layer of functions and type detection to safely forward stuff to C).
Having said this, I think we could still make things better by allowing automatic conversions from iterators to pointers and perhaps explicit (?) conversions from pointer to vector::iterators.
I got around this pesky obstacle by dereferencing and immediately referencing the iterator again. It looks ridiculous, but it satisfies MSVC...
class Thing {
. . .
};
void handleThing(Thing* thing) {
// do stuff
}
vector<Thing> vec;
// put some elements into vec now
for (auto it = vec.begin(); it != vec.end(); ++it)
// handleThing(it); // this doesn't work, would have been elegant ..
handleThing(&*it); // this DOES work

Do stl containers use implicit sharing?

Its known that Qt widgets use implicit sharing. So I am interested if stl containers std::vector, std::string use implicit sharing too.
If no, why? Since it is very useful.
And if the answer is yes, how we can ascertain in it? I need simple C++ stl program which shows that stl containers use implicit sharing. It doesn't do deep copy when is copied.
No. They cannot. When you try to modify the contents of the container, or even calling a mutable begin() on it, it would imply a potential copy-on-write and thus invalidate all references and iterators to the container. This would be a hard to debug situation, and it is prohibited.
Although std::string is technically not a container, it is still prohibited to do copy-on-write since C++11:
References, pointers, and iterators referring to the elements of a basic_string sequence may be invalidated by the following uses of that basic_string object:
...
— Calling non-const member functions, except operator[], at, front, back, begin, rbegin, end, and rend.
[string.require]
... Since it is very useful.
Heh, what for? Passing by reference almost always solves all 'performance problems'. Atomic ref-counts are inherently non-scalable on multi-processors machines.
Aside from the objections raised by others to CoW behaviour in containers, here are a few more. These all fall into the category of behaviour that defies convention, and will therefore cause bizarre bugs from unsuspecting developers.
Exceptions
Allowing CoW would means that innocuous mutation operations on a container can fail with exceptions when they wouldn't otherwise. This would be a particular hazard with operator[] on either a std::vector or std::string
Threading
One might reasonable expect to be able to copy construct a container with the express purpose of handing it off to another thread without worrying about concurrency thereafter. Not so with CoW.
As it's noticed in similar question:
The C++ standard doesn't prohibit or mandate copy-on-write or any
other implementation details for std::string. So long as the semantics
and complexity requirements are met an implementation may choose
whatever implementation strategy it likes.
I think, same is true for std::vector
Also, you may be interested in this topic: How is std::string implemented
STL containers do not use implicit sharing. They always have plain value semantics.
The reason is runtime performance: In multithreaded programs (potentially but not necessarily running on multicore hosts) the locking overhead of the housekeeping data (e.g. reference counting, locking while copying before writing) by far outweighs the overhead of a plain value copy, which has no special threading implications at all. It is expected that programs which would suffer from copying around huge std::maps implement explicit sharing to avoid the copying.
In fact in the very early days of STL std::string did use implicit sharing. But it was dropped when the first multicore CPUs came up.

Is there an operational difference between std::set::iterator and std::set::const_iterator?

For most containers, the iterator type provides read-write access to values in the container, and the const_iterator type provides read-only access. However, for std::set<T>, the iterator type cannot provide read-write access, because modifying a value in the set (potentially) breaks the container invariants. Therefore, in std::set<T>, both iterator and const_iterator provide read-only access.
This leads me to my question: Is there any difference between the things you can do with a std::set<T>::iterator and the things you can do with a std::set<T>::const_iterator?
Note that in C++11, the manipulation methods of containers (e.g., erase) can take const_iterator arguments.
No, there's not much much functional difference between them. Of course, there used to be back in C++03, when set<T>::iterator didn't return a const T&. But once they changed it, they were stuck with two different kinds of iterators that both do the same thing.
Indeed, the standard is quite clear that they have identical functionality (to the point where they can be the same type, but aren't required to be). From 23.2.4, p. 6:
iterator of an associative container is of the bidirectional iterator category. For associative containers where the value type is the same as the key type, both iterator and const_iterator are constant iterators. It is unspecified whether or not iterator and const_iterator are the same type. [ Note: iterator and const_iterator have identical semantics in this case, and iterator is convertible to const_iterator. Users can avoid violating the One Definition Rule by always using const_iterator in their function parameter lists. —end note ]
When we (Err I) ported our large app to VC 10.0, this rule took affect. It broke all sorts of old code where folks manipulated the iterators by calling non const methods on them.
So that leads to the biggest difference I found: You can only call const methods on a const iterator. Where-as in the old standard you could willy-nilly call non-const methods and mess up your set. In some cases I ended up replacing some set's with map's where I found the code absolutely required modification to the items getting stored in the container.
Hope that helps.

Is it a good idea to create an STL iterator which is noncopyable?

Most of the time, STL iterators are CopyConstructable, because several STL algorithms require this to improve performance, such as std::sort.
However, I've been working on a pet project to wrap the FindXFile API (previously asked about), but the problem is it's impossible to implement a copyable iterator around this API. A find handle cannot be duplicated by any means -- DuplicateHandle specifically forbids passing these types of handles to it. And if you just maintain a reference count to the find handle, then a single increment by any copy results in an increment of all copies -- clearly that is not what a copy constructed iterator is supposed to do.
Since I can't satisfy the traditional copy constructible requirement for iterators here, is it even worth trying to create an "STL style" iterator? On one hand, creating some other enumeration method is going to not fall into normal STL conventions, but on the other, following STL conventions are going to confuse users of this iterator if they try to CopyConstruct it later.
Which is the lesser of two evils?
An input iterator which is not a forward iterator is copyable, but you can only "use" one of the copies: incrementing any of them invalidates the others (dereferencing one of them does not invalidate the others). This allows it to be passed to algorithms, but the algorithm must complete with a single pass. You can tell which algorithms are OK by checking their requirements - for example copy requires only an InputIterator, whereas adjacent_find requires a ForwardIterator (first one I found).
It sounds to me as though this describes your situation. Just copy the handle (or something which refcounts the handle), without duplicating it.
The user has to understand that it's only an InputIterator, but in practice this isn't a big deal. istream_iterator is the same, and for the same reason.
With the benefit of C++11 hindsight, it would almost have made sense to require InputIterators to be movable but not to require them to be copyable, since duplicates have limited use anyway. But that's "limited use", not "no use", and anyway it's too late now to remove functionality from InputIterator, considering how much code relies on the existing definition.

Should I prefer iterators over const_iterators?

Someone here recently brought up the article from Scott Meyers that says:
Prefer iterators over const_iterators (pdf link).
Someone else was commenting that the article is probably outdated. I'm wondering what your opinions are?
Here is mine: One of the main points of the article is that you cannot erase or insert on a const_iterator, but I think it's funny to use that as an argument against const_iterators. I thought the whole point of const_iterators it that you do not modify the range at all, neither the elements themselves by substituting their values nor the range by inserting or erasing. Or am I missing something?
I totally agree with you.
I think the answer is simple:
Use const_iterators where const values are the right thing to use, and vice versa.
Seems to me that those who are against const_iterators must be against const in general...
Here's a slightly different way to look at it. Const_iterator almost never makes sense when you are passing it as a pointer into a specific collection and you are passing the collection as well. Mr. Meyer was specifically stating that const_iterator cannot be used with most member functions of a collection instance. In that case, you will need a plain-old iterator. However, if you don't have a handle to the collection, the only difference between the two is that you can modify what is pointed to by an iterator and you can't modify the object referenced by a const_iterator.
So... you want to use iterator whenever you are passing a collection and position into the collection to an algorithm. Basically, signatures like:
void some_operation(std::vector<int>& vec, std::vector::const_iterator pos);
don't make a whole lot of sense. The implicit statement is that some_operation is free to modify the underlying collection but is not allowed to modify what pos references. That doesn't make much sense. If you really want this, then pos should be an offset instead of an iterator.
On the flip side, most of the algorithms in the STL are based on ranges specified by a pair of iterators. The collection itself is never passed so the difference between iterator and const_iterator is whether the value in the collection can be modified through the iterator or not. Without a reference to the collection, the separation is pretty clear.
Hopefully that made things as clear as mud ;)
I don't think this particular statement of Meyer's needs to be taken with special concern. When you want a non-modifying operation, it is best to use a const_iterator. Otherwise, use an ordinary iterator. However, do note the one important thing: Never mix iterators i.e. const ones with non-const ones. As long as you are aware of the latter, you should be fine.
I generally prefer constness, but recently came across a conundrum with const_iterators that has confused my "always use const were possible" philosophy:
MyList::const_iterator find( const MyList & list, int identifier )
{
// do some stuff to find identifier
return retConstItor;
}
Since passing in a const list reference required that I only use const iterators, now if I use the find, I cannot do anything with the result but look at it even though all I wanted to do was express that find would not change the list being passed in.
I wonder perhaps, then, if Scott Mayers advice has to do with issues like this where it becomes impossible to escape const-ness. From what I understand, you cannot (reliably) un-const const_iterators with a simple cast because of some internal details. This also (perhaps in conjunction) be the issue.
this is probably relevant: How to remove constness of const_iterator?
C++98
I think one needs to take into account that Meyers statement refers to c++98. Hard to tell today, but if I remember right
it simply was not easy to get a const_iterator for a non const container at all
if you got a const_iterator you could have hardly made any use of it since most (all?) position arguments for container member functions were expected to be iterators and not const_iterators
e.g.
std::vector<int> container;
would have required
static_cast<std::vector<int>::const_iterator>(container.begin())
to get a const_iterator, which would have considerably inflated a simple .find
and even if you had your result then after
std::vector<int>::const_iterator i = std::find(static_cast<std::vector<int>::const_iterator>(container.begin()), static_cast<std::vector<int>::const_iterator>(container.end()),42);
there would have been no way to use your std::vector::const_iterator for insertion into the vector or any other member function that expected iterators for a position. And there was no way to get a iterator from a const iterator. No way of casting existed (exists?) for that.
Because const iterator does not mean that the container could not be changed, but only that the element pointed to could not been changed (const iterator being the equivalent of pointer to const) that was really a big pile of crap to deal with in such cases.
Today the opposite is true.
const iterators are easy to get using cbegin etc. even for non const containers and all (?) member functions that take positions have const iterators as their arguments so there is no need for any conversion.
std::vector<int> container;
auto i = std::find(container.cbegin(), container.cend(), 42);
container.insert(i, 43);
So what once was
Prefer iterators over const_iterators
today really really should be
Prefer const_iterators over iterators
since the first one is simply an artifact of historical implementation deficits.
By my reading of that link, Meyers appears to be fundamentally saying that interator's are better than const_interator's because you cannot make changes via a const_iterator.
But if that is what he is saying then Meyers is in fact wrong. This is precisely why const_iterator's are better than iterator's when that is what you want to express.