Is Iterator increment results a new iterator? - c++

When we add(or subtract) an integral value to (or from) a pointer, the
result is a new pointer. That new pointer points to the element the
given number ahead of (or behind) the original pointer: (pp.119 c++ primer 5ed)
I've also learned from the book, pointers are Iterators (pp.118 c++ primer 5ed).
Question
Can I also claim that the arithmetic operations on an iterator creating a totally new iterator.

The book describes a situation when you write, say, p + n, where p is a pointer and n is an integer. The expression produces a new value of pointer type. It is up to you to decide where to store the value; you can also decide not to store it at all.
Incrementing a pointer, i.e. writing p += n, changes the value of the original pointer to p + n.
The way it works for iterators is the same: it + n produces a new iterator, while it += n changes the existing iterator.
Note The first expression could be written as std::next(it, n), while the second should be written as std::advance(it, n), for both iterators and for pointers.

Can I also claim that the arithmetic operations on an iterator creating a totally new iterator.
Iterators in the standard library are designed to mimic pointers in their behavior, to an extent. So if you are talking about iterators that originate from standard library containers, then post-incrementing them (which will also modify the source), or adding numbers to them (for random access iterators), will result in new pure iterator values (or "new iterators" as you phrased it). And addition will not modify the source iterator in the standard library.
But since an iterator in general can be any user defined class, with its own overloaded set of operators, then there's no telling. Once can theoretically design an iterator, where the result is a reference to the existing iterator we supplied as an operand.
And as a matter of fact, even in the standard library, pre-increment returns a reference to the current iterator (in accordance to the behavior of pointers in C++).

Yes, no and maybe. An iterator is what's called a "proxy object", that is an object that contains some metadata tied to some other object. In this case, an iterator attempts to behave like a pointer, at least kind of.
So, your actual iterator object will be exactly the same one (for example, if you set a breakpoint in the constructor for an iterator, it won't be called from operator++ on the iterator). Giving the answer "no".
The iterator will have a different value inside it, and behave just like a different (aka 'new' in the quoted text) pointer. So it gives the answer "yes".
I'm sure we can discuss this a lot further, to conclude that the final answer depends on what you mean by the term "new iterator".

Related

Is it possible for a C++ iterator to have gaps and not be linear?

I wrote a C++ iterator to go over an std::string which is UTF-8.
The idea is for the iterator to return char32_t characters instead of bytes. The iterator can be used to go forward or backward. I can also rewind and I suppose the equivalent of rbegin().
Since a character can span multiple bytes, my position within the std::string may jump by 2, 3, or 4 bytes (the library throws if an invalid character is encountered).
This also mean the distance to a certain character does not always increment one by one. In other words, ++it may increment the position by a number from 1 to 4 and --it reverse subtract in a similar manner.
Is that an expected/legal behavior for a C++ iterator?
Many algorithms in C++ work equally well with plain pointers in addition to iterators. std::copy will work with plain pointers, just fine. std::find_if will be happy too. And so on.
By a fortunate coincidence std::copy invokes the ++ operator on the pointers you feed to it. Well, guess what? Passing a bunch int *s to std::copy results in the actual pointer being increment by sizeof(int), instead of 1.
std::copy won't care.
The properties of iterators and their requirements are defined in terms of the logical results and the logical effects of what the various operators cause to happen (as well as which operators are valid for a given iterator). Whether the internal implementation of an iterator increments the internal value, that represents the iterator in some way, by 1, 2, 4, or 42, is immaterial. Note that reverse iterators result in the actual internal pointer getting decremented by its ++ operator overload.
If your custom iterator's implementation of the ++, --, *, [], +, and - operators (whichever ones are appropriate for your iterator) meets all requirements of their assigned iterator category, then the actual effects of these operators on the actual raw pointer value, that represents your iterator, is irrelevant.
The answer to your question is as follows, assuming that your custom iterator is a random access iterator: if all the required operator overloads meet all requirements of a random access iterator, then the actual effects on the underlying pointer value are irrelevant.
The same holds true for any iterator category, not just random access.

What is std::contiguous_iterator useful for?

For what purposes I can use it?
Why is it better than random_access_iterator?
Is there some advantage if I use it?
For a contiguous iterator you can get a pointer to the element the iterator is "pointing" to, and use it like a pointer to a contiguous array.
That can't be guaranteed with a random access iterator.
Remember that e.g. std::deque is a random-access container, but it's typically not a contiguous container (as opposed to std::vector which is both random access and contiguous).
In C++17, there is no such thing as a std::contiguous_iterator. There is the ContiguousIterator named requirement however. This represents a random access iterator over a sequence of elements where each element is stored contiguously, in exactly the same way as an array. Which means that it is possible, given a pointer to an value_type from an iterator, to perform pointer arithmetic on that pointer, which shall work in exactly the same way as performing the same arithmetic on the corresponding iterators.
The purpose of this is to allow for more efficient implementations of algorithms on iterators that are contiguous. Or to forbid algorithms from being used on iterators that aren't contiguous. One example of where this matters is if you're trying to pass C++ iterators into a C interface which is based on pointers to arrays. You can wrap such interfaces behind generic algorithms, verifying the contiguity of the iterator in the template.
Or at least, you could in theory; in C++17, that wasn't really possible.. The reason being that there was not actually a way to test if an iterator was a ContiguousIterator. There's no way to ask a pointer if doing pointer arithmetic on a pointer to an element from the iterator is legal. And there was no std::contiguous_iterator_category one could use for such iterators (as this could cause compatibility problems). So you couldn't use SFINAE tools to verify that an iterator was contiguous.
C++20's std::contiguous_iterator concept resolves this problem. It also resolves the other problem with contiguous iterators. See, the above explanation for ContiguousIterator's behavior starts with us having a pointer to an element from the range. Well, how did you get that? The obvious method would be to do something like std::addressof(*it), but what if it is the end iterator? The end iterator is not dereference-able, so you can't do that. Basically, even if you know that an iterator is contiguous, how do you go about converting it to the equivalent pointer?
The std::contiguous_iterator concept solves both of these problems. std::to_address is available, which will convert any contiguous iterator into its equivalent pointer value. And there is a traits tag that an iterator must provide to denote that it is in fact a contiguous iterator, just in case the default to_address implementation happens to be valid for a non-contiguous iterator.
A random access iterator only requires a constant time (iterator) + (offset), whereas contiguous iterators have the stronger guarantee that std::addressof(*((iterator) + (offset))) == std::addressof(*(iterator)) + (offset) (disregarding overloaded operator&s).
This basically means that the iterator is a pointer or a light wrapper around a pointer, so it is equivalent to a pointer to its elements, whereas a random access iterator can do more, at the cost of possibly being bulkier and being unable to turn it into a simple pointer.
As a C++20 Concept, I would expect you can use it to specify a different algorithm if the container is contiguous. Perhaps exploiting cache locality.

Are all pointers considered iterators?

This was a question on my exam and the answer is that all pointers are iterators but not all iterators are pointers. Why is this the case?
In a statement such as:
int *p = new int(4);
How can p be considered an iterator at all?
"Iterator" is some abstract concept, describing a certain set of operations a type must support with some specific semantics.
Pointers are iterators because they fulfill the concept iterator (and, even stronger, random access iterator), e.g. the operator++ to move to the next element and operator * to access the underlying element.
In your particular example, you get a standard iterator range with
[p, p+1)
which can be used for example in the standard algorithms, like any iterator pair. (It may not be particularly useful, but it is still valid.) The above holds true for all "valid" pointers, that is pointers that point to some object.
The converse implication however is false: For example, consider the std::list<T>::iterator. That is still an iterator, but it cannot be a pointer because it does not have an operator[].

What is singular and non-singular values in the context of STL iterators?

The section ยง24.1/5 from the C++ Standard (2003) reads,
Just as a regular pointer to an array
guarantees that there is a pointer
value pointing past the last element
of the array, so for any iterator type
there is an iterator value that points
past the last element of a
corresponding container. These values
are called past-the-end values. Values
of an iterator i for which the
expression *i is defined are called
dereferenceable. The library never
assumes that past-the-end values are
dereferenceable. Iterators can also
have singular values that are not
associated with any container.
[Example: After the declaration of an
uninitialized pointer x (as with int*
x;), x must always be assumed to have
a singular value of a pointer.]
Results of most expressions are
undefined for singular values; the
only exception is an assignment of a
non-singular value to an iterator that
holds a singular value. In this case
the singular value is overwritten the
same way as any other value.
Dereferenceable values are always
nonsingular.
I couldn't really understand the text shown in bold?
What is singular value and nonsingular value? How are they defined? And where?
How and why dereferenceable values are always nonsingular?
If I understand this correctly, a singular value for an iterator is essentially the equivalent of an unassigned pointer. It's an iterator that hasn't been initialized to point anywhere and thus has no well-defined element it's iterating over. Declaring a new iterator that isn't set up to point to an element of a range, for example, creates that iterator as a singular iterator.
As the portion of the spec alludes to, singular iterators are unsafe and none of the standard iterator operations, such as increment, assignment, etc. can be used on them. All you can do is assign them a new value, hopefully pointing them at valid data.
I think the reason for having this definition is so that statements like
set<int>::iterator itr;
Can be permitted by the spec while having standardized meaning. The term "singular" here probably refers to the mathematical definition of a singularity, which is also called a "discontinuity" in less formal settings.
Iterators can also have singular values that are not associated with any container.
I suppose that's its definition.
How and why dereferenceable values are always nonsingular?
Because if they wouldn't, dereferencing them would be undefined behavior.
Have a look at What is an iterator's default value?.
As the quote indicates, singular values are iterator values that are not associated with any container. A singular value is almost useless: you can't advance it, dereference it, etc. One way (the only way?) of getting a singular iterator is by not initializing it, as shown in templatetypedef's answer.
One of the useful things you can do with a singular iterator, is assign it a non-singular value. When you do that you can do whatever else you want with it.
The non-singular values are, almost by definition, iterator values that are associated with a container. This answers why dereferenceable values are always non-singular: iterators that do not point to any container cannot be dereferenced (what element would this return?).
As Matthieu M. correctly noted, non-singular values may still be non-dereferenceable. An example is the past-the-end iterator (obtainable by calling container.end()): it is associated with a container, but still cannot be referenced.
I can't say where these terms are defined. However, Google has this to say about "define: singular" (among other definitions):
remarkable: unusual or striking
I suppose this can explain the terminology.
What is singular value and nonsingular value? How are they defined? And where?
Let us use the simplest incarnation of an Iterator: the pointer.
For a pointer:
the singular value alluded to is the NULL value an uninitialized value.
a non-singular value is an explicitly initialized value, it may not be dereferencable still (the past-the-end pointer shall not be dereferenced)
I would say that the NULL pointer is a singular value, though not the only one, since it represents the absence of value.
What is the equivalence for regular iterators ?
std::vector<int>::iterator it;, the default constructor of most iterators (those linked to a container) create a singular value. Since it's not tied to a container, any form of navigation (increment, decrement, ...) is meaningless.
How and why dereferenceable values are always nonsingular ?
Singular values, by definition, represent the absence of a real value. They appear in many languages: Python's None, C#'s null, C's NULL, C++'s std::nullptr. The catch is that in C or C++, they may also be simple garbage... (whatever was there in memory before)
Is a default constructed iterator a singular value ?
Not necessarily, I guess. It is not required by the standard, and one could imagine the use of a sentinel object.

STL-Like range, What could go wrong if I did this?

I am writing (as a self-teaching exercise) a simple STL-Like range. It is an Immutable-Random-Access "container". My range, keeps only the its start element, the the number of elements and the step size(the difference between two consecutive elements):
struct range
{
...
private:
value_type m_first_element, m_element_count, m_step;
};
Because my range doesn't hold the elements, it calculates the desired element using the following:
// In the standards, the operator[]
// should return a const reference.
// Because Range doesn't store its elements
// internally, we return a copy of the value.
value_type operator[](size_type index)
{
return m_first_element + m_step*index;
}
As you can see, I am not returning a const reference as the standards say. Now, can I assume that a const reference and a copy of the element are the same in terms of using the non-mutating algorithms in the standard library?
Any advice about the subject is greatly appreciated.
#Steve Jessop: Good point that you mentioned iterators.
Actually, I used sgi as my reference. At the end of that page, it says:
Assuming x and y are iterators from the same range:
Invariants Identity
x == y if and only if &*x == &*y
So, it boils down to the same original question I've asked actually :)
The standard algorithms don't really use operator[], they're all defined in terms of iterators unless I've forgotten something significant. Is the plan to re-implement the standard algorithms on top of operator[] for your "ranges", rather than iterators?
Where non-mutating algorithms do use iterators, they're all defined in terms of *it being assignable to whatever it needs to be assignable to, or otherwise valid for some specified operation or function call. I think all or most such ops are fine with a value.
The one thing I can think of, is that you can't pass a value where a non-const reference is expected. Are there any non-mutating algorithms which require a non-const reference? Probably not, provided that any functor parameters etc. have enough const in them.
So sorry, I can't say definitively that there are no odd corners that go wrong, but it sounds basically OK to me. Even if there are any niggles, you may be able to fix them with very slight differences in the requirements between your versions of the algorithms and the standard ones.
Edit: a second thing that could go wrong is taking pointers/references and keeping them too long. As far as I can remember, standard algorithms don't keep pointers or references to elements - the reason for this is that it's containers which guarantee the validity of pointers to elements, iterator types only tell you when the iterator remains valid (for instance a copy of an input iterator doesn't necessarily remain valid when the original is incremented, whereas forward iterators can be copied in this way for multi-pass algorithms). Since the algorithms don't see containers, only iterators, they have no reason that I can think of to be assuming the elements are persistent.
Items in STL containers are expected to be copied around all the time; think about when a vector has to be reallocated, for example. So, your example is fine, except that it only works with random iterators. But I suspect the latter is probably by design. :-P
Do you want your range to be usable in STL algorithms? Wouldn't it be better off with the first and last elements? (Considering the fact that end() is oft required/used, you will have to pre-calculate it for performance.) Or, are you counting on the contiguous elements (which is my second point)?
Since you put "container" in "quotes" you can do whatever you want.
STL type things (iterators, various member functions on containers..) return references because references are lvalues, and certain constructs (ie, myvec[i] = otherthing) can compile then. Think of operator[] on std::map. For const references, it's not a value just to avoid the copy, I suppose.
This rule is violated all the time though, when convenient. It's also common to have iterator classes store the current value in a member variable purely for the purpose of returning a reference or const reference (yes, this reference would be invalid if the iterator is advanced).
If you're interested in this sort of stuff you should check out the boost iterator library.