How Iterator pointers in the STL work - c++

Hey i'm a little confused on how iterators work. Specifically a const_iterator for a class in this case. I understand that they are really just pointers to specific elements of a string or whatever you're working with, but in that case:
I would think of them as being a memory address, making it so you could not just add integers to get to the next item in the string.
Here's the code I'm confused about:
string::const_iterator iCharacterLocater;
for ( iCharacterLocater = strSTLString.Begin();
iCharacterLocater != strSTLString.end();
++ iCharacterLocater )
{
cout << "Character [ " << nCharOffset ++ <<"] is: ";
cout << *iCharacterLocater << endl;
}
Thanks!! =)

It's a bit more complicated than that. Iterators refer to the GOF design pattern of the same name, and, in C++, are applied in such a way that they look just like pointers. C++ lets you overload operators, which means that custom types can behave in specific ways when certain operators are used on them.
There are several types of iterators in C++, with varying feature levels. The underlying representation of a vector and of a string are both simple enough that iterators behave pretty much exactly like pointers: you can add numbers to them to seek a specific item. For instance, myVector.begin() + 5 returns an iterator to the 6th element of the vector (6th because indices are zero-based, and 0 would be the first). String iterators also let you do this.
String constant iterators behave much like constant char pointers. When you see const char* foo, it doesn't mean you can't change foo–it just means you can't change what it points to. So when you see std::string::const_iterator foo, it doesn't mean you can't change foo. It just means you can't change what it references.
const char* foo = "abcd";
foo = "zyxwvu"; // valid: the pointer itself can be changed
*foo = 't'; // invalid: the pointed data can't be changed
*(foo + 2) == 'x'; // true
std::string myString = "zyxwvu";
std::string::const_iterator foo = myString.begin();
foo = myString.begin() + 2; // valid: the iterator itself can be changed
*foo = 't'; // invalid: the pointed data can't be changed
*(foo + 2) == 'x'; // true
If you don't attempt to modify data, iterators and const iterators behave just the same. Also, except that there is no direct conversion between the two, a pointer to the contents of a string and an iterator to the contents of a string behave exactly the same.

They are a memory address. The thing about these STL containers is that when you store 10 elements in one, those elements are all adjacent just for this reason. So that when you itr++, it points to the next memory address, which is guaranteed to be the next element.
So an std::string just sits on top of a character array. Those characters are all side by side in memory. If you're pointing to one character, and you increment the pointer, you are now pointing to the next pointer. std::string::begin() returns a pointer to the first character, and std::string::end() returns a pointer to the position after the last character.
const std::string::iterator means that you can't modify the iterator. So you can't increment it, or make it point to something else.
std::string::const_iterator means you can't modify the value that the iterator points to. So you can change which value the iterator is pointing to (itr++), but you can't change the actual value at that memory address.

Iterator is basically a pointer, correct ! But what really important is to what it pointing.
What I think is, iterator is the pointer to the structure defined as type.
For example,
for vector of type string, iterator will be a pointer to the string
structure (String is the system defined structure)
For vector of type classA, iterator will be a pointer to the classA
structure (basically class is a user defined structure)
The beauty of defining iterator in this way is, we can defined everything in structure form, so we can retrieve everything by defining iterator to it.

Related

why not we pass asterisk(*) in iterator in stl

When we use iterator we declare iterator and then itr as an object, but we don't pass any pointer like we do every time when declaring pointer variable but when we print the value of vector by the use of iterator than how itr became*itr
when we doesn't pass any pointer
Is pointer is hidden or its work on the background?
Example like:
iterator itr;
*itr
How it works does * means any other things to iterator or *itr act like normal pointer variable.
If it works like a pointer variable then why we do not pass * when declaring itr.
An iterator is an object that lets you travel (or iterate) over each object in a collection or stream. It is a sort of generalization of pointers. That is, pointers are one example of an iterator.
Iterators implement concepts required by various algorithms such as forward iteration (meaning it can be incremented to move forward in the collection), bi-directional iteration (meaning it can go forward and backward), and random access (meaning you can use an index an arbitrary item in the collection).
For instance, moving backward can't typically happen in a stream, so stream's iterators are typically forward iterators only because once you access a value, you can't go back in the stream. A linked list's iterators are bi-directional because you can move forward or backward, but you cannot access them by indexing because the nodes are not typically in contiguous memory, so you can't calculate with an index where an arbitrary element is. A vector's iterators are random access and very much like pointers. (C++20 made these categories more precise, so the old categories are now called "Legacy".)
Iterators can also have special functions, such as std::back_inserter, which appends items to the end of a container when a value is assigned to it's referrent.
So, you can see that iterators allow you to be more precise in defining what your consumer of iterators expects. If your algorithm requires bi-directional iteration, you can communicate that and limit it so it won't work with forward-only iterators.
As for the * operator, it is similar to the * operator for a pointer. In both cases, it means, "give me the value referred to by this handle". It is implemented via operator overloading. You do not need the * when declaring an iterator because it is not a pointer, which is a lower-level construct in the language. Rather, it is an object with pointer-like semantics.
To answer your questions below:
No, the * is not automatically created. When you declare an iterator you are declaring an object. When the class for that object is defined, it may or may not have an operator overload for the * operator (or the == or the + or any other operators).
When you go to use the object, such as passing it to a function, the types will need to match up. If a function you were passing it to requires an iterator (e.g. std::sort()), then no dereferencing * is needed. If the function was expecting a value of the type the iterator refers to, then you would need to dereference it first. In that case the compiler calls the overloaded operator *and returns the value.
That is the nature of overloaded operators -- they look like ordinary operators but ultimately are resolved to a function call defined by the creator of the class. It works the same as if you defined a matrix class that has plus and minus operators overloaded.
How it works does * means any other things to iterator or *itr act like normal pointer variable.
It depends what type stands behind iterator. It can be alias for a pointer:
using iterator = int *;
iterator itr;
*itr; // it is pointer dereferencing in this case.
Or it can be a user defined type:
struct iterator {
int &operator*();
};
iterator itr;
*itr; // it means itr.operator*() here
So without knowing what type iterator is it is quite impossible to say what * actually does here. But in reality you should not care as developers of the library should implement it the way it would not matter for you.

Getting a Raw Pointer to the end of a Container

If I have the end iterator to a container, but I want to get a raw pointer to that is there a way to accomplish this?
Say I have a container: foo. I cannot for example do this: &*foo.end() because it yields the runtime error:
Vector iterator not dereferencable
I can do this but I was hoping for a cleaner way to get there: &*foo.begin() + foo.size().
EDIT:
This is not a question about how to convert an iterator to a pointer in general (obviously that's in the question), but how to specifically convert the end iterator to a pointer. The answers in the "duplicate" question actually suggest dereferencing the iterator. The end iterator cannot be dereferenced without seg-faulting.
The correct way to access the end of storage is:
v.data() + v.size()
This is because *v.begin() is invalid when v is empty.
The member function data is provided for all contiguous containers (vector, string and array).
From C++17 you will also be able to use the non-member functions:
data(v) + size(v)
This works on raw arrays as well.
In general? No.
And the fact that you're asking indicates that something is wrong with your overall design.
For vectors, arrays, strings? Sure… but why?
Just get a pointer to a valid element, and advance it:
std::vector<T> foo;
const T* ptr = foo.data() + foo.size();
As long as you don't dereference such a pointer (which is almost equivalent to dereferencing the iterator, as you did in your attempt) it is valid to obtain and hold such a pointer, because it points to the special one-past-the-end location.
Note that &foo[0] + foo.size() has undefined behaviour if the vector is empty, because &foo[0] is &*(foo.data() + 0) is &*foo.data(), and (just like in your attempt) *foo.data() is disallowed if there's nothing there. So we avoid all dereferencing and simply advance foo.data() itself.
Anyway, this only works for the case of vectors1, arrays and strings, though. Other containers do not guarantee (or can be reasonably expected to provide) storage contiguity; their end pointers could be almost anything, e.g. a "sentinel" null pointer, which is unlikely to be of any use to you.
That is why the iterator abstraction is there in the first place. Stick to it if you can, instead of delving into raw pointer usage.
1. Excepting std::vector<bool>.

Need clarification about C++ std::iterator

Reading a C++ book I encountered the following example on using iterators:
vector<string::iterator> find_all(string& s, char c)
{
vector<string::iterator> res;
for(auto p = s.begin(); p != s.end(); ++p)
if(*p == c)
res.push_back(p);
return res;
}
void test()
{
string m {"Mary had a little lamb"};
for(auto p : find_all(m, 'a'))
if(*p != 'a')
cerr << "a bug!\n";
}
I'm a little confused about what the vector returned by find_all() contains. Is it essentially "pointers" to the elements of the string m created above it?
Thanks.
I'm a little confused about what the vector returned by find_all() contains. Is it essentially "pointers" to the elements of the string m created above it?
Mostly; iterators aren't (necessarily) pointers, they are somewhat a generalization of the pointer concept. They are used to point to specific objects stored inside containers (in this case, characters inside a string), you can use them to move between the elements of the string (via the usual arithmetic operators - when they are supported) and you "dereference" them with * to get a reference to the pointed object.
Notice that, depending from the container, they are implemented differently and provide different features; an iterator to a std::list, for example, will allow ++, -- and *, but not moving to arbitrary locations, and an iterator to a singly-linked list won't even support --, while typically iterators to array-like data structures (like vector or string) will allow completely free movement.
To refer to elements in array-like structures often one just stores indexes, since they are cheap to store and use; for other structures, instead, storing iterators may be more convenient.
For example, just yesterday I had some code which walked a unordered_set<string, int> (=a hashtable that mapped some words to their occurrences) to "take note" of some of the (string, int) couples to use them later.
The equivalent of storing vector indexes here would have been storing the hashtable's keys, but (1) they are strings (so they are moderately costly to allocate and handle), and (2) to use them to reach the corresponding object I had to do another hashtable lookup later. Instead, storing iterators in a vector guarantees no hassle for storing strings (iterators are intended to be cheap to handle) and no need to perform a lookup again.
Yes, iterators are like pointers. std::string::iterator can even be an alias for char *, although it's usually not.
In general, iterators provide a subset of pointer functionality. Which subset depends on the iterator. Your book probably covers this, but all iterators can be dereferenced (*, but there is never a reference & operation) and incremented (++), then some additionally provide --, and some add + and - on top of that.
In this case, the function seems to assume you will only be querying the values of the iterators without modifying the string. Because the allocation block used for string storage may change as the string grows, iterators (like pointers) into the string may be invalidated. This is why std::string member functions like string::find return index numbers, not iterators.
A vector of indexes could be a better design choice, but this is good enough for an example.

why can't I dereference an iterator?

If I have a vector of objects (plain objects, not pointers or references), why can't I do this?
Object* object;
object = vectorOfObjects.end();
or
object = &vectorOfObjects.end();
or
object = &(*vectorOfObjects.end());
also the same question if 'object' was a reference.
They are three separate errors:
object = vectorOfObjects.end();
won't work because end() returns a an iterator, and object is a pointer. Those are generally distinct types (A vector can use raw pointers as iterators, but not all implementations do it. Other containers require special iterator types).
object = &vectorOfObjects.end();
doesn't work because you take the address of the returned iterator. That is, you get a pointer to an iterator, rather than a pointer to an Object.
object = &(*vectorOfObjects.end());
doesn't work because the end iterator doesn't point to a valid element. It points one past the end of the sequence. And so, it can't be dereferenced. You can dereference the last element in the sequence (which would be --vectorOfObjects.end()), but not an iterator pointing past the end.
Finally, the underlying problem/confusion might be that you think an iterator can be converted to a pointer. In general, it can't. If your container is a vector, you're guaranteed that elements are allocated contiguously, like in an array, and so a pointer will work. But for, say, a list, a pointer to an element is useless. It doesn't give you any way to reach the next element.
Because .end() returns an iterator, not an Object, which isn't even a valid member of the vector.
Even using begin(), which returns a valid object, you'd need the following:
The correct way would be:
std::vector<Object> vec;
std::vector<Object>::iterator it;
//....
it = vec.begin();
Object o = *it;
This is because by convention end does not point to a valid location in the container: it sits at a location one past the end of the container, so dereferencing it is illegal.
The situation with begin() is different: you can dereference it when v.begin() != v.end() (i.e. when the container is not empty):
object = *vectorOfObjects.begin();
If you need to access the last element, you can do this:
vector<object>::const_iterator i = vectorOfObjects.end();
i--;
cout << *i << endl; // Prints the last element of the container
Again, this works only when your vector contains at least one element.
Object* means a pointer to an Object, where a pointer is a memory address. But the .end() method returns an Iterator, which is an object itself, not a memory address.

How do you use a C++ iterator?

I have a vector like so:
vector<MyType*> _types;
And I want to iterate over the vector and call a function on each of MyTypes in the vector, but I'm getting invalid return errors from the compiler. It appears the pos iterator isn't a pointer to MyType, it's something else. What am I not understanding?
Edit: Some code..
for (pos = _types.begin(); pos < _types.end(); pos++)
{
InternalType* inst = *pos->GetInternalType();
}
The compiler errors are:
invalid return type 'InternalType**' for overloaded 'operator ->'
'GetInternalType' : is not a member of 'std::_Vector_iterator<_Ty,_Alloc>'
Edit pt2
Should my vector contain pointers or objects? What are the pros and cons? If I am using new to create an instance, I am guessing I can only use a vector of pointers to MyType is that correct?
If the vector contained objects, not pointers, you could do pos->foo(). The iterator "acts like" a pointer. But your vector contains pointers, so an iterator will act like a pointer to a pointer, so needs to be dereferenced twice.
MyType *pMyType = *pos; // first dereference
if (pMyType) { // make sure the pointer is not null
pMyType->foo(); // second dereference
}
If you are sure the pointer is not null, you could do this:
(*pos)->foo();
The parenthesis around *pos are needed so the dereference applies to pos, not to pos->foo(). Order of operations.
If your vector needs to contain items from a class hierarchy (e.g., subclasses of MyType), then you have to make it a vector of pointers. Otherwise a vector of objects is probably simpler.
You've defined _types as a vector of pointers. Assuming your pos is an iterator into that vector, it's going to be an iterator to a pointer, so you'll need to dereference it twice to get to an instance of MyType.
Edit: based on what you've added to the question: You have something like *pos->whatever. Try (*pos)->whatever instead. As it stands, you're trying to use whatever as a member of the iterator, then dereference the result...
Iterators are not pointers. Iterators are types that behave like pointers. In some implementations of some containers, the iterator type may in fact be a pointer (all pointers are iterators), but this cannot be used as a general rule.
If you need to generate a pointer from an iterator, you can use &*pos, which will dereference the iterator and then get the address of the result (of course, this doesn't work if unary operator & is overloaded, but that's a whole other can of worms).
for (vector<MyType*>::iterator i = _tpes.begin(); i != _types.end(); ++i) {
// do something with i
}
for (vector<MyType*>::iterator it = _types.begin();
it != _types.end();
++it)
{
// `*it` is this iteration's `MyType*` from your vector
}