implementing an iterator of certain class - c++

when creating an iterator of a vector, the iterator itself is a pointer to the values held by the vector. therefore *iterator is actually the value held by the vector.
so I have two questions:
when using an iterator on a map, what is the iterator actually? I mean, what is it's inner implementation? is it like a struct that holds different data members?
If I want to implement my own iterator, which holds several data members, what am I actually returning when creating an iterator?

Depends on implementation. Usually, std::map is implemented as a balanced binary search tree. In that case, the iterator would likely point to a node in the tree.

The iterator of a std::map is a structure that references the key-value-pairs saved in your map. The standard iterator which you get with i.e begin() or end() is a bidirectional iterator. This means that your can call ++i and --i operators on the iterator object to move for/backward between the items saved in your map.
Why do you want to implement your own iterator? Maybe creating a class or struct saving it to a std::vector<T> will do what you want?! you can access the iterator by std::vector<T>::iterator. If you really want to implement your own iterator, you should ask yourself the question if it should work for your own data structures as a test, or if you want to be compatible with i.e. std data structures. If its the latter, you should derive from a iterator implementation and modify it the way of your needs. Look at this answer as well.

The iterator itself is a pointer to the values held by the vector.
Vector iterator is not a pointer to value but a class with operator * implemented, which returns the value held by the container and pointed out by your iterator. In case of map you can access key and value using first and second fields:
map<string, int> wheelMap;
wheelMap["Car"] = 4;
map<string, int>::iterator itWheel = wheelMap.begin();
cout << itWheel ->first << ":" << itWheel ->second << endl; //This will print: Car:4
Map iterator has also other operators implemented: +, ++, -, --, ->, ==, !=.
Additionally vector has also the operator [] implemented to get the values by index.

I implemented std::map-like container as red-black-tree (like often sited being used for std::map implementation) and only thing the iterator implementation needs (both const and non-const versions) is pointer to the tree node. Each tree node contains pointers to the two children and parent (plus color bit) which is enough to traverse the tree to either direction. In general it depends on the container & iterator types (and implementation) though what kind of data is needed to implement its functionality. E.g. my deque iterators have pointer to the container and index of the element, but it's really implementation specific how the iterators are implemented and what data they need.

Related

why not we pass asterisk(*) in iterator in stl

When we use iterator we declare iterator and then itr as an object, but we don't pass any pointer like we do every time when declaring pointer variable but when we print the value of vector by the use of iterator than how itr became*itr
when we doesn't pass any pointer
Is pointer is hidden or its work on the background?
Example like:
iterator itr;
*itr
How it works does * means any other things to iterator or *itr act like normal pointer variable.
If it works like a pointer variable then why we do not pass * when declaring itr.
An iterator is an object that lets you travel (or iterate) over each object in a collection or stream. It is a sort of generalization of pointers. That is, pointers are one example of an iterator.
Iterators implement concepts required by various algorithms such as forward iteration (meaning it can be incremented to move forward in the collection), bi-directional iteration (meaning it can go forward and backward), and random access (meaning you can use an index an arbitrary item in the collection).
For instance, moving backward can't typically happen in a stream, so stream's iterators are typically forward iterators only because once you access a value, you can't go back in the stream. A linked list's iterators are bi-directional because you can move forward or backward, but you cannot access them by indexing because the nodes are not typically in contiguous memory, so you can't calculate with an index where an arbitrary element is. A vector's iterators are random access and very much like pointers. (C++20 made these categories more precise, so the old categories are now called "Legacy".)
Iterators can also have special functions, such as std::back_inserter, which appends items to the end of a container when a value is assigned to it's referrent.
So, you can see that iterators allow you to be more precise in defining what your consumer of iterators expects. If your algorithm requires bi-directional iteration, you can communicate that and limit it so it won't work with forward-only iterators.
As for the * operator, it is similar to the * operator for a pointer. In both cases, it means, "give me the value referred to by this handle". It is implemented via operator overloading. You do not need the * when declaring an iterator because it is not a pointer, which is a lower-level construct in the language. Rather, it is an object with pointer-like semantics.
To answer your questions below:
No, the * is not automatically created. When you declare an iterator you are declaring an object. When the class for that object is defined, it may or may not have an operator overload for the * operator (or the == or the + or any other operators).
When you go to use the object, such as passing it to a function, the types will need to match up. If a function you were passing it to requires an iterator (e.g. std::sort()), then no dereferencing * is needed. If the function was expecting a value of the type the iterator refers to, then you would need to dereference it first. In that case the compiler calls the overloaded operator *and returns the value.
That is the nature of overloaded operators -- they look like ordinary operators but ultimately are resolved to a function call defined by the creator of the class. It works the same as if you defined a matrix class that has plus and minus operators overloaded.
How it works does * means any other things to iterator or *itr act like normal pointer variable.
It depends what type stands behind iterator. It can be alias for a pointer:
using iterator = int *;
iterator itr;
*itr; // it is pointer dereferencing in this case.
Or it can be a user defined type:
struct iterator {
int &operator*();
};
iterator itr;
*itr; // it means itr.operator*() here
So without knowing what type iterator is it is quite impossible to say what * actually does here. But in reality you should not care as developers of the library should implement it the way it would not matter for you.

Dereferencing iterator-value type of iterator

If I want to iterate through stl map, normally I use
for (it=my_map.begin();it!=my_map.end();it++)
{
}
I know that if (typeid(map<int,char>::iterator::value_type) == typeid(pair<const int,char>)) is TRUE i.e.
value type of the iterator of std::map(key_type,value_type) is std::pair(const key_type,value_type).
But if I want to do
std::pair<const key_type,value_type> b=it;
compiler will give error?
This assignment std::pair<const key_type,value_type> b=*it; works.
My question: What is the type of the iterator pointer to pair or pair?
Iterators are iterators. They are neither pointers, nor pairs. They are iterators.
"Dereferencing"1 an iterator from a std::map<K, V> will give you a std::pair<const K, V>&2, yes, but the iterator itself has its own type.
1) The * and -> operators are overloaded to perform this indirection since, again, an iterator is not actually a pointer in the C++ sense.
2) Or a const std::pair<const K, V>&, if you started off with a std::map<K, V>::const_iterator.
The answer to your question depends on how the iterator is implemented. As long as the type supports the operations required to be an iterator, the implementation is not something you should be concerned about. In the case of std::map's iterators, chances are very good that they're in fact structs, not simply pointers. The reason is that typically maps are implemented as balanced trees, and so the iterators would need to know how to traverse the tree in-order. This requires a bit of book-keeping (parent nodes, maybe?)

Tree-container iterator interface

I am making my own STL-like container - tree with any number of children
template<typename Type>
class Node
{
Type value;
Iterator AddChild(const Type & value);
void Remove(const Iterator & where);
...
};
I decided that iterator's operator* should return value of current node, but what should return operator->? Currently, it returns Node<Type>* and it very useful in such situations
Node<int>::Iterator it = tree.begin();
it->AddChild(4);
But my mentor said me, that operator-> should return Type*. What is STL-like way to access Node's methods? Something like it.Ref().MyMethod() doesn't look good.
Your mentor is right, the return type of operator->() should be Type*.
The big idea behind iterators is that they are just smart pointers to some locations inside the container. You can alter the value stored at that location (by assigning to *it) or access its members, but to do more drastic (i.e. structural) changes to container contents, you have to have direct access to the container itself.
That's why in STL, there are no node's methods. Instead, there're container's methods (and also algorithms) that accept iterators as arguments.
In other words, the STL way of doing what you're trying to do is:
Node<int>::Iterator it = tree.begin();
tree.AddChild(it, 4);
operator-> should return YourTree::value_type* and operator* should return YourTree::value_type&. (Actually a YourTree::pointer and YourTree::reference, but these are normally just aliases for * and & of the value type). Note the consistency. Without it, the standard algorithms will not work.
It is up to you to decide what the value_type is. It could well be Node if you want. This however can be confusing and hard to implement consistently. I would leave it as Type.
The programmer expects it->method to be equivalent to (*it).method, so the operator-> should return pointer to the same thing that operator* returns reference to. Normally that should be the value of the iterator, because that's the expected way to get at the value.
You can expose methods of the node as methods of the pointer, i.e. called as it.method, but it's somewhat confusing and in most cases it would require extra data in the iterator compared to methods of the container taking iterator as argument. That's why STL always uses methods of the container taking iterator. E.g. container.insert(iterator, value) inserts value after iterator.

Strange usage of pointer and method

if (mySharedState -> liveIPs -> find(flowStats -> destinationIP) != mySharedState -> liveIPs -> end() ){
//do something
}
unordered_map <uint32_t, std::string> *liveIPs;
I have never seen such a usage(usage of find(...) and end()). Could somebody help me about what it returns?
(this is c++ code by the way)
You use this technique to check if the container contains that value.
find() returns an iterator corresponding to that value, end() returns an iterator 1 past the end of the container, which is used to signal "value not found".
Functions find(value) and end() are member functions of classes called "containers" used to store elements of various types (list, set, vector, map...). There is more info on containers here.
Both member functions return an iterator (kind of a pointer) to the container element. You can read about the iterators here.
Abstractly speaking, find(value) will give you the position of the element in a container that is equal to the value. And end() will return an iterator pointing to the end of the container (the position behind the last element).
So in your case:
// from mSharedState get liveIPs (a container storing IPs)
// and find the element with value destinationIP
mSharedState->liveIPs->find(flowStats->destinationIP)
// check if the iterator returned by find(flowStats->destinationIP) is different
// then the end of the liveIPs contatiner
!= liveIPs->end()
So, "//do something" will be executed if the container liveIPs holds the element with the value destinationIP.
Since find(value) and end() are usually member functions of a container, I think that the code snippet you are showing is a part of the definition of a member function of a STL conformant container (maybe some user defined container that conforms to the STL container interface, providing find(value) and end() as member functions).

vector::erase and reverse_iterator

I have a collection of elements in a std::vector that are sorted in a descending order starting from the first element. I have to use a vector because I need to have the elements in a contiguous chunk of memory. And I have a collection holding many instances of vectors with the described characteristics (always sorted in a descending order).
Now, sometimes, when I find out that I have too many elements in the greater collection (the one that holds these vectors), I discard the smallest elements from these vectors some way similar to this pseudo-code:
grand_collection: collection that holds these vectors
T: type argument of my vector
C: the type that is a member of T, that participates in the < comparison (this is what sorts data before they hit any of the vectors).
std::map<C, std::pair<T::const_reverse_iterator, std::vector<T>&>> what_to_delete;
iterate(it = grand_collection.begin() -> grand_collection.end())
{
iterate(vect_rit = it->rbegin() -> it->rend())
{
// ...
what_to_delete <- (vect_rit->C, pair(vect_rit, *it))
if (what_to_delete.size() > threshold)
what_to_delete.erase(what_to_delete.begin());
// ...
}
}
Now, after running this code, in what_to_delete I have a collection of iterators pointing to the original vectors that I want to remove from these vectors (overall smallest values). Remember, the original vectors are sorted before they hit this code, which means that for any what_to_delete[0 - n] there is no way that an iterator on position n - m would point to an element further from the beginning of the same vector than n, where m > 0.
When erasing elements from the original vectors, I have to convert a reverse_iterator to iterator. To do this, I rely on C++11's §24.4.1/1:
The relationship between reverse_iterator and iterator is
&*(reverse_iterator(i)) == &*(i- 1)
Which means that to delete a vect_rit, I use:
vector.erase(--vect_rit.base());
Now, according to C++11 standard §23.3.6.5/3:
iterator erase(const_iterator position); Effects: Invalidates
iterators and references at or after the point of the erase.
How does this work with reverse_iterators? Are reverse_iterators internally implemented with a reference to a vector's real beginning (vector[0]) and transforming that vect_rit to a classic iterator so then erasing would be safe? Or does reverse_iterator use rbegin() (which is vector[vector.size()]) as a reference point and deleting anything that is further from vector's 0-index would still invalidate my reverse iterator?
Edit:
Looks like reverse_iterator uses rbegin() as its reference point. Erasing elements the way I described was giving me errors about a non-deferenceable iterator after the first element was deleted. Whereas when storing classic iterators (converting to const_iterator) while inserting to what_to_delete worked correctly.
Now, for future reference, does The Standard specify what should be treated as a reference point in case of a random-access reverse_iterator? Or this is an implementation detail?
Thanks!
In the question you have already quoted exactly what the standard says a reverse_iterator is:
The relationship between reverse_iterator and iterator is &*(reverse_iterator(i)) == &*(i- 1)
Remember that a reverse_iterator is just an 'adaptor' on top of the underlying iterator (reverse_iterator::current). The 'reference point', as you put it, for a reverse_iterator is that wrapped iterator, current. All operations on the reverse_iterator really occur on that underlying iterator. You can obtain that iterator using the reverse_iterator::base() function.
If you erase --vect_rit.base(), you are in effect erasing --current, so current will be invalidated.
As a side note, the expression --vect_rit.base() might not always compile. If the iterator is actually just a raw pointer (as might be the case for a vector), then vect_rit.base() returns an rvalue (a prvalue in C++11 terms), so the pre-decrement operator won't work on it since that operator needs a modifiable lvalue. See "Item 28: Understand how to use a reverse_iterator's base iterator" in "Effective STL" by Scott Meyers. (an early version of the item can be found online in "Guideline 3" of http://www.drdobbs.com/three-guidelines-for-effective-iterator/184401406).
You can use the even uglier expression, (++vect_rit).base(), to avoid that problem. Or since you're dealing with a vector and random access iterators: vect_rit.base() - 1
Either way, vect_rit is invalidated by the erase because vect_rit.current is invalidated.
However, remember that vector::erase() returns a valid iterator to the new location of the element that followed the one that was just erased. You can use that to 're-synchronize' vect_rit:
vect_rit = vector_type::reverse_iterator( vector.erase(vect_rit.base() - 1));
From a standardese point of view (and I'll admit, I'm not an expert on the standard): From §24.5.1.1:
namespace std {
template <class Iterator>
class reverse_iterator ...
{
...
Iterator base() const; // explicit
...
protected:
Iterator current;
...
};
}
And from §24.5.1.3.3:
Iterator base() const; // explicit
Returns: current.
Thus it seems to me that so long as you don't erase anything in the vector before what one of your reverse_iterators points to, said reverse_iterator should remain valid.
Of course, given your description, there is one catch: if you have two contiguous elements in your vector that you end up wanting to delete, the fact that you vector.erase(--vector_rit.base()) means that you've invalidated the reverse_iterator "pointing" to the immediately preceeding element, and so your next vector.erase(...) is undefined behavior.
Just in case that's clear as mud, let me say that differently:
std::vector<T> v=...;
...
// it_1 and it_2 are contiguous
std::vector<T>::reverse_iterator it_1=v.rend();
std::vector<T>::reverse_iterator it_2=it_1;
--it_2;
// Erase everything after it_1's pointee:
// convert from reverse_iterator to iterator
std::vector<T>::iterator tmp_it=it_1.base();
// but that points one too far in, so decrement;
--tmp_it;
// of course, now tmp_it points at it_2's base:
assert(tmp_it == it_2.base());
// perform erasure
v.erase(tmp_it); // invalidates all iterators pointing at or past *tmp_it
// (like, say it_2.base()...)
// now delete it_2's pointee:
std::vector<T>::iterator tmp_it_2=it_2.base(); // note, invalid iterator!
// undefined behavior:
--tmp_it_2;
v.erase(tmp_it_2);
In practice, I suspect that you'll run into two possible implementations: more commonly, the underlying iterator will be little more than a (suitably wrapped) raw pointer, and so everything will work perfectly happily. Less commonly, the iterator might actually try to track invalidations/perform bounds checking (didn't Dinkumware STL do such things when compiled in debug mode at one point?), and just might yell at you.
The reverse_iterator, just like the normal iterator, points to a certain position in the vector. Implementation details are irrelevant, but if you must know, they both are (in a typical implementation) just plain old pointers inside. The difference is the direction. The reverse iterator has its + and - reversed w.r.t. the regular iterator (and also ++ and --, > and < etc).
This is interesting to know, but doesn't really imply an answer to the main question.
If you read the language carefully, it says:
Invalidates iterators and references at or after the point of the erase.
References do not have a built-in sense of direction. Hence, the language clearly refers to the container's own sense of direction. Positions after the point of the erase are those with higher indices. Hence, the iterator's direction is irrelevant here.