Retrieving data from vectors and unordered_sets - c++

I've started to put data into vectors and unordered_sets. I've worked out fairly easily how to put the data in, and I know how to get the data out if I need to unload all the data, for example:
for (auto i : vehicles)
MakeSpawnInfoVehicle(i.AddedInformation);
However, I've reached a point where I want just one element of information from either the unordered_sets or vectors such as what ever the 10th entry is in the vector or the unordered_set.
If someone could provide a basic example of both, I'm sure I'll understand it.

In situations like this, you can consult a good reference.
std::vector provides operator[] with constant-time performance:
auto tenth_element = vehicle_vector[9];
std::unordered_set is optimised for element-based lookup (that is, testing whether an element is present or not). It is rare you would need the 10th element (especially since the set is, by definition, unordered), but if you do, you take an iterator, increment it and dereference it:
auto tenth_element = *std::next(vehicle_set.begin(), 9);
In general, you access the elements of a container either through the container's member functions (such as the vector's operator [] above), or through an iterator.
An iterator is a generalisation of a pointer - it's some unspecified type which points to an element in the container. You can then work with iterators using their member functions like operator ++ or free functions like std::next(). Random-access iterators also support []. To get the element to which the iterator "points," dereference the iterator like *it or it->whatever.
You obtain an iterator to the beginning of the container using begin() and an iterator to one past the last element using end(). Containers can also provide other member functions to get an iterator to an element - such as vehicle_set.find(aVehicle), which returns an iterator to aVehicle if it's present in the set, or the end() iterator if it's not.
Which member functions a container offers depends on which container it is and in particular how efficient the operations are. std::vector doesn't proivde find(), because it would be no better than std::find() - there is no structure in std::vector for fast lookup. std::unordered_set does provide find(), but doesn't have operator [], because its iterators are not random-access: accessing the nth element of an unordered set requires time proportional to n.

Related

Understanding Iterators in the STL

What exactly are iterators in the C++ STL?
In my case, I'm using a list, and I don't understand why you have to make an iterator std::list <int>::const_iterator iElementLocator; to display contents of the list by the derefrence operator:
cout << *iElementLocator; after assigning it to maybe list.begin().
Please explain what exactly an iterator is and why I have to dereference or use it.
There are three building blocks in the STL:
Containers
Algorithms
Iterators
At the conceptual level containers hold data. That by itself isn't very useful, because you want to do something with the data; you want to operate on it, manipulate it, query it, play with it. Algorithms do exactly that. But algorithms don't hold data, they have no data -- they need a container for this task. Give a container to an algorithm and you have an action going on.
The only problem left to solve is how does an algorithm traverse a container, from a technical point of view. Technically a container can be a linked list, or it can be an array, or a binary tree, or any other data structure that can hold data. But traversing an array is done differently than traversing a binary tree. Even though conceptually all an algorithm wants is to "get" one element at a time from a container, and then work on that element, the operation of getting the next element from a container is technically very container-specific.
It appears as if you'd need to write the same algorithm for each container, so that each version of the algorithm has the correct code for traversing the container. But there's a better solution: ask the container to return an object that can traverse over the container. The object would have an interface algorithms know. When an algorithm asks the object to "get the next element" the object would comply. Because the object came directly from the container it knows how to access the container's data. And because the object has an interface the algorithm knows, we need not duplicate an algorithm for each container.
This is the iterator.
The iterator here glues the algorithm to the container, without coupling the two. An iterator is coupled to a container, and an algorithm is coupled to the iterator's interface. The source of the magic here is really template programming. Consider the standard copy() algorithm:
template<class In, class Out>
Out copy(In first, In last, Out res)
{
while( first != last ) {
*res = *first;
++first;
++res;
}
return res;
}
The copy() algorithm takes as parameters two iterators templated on the type In and one iterator of type Out. It copies the elements starting at position first and ending just before position last, into res. The algorithm knows that to get the next element it needs to say ++first or ++res. It knows that to read an element it needs to say x = *first and to write an element it needs to say *res = x. That's part of the interface algorithms assume and iterators commit to. If by mistake an iterator doesn't comply with the interface then the compiler would emit an error for calling a function over type In or Out, when the type doesn't define the function.
I'm being lazy. So I would not type describing what an iterator is and how they're used, especially when there're already lots of articles online that you can read yourself.
Here are few that I can quote for a start, proividing the links to complete articles:
MSDN says,
Iterators are a generalization of
pointers, abstracting from their
requirements in a way that allows a
C++ program to work with different
data structures in a uniform manner.
Iterators act as intermediaries
between containers and generic
algorithms. Instead of operating on
specific data types, algorithms are
defined to operate on a range
specified by a type of iterator. Any
data structure that satisfies the
requirements of the iterator may then
be operated on by the algorithm. There
are five types or categories of
iterator [...]
By the way, it seems the MSDN has taken the text in bold from C++ Standard itself, specifically from the section §24.1/1 which says
Iterators are a generalization of
pointers that allow a C + + program to
work with different data structures
(containers) in a uniform manner. To
be able to construct template
algorithms that work correctly and
efficiently on different types of data
structures, the library formalizes not
just the interfaces but also the
semantics and complexity assumptions
of iterators. All iterators i support
the expression *i, resulting in a
value of some class, enumeration, or
built-in type T, called the value type
of the iterator. All iterators i for
which the expression (*i).m is
well-defined, support the expression
i->m with the same semantics as
(*i).m. For every iterator type X for
which equality is defined, there is a
corresponding signed integral type
called the difference type of the
iterator.
cplusplus says,
In C++, an iterator is any object
that, pointing to some element in a
range of elements (such as an array or
a container), has the ability to
iterate through the elements of that
range using a set of operators (at
least, the increment (++) and
dereference (*) operators).
The most obvious form of iterator is a
pointer [...]
And you can also read these:
What Is an Iterator?
Iterators in the Standard C++ Library
Iterator (at wiki entry)
Have patience and read all these. Hopefully, you will have some idea what an iterator is, in C++. Learning C++ requires patience and time.
An iterator is not the same as the container itself. The iterator refers to a single item in the container, as well as providing ways to reach other items.
Consider designing your own container without iterators. It could have a size function to obtain the number of items it contains, and could overload the [] operator to allow you to get or set an item by its position.
But "random access" of that kind is not easy to implement efficiently on some kinds of container. If you obtain the millionth item: c[1000000] and the container internally uses a linked list, it will have to scan through a million items to find the one you want.
You might instead decide to allow the collection to remember a "current" item. It could have functions like start and more and next to allow you to loop through the contents:
c.start();
while (c.more())
{
item_t item = c.next();
// use the item somehow
}
But this puts the "iteration state" inside the container. This is a serious limitation. What if you wanted to compare each item in the container with every other item? That requires two nested loops, both iterating through all the items. If the container itself stores the position of the iteration, you have no way to nest two such iterations - the inner loop will destroy the working of the outer loop.
So iterators are an independent copy of an iteration state. You can begin an iteration:
container_t::iterator i = c.begin();
That iterator, i, is a separate object that represents a position within the container. You can fetch whatever is stored at that position:
item_t item = *i;
You can move to the next item:
i++;
With some iterators you can skip forward several items:
i += 1000;
Or obtain an item at some position relative to the position identified by the iterator:
item_t item = i[1000];
And with some iterators you can move backwards.
And you can discover if you've reached beyond the contents of the container by comparing the iterator to end:
while (i != c.end())
You can think of end as returning an iterator that represents a position that is one beyond the last position in the container.
An important point to be aware of with iterators (and in C++ generally) is that they can become invalid. This usually happens for example if you empty a container: any iterators pointing to positions in that container have now become invalid. In that state, most operations on them are undefined - anything could happen!
An iterator is to an STL container what a pointer is to an array. You can think of them as pointer objects to STL containers. As pointers, you will be able to use them with the pointer notation (e.g. *iElementLocator, iElementLocator++). As objects, they will have their own attributes and methods (http://www.cplusplus.com/reference/std/iterator).
There already exists a lot of good explanations of iterators. Just google it.
One example.
If there is something specific you don't understand come back and ask.
I'd suggest reading about operator overloading in C++. This will tell why * and -> can mean essentially anything. Only then you should read about the iterators. Otherwise it might appear very confusing.

Iterator equivalent to null pointer?

In an algorithm I'm currently implementing, I need to manipulate a std::list of struct T.
T holds a reference to another instance of T, but this reference can also be "unassigned".
At first, I wanted to use a pointer to hold this reference, but using an iterator instead makes it easier to remove from the list.
My question is : how to represent the equivalent to null pointer with my iterator?
I read general solution is to use myList.end(), but in my case, I need to test whether the iterator is "null" or not, and I may add or remove elements to the list between the moment when I store the iterator and the moment I remove it from list... Should I make the iterator point to a known list containing the "null" element? Or is there a more elegant solution?
According to this (emphasis by me):
Compared to the other base sequence
containers (vector and deque), lists
are the most efficient container doing
insertions at some position other than
the beginning or the end of the
sequence, and, unlike in these, all of
the previously obtained iterators and
references remain valid after the
insertion and refer to the same
elements they were referring before.
The same applies to erasure (with the obvious exception of iterators referring to a deleted element becoming invalidated). So yes, obtaining end() will always point to the same "invalid" element and should be safe to use.

C++ reverse_iterator Alternatives

I am attempting to write a two-pass algorithm incorporating some legacy code. I want to move through a particular container twice, once in order and once in reverse order. Obviously, my first thought was to use an iterator and a reverse_iterator, but strangely the designers of the container class I'm using did not see fit to define a working reverse_iterator for the container (reverse_iterators here cannot be dereferenced like iterators can be). I already have an algorithm that requires a reverse_iterator.
My thought is to use the first pass iterator for the first part of the algorithm, and as I perform the algorithm push_front the items into a new container, then iterate through the new container. This will take up memory, which isn't critical in my application, but made me wonder: are there any cleaner alternatives to reverse_iterators in C++, or should I take the time to rework my algorithm using only forward iterators?
If you need to iterate over the elements of a container in reverse order, you don't necessarily need to use a reverse iterator.
If the container has bidirectional iterators, then you can use ordinary iterators and use --it to iterate from end() to begin() instead of using ++it to iterate from begin() to end().
Since this is a bit tricky, you can use the std::reverse_iterator wrapper to convert an ordinary iterator into a reverse iterator (this basically swaps ++ and -- and encapsulates the trickery required to get this to work).
If the container doesn't have bidirectional iterators then that means it's impossible to iterate over the elements of the container in reverse order, in which case you would need either to rewrite your algorithm or to use a different container.
Any container that has bidirectional iterators, it should provide reverse iterator functionality; this is part of the STL and C++ Standard Library "Container" concept.

Does a vector sort invalidate iterators?

std::vector<string> names;
std::vector<string>::iterator start = names.begin();
std::vector<string>::iterator end = names.end();
sort (start,end);
//are my start and end valid at this point?
//or they do not point to front and tail resp?
According to the C++ Standard §23.1/11:
Unless otherwise specified (either explicitly or by defining a function in terms of other functions), invoking
a container member function or passing a container as an argument to a library function shall not invalidate
iterators to, or change the values of, objects within that container.
§25.3 "Sorting and related operations" doesn't specify that iterators will be invalidated, so iterators in the question should stay valid.
They still point to the beginning and end. The values in those slots of the vector have probably changed, but the storage location in which each resides remains the same.
std::sort will not invalidate iterators to a vector. The sort template uses the * operator on the iterators to access and modify the contents of the vector, and modifying a vector element though an iterator to an element already in the vector will not invalidate any iterators.
In summary,
your existing iterators will not be invalidated
however, the elements they point to may have been modified
In addition to the support for the standard provided by Kirill V. Lyadvinsky (Does a vector sort invalidate iterators?):
25/5 "Algorithms library"
If an algorithm’s Effects section says
that a value pointed to by any
iterator passed as an argument is
modified, then that algorithm has an
additional type requirement: The type
of that argument shall satisfy the
requirements of a mutable iterator
(24.1).
24.1/4 "Iterator requirements"
Besides its category, a forward,
bidirectional, or random access
iterator can also be mutable or
constant depending on whether the
result of the expression *i behaves as
a reference or as a reference to a
constant.
std::vector keeps its elements in contiguous memory. std::sort takes arguments (iterators) by value and re-arranges the sequence between them. The net result is your local variables start and end are still pointing to first and one-past-the-last elements of the vector.

Is end() required to be constant in an STL map/set?

§23.1.2.8 in the standard states that insertion/deletion operations on a set/map will not invalidate any iterators to those objects (except iterators pointing to a deleted element).
Now, consider the following situation: you want to implement a graph with uniquely numbered nodes, where every node has a fixed number (let's say 4) of neighbors. Taking advantage of the above rule, you do it like this:
class Node {
private:
// iterators to neighboring nodes
std::map<int, Node>::iterator neighbors[4];
friend class Graph;
};
class Graph {
private:
std::map<int, Node> nodes;
};
(EDIT: Not literally like this due to the incompleteness of Node in line 4 (see responses/comments), but along these lines anyway)
This is good, because this way you can insert and delete nodes without invalidating the consistency of the structure (assuming you keep track of deletions and remove the deleted iterator from every node's array).
But let's say you also want to be able to store an "invalid" or "nonexistent" neighbor value. Not to worry, we can just use nodes.end()... or can we? Is there some sort of guarantee that nodes.end() at 8 AM will be the same as nodes.end() at 10 PM after a zillion insertions/deletions? That is, can I safely == compare an iterator received as a parameter to nodes.end() in some method of Graph?
And if not, would this work?
class Graph {
private:
std::map<int, Node> nodes;
std::map<int, Node>::iterator _INVALID;
public:
Graph() { _INVALID = nodes.end(); }
};
That is, can I store nodes.end() in a variable upon construction, and then use this variable whenever I want to set a neighbor to invalid state, or to compare it against a parameter in a method? Or is it possible that somewhere down the line a valid iterator pointing to an existing object will compare equal to _INVALID?
And if this doesn't work either, what can I do to leave room for an invalid neighbor value?
You write (emphasis by me):
§23.1.2.8 in the standard states that insertion/deletion operations on a set/map will not invalidate any iterators to those objects (except iterators pointing to a deleted element).
Actually, the text of 23.1.2/8 is a bit different (again, emphasis by me):
The insert members shall not affect the validity of iterators and references to the container, and the erase members shall invalidate only iterators and references to the erased elements.
I read this as: If you have a map, and somehow obtain an iterator into this map (again: it doesn't say to an object in the map), this iterator will stay valid despite insertion and removal of elements. Assuming std::map<K,V>::end() obtains an "iterator into the map", it should not be invalidated by insertion/removal.
This, of course, leaves the question whether "not invalidated" means it will always have the same value. My personal assumption is that this is not specified. However, in order for the "not invalidated" phrase to make sense, all results of std::map<K,V>::end() for the same map must always compare equal even in the face of insertions/removal:
my_map_t::iterator old_end = my_map.end();
// wildly change my_map
assert( old_end == my_map.end() );
My interpretation is that, if old_end remains "valid" throughout changes to the map (as the standard promisses), then that assertion should pass.
Disclaimer: I am not a native speaker and have a very hard time digesting that dreaded legaleze of the Holy PDF. In fact, in general I avoid it like the plague.
Oh, and my first thought also was: The question is interesting from an academic POV, but why doesn't he simply store keys instead of iterators?
23.1/7 says that end() returns an iterator that
is the past-the-end value for the container.
First, it confirms that what end() returns is the iterator. Second, it says that the iterator doesn't point to a particular element. Since deletion can only invalidate iterators that point somewhere (to the element being deleted), deletions can't invalidate end().
Well, there's nothing preventing particular collection implementation from having end() depend on the instance of collection and time of day, as long as comparisons and such work. Which means, that, perhaps, end() value may change, but old_end == end() comparison should still yield true. (edit: although after reading the comment from j_random_hacker, I doubt this paragraph itself evaluates to true ;-), not universally — see the discussion below )
I also doubt you can use std::map<int,Node>::iterator in the Node class due to the type being incomplete, yet (not sure, though).
Also, since your nodes are uniquely numbered, you can use int for keying them and reserve some value for invalid.
Iterators in (multi)sets and (multi)maps won't be invalidated in insertions and deletions and thus comparing .end() against previous stored values of .end() will always yield true.
Take as an example GNU libstdc++ implementation where .end() in maps returns the default intialized value of Rb_tree_node
From stl_tree.h:
_M_initialize()
{
this->_M_header._M_color = _S_red;
this->_M_header._M_parent = 0;
this->_M_header._M_left = &this->_M_header;
this->_M_header._M_right = &this->_M_header;
}
Assuming that (1) map implemented with red-black tree (2) you use same instance "after a zillion insertions/deletions"- answer "Yes".
Relative implmentation I can tell that all incarnation of stl I ever know use the tree algorithm.
A couple points:
1) end() references an element that is past the end of the container. It doesn't change when inserts or deletes change the container because it's not pointing to an element.
2) I think perhaps your idea of storing an array of 4 iterators in the Node could be changed to make the entire problem make more sense. What you want is to add a new iterator type to the Graph object that is capable of iterating over a single node's neighbours. The implementation of this iterator will need to access the members of the map, which possibly leads you down the path of making the Graph class extend the map collection. With the Graph class being an extended std::map, then the language changes, and you no longer need to store an invalid iterator, but instead simply need to write the algorithm to determine who is the 'next neighbour' in the map.
I think it is clear:
end() returns an iterator to the element one past the end.
Insertion/Deletion do not affect existing iterators so the returned values are always valid (unless you try to delete the element one past the end (but that would result in undefined behavior anyway)).
Thus any new iterator generated by end() (would be different but) when compared with the original using operator== would return true.
Also any intermediate values generated using the assignment operator= have a post condition that they compare equal with operator== and operator== is transitive for iterators.
So yes, it is valid to store the iterator returned by end() (but only because of the guarantees with associative containers, therefor it would not be valid for vector etc).
Remember the iterator is not necessarily a pointer. It can potentially be an object where the designer of the container has defined all the operations on the class.
I believe that this depends entirely on what type of iterator is being used.
In a vector, end() is the one past the end pointer and it will obviously change as elements are inserted and removed.
In another kind of container, the end() iterator might be a special value like NULL or a default constructed element. In this case it doesn't change because it doesn't point at anything. Instead of being a pointer-like thing, end() is just a value to compare against.
I believe that set and map iterators are the second kind, but I don't know of anything that requires them to be implemented in that way.
C++ Standard states that iterators should stay valid. And it is. Standard clearly states that in 23.1.2/8:
The insert members shall not affect the validity of iterators and references to the container, and the erase members shall invalidate only iterators and references to the erased elements.
And in 21.1/7:
end() returns an iterator which is the past-the-end value for the container.
So iterators old_end and new_end will be valid. That means that we could get --old_end (call it it1) and --new_end (call it it2) and it will be the-end value iterators (from definition of what end() returns), since iterator of an associative container is of the bidirectional iterator category (according to 23.1.2/6) and according to definition of --r operation (Table 75).
Now it1 should be equal it2 since it gives the-end value, which is only one (23.1.2/9). Then from 24.1.3 follows that: The condition that a == b implies ++a == ++b. And ++it1 and ++it2 will give old_end and new_end iterators (from definition of ++r operation Table 74). Now we get that old_end and new_end should be equal.
I had a similar question recently, but I was wondering if calling end() to retrieve an iterator for comparison purposes could possibly have race conditions.
According to the standard, two iterators are considered equivalent if both can be dereferenced and &*a == &*b or if neither can be dereferenced. Finding the bolded statement took a while and is very relevant here.
Because an std::map::iterator cannot be invalidated unless the element it points to has been removed, you're guaranteed that two iterators returned by end, regardless of what the state of the map was when they were obtained, will always compare to each other as true.