Check if iterator points to element in container - c++

I am working on some C++ program right now and cannot resolve the above question. I see in forms this has been asked before, but neither answer was satisfying to me.
So, I am working with containers (e.g. a list), additionally I have an array with iterators to this list. I have been doing the following: Initially, I set all iterators in the array as pointing to a dummy list (DummyList.begin()).
To check, if the iterator was initialized I used to check (it != DummyList.begin()), in case needed I would then set the iterator to some element of the real list.
This seems to work.
However, when trying this for other contains (i.e. boost::circular_buffer), I got invalid iterator errors, and upon googling I found e.g. this comparing iterators from different containers, which says that comparing iterators from different containers produces undefined behavior.
That would be scary. Is this still the case?
If yes, then why did my program work so far?
And how would I do this otherwise then?
Edit:
The code looks something like this:
Initialization:
std::list<int> DummyList;
M = new std::list<int>::iterator[n + 1];
for (int i=0;i<=n;++i) {
M[i] = DummyList.begin();
}
Later than the check:
if (M[i] == DummyList.begin())
....

Yes, iterators can only be compared if they belong to the same container. Otherwise, the result is Undefined. Note that "appearing to work" is a valid form of Undefined Behaviour (and usually the worst one, as it hides errors).
What you can do in your case is use something like boost::optional to only store valid iterators in your array:
std::array<boost::optional<MyList::iterator>>, 10> iterators;
if (!iterators[i]) // iterator is not initialised
iterators[i] = myList.begin(); // store valid iterator
else
{
MyList::iterator it = *iterators[i]; // access valid iterator
iterators[i] = boost::none; // make iterator in array invalid again
}

Related

What is the advantage of using (it != vector.end()) instead of (it < vector.end()) in for loops? [duplicate]

I'm used to writing loops like this:
for (std::size_t index = 0; index < foo.size(); index++)
{
// Do stuff with foo[index].
}
But when I see iterator loops in others' code, they look like this:
for (Foo::Iterator iterator = foo.begin(); iterator != foo.end(); iterator++)
{
// Do stuff with *Iterator.
}
I find the iterator != foo.end() to be offputting. It can also be dangerous if iterator is incremented by more than one.
It seems more "correct" to use iterator < foo.end(), but I never see that in real code. Why not?
All iterators are equality comparable. Only random access iterators are relationally comparable. Input iterators, forward iterators, and bidirectional iterators are not relationally comparable.
Thus, the comparison using != is more generic and flexible than the comparison using <.
There are different categories of iterators because not all ranges of elements have the same access properties. For example,
if you have an iterators into an array (a contiguous sequence of elements), it's trivial to relationally compare them; you just have to compare the indices of the pointed to elements (or the pointers to them, since the iterators likely just contain pointers to the elements);
if you have iterators into a linked list and you want to test whether one iterator is "less than" another iterator, you have to walk the nodes of the linked list from the one iterator until either you reach the other iterator or you reach the end of the list.
The rule is that all operations on an iterator should have constant time complexity (or, at a minimum, sublinear time complexity). You can always perform an equality comparison in constant time since you just have to compare whether the iterators point to the same object. So, all iterators are equality comparable.
Further, you aren't allowed to increment an iterator past the end of the range into which it points. So, if you end up in a scenario where it != foo.end() does not do the same thing as it < foo.end(), you already have undefined behavior because you've iterated past the end of the range.
The same is true for pointers into an array: you aren't allowed to increment a pointer beyond one-past-the-end of the array; a program that does so exhibits undefined behavior. (The same is obviously not true for indices, since indices are just integers.)
Some Standard Library implementations (like the Visual C++ Standard Library implementation) have helpful debug code that will raise an assertion when you do something illegal with an iterator like this.
Short answer: Because Iterator is not a number, it's an object.
Longer answer: There are more collections than linear arrays. Trees and hashes, for example, don't really lend themselves to "this index is before this other index". For a tree, two indices that live on separate branches, for example. Or, any two indices in a hash -- they have no order at all, so any order you impose on them is arbitrary.
You don't have to worry about "missing" End(). It is also not a number, it is an object that represents the end of the collection. It doesn't make sense to have an iterator that goes past it, and indeed it cannot.

container iterators and operations on containers

I'm studying C++ and I'm reading about STL containers,iterators and the operations that can be performed on them.
I know that every container type (or better, the corresponding template of which each type is an instance) defines a companio type that acts like a pointer-like type and it's called iterator. What I understand is that once you get an iterator to a container,performing operations like adding an element may invalidate that iterator,so I tried to test this statement with an example:
#include <vector>
#include <iostream>
using namespace std;
int main()
{
vector<int> ivec={1,2,3,4,5,6,7,8,9,0};
auto beg=ivec.begin();
auto mid=ivec.begin()+ivec.size()/2;
while (beg != mid) {
if (*beg==2)
ivec.insert(beg,0);
++beg;
}
for (auto i:ivec)
cout<<i<<" ";
}
here,I'm simply contructing a vector of ints, brace initialize it,and performing a condition based operation,inserting an element in the first half of the container.
The code is flawed I think, because I'm initializing two iterator objects beg
and end and then I use them in the while statement as a condition.
BUT, if the code should change the contents of the container (and it sure does) what happens to the iterators?
The code seems to run just fine,it add a 0 in the ivec[1] position and prints the result.
What I thought is that the beg iterator would point to the newly added element and that the mid iterator would have pointed to the element before the formerly pointed to by mid (it's like the iterators point to the same memory locations while the underlying array,"slides" under.. unless it's reallocated that is)
Can someone explain me this behaviour??
When the standard says iterators are invalidated, this does not guarantee that they will be invalid in the sense of preventing your program from working. Using an invalid iterator is undefined behavior which is a huge and important topic in C++. It doesn't mean your program will crash, but it might. Your program might also do something else--the behavior is completely undefined.

How does sorting in c++ work?

I'm a Java developer. I'm currently learning C++. I've been looking at code samples for sorting. In Java, one normally gives a sorting method the container it needs to sort e.g
sort(Object[] someArray)
I've noticed in C++ that you pass two args, the start and end of the container. My question is that how is the actual container accessed then?
Here's sample code taken from Wikipedia illustrating the the sort method
#include <iostream>
#include <algorithm>
#include <vector>
int main() {
std::vector<int> vec;
vec.push_back(10); vec.push_back(5); vec.push_back(100);
std::sort(vec.begin(), vec.end());
for (int i = 0; i < vec.size(); ++i)
std::cout << vec[i] << ' ';
}
vec.begin() and vec.end() are returning iterators iterators. The iterators are kind of pointers on the elements, you can read them and modify them using iterators. That is what sort is doing using the iterators.
If it is an iterator, you can directly modify the object the iterator is referring to:
*it = X;
The sort function does not have to know about the containers, which is the power of the iterators. By manipulating the pointers, it can sort the complete container without even knowing exactly what container it is.
You should learn about iterators (http://www.cprogramming.com/tutorial/stl/iterators.html)
vec.begin() and vec.end() do not return the first and last elements of the vector. They actually return what is known as an iterator. An iterator behaves very much like a pointer to the elements. If you have an iterator i that you initialised with vec.begin(), you can get a pointer to the second element in the vector just by doing i++ - the same as you would if you had a point to the first element in an array. Likewise you can do i-- to go backwards. For some iterators (known as random access iterators), you can even do i + 5 to get an iterator to the 5th element after i.
This is how the algorithm accesses the container. It knows that all of the elements that it should be sorting are between begin() and end(). It navigates around the elements by doing simple iterator operations. It can then modify the elements by doing *i, which gives the algorithm a reference to the element that i is pointing at. For example, if i is set to vec.begin(), and you do *i = 5;, you will change the value of the first element of vec.
This approach allows you to pass only part of a vector to be sorted. Let's say you only wanted to sort the first 5 elements of your vector. You could do:
std::sort(vec.begin(), vec.begin() + 5);
This is very powerful. Since iterators behave very much like pointers, you can actually pass plain old pointers too. Let's say you have an array int array[] = {4, 3, 2, 5, 1};, you could easily call std::sort(array, array + 5) (because the name of an array will decay to a pointer to its first element).
The container doesn't have to be accessed. That's the whole point of the design behind the Standard Template Library (which became part of the C++ standard library): The algorithms don't know anything about containers, just iterators.
This means they can work with anything that provides a pair of iterators. Of course all STL containers provide begin() and end() methods, but you can also use a regular old C array, or an MFC or glib container, or anything else, just by writing your own iterators for it. (And for C arrays, it's as simple as a and a+a_len for the begin and end iterators.)
As for how it works under the covers: Iterators follow an implicit protocol: you can do things like ++it to advance an iterator to the next element, or *it to get the value of the current element, or *it = 3 to set the value of the current element. (It's a bit more complicated than this, because there are a few different protocols—iterators can be random-access or forward-only, const or writable, etc. But that's the basic idea.) So, if `sort is coded to restrict itself to the iterator protocol (and, of course, it is), it works with anything that conforms to that protocol.
To learn more, there are many tutorials on the internet (and in the bookstore); there's only so much an SO answer can explain.
begin() and end() return iterators. See e.g. http://www.cprogramming.com/tutorial/stl/iterators.html
Iterators act like references into part of a container. That is, *iter = z; actually changes one of the elements in the container.
std::sort actually uses a swap function on references to the contained objects, so that any iterators you have already initialized remain in the same order but the values those iterators refer to are changed.
Note that std::list also has member functions called sort. It works the other way around: any iterators you have already initialized keep the same values, but the order of those iterators changes.

Erase by iterator on a C++ STL map

I'm curious about the rationale behind the following code. For a given map, I can delete a range up to, but not including, end() (obviously,) using the following code:
map<string, int> myMap;
myMap["one"] = 1;
myMap["two"] = 2;
myMap["three"] = 3;
map<string, int>::iterator it = myMap.find("two");
myMap.erase( it, myMap.end() );
This erases the last two items using the range. However, if I used the single iterator version of erase, I half expected passing myMap.end() to result in no action as the iterator was clearly at the end of the collection. This is as distinct from a corrupt or invalid iterator which would clearly lead to undefined behaviour.
However, when I do this:
myMap.erase( myMap.end() );
I simply get a segmentation fault. I wouldn't have thought it difficult for map to check whether the iterator equalled end() and not take action in that case. Is there some subtle reason for this that I'm missing? I noticed that even this works:
myMap.erase( myMap.end(), myMap.end() );
(i.e. does nothing)
The reason I ask is that I have some code which receives a valid iterator to the collection (but which could be end()) and I wanted to simply pass this into erase rather than having to check first like this:
if ( it != myMap.end() )
myMap.erase( it );
which seems a bit clunky to me. The alternative is to re code so I can use the by-key-type erase overload but I'd rather not re-write too much if I can help it.
The key is that in the standard library ranges determined by two iterators are half-opened ranges. In math notation [a,b) They include the first but not the last iterator (if both are the same, the range is empty). At the same time, end() returns an iterator that is one beyond the last element, which perfectly matches the half-open range notation.
When you use the range version of erase it will never try to delete the element referenced by the last iterator. Consider a modified example:
map<int,int> m;
for (int i = 0; i < 5; ++i)
m[i] = i;
m.erase( m.find(1), m.find(4) );
At the end of the execution the map will hold two keys 0 and 4. Note that the element referred by the second iterator was not erased from the container.
On the other hand, the single iterator operation will erase the element referenced by the iterator. If the code above was changed to:
for (int i = 1; i <= 4; ++i )
m.erase( m.find(i) );
The element with key 4 will be deleted. In your case you will attempt to delete the end iterator that does not refer to a valid object.
I wouldn't have thought it difficult for map to check whether the iterator equalled end() and not take action in that case.
No, it is not hard to do, but the function was designed with a different contract in mind: the caller must pass in an iterator into an element in the container. Part of the reason for this is that in C++ most of the features are designed so that the incur the minimum cost possible, allowing the user to balance the safety/performance on their side. The user can test the iterator before calling erase, but if that test was inside the library then the user would not be able to opt out of testing when she knows that the iterator is valid.
n3337 23.2.4 Table 102
a.erase( q1, q2)
erases all the elements in the range [q1,q2). Returns q2.
So, iterator returning from map::end() is not in range in case of myMap.erase(myMap.end(), myMap.end());
a.erase(q)
erases the element pointed to by q. Returns an iterator pointing to the element immediately following q prior to the element being erased. If no such element exists, returns a.end().
I wouldn't have thought it difficult for map to check whether the
iterator equalled end() and not take action in that case. Is there
some subtle reason for this that I'm missing?
Reason is same, that std::vector::operator[] can don't check, that index is in range, of course.
When you use two iterators to specify a range, the range consists of the elements from the element that the first iterator points to up to but not including the element that the second iterator points to. So erase(it, myMap.end()) says to erase everything from it up to but not including end(). You could equally well pass an iterator that points to a "real" element as the second one, and the element that that iterator points to would not be erased.
When you use erase(it) it says to erase the element that it points to. The end() iterator does not point to a valid element, so erase(end()) doesn't do anything sensible. It would be possible for the library to diagnose this situation, and a debugging library will do that, but it imposes a cost on every call to erase to check what the iterator points to. The standard library doesn't impose that cost on users. You're on your own. <g>

Checking if an iterator is valid

Is there any way to check if an iterator (whether it is from a vector, a list, a deque...) is (still) dereferenceable, i.e. has not been invalidated?
I have been using try-catch, but is there a more direct way to do this?
Example: (which doesn't work)
list<int> l;
for (i = 1; i<10; i++) {
l.push_back(i * 10);
}
itd = l.begin();
itd++;
if (something) {
l.erase(itd);
}
/* now, in other place.. check if it points to somewhere meaningful */
if (itd != l.end())
{
// blablabla
}
I assume you mean "is an iterator valid," that it hasn't been invalidated due to changes to the container (e.g., inserting/erasing to/from a vector). In that case, no, you cannot determine if an iterator is (safely) dereferencable.
As jdehaan said, if the iterator wasn't invalidated and points into a container, you can check by comparing it to container.end().
Note, however, that if the iterator is singular -- because it wasn't initialized or it became invalid after a mutating operation on the container (vector's iterators are invalidated when you increase the vector's capacity, for example) -- the only operation that you are allowed to perform on it is assignment. In other words, you can't check whether an iterator is singular or not.
std::vector<int>::iterator iter = vec.begin();
vec.resize(vec.capacity() + 1);
// iter is now singular, you may only perform assignment on it,
// there is no way in general to determine whether it is singular or not
Non-portable answer: Yes - in Visual Studio
Visual Studio's STL iterators have a "debugging" mode which do exactly this. You wouldn't want to enable this in ship builds (there is overhead) but useful in checked builds.
Read about it on VC10 here (this system can and in fact does change every release, so find the docs specific to your version).
Edit Also, I should add: debug iterators in visual studio are designed to immediately explode when you use them (instead undefined behavior); not to allow "querying" of their state.
Usually you test it by checking if it is different from the end(), like
if (it != container.end())
{
// then dereference
}
Moreover using exception handling for replacing logic is bad in terms of design and performance. Your question is very good and it is definitively worth a replacement in your code. Exception handling like the names says shall only be used for rare unexpected issues.
Is there any way to check if a iterator (whether it is from a vector, a list, a deque...) is (still) dereferencable, i.e has not been invalidated ?
No, there isn't. Instead you need to control access to the container while your iterator exists, for example:
Your thread should not modify the container (invalidating the iterator) while it is still using an instantiated iterator for that container
If there's a risk that other threads might modify the container while your thread is iterating, then in order to make this scenario thread-safe your thread must acquire some kind of lock on the container (so that it prevents other threads from modifying the container while it's using an iterator)
Work-arounds like catching an exception won't work.
This is a specific instance of the more general problem, "can I test/detect whether a pointer is valid?", the answer to which is typically "no, you can't test for it: instead you have to manage all memory allocations and deletions in order to know whether any given pointer is still valid".
Trying and catching is not safe, you will not, or at least seldom throw if your iterator is "out of bounds".
what alemjerus say, an iterator can always be dereferenced. No matter what uglyness lies beneath. It is quite possible to iterate into other areas of memory and write to other areas that might keep other objects. I have been looking at code, watching variables change for no particular reason. That is a bug that is really hard to detect.
Also it is wise to remember that inserting and removing elements might potentially invalidate all references, pointers and iterators.
My best advice would be to keep you iterators under control, and always keep an "end" iterator at hand to be able to test if you are at the "end of the line" so to speak.
In some of the STL containers, the current iterator becomes invalid when you erase the current value of the iterator. This happens because the erase operation changes the internal memory structure of the container and increment operator on existing iterator points to an undefined locations.
When you do the following, iterator is incementented before it is passed to erase function.
if (something) l.erase(itd++);
Is there any way to check if an iterator is dereferencable
Yes, with gcc debugging containers available as GNU extensions. For std::list you can use __gnu_debug::list instead. The following code will abort as soon as invalid iterator is attempted to be used. As debugging containers impose extra overhead they are intended only when debugging.
#include <debug/list>
int main() {
__gnu_debug::list<int> l;
for (int i = 1; i < 10; i++) {
l.push_back(i * 10);
}
auto itd = l.begin();
itd++;
l.erase(itd);
/* now, in other place.. check if itd points to somewhere meaningful */
if (itd != l.end()) {
// blablabla
}
}
$ ./a.out
/usr/include/c++/7/debug/safe_iterator.h:552:
Error: attempt to compare a singular iterator to a past-the-end iterator.
Objects involved in the operation:
iterator "lhs" # 0x0x7ffda4c57fc0 {
type = __gnu_debug::_Safe_iterator<std::_List_iterator<int>, std::__debug::list<int, std::allocator<int> > > (mutable iterator);
state = singular;
references sequence with type 'std::__debug::list<int, std::allocator<int> >' # 0x0x7ffda4c57ff0
}
iterator "rhs" # 0x0x7ffda4c580c0 {
type = __gnu_debug::_Safe_iterator<std::_List_iterator<int>, std::__debug::list<int, std::allocator<int> > > (mutable iterator);
state = past-the-end;
references sequence with type 'std::__debug::list<int, std::allocator<int> >' # 0x0x7ffda4c57ff0
}
Aborted (core dumped)
The type of the parameters of the erase function of any std container (as you have listed in your question, i.e. whether it is from a vector, a list, a deque...) is always iterator of this container only.
This function uses the first given iterator to exclude from the container the element that this iterator points at and even those that follow. Some containers erase only one element for one iterator, and some other containers erase all elements followed by one iterator (including the element pointed by this iterator) to the end of the container. If the erase function receives two iterators, then the two elements, pointed by each iterator, are erased from the container and all the rest between them are erased from the container as well, but the point is that every iterator that is passed to the erase function of any std container becomes invalid! Also:
Each iterator that was pointing at some element that has been erased from the container becomes invalid, but it doesn't pass the end of the container!
This means that an iterator that was pointing at some element that has been erased from the container cannot be compared to container.end().
This iterator is invalid, and so it is not dereferencable, i.e. you cannot use neither the * nor -> operators, it is also not incrementable, i.e. you cannot use the ++ operator, and it is also not decrementable, i.e. you cannot use the -- operator.
It is also not comparable!!! I.E. you cannot even use neither == nor != operators
Actually you cannot use any operator that is declared and defined in the std iterator.
You cannot do anything with this iterator, like null pointer.
Doing something with an invalid iterator immediately stops the program and even causes the program to crash and an assertion dialog window appears. There is no way to continue program no matter what options you choose, what buttons you click. You just can terminate the program and the process by clicking the Abort button.
You don't do anything else with an invalid iterator, unless you can either set it to the begin of the container, or just ignore it.
But before you decide what to do with an iterator, first you must know if this iterator is either invalid or not, if you call the erase function of the container you are using.
I have made by myself a function that checks, tests, knows and returns true whether a given iterator is either invalid or not. You can use the memcpy function to get the state of any object, item, structure, class and etc, and of course we always use the memset function at first to either clear or empty a new buffer, structure, class or any object or item:
bool IsNull(list<int>::iterator& i) //In your example, you have used list<int>, but if your container is not list, then you have to change this parameter to the type of the container you are using, if it is either a vector or deque, and also the type of the element inside the container if necessary.
{
byte buffer[sizeof(i)];
memset(buffer, 0, sizeof(i));
memcpy(buffer, &i, sizeof(i));
return *buffer == 0; //I found that the size of any iterator is 12 bytes long. I also found that if the first byte of the iterator that I copy to the buffer is zero, then the iterator is invalid. Otherwise it is valid. I like to call invalid iterators also as "null iterators".
}
I have already tested this function before I posted it there and found that this function is working for me.
I very hope that I have fully answered your question and also helped you very much!
There is a way, but is ugly... you can use the std::distance function
#include <algorithms>
using namespace std
auto distance_to_iter = distance(container.begin(), your_iter);
auto distance_to_end = distance(container.begin(),container.end());
bool is_your_iter_still_valid = distance_to_iter != distance_to_end;
use erase with increment :
if (something) l.erase(itd++);
so you can test the validity of the iterator.
if (iterator != container.end()) {
iterator is dereferencable !
}
If your iterator doesnt equal container.end(), and is not dereferencable, youre doing something wrong.