This question already has an answer here:
Behavior when dereferencing the .end() of a vector of strings
(1 answer)
Closed last year.
I have a list of sets:
std::list<std::set<int>> nn = {{1,2},{4,5,6}};
and I want to print out the element which end() refers to:
for (auto el : nn){
std::cout << *el.end() << std::endl;
}
What I get as a result is:
2 and 3.
I do not know where do these values come from. Can someone help me plz?
Question 1
What does end() refere to
Answer
end() is a public member function of std::set that returns an iterator to the past-the-end element in the set container.
Question 2
I do not know where do these values come from.
Answer
When you wrote:
std::cout << *el.end() << std::endl;//this is undefined behavior
In the above statement you are dereferencing the iterator that was returned by the end() member function.
But note that if we dereference the iterator that was returned by this member function then we get undefined behavior.
Undefined behavior means anything1 can happen including but not limited to the program giving your expected output. But never rely(or make conclusions based) on the output of a program that has undefined behavior.
1For a more technically accurate definition of undefined behavior see this where it is mentioned that: there are no restrictions on the behavior of the program.
It is not allowed to de-reference the end() iterator. Doing so causes undefined behavior. It doesn't refer to any element of the container, but one past the last element.
The reason that end() "points" after the last element, is that it is necessary to distinguish empty containers. If end() was referring to the last element and begin() to the first, then if begin() == end() that would mean that there is one element in the container and we can't distinguish the case of an empty container.
For containers that support it, to access the last element of the container you can use .back(), which will return a reference, not an iterator. But this is only allowed if there is a last element, i.e. if the container is not empty. Otherwise you have again undefined behavior. So check .empty() first if necessary.
std::set does not have the back() member and is not really intended to be used this way, but if you really want to access the last element, which is not the last element in the constructor initializer list, but the last element in the < order of the elements, then you can use std::prev(el.end()) or el.rbegin() ("reverse begin") which will give you an iterator to the last element. Again, dereferencing this iterator is only allowed if the container is not empty. (For std::prev forming the iterator itself isn't even allowed if the container is empty.)
Undefined behavior means that you will have no guarantees on the program behavior. It could output something in one run and something else in another. It could also output nothing, etc.
There is no requirement that you will get the output you see. For example, current x86_64 Clang with libc++ as standard library implementation, compiles (with or without optimization) a program that prints 0 twice. https://godbolt.org/z/nWMss1fqe
Practically speaking, assuming the compiler didn't take advantage of the undefined behavior for an optimization that drastically changes the program from the "intended" program flow, you will likely, depending on the implementation of the standard library, read some internal memory of the std::set implementation in the standard library, get a segmentation fault if the indirection points to inaccessible memory or incidentally (with no guarantees) refer to other values in the container.
Related
In C++-STL, set::end() returns an iterator pointing to past-the-last element of the set container. Since it does not refer to a valid element, it cannot de-referenced end() function returns a bidirectional iterator.
But when I execute the following code:
set<int> s;
s.insert(1);
s.insert(4);
s.insert(2);
// iterator pointing to the end
auto pos2 = s.end();
cout<<*pos2;
it prints 3 as output. The output increases as I insert more elements to the set and is always equal to the total number of elements in the set.
Why does this happen?
Dereferencing the end() iterator is undefined behavior, so anything is allowed to happen. Ideally you'd get a crash, but unfortunately that doesn't seem to be the case here and everything "seems" to work.
Since it does not refer to a valid element, it cannot de-referenced
It can, as your test code demonstrated. However, it shouldn't be dereferenced.
Although it is undefined behaviour, in this particular case the observed behaviour could be due to implementation details of the standard library in use.
std::set::size() has O(1) complexity, but std::set is a node-based container (internally a binary search tree). So the size needs to be stored somewhere withing the data structure. It could be that the end() iterator points at a location that doubles as storage for the size, and by pure chance, you're able to access it.
I wrote some code that takes iterators but have to do comparison in reversed order,
template<class ConstBiIter>
bool func(ConstBiIter seq_begin, ConstBiIter seq_end)
{
ConstBiIter last = std::prev(seq_end);
while (--last != std::prev(seq_begin)) // --> I need to compare the beginning data
{
......
}
return true;
}
In VS2013, when running in Debug mode, --last != std::prev(seq_begin) will cause debugger assertion fail with the error message
Expression:string iterator + offset out of range.
but it is perfectly OK when running in Release mode and giving out correct result, because there's no boundary check in Released mode.
My questions are:
Is it safe to use std::prev(some_container.begin()) as sentry like some_container.rend()?
How can I directly compare a reverse_iterator with an iterator? If I write the code:
std::cout << (std::prev(some_container.begin())==some_container.rend()) << std::endl; it won't compile, even if you reinterpret_cast them.
I am curious if prev(some_container.begin()) equals some_container.rend() physically?
No, it's not safe to try and decrement the begin iterator.
std::reverse_iterator (which is what is returned by std::rend) does not actually, underneath, contain an iterator before the begin iterator. It stores an underlying iterator to the next element from the one it conceptually points to. Therefore, when the reverse iterator is "one past the end" (i.e. "before the beginning") its underlying iterator (that you get by calling base()) is the begin iterator.
Undefined behavior is not safe, even if it works today in your test. In C++, "it worked when I tried it" is not good evidence that you are doing it correctly: one of the most common types of undefined behavior is "it seems to work".
The problem is that undefined behavior working is fundamentally fragile. It can break if you breathe on it hard.
The compiler is free to optimize branches and code reached only via undefined behavior away, and in many cases does just that. It is even free to do so after a service patch, compiler upgrade, seemingly irrelevant change in flags passed to compiler, or the length of the executable path name. It is free to work fine 99.9% of the time, then format your hard drive the other 0.1% of the time.
Some of these are more likely than others.
While iterators to std::string and std::vector elements are basically pointers in release, and the compiler can even typedef a pointer to be said iterators, even that assumption can fail when the next compiler version uses wrapped pointers.
Undefined behavior is left in the C++ standard to allow freedom for compiler writers to generate more optimal code. If you invoke it, you can step on their toes.
That being said, there are reasons to use behavior undefined by the C++ standard. When you do so, document it heavily, isolate it, and make sure the payoff (say, delegates twice as fast as std::function) is worth it.
The above is non-isolated and not worth doing undefined behavior, especially because you can solve it without the undefined behavior.
The easiest solution if you want to iterate backwards is to make some reverse iterators.
template<class ConstBiIter>
bool func(ConstBiIter seq_begin, ConstBiIter seq_end)
{
std::reverse_iterator<ConstBiIter> const rend(seq_beg);
for (std::reverse_iterator<ConstBiIter> rit(seq_end); rit != rend; ++rit)
{
......
}
return true;
}
Now rfirst iterates over the range backwards.
If you need to get back to a forward iterator that refers to the same element for whatever reason, and you are not rend, you can std::prev(rit.base()). If rit == seq_end at that point, that is undefined behavior.
24.5.1 Reverse iterators
Class template reverse_iterator is an iterator adaptor that iterates from the end of the sequence defined by its underlying iterator to the beginning of that sequence. The fundamental relation between a reverse iterator and its corresponding iterator i is established by the identity: &*(reverse_iterator(i)) == &*(i - 1).
The default iterator loop that you find on the Internet looks like this:
for(vector<int>::iterator it=myVector.begin(); it!=myVector.end(); it++)
Now I want to try some more fancy stuff with iterators. I was thinking about run through e.g. every third element of a vector (it + 3 as the incrementor), but I fear that this behaviour explodes if I use a different compiler or a different data set as it + 3 might not be not equal to vector::end(), but at the same time not point to something valid as well.
So I wanted to know if it is always true that if
it >= myVector.end()
then it is not pointing to an element in my vector? Can I use < instead of != and be safe that I won't run into compiler-implementation-specific problems?
Thank you very much!
The only iterators that are defined on a vector are iterators pointing to one of its element and the end iterator. Any iterator "greater than" end is undefined anyway. Thus, although your comparison might work, it does not add any value as the added > only makes a difference when comparing with an undefined iterator. As using an undefined iterator implies undefined behaviour, you never ever want to have such an iterator anyway. In addition, this also answers your question: Comparing with an undefined iterator is of course also undefined and therefore not guaranteed to yield a meaningful result.
If you want to check, whether an element is in this vector or in another, then >= won't help you, too, as comparing these iterators is undefined and will usually boil down to a pointer comparison, so the result depends on which vector has a lower address.
So all in all, using >= here simply makes no sense and should therefore be avoided.
As your comment states, you want to iterate through every n-th element of a vector. There a various ways to do this in a defined manner, e.g.:
(it - vec.begin()) + n < vec.size()
Why >=, instead of ==? The end iterator is guaranteed to compare greater than any valid iterator it is not equal to.
std::unordered_set::erase() has 3 overloads: In the one taking a reference, passing an "invalid" value, i.e. one that doesn't exist in the set, simply makes erase() return 0. But what about the other two overloads?
Does the C++11 standard say what erase() should do in this case, or it's compiler dependent? Is it supposed to return end() or undefined behavior?
I couldn't find an answer in the specification, cppreference.com, cplusplus.com. On IBM site they say it returns end() if no element remains after the operation, but what happens if the operation itself fails due to an invalid iterator?
And in general, do erase() methods for STL containers simply have undefined behavior in these case?
(so I need to check my iterators before I pass any to erase(), or use the unordered_set::erase() overload which takes a value_type reference, which would simply return 0 if it fails)
There is a big semantic difference between trying to remove a value that doesn't occur in set and trying to erase from a invalid iterator.
Trying to use an invalid iterator is undefined behaviour and will end badly.
Do you have a specific use-case you are thinking of when you might want to erase an invalid iterator?
These are two completely different cases. There is no "invalid value", values that don't exist in the set are still valid. So you pass a valid value that s not contained in the set and thus get 0 returned - no elements have been erased.
The other overloads are completely different. The standard requires the iterators passed to the erase methods to be "valid and dereferencable" and "a valid iterator range", respectively. Otherwise the behavior is undefined.
So yes, iterators have to be valid. But you cannot check if an iterator is valid programmatically - you have to make sure from your program logic, that they are.
I have a problem with assigning an unintialized to an initialized iterator. The following code excerpt produces an access violation when built with Visual Studio 2010. In previous versions of Visual Studio the code should work.
#include <list>
int main() {
std::list<int> list;
std::list<int>::iterator it = list.begin();
std::list<int>::iterator jt;
it = jt; // crashes in VS 2010
}
Wouldn't this be considered valid C++?
I need this code to implement a "cursor" class that either points nowhere or to a specific element in a list. What else could I use as a value for an uninitialized iterator if I don't have a reference to my container yet?
it = jt; // crashes in VS 2010
This invokes undefined behaviour (UB). According to the C++ Standard ,jt is a singular iterator which is not associated with any container, and results of most expressions are undefined for singular iterator.
The section ยง24.1/5 from the C++ Standard (2003) reads (see the bold text specifically),
Just as a regular pointer to an array
guarantees that there is a pointer
value pointing past the last element
of the array, so for any iterator type
there is an iterator value that points
past the last element of a
corresponding container. These values
are called past-the-end values. Values
of an iterator i for which the
expression *i is defined are called
dereferenceable. The library never
assumes that past-the-end values are
dereferenceable. Iterators can also
have singular values that are not
associated with any container.
[Example: After the declaration of an
uninitialized pointer x (as with int*
x;), x must always be assumed to have
a singular value of a pointer.]
Results of most expressions are
undefined for singular values; the
only exception is an assignment of a
non-singular value to an iterator that
holds a singular value. In this case
the singular value is overwritten the
same way as any other value.
Dereferenceable values are always
nonsingular.
If MSVS2010 crashes this, it is one of infinite possibilities of UB, for UB means anything could happen; the Standard doesn't prescribe any behavior.
C++11, 24.2.1/3:
Results of most expressions are undefined for singular values; the
only exceptions are destroying an iterator that holds a singular
value, the assignment of a non-singular value to an iterator that
holds a singular value, and, for iterators that satisfy the
DefaultConstructible requirements, using a value-initialized iterator
as the source of a copy or move operation.
The list is limitative, and your example isn't listed in the allowed exceptions. jt is singular and default-initialized. Therefore it may not be used as the source of a copy operation.
You need a KNOWN value to use a signal. You don't have that unless you have a container to get .end() from, which you think is your problem.
What you really need to do is get away from thinking that you can use 'special' iterator values for oddball cases that don't involve a container. Iterators, while they work a lot like pointers, are NOT pointers. They don't have the equivalent of 'NULL'.
Instead, use a boolean flag value to see if the container is set or not, and make sure the iterators (all of them, if you have more than one) get set to some valid value when the container becomes known, and the flag gets set back to false when you lose the container. Then you can check the flag before any iterator operations.
list.end() points anywhere beyond the container, so we can consider it like pointing nowhere.
Also accessing unitialized variable causes undefined behavior.