Is there any way to convert a vector::iterator into a pointer without having access to the vector itself? This works only if the vector is available:
typedef vector<int> V;
int* to_pointer1( V& v, V::iterator t )
{
return v.data() + (t-v.begin() );
}
But how can this be done without having the vector around? This wouldn't work:
int* to_pointer2( V::iterator t )
{
// the obvious approach doesn't work for the end iterator
return &*t;
}
In fact I would have expected this to be a very basic and popular question but for some reason I couldn't find it.
In general you are not guaranteed that there is a pointer corresponding to an iterator that refers to an ordinary vector item.
In particular std::vector<bool> is specialized so that &*it won't even compile.
However, for other vectors it's only1 the formally-pedantic that stops you from doing &*it also for the end iterator. In C99 it is, as I recall, formally valid to do &*p when p is an end pointer, and a goal of std::vector is to support all ordinary raw array notation. If I needed this I would just ignore the formal-pedantic, and write code with a view towards some future revision of the standard that would remove that imagined obstacle.
So, in practice, just do &*it. :)
#include <iostream>
#include <vector>
using namespace std;
auto main() -> int
{
vector<int> x( 5 );
cout << &*x.begin() << " " << &*x.end() << endl;
cout << &*x.end() - &*x.begin() << endl; // Works nicely IN PRACTICE.
}
But do remember that you can't do this for vector<bool>.
1) In a comment elsewhere user pentadecagon remarks: “Try -D_GLIBCXX_DEBUG, as documented here: gcc.gnu.org/onlinedocs/libstdc++/manual/debug_mode_using.html.” g++ often provides some means of bringing any formal UB to the front, e.g. “optimization” of formally UB loops. A solution in this particular case is simply to not ask it to do this, but more generally one may have to explicitly ask it to not do it.
No, this is not currently possible. n3884 Contiguous Iterators: A Refinement of Random Access Iterators proposes a free function std::pointer_from which would satisfy your requirement:
Expression: std::pointer_from(a)
Return type: reference
Operational semantics: if a is dereferenceable, std::address_of(*a); otherwise, if a is reachable from a dereferenceable iterator b, std::pointer_from(b) + (a – b); otherwise a valid pointer value ([basic.compound]).
What you are asking is not possible for the end iterator; By definition, the end iterator points to a hypothetical element past the end of the array (i.e. de-referencing it, is always UB).
You said:
#Nick I want it to work for any valid iterator, including the end. It should be possible, because to_pointer1 also covers this case.
to_pointer1 doesn't cover this case. to_pointer1 returns an invalid memory address (Actually the address is probably valid, but there is no data there).
I use this template:
template<typename It>
inline auto ptr(It const&it) -> decltype(std::addressof(*it))
{ return std::addressof(*it); }
In practice, this will work for most iterators. Exceptions are objects where either *it is not defined (vector<bool>::iterator) or where it points to nowhere (rather than to an past-the-last element). For your particular purpose, it shall be okay, except for vector<bool> (when the concept of a pointer is not sensible).
Related
I'm studying C++ and I'm reading about STL containers,iterators and the operations that can be performed on them.
I know that every container type (or better, the corresponding template of which each type is an instance) defines a companio type that acts like a pointer-like type and it's called iterator. What I understand is that once you get an iterator to a container,performing operations like adding an element may invalidate that iterator,so I tried to test this statement with an example:
#include <vector>
#include <iostream>
using namespace std;
int main()
{
vector<int> ivec={1,2,3,4,5,6,7,8,9,0};
auto beg=ivec.begin();
auto mid=ivec.begin()+ivec.size()/2;
while (beg != mid) {
if (*beg==2)
ivec.insert(beg,0);
++beg;
}
for (auto i:ivec)
cout<<i<<" ";
}
here,I'm simply contructing a vector of ints, brace initialize it,and performing a condition based operation,inserting an element in the first half of the container.
The code is flawed I think, because I'm initializing two iterator objects beg
and end and then I use them in the while statement as a condition.
BUT, if the code should change the contents of the container (and it sure does) what happens to the iterators?
The code seems to run just fine,it add a 0 in the ivec[1] position and prints the result.
What I thought is that the beg iterator would point to the newly added element and that the mid iterator would have pointed to the element before the formerly pointed to by mid (it's like the iterators point to the same memory locations while the underlying array,"slides" under.. unless it's reallocated that is)
Can someone explain me this behaviour??
When the standard says iterators are invalidated, this does not guarantee that they will be invalid in the sense of preventing your program from working. Using an invalid iterator is undefined behavior which is a huge and important topic in C++. It doesn't mean your program will crash, but it might. Your program might also do something else--the behavior is completely undefined.
take a look at the following code:
#include <algorithm>
#include <deque>
#include <iostream>
using namespace std;
int main()
{
deque<int> in {1,2,3};
deque<int> out;
// line in question
move(in.begin(), in.end(), out.begin());
for(auto i : out)
cout << i << endl;
return 0;
}
This will not move anything. Looking at the example here, one must write the line in question like this:
move(in.begin(), in.end(), std::back_inserter(out));
This makes sense in a way, as std::move expects its first two arguments to be InputInterators (which is satisfied here) and the third one to be an OutputIterator (which out.begin() is not).
What does actually happen if the original code is executed and move is passed an iterator that is not an OutputIterator? Why does C++'s type-safety not work here? And why is the construction of an output-iterator delegated to an external function, i.e. why does out.backInserter() not exist?
The original code tries to dereference and increment out.begin(). Since out is empty, that's a past-the-end iterator, and it can't be dereferenced or incremented. Doing so gives undefined behaviour.
std::move expects [...] the third one to be an OutputIterator (which out.begin() is not).
Yes it is. Specifically, it's a mutable random access iterator, which supports all the operations required of an output iterator, and more.
What does actually happen if the original code is executed and move is passed an iterator that is not an OutputIterator?
That would cause a compile error if the iterator didn't support the operations required of an output iterator needed by the function; or undefined behaviour if the operations existed but did something other than that required of an output iterator.
Why does C++'s type-safety not work here?
Because the type is correct. The incorrect runtime state (being a past-the-end iterator, not the start of a sequence with at least as many elements as the input range) can't be detected through the static type system.
why does out.backInserter() not exist?
That would have to be written separately for all sequence containers: both the standard ones, and any others you might define yourself. The generic function only has to be implemented once, in the standard library, to be usable for any container that supports push_back.
There are so many alternative ways of addressing elements of a vector.
I could use a pointer like so:
vector<int> v = {10, 11, 12};
int *p = &v[0];
cout << *p; //Outputs "10"
I could use a pointer this way too:
vector<int> v = {10, 11, 12};
vector<int>::pointer p = v.data();
cout << *p; //Outputs "10"
I could also use the iterator type:
vector<int> v = {10, 11, 12};
vector<int>::iterator i = v.begin();
cout << *i; //Outputs "10"
Are there any significant differences that I'm missing here?
As far as being able to perform the task at hand, they all work equally well. After all, they all provide an object which meets the requirements of an iterator and you are using them to point at the same element of the vector. However, I would pick the vector<int>::iterator option because the type is more expressive about how we intend to use it.
The raw pointer type, int*, tells you very little about what p is, except that it stores the address of an int. If you think about p in isolation, its type doesn't tell you very much about how you can use it. The vector<int>::pointer option has the same issue - it just expresses the type of the objects it points at as being the element type of a vector. There's no reason it actually needs to point into a vector.
On the other hand vector<int>::iterator tells you everything you need to know. It explicitly states that the object is an iterator and that iterator is used to point at elements in a vector<int>.
This also has the benefit of being more easily maintainable if you ever happen to change the container type. If you changed to a std::list, for example, the pointer type just wouldn't work any more because the elements are not stored as a contiguous array. The iterator type of a container always provides you with a type you can use to iterate over its elements.
When we have Concepts, I'd expect the best practise to be something like:
ForwardIteratorOf<int> it = std::begin(v);
where ForwardIteratorOf<int> (which I am imagining exists) is changed to whatever concept best describes your intentions for it. If the type of the elements doesn't matter, then just ForwardIterator (or BidirectionalIterator, RandomAccessIterator, or whatever).
If you add the check:
if ( !v.empty() )
Then, all the example you've shown are equally valid.
If you are about to iterate over the elements of the vector, I would go with:
vector<int>::iterator i = v.begin();
It's easier to check whether the iterator has reached the end of the vector with an iterator than with the other forms.
if ( i != v.end() )
{
// Do stuff.
}
All these ways have their advantages, but at the core they are very similar. Some of them don't work though (they cause so-called "undefined behaviour") when the vector is empty.
According to cppreference:
A pointer to an element of an array satisfies all requirements of LegacyContiguousIterator
which is the most powerful iterator as it encompasses all other iterators functionality. So they can be one and the same, an iterator is just a means of making our code clear, consice and portable.
For example we could have some container "C"...
//template <typename T, int N> class C { //for static allocation
template <typename T> class C {
//T _data[N]; //for static allocation
T* _data; //need to dynamically allocate _data
public:
typedef T* iterator;
}
where C<int>::iterator would be an int* and there would be no difference.
Maybe we don't want/need the full power of a LegacyContiguousIterator so we could redefine C<int>::iterator
as another class that follows the outline for say LegacyForwardIterator. This new iterator class may redefine operator*. In this case it is implementation dependant and an int* may cause undefined behaviour when trying to access the elements.
This is why iterators should be preferred but in most cases they are going to be the same thing.
In both cases our container " C" will work just like other STL containers so long as we define all the other necessary member functions and typedefs.
Consider this hypothetical implementation of vector:
template<class T> // ignore the allocator
struct vector
{
typedef T* iterator;
typedef const T* const_iterator;
template<class It>
void insert(iterator where, It begin, It end)
{
...
}
...
}
Problem
There is a subtle problem we face here:
There is the possibility that begin and end refer to items in the same vector, after where.
For example, if the user says:
vector<int> items;
for (int i = 0; i < 1000; i++)
items.push_back(i);
items.insert(items.begin(), items.end() - 2, items.end() - 1);
If It is not a pointer type, then we're fine.
But we don't know, so we must check that [begin, end) does not refer to a range already inside the vector.
But how do we do this? According to C++, if they don't refer to the same array, then pointer comparisons would be undefined!
So the compiler could falsely tell us that the items don't alias, when in fact they do, giving us unnecessary O(n) slowdown.
Potential solution & caveat
One solution is to copy the entire vector every time, to include the new items, and then throw away the old copy.
But that's very slow in scenarios such as in the example above, where we'd be copying 1000 items just to insert 1 item, even though we might clearly already have enough capacity.
Is there a generic way to (correctly) solve this problem efficiently, i.e. without suffering from O(n) slowdown in cases where nothing is aliasing?
You can use the predicates std::less etc, which are guaranteed to give a total order, even when the raw pointer comparisons do not.
From the standard [comparisons]/8:
For templates greater, less, greater_equal, and less_equal, the specializations for any pointer type yield a total order, even if the built-in operators <, >, <=, >= do not.
But how do we do this? According to C++, if they don't refer to the same array, then pointer comparisons would be undefined!
Wrong. The pointer comparisons are unspecified, not undefined. From C++03 §5.9/2 [expr.rel]:
[...] Pointers to objects or functions of the same type (after pointer conversions) can be compared, with a result defined as follows:
[...]
-Other pointer comparisons are unspecified.
So it's safe to test if there is an overlap before doing the expensive-but-correct copy.
Interestingly, C99 differs from C++ in this, in that pointer comparisons between unrelated objects is undefined behavior. From C99 §6.5.8/5:
When two pointers are compared, the result depends on the relative locations in the address space of the objects pointed to. [...] In all other cases, the behavior is undefined.
Actually, this would be true even if they were regular iterators. There's nothing stopping anyone doing
std::vector<int> v;
// fill v
v.insert(v.end() - 3, v.begin(), v.end());
Determining if they alias is a problem for any implementation of iterators.
However, the thing you're missing is that you're the implementation, you don't have to use portable code. As the implementation, you can do whatever you want. You could say "Well, in my implementation, I follow x86 and < and > are fine to use for any pointers.". And that would be fine.
For example, the following is possible:
std::set<int> s;
std::set<int>::iterator it = s.begin();
I wonder if the opposite is possible, say,
std::set<int>* pSet = it->**getContainer**(); // something like this...
No, there is no portable way to do this.
An iterator may not even have a reference to the container. For example, an implementation could use T* as the iterator type for both std::array<T, N> and std::vector<T>, since both store their elements as arrays.
In addition, iterators are far more general than containers, and not all iterators point into containers (for example, there are input and output iterators that read to and write from streams).
No. You must remember the container that an iterator came from, at the time that you find the iterator.
A possible reason for this restriction is that pointers were meant to be valid iterators and there's no way to ask a pointer to figure out where it came from (e.g. if you point 4 elements into an array, how from that pointer alone can you tell where the beginning of the array is?).
It is possible with at least one of the std iterators and some trickery.
The std::back_insert_iterator needs a pointer to the container to call its push_back method. Moreover this pointer is protected only.
#include <iterator>
template <typename Container>
struct get_a_pointer_iterator : std::back_insert_iterator<Container> {
typedef std::back_insert_iterator<Container> base;
get_a_pointer_iterator(Container& c) : base(c) {}
Container* getPointer(){ return base::container;}
};
#include <iostream>
int main() {
std::vector<int> x{1};
auto p = get_a_pointer_iterator<std::vector<int>>(x);
std::cout << (*p.getPointer()).at(0);
}
This is of course of no pratical use, but merely an example of an std iterator that indeed carries a pointer to its container, though a quite special one (eg. incrementing a std::back_insert_iterator is a noop). The whole point of using iterators is not to know where the elements are coming from. On the other hand, if you ever wanted an iterator that lets you get a pointer to the container, you could write one.