Get a pointer to STL container an iterator is referencing?

Get a pointer to STL container an iterator is referencing? - c++

For example, the following is possible:
std::set<int> s;
std::set<int>::iterator it = s.begin();
I wonder if the opposite is possible, say,
std::set<int>* pSet = it->**getContainer**(); // something like this...

No, there is no portable way to do this.
An iterator may not even have a reference to the container. For example, an implementation could use T* as the iterator type for both std::array<T, N> and std::vector<T>, since both store their elements as arrays.
In addition, iterators are far more general than containers, and not all iterators point into containers (for example, there are input and output iterators that read to and write from streams).

No. You must remember the container that an iterator came from, at the time that you find the iterator.
A possible reason for this restriction is that pointers were meant to be valid iterators and there's no way to ask a pointer to figure out where it came from (e.g. if you point 4 elements into an array, how from that pointer alone can you tell where the beginning of the array is?).

It is possible with at least one of the std iterators and some trickery.
The std::back_insert_iterator needs a pointer to the container to call its push_back method. Moreover this pointer is protected only.
#include <iterator>
template <typename Container>
struct get_a_pointer_iterator : std::back_insert_iterator<Container> {
typedef std::back_insert_iterator<Container> base;
get_a_pointer_iterator(Container& c) : base(c) {}
Container* getPointer(){ return base::container;}
};
#include <iostream>
int main() {
std::vector<int> x{1};
auto p = get_a_pointer_iterator<std::vector<int>>(x);
std::cout << (*p.getPointer()).at(0);
}
This is of course of no pratical use, but merely an example of an std iterator that indeed carries a pointer to its container, though a quite special one (eg. incrementing a std::back_insert_iterator is a noop). The whole point of using iterators is not to know where the elements are coming from. On the other hand, if you ever wanted an iterator that lets you get a pointer to the container, you could write one.

Related

Can std::vector<T>::iterator simply be T*?

Simple theoretical question: would a simple pointer be a valid iterator type for std::vector?
For other containers (e.g. list, map), that would not be possible, but for std::vector the held data is guaranteed to be contiguous, so I see no reason why not.
As far as I know, some implementations (e.g. Visual Studio) do some safe checks on debug build. But that is in UB territory, so for well defined behavior I think there is no difference.
Apart for some checks ("modifying" undefined behavior), are there any advantages of using a class instead of a simple pointer for vector iterators?

would a simple pointer be a valid iterator type for std::vector?
Yes. And also for std::basic_string and std::array.
are there any advantages of using a class instead of a simple pointer for vector iterators?
It offers some additional type safety, so that logic errors like the following don't compile:
std::vector<int> v;
int i=0;
int* p = &i;
v.insert(p, 1); // oops, not an iterator!
delete v.begin(); // oops!
std::string s;
std::vector<char> v;
// compiles if string and vector both use pointers for iterators:
v.insert(s.begin(), '?');
std::array<char, 2> a;
// compiles if array and vector both use pointers for iterators:
v.erase(a.begin());

Yes, it can be T*, but that has the slightly annoying property that the ADL-associated namespace of std::vector<int>::iterator is not std:: ! So swap(iter1, iter2) may fail to find std::swap.

A food for thought - an iterator class can also be implemented by the terms of indexes instead of pointers
of course, when a vector reallocates, all the pointers , references and iterators become invalidated.
but at least for iterators, that doesn't have to be the case always, if the iterator holds an index + pointer to the vector, you can create non-reallocation-invalidated iterators that simply returns (*m_vector)[m_index]. the iterator is invalid when the vector dies out, or the index is invalid. in other words, the iterator is invalid only if the term vec[i] is invalid, regardless of reallocations.
this is a strictly non standard implementation of a vector iterator, but non the less an advantage to class based iterator rather than raw pointer.
also, UB doesn't state what should happen when an invalid iterator is being dereferenced. throwing an exception or logging an error fall under UB.
that means that an iterator which does bound checking is significantly slower, but for some cases where speed is not important, "safe but slow iterator" may have more advantages than "unsafe but fast iterator"

How to make idempotent taking a reference to a dereference of an iterator

The code bellow (-std=c++11) according to a "naive" view should work.
Instead it doesn't (should be known and understood why it doesn't).
Which is the shortest way of modifying the code (overloading &) in order to make it behave according to the "naive" view ?
Shouldn't that be given as an option during stl object creation (without writting too much) ?
#include <iostream>
#include <vector>
int main(int argc, char **argv)
{ std::vector<int> A{10,20,30};
auto i=A.begin();
auto j=&*i;
std::cout<<"i==j gives "<<(i==j)<<std::endl;
return 0;
}

The problem cannot be solved. There are three reasons it cannot be solved.
First problem
The operator & you need to overload is the operator & for the element type of the vector. You cannot overload operator & for arbitrary types, and in particular you can't overload it for built-in types (like int in your example).
Second problem
Presumably you want this to work for std::vector, std::array, and built-in arrays? Also probably std::list, std::deque, etc? You can't. The iterators for each of those contains will be different (in practise: in theory, some of them could share iterators, but I am not aware of any standard library where they do.)
Third problem
If you were prepared to accept that this would only work for std::vector<MyType>, then you could overload MyType::operator & - but you still couldn't work out which std::vector<MyType> the MyType object lives in (and you need that to obtain the iterator).

First of, in your code snippet i deducts to std::vector<int>::iterator and j deducts to int*. The compiler doesn't know how to compare std::vector<int>::iterator against int*.
For this to work out, you could provide an overloaded operator== that would compare vector iterators against vector value type pointers in the following manner:
template<typename T>
bool operator==(typename std::vector<T>::iterator it, T *i) {
return &(*it) == i;
}
template<typename T>
bool operator==(T *i, typename std::vector<T>::iterator it) {
return it == i;
}
Live Demo

This shouldn't work - not even "accoding to a 'naive"' view". Eventhough every pointer is an iterator the reverse is not necessarily true. Why would you expect that to work?
It would work under two scenarios:
The iterator of the std::vector<T> implementation is actually a T*. Then your code would work since decltype(i) == int* and decltype(j) == int*). This MAY be the case for some compilers but you shouldn't even rely on it if it was true for your compiler.
The dereference operator does not return an object of type T but rather something that is convertible to T and has an overloaded operator& which gives the iterator back. This is not the case for very good reasons.
You could -as other have suggested- overload operator== to check whether both indirections (pointer and iterator) reference the same object but I suspect that you want the address of operator to give you back the iterator which cannot be accomplished if the iterator is not a pointer because the object type which is stored in the vector has no notion of vector/iterator or whatever.
The problem isn't in the equality operator, what I need is to define the dereference operator to give an iterator
You can't. The dereference operator in question is std::vector<int>::iterator which is part of the standard library and you can (and should not) manipulate it.
Note that since C++11 in a std::vector<T, A>,
value_type is T and
reference is T&.
Furthermore, the following is true:
All input iterators i support *i which gives a value of type T which is the value type of that iterator.
The iterator of std::vector<T> is required to have T as its value type.
An iterator of std::vector<T> is an input iterator.

C++ vector iterators vs. pointers

There are so many alternative ways of addressing elements of a vector.
I could use a pointer like so:
vector<int> v = {10, 11, 12};
int *p = &v[0];
cout << *p; //Outputs "10"
I could use a pointer this way too:
vector<int> v = {10, 11, 12};
vector<int>::pointer p = v.data();
cout << *p; //Outputs "10"
I could also use the iterator type:
vector<int> v = {10, 11, 12};
vector<int>::iterator i = v.begin();
cout << *i; //Outputs "10"
Are there any significant differences that I'm missing here?

As far as being able to perform the task at hand, they all work equally well. After all, they all provide an object which meets the requirements of an iterator and you are using them to point at the same element of the vector. However, I would pick the vector<int>::iterator option because the type is more expressive about how we intend to use it.
The raw pointer type, int*, tells you very little about what p is, except that it stores the address of an int. If you think about p in isolation, its type doesn't tell you very much about how you can use it. The vector<int>::pointer option has the same issue - it just expresses the type of the objects it points at as being the element type of a vector. There's no reason it actually needs to point into a vector.
On the other hand vector<int>::iterator tells you everything you need to know. It explicitly states that the object is an iterator and that iterator is used to point at elements in a vector<int>.
This also has the benefit of being more easily maintainable if you ever happen to change the container type. If you changed to a std::list, for example, the pointer type just wouldn't work any more because the elements are not stored as a contiguous array. The iterator type of a container always provides you with a type you can use to iterate over its elements.
When we have Concepts, I'd expect the best practise to be something like:
ForwardIteratorOf<int> it = std::begin(v);
where ForwardIteratorOf<int> (which I am imagining exists) is changed to whatever concept best describes your intentions for it. If the type of the elements doesn't matter, then just ForwardIterator (or BidirectionalIterator, RandomAccessIterator, or whatever).

If you add the check:
if ( !v.empty() )
Then, all the example you've shown are equally valid.
If you are about to iterate over the elements of the vector, I would go with:
vector<int>::iterator i = v.begin();
It's easier to check whether the iterator has reached the end of the vector with an iterator than with the other forms.
if ( i != v.end() )
{
// Do stuff.
}

All these ways have their advantages, but at the core they are very similar. Some of them don't work though (they cause so-called "undefined behaviour") when the vector is empty.

According to cppreference:
A pointer to an element of an array satisfies all requirements of LegacyContiguousIterator
which is the most powerful iterator as it encompasses all other iterators functionality. So they can be one and the same, an iterator is just a means of making our code clear, consice and portable.
For example we could have some container "C"...
//template <typename T, int N> class C { //for static allocation
template <typename T> class C {
//T _data[N]; //for static allocation
T* _data; //need to dynamically allocate _data
public:
typedef T* iterator;
}
where C<int>::iterator would be an int* and there would be no difference.
Maybe we don't want/need the full power of a LegacyContiguousIterator so we could redefine C<int>::iterator
as another class that follows the outline for say LegacyForwardIterator. This new iterator class may redefine operator*. In this case it is implementation dependant and an int* may cause undefined behaviour when trying to access the elements.
This is why iterators should be preferred but in most cases they are going to be the same thing.
In both cases our container " C" will work just like other STL containers so long as we define all the other necessary member functions and typedefs.

Making a hash table of iterators in C++

I'm trying to accelerate a specific linked-list operation by hashing some of the node pointers. This is the code I'm using:
unordered_set< typename list< int >::iterator > myhashset;
In Visual Studio 2012, I get an "error C2338: The C++ Standard doesn't provide a hash for this type", since the compiler doesn't know how to hash iterators. Therefore, I need to implement my own hash function for list iterators like so:
struct X{int i,j,k;};
struct hash_X{
size_t operator()(const X &x) const{
return hash<int>()(x.i) ^ hash<int>()(x.j) ^ hash<int>()(x.k);
}
};
(wikipedia reference)
I'm having trouble figuring out what members of the iterator guarantee uniqueness (and, therefore, the members that I want to hash). Another concern is that those members may be private.
One solution that comes to mind is re-implementing and list::iterator, but that seems like a hack and introduces more code to maintain.

Use the address of the element that the iterator refers to.
struct list_iterator_hash {
size_t operator()(const list<int>::iterator &i) const {
return hash<int*>()(&*i);
}
};
But this will only work for dereferenceable iterators, not end() or list<int>::iterator().

You can use pointers to the element in place of iterators. Let's say you had a list of structs MyStruct. You can use
unordered_set<MyStruct*> myhashset;
and the C++ standard library already implements std::hash for any pointer.
So if you ever need to insert or search listIt then use &(*listIt) which will get the pointer of type MyStruct*.

Why does std::vector transfer its constness to the contained objects?

A const int * and an int *const are very different. Similarly with const std::auto_ptr<int> vs. std::auto_ptr<const int>. However, there appears to be no such distinction with const std::vector<int> vs. std::vector<const int> (actually I'm not sure the second is even allowed). Why is this?
Sometimes I have a function which I want to pass a reference to a vector. The function shouldn't modify the vector itself (eg. no push_back()), but it wants to modify each of the contained values (say, increment them). Similarly, I might want a function to only change the vector structure but not modify any of its existing contents (though this would be odd). This kind of thing is possible with std::auto_ptr (for example), but because std::vector::front() (for example) is defined as
const T &front() const;
T &front();
rather than just
T &front() const;
There's no way to express this.
Examples of what I want to do:
//create a (non-modifiable) auto_ptr containing a (modifiable) int
const std::auto_ptr<int> a(new int(3));
//this works and makes sense - changing the value pointed to, not the pointer itself
*a = 4;
//this is an error, as it should be
a.reset();
//create a (non-modifiable) vector containing a (modifiable) int
const std::vector<int> v(1, 3);
//this makes sense to me but doesn't work - trying to change the value in the vector, not the vector itself
v.front() = 4;
//this is an error, as it should be
v.clear();

It's a design decision.
If you have a const container, it usually stands to reason that you don't want anybody to modify the elements that it contains, which are an intrinsic part of it. That the container completely "owns" these elements "solidifies the bond", if you will.
This is in contrast to the historic, more lower-level "container" implementations (i.e. raw arrays) which are more hands-off. As you quite rightly say, there is a big difference between int const* and int * const. But standard containers simply choose to pass the constness on.

The difference is that pointers to int do not own the ints that they point to, whereas a vector<int> does own the contained ints. A vector<int> can be conceptualised as a struct with int members, where the number of members just happens to be variable.
If you want to create a function that can modify the values contained in the vector but not the vector itself then you should design the function to accept iterator arguments.
Example:
void setAllToOne(std::vector<int>::iterator begin, std::vector<int>::iterator end)
{
std::for_each(begin, end, [](int& elem) { elem = 1; });
}
If you can afford to put the desired functionality in a header, then it can be made generic as:
template<typename OutputIterator>
void setAllToOne(OutputIterator begin, OutputIterator end)
{
typedef typename iterator_traits<OutputIterator>::reference ref;
std::for_each(begin, end, [](ref elem) { elem = 1; });
}

One big problem syntactically with what you suggest is this: a std::vector<const T> is not the same type as a std::vector<T>. Therefore, you could not pass a vector<T> to a function that expects a vector<const T> without some kind of conversion. Not a simple cast, but the creation of a new vector<const T>. And that new one could not simply share data with the old; it would have to either copy or move the data from the old one to the new one.
You can get away with this with std::shared_ptr, but that's because those are shared pointers. You can have two objects that reference the same pointer, so the conversion from a std::shared_ptr<T> to shared_ptr<const T> doesn't hurt (beyond bumping the reference count). There is no such thing as a shared_vector.
std::unique_ptr works too because they can only be moved from, not copied. Therefore, only one of them will ever have the pointer.
So what you're asking for is simply not possible.

You are correct, it is not possible to have a vector of const int primarily because the elements will not assignable (requirements for the type of the element contained in the vector).
If you want a function that only modifies the elements of a vector but not add elements to the vector itself, this is primarily what STL does for you -- have functions that are agnostic about which container a sequence of elements is contained in. The function simply takes a pair of iterators and does its thing for that sequence, completely oblivious to the fact that they are contained in a vector.
Look up "insert iterators" for getting to know about how to insert something into a container without needing to know what the elements are. E.g., back_inserter takes a container and all that it cares for is to know that the container has a member function called "push_back".

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js