std::map, pointer to map key value, is this possible? - c++

std::map<std::string, std::string> myMap;
std::map<std::string, std::string>::iterator i = m_myMap.find(some_key_string);
if(i == m_imagesMap.end())
return NULL;
string *p = &i->first;
Is the last line valid?
I want to store this pointer p somewhere else, will it be valid for the whole program life?
But what will happen if I add some more elements to this map (with other unique keys) or remove some other keys, won’t it reallocate this string (key-value pair), so the p will become invalid?

Section 23.1.2#8 (associative container requirements):
The insert members shall not affect the validity of iterators and references to the container, and the erase members shall invalidate only iterators and references to the erased elements.
So yes storing pointers to data members of a map element is guaranteed to be valid, unless you remove that element.

First, maps are guaranteed to be stable; i.e. the iterators are not invalidated by element insertion or deletion (except the element being deleted of course).
However, stability of iterator does not guarantee stability of pointers! Although it usually happens that most implementations use pointers - at least at some level - to implement iterators (which means it is quite safe to assume your solution will work), what you should really store is the iterator itself.
What you could do is create a small object like:
struct StringPtrInMap
{
typedef std::map<string,string>::iterator iterator;
StringPtrInMap(iterator i) : it(i) {}
const string& operator*() const { return it->first; }
const string* operator->() const { return &it->first; }
iterator it;
}
And then store that instead of a string pointer.

If you're not sure which operations will invalidate your iterators, you can look it up pretty easily in the reference. For instance for vector::insert it says:
This effectively increases the vector size, which causes an automatic reallocation of the allocated storage space if, and only if, the new vector size surpases the current vector capacity. Reallocations in vector containers invalidate all previously obtained iterators, references and pointers.
map::insert on the other hand doesn't mention anything of the sort.
As Pierre said, you should store the iterator rather than the pointer, though.

Why are you wanting to do this?
You can't change the value of *p, since it's const std::string. If you did change it, then you might break the invariants of the container by changing the sort order of the elements.
Unless you have other requirements that you haven't given here, then you should just take a copy of the string.

Related

Is it safe to reference a value in unordered_map

For example:
#include <unordered_map>
class A{};
std::unordered_map<unsigned int, A> map {{0,{}},{1, {}},{2, {}}};
int main() {
A& a1 = map[1];
// some insert and remove operations ( key 1 never removed)
// ....
}
Is it safe to still use a1 to reference the value which key is "1", after a lot of insert operations?
In other word:
since std::vector will move elements if the capacity changed, a reference of it's element is not guarantee to be valid. Is this fact also fits for unordered_map?
Yes, it is safe. From the standard:
22.2.7 Unordered associative containers [unord.req]
The insert and emplace members shall not affect the validity of
references to container elements, but may invalidate all iterators to
the container. The erase members shall invalidate only iterators and
references to the erased elements, and preserve the relative order of
the elements that are not erased.
References are safe, but iterators are not!
Perhaps a "less authoritative" source, but easier to read:
References to elements in the unordered_map container remain valid in
all cases, even after a rehash
https://www.cplusplus.com/reference/unordered_map/unordered_map/operator[]/
after a lot of insert operations?
If you meant insert or emplace, then it's fine.
(emphasis mine)
If rehashing occurs due to the insertion, all iterators are invalidated. Otherwise iterators are not affected. References are not invalidated. Rehashing occurs only if the new number of elements is greater than max_load_factor()*bucket_count(). If the insertion is successful, pointers and references to the element obtained while it is held in the node handle are invalidated, and pointers and references obtained to that element before it was extracted become valid. (since C++17)

Containers and iterators (multiple choice)

STL iterators are used with container classes and are conceptually similar to pointers to specific elements stored in the container.
One of the statements below is true. Which one?
An iterator typically holds an address (pointer), and operator++ applied to the iterator always increases that address.
When iterator it goes out of scope in a program, it gets destructed, which automatically invokes delete it;.
For a valid STL container myC, when the expression myC.end()-myC.begin() is well-defined, it returns the same value as myC.size().
When a container goes out of scope, all iterators that point to it are automatically modified.
For a valid STL container myC, the iterator returned by myC.end() refers to the last valid element in myC.
Apparently the solution is 3. but I don't understand why. Can someone elaborate on why this is the case, and possibly show why the others are false as well?
Think of the requirements of the addresses of items in a linked-list (list). They don't need to be sequential in memory.
delete is something that's manually done on pointers, it wouldn't happen automatically (even if the pointer goes out of scope) (unless done in some API). Iterators are (generally) classes, so delete would not even apply. The iterator would get destructed though.
You can also probably classify a pointer as an iterator. But delete will still not be called automatically.
Note that this only applies to random access iterators. You can calculate the number of items in a container as follows:
int count = 0;
for (iterator it = begin(); it != end(); ++it, ++count) { }
so you increment begin() count times to get to end(),
so begin() + count = end(),
so end() - begin() = count, and count = size(),
so end() - begin() = size()
This is not the way C++ works. Although there are design patterns to achieve this behaviour, usually when modifying a class, it's your responsibility to ensure any dependent classes are updated if invalidated. When you'd try to use an iterator of a container that went out of scope, this would result in undefined behaviour.
end() is past the last element, probably with something like this in mind: (I'm sure among other reasons)
for (iterator it = begin(); it != end(); ++it)

vector::erase and reverse_iterator

I have a collection of elements in a std::vector that are sorted in a descending order starting from the first element. I have to use a vector because I need to have the elements in a contiguous chunk of memory. And I have a collection holding many instances of vectors with the described characteristics (always sorted in a descending order).
Now, sometimes, when I find out that I have too many elements in the greater collection (the one that holds these vectors), I discard the smallest elements from these vectors some way similar to this pseudo-code:
grand_collection: collection that holds these vectors
T: type argument of my vector
C: the type that is a member of T, that participates in the < comparison (this is what sorts data before they hit any of the vectors).
std::map<C, std::pair<T::const_reverse_iterator, std::vector<T>&>> what_to_delete;
iterate(it = grand_collection.begin() -> grand_collection.end())
{
iterate(vect_rit = it->rbegin() -> it->rend())
{
// ...
what_to_delete <- (vect_rit->C, pair(vect_rit, *it))
if (what_to_delete.size() > threshold)
what_to_delete.erase(what_to_delete.begin());
// ...
}
}
Now, after running this code, in what_to_delete I have a collection of iterators pointing to the original vectors that I want to remove from these vectors (overall smallest values). Remember, the original vectors are sorted before they hit this code, which means that for any what_to_delete[0 - n] there is no way that an iterator on position n - m would point to an element further from the beginning of the same vector than n, where m > 0.
When erasing elements from the original vectors, I have to convert a reverse_iterator to iterator. To do this, I rely on C++11's §24.4.1/1:
The relationship between reverse_iterator and iterator is
&*(reverse_iterator(i)) == &*(i- 1)
Which means that to delete a vect_rit, I use:
vector.erase(--vect_rit.base());
Now, according to C++11 standard §23.3.6.5/3:
iterator erase(const_iterator position); Effects: Invalidates
iterators and references at or after the point of the erase.
How does this work with reverse_iterators? Are reverse_iterators internally implemented with a reference to a vector's real beginning (vector[0]) and transforming that vect_rit to a classic iterator so then erasing would be safe? Or does reverse_iterator use rbegin() (which is vector[vector.size()]) as a reference point and deleting anything that is further from vector's 0-index would still invalidate my reverse iterator?
Edit:
Looks like reverse_iterator uses rbegin() as its reference point. Erasing elements the way I described was giving me errors about a non-deferenceable iterator after the first element was deleted. Whereas when storing classic iterators (converting to const_iterator) while inserting to what_to_delete worked correctly.
Now, for future reference, does The Standard specify what should be treated as a reference point in case of a random-access reverse_iterator? Or this is an implementation detail?
Thanks!
In the question you have already quoted exactly what the standard says a reverse_iterator is:
The relationship between reverse_iterator and iterator is &*(reverse_iterator(i)) == &*(i- 1)
Remember that a reverse_iterator is just an 'adaptor' on top of the underlying iterator (reverse_iterator::current). The 'reference point', as you put it, for a reverse_iterator is that wrapped iterator, current. All operations on the reverse_iterator really occur on that underlying iterator. You can obtain that iterator using the reverse_iterator::base() function.
If you erase --vect_rit.base(), you are in effect erasing --current, so current will be invalidated.
As a side note, the expression --vect_rit.base() might not always compile. If the iterator is actually just a raw pointer (as might be the case for a vector), then vect_rit.base() returns an rvalue (a prvalue in C++11 terms), so the pre-decrement operator won't work on it since that operator needs a modifiable lvalue. See "Item 28: Understand how to use a reverse_iterator's base iterator" in "Effective STL" by Scott Meyers. (an early version of the item can be found online in "Guideline 3" of http://www.drdobbs.com/three-guidelines-for-effective-iterator/184401406).
You can use the even uglier expression, (++vect_rit).base(), to avoid that problem. Or since you're dealing with a vector and random access iterators: vect_rit.base() - 1
Either way, vect_rit is invalidated by the erase because vect_rit.current is invalidated.
However, remember that vector::erase() returns a valid iterator to the new location of the element that followed the one that was just erased. You can use that to 're-synchronize' vect_rit:
vect_rit = vector_type::reverse_iterator( vector.erase(vect_rit.base() - 1));
From a standardese point of view (and I'll admit, I'm not an expert on the standard): From §24.5.1.1:
namespace std {
template <class Iterator>
class reverse_iterator ...
{
...
Iterator base() const; // explicit
...
protected:
Iterator current;
...
};
}
And from §24.5.1.3.3:
Iterator base() const; // explicit
Returns: current.
Thus it seems to me that so long as you don't erase anything in the vector before what one of your reverse_iterators points to, said reverse_iterator should remain valid.
Of course, given your description, there is one catch: if you have two contiguous elements in your vector that you end up wanting to delete, the fact that you vector.erase(--vector_rit.base()) means that you've invalidated the reverse_iterator "pointing" to the immediately preceeding element, and so your next vector.erase(...) is undefined behavior.
Just in case that's clear as mud, let me say that differently:
std::vector<T> v=...;
...
// it_1 and it_2 are contiguous
std::vector<T>::reverse_iterator it_1=v.rend();
std::vector<T>::reverse_iterator it_2=it_1;
--it_2;
// Erase everything after it_1's pointee:
// convert from reverse_iterator to iterator
std::vector<T>::iterator tmp_it=it_1.base();
// but that points one too far in, so decrement;
--tmp_it;
// of course, now tmp_it points at it_2's base:
assert(tmp_it == it_2.base());
// perform erasure
v.erase(tmp_it); // invalidates all iterators pointing at or past *tmp_it
// (like, say it_2.base()...)
// now delete it_2's pointee:
std::vector<T>::iterator tmp_it_2=it_2.base(); // note, invalid iterator!
// undefined behavior:
--tmp_it_2;
v.erase(tmp_it_2);
In practice, I suspect that you'll run into two possible implementations: more commonly, the underlying iterator will be little more than a (suitably wrapped) raw pointer, and so everything will work perfectly happily. Less commonly, the iterator might actually try to track invalidations/perform bounds checking (didn't Dinkumware STL do such things when compiled in debug mode at one point?), and just might yell at you.
The reverse_iterator, just like the normal iterator, points to a certain position in the vector. Implementation details are irrelevant, but if you must know, they both are (in a typical implementation) just plain old pointers inside. The difference is the direction. The reverse iterator has its + and - reversed w.r.t. the regular iterator (and also ++ and --, > and < etc).
This is interesting to know, but doesn't really imply an answer to the main question.
If you read the language carefully, it says:
Invalidates iterators and references at or after the point of the erase.
References do not have a built-in sense of direction. Hence, the language clearly refers to the container's own sense of direction. Positions after the point of the erase are those with higher indices. Hence, the iterator's direction is irrelevant here.

Does std::vector::swap invalidate iterators?

If I swap two vectors, will their iterators remain valid, now just pointing to the "other" container, or will the iterator be invalidated?
That is, given:
using namespace std;
vector<int> x(42, 42);
vector<int> y;
vector<int>::iterator a = x.begin();
vector<int>::iterator b = x.end();
x.swap(y);
// a and b still valid? Pointing to x or y?
It seems the std mentions nothing about this:
[n3092 - 23.3.6.2]
void swap(vector<T,Allocator>& x);
Effects:
Exchanges the contents and capacity()
of *this with that of x.
Note that since I'm on VS 2005 I'm also interested in the effects of iterator debug checks etc. (_SECURE_SCL)
The behavior of swap has been clarified considerably in C++11, in large part to permit the Standard Library algorithms to use argument dependent lookup (ADL) to find swap functions for user-defined types. C++11 adds a swappable concept (C++11 §17.6.3.2[swappable.requirements]) to make this legal (and required).
The text in the C++11 language standard that addresses your question is the following text from the container requirements (§23.2.1[container.requirements.general]/8), which defines the behavior of the swap member function of a container:
Every iterator referring to an element in one container before the swap shall refer to the same element in the other container after the swap.
It is unspecified whether an iterator with value a.end() before the swap will have value b.end() after the swap.
In your example, a is guaranteed to be valid after the swap, but b is not because it is an end iterator. The reason end iterators are not guaranteed to be valid is explained in a note at §23.2.1/10:
[Note: the end() iterator does not refer to any element, so it may be
invalidated. --end note]
This is the same behavior that is defined in C++03, just substantially clarified. The original language from C++03 is at C++03 §23.1/10:
no swap() function invalidates any references, pointers, or iterators referring to the elements of the containers being swapped.
It's not immediately obvious in the original text, but the phrase "to the elements of the containers" is extremely important, because end() iterators do not point to elements.
Swapping two vectors does not invalidate the iterators, pointers, and references to its elements (C++03, 23.1.11).
Typically the iterator would contain knowledge of its container, and the swap operation maintains this for a given iterator.
In VC++ 10 the vector container is managed using this structure in <xutility>, for example:
struct _Container_proxy
{ // store head of iterator chain and back pointer
_Container_proxy()
: _Mycont(0), _Myfirstiter(0)
{ // construct from pointers
}
const _Container_base12 *_Mycont;
_Iterator_base12 *_Myfirstiter;
};
All iterators that refer to the elements of the containers remain valid
As for Visual Studio 2005, I have just tested it.
I think it should always work, as the vector::swap function even contains an explicit step to swap everything:
// vector-header
void swap(_Myt& _Right)
{ // exchange contents with _Right
if (this->_Alval == _Right._Alval)
{ // same allocator, swap control information
#if _HAS_ITERATOR_DEBUGGING
this->_Swap_all(_Right);
#endif /* _HAS_ITERATOR_DEBUGGING */
...
The iterators point to their original elements in the now-swapped vector object. (I.e. w/rg to the OP, they first pointed to elements in x, after the swap they point to elements in y.)
Note that in the n3092 draft the requirement is laid out in §23.2.1/9 :
Every iterator referring to an
element in one container before the
swap shall refer to the same element
in the other container after the swap.

Persistant references in STL Containers

When using C++ STL containers, under what conditions must reference values be accessed?
For example are any references invalidated after the next function call to the container?
{
std::vector<int> vector;
vector.push_back (1);
vector.push_back (2);
vector.push_back (3);
vector[0] = 10; //modifies 0'th element
int& ref = vector[0];
ref = 10; //modifies 0'th element
vector.push_back (4);
ref = 20; //modifies 0'th element???
vector.clear ();
ref = 30; //clearly obsurd
}
I understand that in most implementations of the stl this would work, but I'm interested in what the standard declaration requires.
--edit:
Im interested becuase I wanted to try out the STXXL (http://stxxl.sourceforge.net/) library for c++, but I realised that the references returned by the containers were not persistent over multiple reads, and hence not compatible without making changes (however superficial) to my existing stl code. An example:
{
std::vector<int> vector;
vector.push_back (1);
vector.push_back (2);
int& refA = vector[0];
int& refB = vector[1]; //refA is not gaurenteed to be valid anymore
}
I just wanted to know if this meant that STXXL containers where not 100% compatible, or indeed if I had been using STL containers in an unsafe/implementation dependant way the whole time.
About inserting into vectors, the standard says in 23.2.4.3/1:
[insert()] causes reallocation if the
new size is greater than the old
capacity. If no reallocation happens,
all the iterators and references
before the insertion point remain
valid.
(Although this in fact this talks about insert(), Table 68 indicates that a.push_back(x) must be equivalent to a.insert(a.end(), x) for any vector a and value x.) This means that if you reserve() enough memory beforehand, then (and only then) iterators and references are guaranteed not to be invalidated when you insert() or push_back() more items.
Regarding removing items, 23.2.4.3/3 says:
[erase()] invalidates all the
iterators and references after the
point of the erase.
According to Table 68 and Table 67 respectively, pop_back() and clear() are equivalent to appropriate calls to erase().
Some basic rules for vector:
Reallocation invalidates all
references, pointers, and iterators
for elements of the vector.
Insertions may invalidate references,
pointers, and iterators.
Inserting or removing elements
invalidates references, pointers, and
iterators that refer to the following
elements.
If an insertion causes reallocation,
it invalidates all references,
iterators, and pointers.
I expect that references would be invalidated only by any explicit or implicit resize() (see also the max_size, capacity, and reserve methods).
Vector will invalidate its iterator and references when it reallocates, which depends upon its current capacity. Although the above code might work in some cases, you shouldn't rely on this as the reference might be invalidated after the push_back(4) call.