Iterator vs. Reference vs. Pointer [closed]

Iterator vs. Reference vs. Pointer [closed] - c++

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I have a class that spawns an arbitrary number of worker object that compute their results into a std::vector. I'm going to remove some of the worker objects at certain points but I'd like to keep their results in a certain ordering only known to the class that spawned them. Thus I'm providing the vectors for the output in the class A.
I have (IMO) three options: I could either have pointers to the vectors, references or iterators as members. While the iterator option has certain draw backs (The iterator could be incremented.) I'm unsure if pointers or references are clearer.
I feel references are better because they can't be NULL and a cruncher would require the presence of a vector.
What I'm most unsure about is the validity of the references. Will they be invalidated by some operations on the std::list< std::vector<int> >? Are those operations the same as invalidating the iterators of std::list? Is there another approach I don't see right now? Also the coupling to a container doesn't feel right: I force a specific container to the Cruncher class.
Code provided for clarity:
#include <list>
#include <vector>
#include <boost/ptr_container/ptr_list.hpp>
class Cruncher {
std::vector<int>* numPointer;
std::vector<int>& numRef;
std::list< std::vector<int> >::iterator numIterator;
public:
Cruncher(std::vector<int>*);
Cruncher(std::vector<int>&);
Cruncher(std::list< std::vector<int> >::iterator);
};
class A {
std::list< std::vector<int> > container;
boost::ptr_list< std::vector<int> > container2;
std::vector<Cruncher> cruncherList;
};

If an iterator is invalidated, it would also invalidate a pointer/reference that the iterator was converted into. If you have this:
std::vector<T>::iterator it = ...;
T *p = &(*it);
T &r = *p;
if the iterator is invalidated (for example a call to push_back can invalidate all existing vector iterators), the pointer and the reference will also be invalidated.
From the standard 23.2.4.2/5 (vector capacity):
Notes: Reallocation invalidates all the references, pointers, and iterators referring to the elements in the sequence.
The same general principal holds for std::list. If an iterator is invalidated, the pointers and references the iterator is converted into are also invalidated.
The difference between std::list and std::vector is what causes iterator invalidation. A std::list iterator is valid as long as you don't remove the element it is referring to. So where as std::vector<>::push_back can invalidate an iterator, std::list<>::push_back cannot.

If the parent's vector's contents is re-allocated after spawning your worker threads, then their pointers, references, iterators, or whatever are almost certainly invalid. A list MAY be different (given how they're allocated) but I don't know, and may even be platform-dependent.
Basically, if you have multiple worker threads, it's probably safest to actually have a method on the parent class to dump the results back into as long as the copy isn't that taxing. Sure it's not as good as allocating directly into the parent, but then you need to ensure the container you're dumping into doesn't get "lost" on re-allocation.
If the list you're using is guaranteed not to re-allocate its "other" space when members are added (or deleted), then that would achieve what you're looking for, but the vector is definitely unsafe. But either way, the way in which you access it (pointer, reference, or iterator) probably doesn't matter that much as long as your "root container" isn't going to move around its contents.
Edit:
As mentioned in the comments below, here's a block about the list from SGI's website (emphasis mine) :
Lists have the important property that
insertion and splicing do not
invalidate iterators to list elements,
and that even removal invalidates only
the iterators that point to the
elements that are removed. The
ordering of iterators may be changed
(that is, list::iterator might have
a different predecessor or successor
after a list operation than it did
before), but the iterators themselves
will not be invalidated or made to
point to different elements unless
that invalidation or mutation is
explicit.
So this basically says "use a list as your master store" and then each worker can dump into its own, and knows it won't get invalidated when another worker is completely done and their vector is deleted from the list.

In the current version of C++ (i.e. no move constructors) then pointers into items embedded in a std::list will be invalidated along with the list iterators.
If however you used a std::list*>, then the vector* could move around but the vector would not, so your pointer into the vector would remain valid.
With the addition of move constructors in C++0x the vector content is likely to stay put unless the vector itself is resized, but any such assumption would be inherently non-portable.

I like the pointer parameter. It is a matter of style. I prefer this parameter type style:
Const reference: Large object is being passed for reading. Reference avoids wasteful copying. Looks just like pass-by-value at the point of call.
Pointer: Object is being passed for reading and writing. The call will have an "&" to get the pointer, so the writing is made obvious during code review.
Non-const reference: banned, because code review can't tell which parameters might get changed as a side effect.
As you say, an iterator creates a pointless dependency on the parent container type. (std::list is implemented as a double-linked list, so only deleting its entry invalidates a vector. So it would work.)

Related

Sharing a `std::list` without adding a (redundant) reference to it

I have a conainter, lets say a std::list<int>, which I would like to share between objects. One of the objects is known to live longer than the others, so he will hold the container. In order to be able to access the list, the other objects may have a pointer to the list.
Since the holder object might get moved, I'll need to wrap the list with a unique_ptr:
class LongLiveHolder { std::unique_ptr<std::list<int>> list; };
class ShortLiveObject { std::list<int>& list; };
However, I don't really need the unique_ptr wrapper. Since the list probably just contains a [unique_ptr] pointer to the first node (and a pointer to the last node), I could, theoretically, have those pointers at the other objects:
class LongLiveHolder { std::unique_ptr<NonExistentListNode<int>> back; };
class ShortLiveObject { NonExistentListNode<int>& back; };
, which would save me a redundant dereference when accessing the list, except that I would no longer have the full std::list interface to use with the shorter-lived object- just the node pointers.
Can I somehow get rid of this extra layer of indirection, while still having the std::list interface in the shorter-lived object?

Preface
You may be overthinking the cost of the extra indirection from the std::unique_ptr (unless you have a lot of these lists and you know that usages of them will be frequent and intermixed with other procedures). In general, I'd first trust my compiler to do smart things. If you want to know the cost, do performance profiling.
The main purpose of the std::unique_ptr in your use-case is just to have shared data with a stable address when other data that reference it gets moved. If you use the list member of the long-lived object multiple times in a single procedure, you can possibly help your compiler to help you (and also get some nicer-to-read code) when you use the list through the long-lived object by making a variable in the scope of the procedure that stores a reference to the std::list pointed to by the std::unique_ptr like:
void fn(LongLiveHolder& holder) {
auto& list {holder.list.get()};
list.<some_operation_1>(...);
list.<some_operation_2>(...);
list.<some_operation_3>(...);
}
But again, you should inspect the generated machine code and do performance profiling if you really want to know what kind of difference it makes.
If Context Permits, Write your own List
You said:
However, I don't really need the unique_ptr wrapper. Since the list probably just contains a [unique_ptr] pointer to the first node (and a pointer to the last node), I could, theoretically, have those pointers at the other objects: [...]
Considering Changes in what is the First Node
What if the first node of the list is allowed to be deleted? What if a new node is allowed to be inserted at the beginning of the list? You'd need a very specific context for those to not be requirements. What you want in your short-lived object is a view abstractions which supports the same interface as the actual list but just doesn't manage the lifetime of the list contents. If you implement the view abstraction as a pointer to the list's first node, then how will the view object know about changes to what the "real"/lifetime-managing list considers to be the first node? It can't- unless the lifetime-managing list keeps an internal list of all views of itself which are alive and also updates those (which itself is a performance and space overhead), and even then, what about the reverse? If the view abstraction was used to change what's considered the first node, how would the lifetime-managing list know about that change? The simplest, sane solution is to have an extra level of indirection: make the view point to the list instead of to what was the list's first node when the view was created.
Considering Requirements on Time Complexity of getting the list size
I'm pretty sure a std::list can't just hold pointers to front and back nodes. For one thing, since c++11 requires that std::list::size() is O(1), std::list probably has to keep track of its size at all times in a counter member- either storing it in itself, or doing some kind of size-tracking in each node struct, or some other implementation-defined behaviour. I'm pretty sure the simplest and most performant way to have multiple moveable references (non-const pointers) to something that needs to do this kind of bookkeeping is to just add another level of indirection.
You could try to "skip" the indirection layer required by the bookkeeping for specific cases that don't require that information, which is the iterators/node-pointers approach, which I'll comment on later. I can't think of a better place or way to store that bookkeeping other than with the collection itself. Ie. If the list interface has requirements that require such bookkeeping, an extra layer of indirection for each user of the list implementation has a very strong design rationale.
If Context Permits
If you don't care about having O(1) to get the size of your list, and you know that what is considered the first node will not change for the lifetime of the short-lived object, then you can write your own List class list-view class and make your own context-specific optimizations. That's one of the big selling-points of languages like C++: You get a nice standard library that does commonly useful things, and when you have a specific scenario where some features of those tools aren't required and are resulting in unnecessary overhead, you can build your own tool/abstraction (or possibly use someone else's library).
Commentary on std::unique_ptr + reference
Your first snippet works, but you can probably get some better implicit constructors and such for SortLiveObject by using std::reference_wrapper, since the default implicity-declared copy-assignment and default-construct functions get deleted when there's a reference member.
class LongLiveHolder { std::unique_ptr<std::list<int>> list; };
class ShortLiveObject { std::reference_wrapper<std::list<int>> list; };
Commentary on std::shared_ptr + std::weak_ref
Like #Adrian Maire suggested, std::shared_ptr in the longer-lived, object which might move while the shorter-lived object exists, and std::weak_ptr in the shorter-lived object is a working approach, but it probably has more overhead (at least coming from the ref-count) than using std::unique_ptr + a reference, and I can't think of any generalized pros, so I wouldn't suggest it unless you already had some other reason to use a std::shared_ptr. In the scenario you gave, I'm pretty sure you do not.
Commentary on Storing iterators/node-pointers in the short-lived object
#Daniel Langr already commented about this, but I'll try to expand.
Specifically for std::list, there is a possible standard-compliant solution (with several caveats) that doesn't have the extra indirection of the smart pointer. Caveats:
You must be okay with only having an iterator interface for the shorter-lived object (which you indicated that you are not).
The front and back iterators must be stable for the lifetime of the shorter-lived object. (the iterators should not be deleted from the list, and the shorter-lived object won't see new list entries that are pushed to the front or back by someone using the longer-lived object).
From cppreference.com's page for std::list's constructors:
After container move construction (overload (8)), references, pointers, and iterators (other than the end iterator) to other remain valid, but refer to elements that are now in *this. The current standard makes this guarantee via the blanket statement in [container.requirements.general]/12, and a more direct guarantee is under consideration via LWG 2321.
From cppreference.com's page for std::list:
Adding, removing and moving the elements within the list or across several lists does not invalidate the iterators or references. An iterator is invalidated only when the corresponding element is deleted.
But I am not a language lawyer. I could be missing something important.
Also, you replied to Daniel saying:
Some iterators get invalid when moving the container (e.g. insert_iterator) #DanielLangr
Yes, so if you want to be able to make std::input_iterators, use the std::unique_ptr + reference approach and construct short-lived std::input_iterators when needed instead of trying to store long-lived ones.

If the list owner will be moved, then you need some memory address to share somehow.
You already indicated the unique_ptr. It's a decent solution if the non-owners don't need to save it internally.
The std::shared_ptr is an obvious alternative.
Finally, you can have a std::shared_ptr in the owner object, and pass std::weak_ptr to non-owners.

Deleting STL-container element via iterator only

Is there a way to delete an element from an STL container (be it list, vector, ...) only via an iterator pointing to the element to be deleted, but without providing the container object it resides in (i.e. without directly using the container memberfunction container<T>::iterator container<T>.erase(container<T>::iterator)?
(Follow-up to this question)

No, this is not possible.
Imagine what would have to happen in order for the standard to provide a way to do this. std::vector<T>::iterator could not be a simple T*. Instead it would have to contain enough information for the library to be able to "find" the vector to which it belongs, such as a pointer to the vector itself. Thus, if the standard imposed a requirement to make it possible to delete an element given only an iterator, it would force the standard library to add overhead that would slow down all users of the container.

Constness of STL containers and their elements - when to use const?

I have been overthinking (some may say underthinking, let's see what happens) the const-ness of STL containers and their elements.
I have been looking for a discussion of this, but the results have been surprisingly sparse. So I'm not necessarily looking for a definite answer here, I'd be just as happy with a discussion that gets the gears in my head moving again.
Let's say I have a class that keeps std::strings in a std::vector. My class is a dictionary that reads words from a dictionary file. They will never be changed. So it seems prudent to declare it as
std::vector<const std::string> m_myStrings;
However, I've read scattered comments that you shouldn't use const elements in a std::vector, since the elements need to be assignable.
Question:
Are there cases when const elements are used in std::vector (excluding hacks etc)?
Are const elements used in other containers? If so, which ones, and when?
I'm primarily talking about value types as elements here, not pointers.

My class is a dictionary that reads words from a dictionary file. They will never be changed.
Encapsulation can help here.
Have your class keep a vector<string>, but make it private.
Then add an accessor to your class that returns a const vector<string> &, and make the callers go through that.
The callers cannot change the vector, and operator [] on the vector will hand them const string &, which is exactly what you want.

No, for the reason you state.

In the context of std::vector, I don't think it makes sense to use a const qualifier with its template parameter because a std::vector is dynamic by nature and may be required to "move" in memory in order to "resize" itself.
In the C++03 standard, std::vector is guaranteed stored in contiguous memory. This almost requires that std::vector be implemented with some form of an array. But how can we create a dynamic size-changing array? We cannot simply just "append" memory to the end of it--that would either require an additional node (and a linked list) or actually physically putting our additional entries at the end of the array, which would be either out-of-bounds or require us to just reserve more memory in the first place.
Thus, I would assume that std::vector would need to allocate an additional array, copy or move its members over to the end array, and then delete the old one.
It is not guaranteed that a move or copy assignment for every template-able object for a std::vector would not change the underlying object being moved or copied--it is considered good form to do add the const qualifier, but it is not required. Therefore, we cannot allow a std::vector<const T>.
Related: How is C++ std::vector implemented?

consider using
std::vector<std::shared_ptr<const std::string>>
instead?

QMap/QHash operator[] returned reference validity

I was wondering for how long the reference to a value inside a Qt container, especially a QHash or a QMap is valid. By valid I mean if it is guaranteed to still point to the correct location inside the map/hash after inserting or removing other elements.
Let's the following code:
QHash<char,int> dict; // or QMap<char,int> dict;
dict.insert('a', 1);
int& val(dict['a']);
dict.insert('b', 2);
val = 3; // < will this work or lead to a segfault
Will setting the value at the last line correctly update the value associated with a to 3 or will it lead to a segfault or will it be undefined (so work sometimes, segfault other times, depending on whether the data structure had to be reorganized internally, like resizing of the hash-table array). Is the behavior the same for QMap and QHash, or will one work and the other not?

This is fully covered in the documentation — you must have missed it!
Iterators of both types are invalidated when the data in the container
is modified or detached from implicitly shared copies due to a call to
a non-const member function.
So, although I would expect iterators/references to remain valid in practice in the scenario you described above, you shall not rely on this. Using them in this way shall invoke Undefined Behaviour.
This holds for QHashIterator and QMutableHashIterator, as well as bare references. Beware of non-authoritative references claiming the opposite, relying on implementation details that may change at any time.

There is nothing wrong with using references at QMap/QHash elements unless you delete the node you are referencing to. The elements of qt containers do not get reallocated every time a new elements is inserted. However I cannot see any good reason for using references to container elements.
For more details check this excellent article about qt containers internal implementation

Pointers to elements of STL containers

Given an STL container (you may also take boost::unordered_map and boost::multi_index_container into account) that is non-contiguous, is it guaranteed that the memory addresses of the elements inside the container never changes if no element is removed, (but new ones can be added)?
e.g.
class ABC { };
//
//...
//
std::list<ABC> abclist;
ABC abc;
abclist.insert(abc);
ABC * abc_ptr = &(*abclist.begin());
In other word will abc_ptr be pointed to abc throughout the execution, if I do not remove abc from abc_list.
I am asking this because I am going to wrap the class ABC in C++/Cli, so I need pointers to the ABC instances in the wrapper class. ABC is a simple class and I want the container to handle the memory. If the answer is no then I will use std::list<ABC*>.

std::list, std::set, and std::map guarantee that the iterators (including simple pointers) will not be invalidated when a new element is added or even removed.

As Armen mentioned std::list, std::set, and std::map are guaranteed to only invalidate the removed iterator. In the case of boost::unodered_map, the modifiers may indeed invalidate iterators.
http://www.boost.org/doc/libs/1_38_0/doc/html/boost/unordered_map.html

The C++ Standard places stringent rules on the validity of references / iterators. For each container, each method documents which elements may be moved (invalidating references and iterators).
The Node Based Containers: list, map, set, multimap and multiset guarantee that references and iterators to elements will remain valid as long as the element is not removed from the container.
Your use case is therefore one of the corner cases where using a list for storage is good, because of the invalidation guarantees that list offer.

I think it's better to use std::list <shared_ptr <ABC> > instead of passing a pointer.
It's good practice to delegate memory management (see scott meyers effective c++)
This has mulitple advantages:
you can share them and pass them without the headache of freeing them
garbage collection of your pointers
you don't pass a pointer in the first place

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js