The standard C++ containers offer only one version of operator[] for containers like vector<T> and deque<T>. It returns a T& (other than for vector<bool>, which I'm going to ignore), which is an lvalue. That means that in code like this,
vector<BigObject> makeVector(); // factory function
auto copyOfObject = makeVector()[0]; // copy BigObject
copyOfObject will be copy constructed. Given that makeVector() returns an rvalue vector, it seems reasonable to expect copyOfObject to be move constructed.
If operator[] for such containers was overloaded for rvalue and lvalue objects, then operator[] for rvalue containers could return an rvalue reference, i.e., an rvalue:
template<typename T>
container {
public:
T& operator[](int index) &; // for lvalue objects
T&& operator[](int index) &&; // for rvalue objects
...
};
In that case, copyOfObject would be move constructed.
Is there a reason this kind of overloading would be a bad idea in general? Is there a reason why it's not done for the standard containers in C++14?
Converting comment into answer:
There's nothing inherently wrong with this approach; class member access follows a similar rule (E1.E2 is an xvalue if E1 is an rvalue and E2 names a non-static data member and is not a reference, see [expr.ref]/4.2), and elements inside a container are logically similar to non-static data members.
A significant problem with doing it for std::vector or other standard containers is that it will likely break some legacy code. Consider:
void foo(int &);
std::vector<int> bar();
foo(bar()[0]);
That last line will stop compiling if operator[] on an rvalue vector returned an xvalue. Alternatively - and arguably worse - if there is a foo(const int &) overload, it will silently start calling that function instead.
Also, returning a bunch of elements in a container and only using one element is already rather inefficient. It's arguable that code that does this probably doesn't care much about speed anyway, and so the small performance improvement is not worth introducing a potentially breaking change.
I think you will leave the container in an invalid state if you move out one of the elements, I would argue the need to allow that state at all. Second, if you ever need that, can't you just call the new object's move constructor like this:
T copyObj = std::move(makeVector()[0]);
Update:
Most important point is, again in my opinion, that containers are containers by their nature, so they should not anyhow modify the elements inside them. They just provide a storage, iteration mechanism, etc.
Related
In the C++ standard vector.capacity section, it defines two overloads for resize(). (see also https://en.cppreference.com/w/cpp/container/vector/resize)
This overload requires that the type T is MoveInsertable and DefaultInsertable:
constexpr void resize(size_type sz);
The other overload requires that the type T is CopyInsertable:
constexpr void resize(size_type sz, const T& c);
My question is that why doesn't the second overload try to move from the existing values from the old vector, and further copy-insert new values from the supplied c argument. Wouldn't that be more efficient?
I agree that the second parameter could be an rvalue reference - this saves one copy, when it is subsequently copied from the first newly appended item. I assume what you have in mind is something like this:
std::vector<LargeObject> v;
// ...
LargeObject obj = setupLotsOfResources();
// Now do 1 move and 9 copies instead of 10 copies
v.resize(10, std::move(obj));
However, I would consider this an edge case, and working with rvalue references that are used for 1 move construction and N-1 copies is quite a confusing API. As you are free to use the std::vector API with what is already there such that for the above example, you will have 1 move and N-1 copies, I believe the rationale behind the existing function signature is ease of use and a straightforward signature that doesn't require much studying of the specs to understand what it does.
I have trouble finding the semantics of the reference type trait of an iterator. Let's say I want to implement a chunk iterator, that, given a position into a range, will give me chunks of that range:
template<class T, int N>
class chunk_iterator {
public:
using reference = std::span<T,N>;
chunk_iterator(T* ptr): ptr(ptr) {}
chunk_iterator operator++() { ptr += N; return *this; }
reference operator*() const { return {ptr,N}; }
private:
T* ptr;
};
The problem that I see here is that std::span is a view-like thing, but it does not behave like a reference (say a std::array<T,N>& in this case). In particular, if I assign to a span, the assignement is shallow, it will not copy the value.
Is std::span a valid iterator::reference type? Are view and reference semantics explained in detail somewhere?
What should I do to solve my problem? Implement a span_ref with proper reference semantics? It it already implemented in some library? Is a non-native reference type even allowed?
(note: solving the problem by storing a std::array<T,N> and returning a std::array<T,N>& in operator* is doable, but ugly, and if N is not known at compile time, storing instead a std::vector<T> with dynamic memory allocation is just plain wrong)
When talking about standard-compliant iterators, it depends on several things.
For conforming Iterators, it almost doesn't matter what the reference type is because the standard does not require any usage semantics for the reference type. But that also means nobody except you knows how to use your iterator.
For conforming Input Iterators, the reference type must meet the semantics specified. Notice that for LegacyInputIterator, the expression *it must be a reference that is usable as a reference with all the normal semantics, otherwise code that uses your iterator will not behave as expected. This means reading from a reference is akin to reading from a built-in reference. In particular, the following should do "normal" things:
auto value = *itr; // this should read a value
In this situation, a view type like span wouldn't work because span is more like a pointer than a reference: in the above snippet value would be a span, not whatever the span refers to.
For conforming Output Iterators, the reference type has no requirements. In fact, standard LegacyOutputIterators like std::back_insert_iterator have void as a reference type.
For conforming Forward Iterators and above, the standard actually requires the reference be a built-in reference. This is to support uses like below:
auto& ref = *itr;
auto ptr = &ref; // this must create a pointer pointing to the original object
auto ref2 = *ptr; // this must create a second, equivalent reference
auto other = std::move( ref ); // this must do a "move", which may be the same as a copy
ref = other; // this must assign "other"'s value back into the referred-to object
If the above didn't work correctly, many of the standard algorithms wouldn't be possible to write generically.
Speaking to span specifically, it acts more like a pointer than a reference logically. It can be re-assigned to point to something else. Taking its address creates a pointer to the span, not a pointer to the container being spanned over. Calling std::move on a span copies the span, and doesn't move the contents of the spanned range. A built-in reference T& will only refer to one thing ever once it's been created.
Creating a non-conforming reference that actually works with standard algorithms would involve a family of types overloading operator*, operator->, and operator&, operator=, and std::move, and modeling pointers, lvalue references, and rvalue references.
The meaning of an iterator's reference type cannot be understood without comprehending its relationship to the iterator's value_type. An iterator is a construct that represents a position within a sequence of value_types. A reference is a mediator within this paradigm; it is a thing that acts like a value_type (const) &. Until you figure out what your value_type is going to be, you can't decide what your reference will need to look like.
What "acts like" means depends on what kind of iterator we're talking about.
For C++11, the InputIterator category requires that reference be a type which is implicitly convertible to a value_type. For the OutputIterator category, reference is required to be a type which is assignable from a value_type.
For all of the more restricted iterator categories (ForwardIterator and above), reference is required to be exactly one of value_type & (if you can write to the sequence) or value_type const & (if you can only read from the sequence).
Iterators where reference is not a value_type (const) & are often called proxy iterators, as the reference type typically acts as a "proxy" for the actual data stored in the sequence (assuming the iterator isn't just inventing values to begin with). Proxy iterators are often used for cases where the iterator doesn't iterate over a range of actual value_types, but simply pretends to. This could be the bitwise iterators of vector<bool> or an iterator that iterates over the sequence of integers on some half-open range [0, N).
But proxy iterator references have to act like language references to one degree or another. InputIterator references have to be implicitly convertible to the value_type. span<T, N> is not implicitly convertible to array<T, N> or any other container type that would be appropriate for a value_type. OutputIterator references have to be assignable from value_type. And while span<T, N> may be assignable from an array<T, N>, the assignment operation doesn't have the same meaning. To assign to an OutputIterator's reference ought to change the values stored within the sequence. And this doesn't.
In any case, you first need to invent a value_type that does what you need it to do. Then you need to build a proper reference type that acts like a reference. Lastly... well, you can't make your iterator a ForwardIterator or higher, because C++11 doesn't support proxy iterators of the most useful iterator categories. C++20's new formulation of iterators allows proxy iterators for anything that isn't a contiguous_iterator.
The standard C++ containers offer only one version of operator[] for containers like vector<T> and deque<T>. It returns a T& (other than for vector<bool>, which I'm going to ignore), which is an lvalue. That means that in code like this,
vector<BigObject> makeVector(); // factory function
auto copyOfObject = makeVector()[0]; // copy BigObject
copyOfObject will be copy constructed. Given that makeVector() returns an rvalue vector, it seems reasonable to expect copyOfObject to be move constructed.
If operator[] for such containers was overloaded for rvalue and lvalue objects, then operator[] for rvalue containers could return an rvalue reference, i.e., an rvalue:
template<typename T>
container {
public:
T& operator[](int index) &; // for lvalue objects
T&& operator[](int index) &&; // for rvalue objects
...
};
In that case, copyOfObject would be move constructed.
Is there a reason this kind of overloading would be a bad idea in general? Is there a reason why it's not done for the standard containers in C++14?
Converting comment into answer:
There's nothing inherently wrong with this approach; class member access follows a similar rule (E1.E2 is an xvalue if E1 is an rvalue and E2 names a non-static data member and is not a reference, see [expr.ref]/4.2), and elements inside a container are logically similar to non-static data members.
A significant problem with doing it for std::vector or other standard containers is that it will likely break some legacy code. Consider:
void foo(int &);
std::vector<int> bar();
foo(bar()[0]);
That last line will stop compiling if operator[] on an rvalue vector returned an xvalue. Alternatively - and arguably worse - if there is a foo(const int &) overload, it will silently start calling that function instead.
Also, returning a bunch of elements in a container and only using one element is already rather inefficient. It's arguable that code that does this probably doesn't care much about speed anyway, and so the small performance improvement is not worth introducing a potentially breaking change.
I think you will leave the container in an invalid state if you move out one of the elements, I would argue the need to allow that state at all. Second, if you ever need that, can't you just call the new object's move constructor like this:
T copyObj = std::move(makeVector()[0]);
Update:
Most important point is, again in my opinion, that containers are containers by their nature, so they should not anyhow modify the elements inside them. They just provide a storage, iteration mechanism, etc.
As of GCC 4.9.2, it's now possible to compile C++11 code that inserts or erases container elements via a const_iterator.
I can see how it makes sense for insert to accept a const_iterator, but I'm struggling to understand why it makes sense to allow erasure via a const_iterator.
This issue has been discussed previously, but I haven't seen an explanation of the rationale behind the change in behaviour.
The best answer I can think of is that the purpose of the change was to make the behaviour of const_iterator analogous to that of a const T*.
Clearly a major reason for allowing delete with const T* is to enable declarations such as:
const T* x = new T;
....
delete x;
However, it also permits the following less desirable behaviour:
int** x = new int*[2];
x[0] = new int[2];
const int* y = x[0];
delete[] y; // deletes x[0] via a pointer-to-const
and I'm struggling to see why it's a good thing for const_iterator to emulate this behaviour.
erase and insert are non-const member functions of the collection. Non-const member functions are the right way to expose mutating operations.
The constness of the arguments are irrelevant; they aren't being used to modify anything. The collection can be modified because the collection is non-const (held in the hidden this argument).
Compare:
template <typename T>
std::vector<T>::iterator std::vector<T>::erase( std::vector<T>::const_iterator pos );
^^^^^ ok
to a similar overload that is NOT allowed
template <typename T>
std::vector<T>::iterator std::vector<T>::erase( std::vector<T>::iterator pos ) const;
wrong ^^^^^
The first question here is whether constness of an iterator is supposed to imply constnes of the entire container or just constness of the elements being iterated over.
Apparently, it has been decided that the latter is the right approach. Constness of an iterator does not imply constness of the container, it only implies constness of container's elements.
Since the beginning of times, construction of an object and its symmetrical counterpart - destruction of an object - were considered meta-operations with regards to the object itself. The object has always been supposed to behave as mutable with regard to these operations, even if it was declared as const. This is the reason you can legally modify const objects in their constructors and destructors. This is the reason you can delete an object of type T through a const T * pointer.
Given the resolution referred to by 1, it becomes clear that 2 should be extended to iterators as well.
I'd like to use the following idiom, that I think is non-standard. I have functions which return vectors taking advantage of Return Value Optimization:
vector<T> some_func()
{
...
return vector<T>( /* something */ );
}
Then, I would like to use
vector<T>& some_reference;
std::swap(some_reference, some_func());
but some_func doesn't return a LValue. The above code makes sense, and I found this idiom very useful. However, it is non-standard. VC8 only emits a warning at the highest warning level, but I suspect other compilers may reject it.
My question is: Is there some way to achieve the very same thing I want to do (ie. construct a vector, assign to another, and destroy the old one) which is compliant (and does not use the assignment operator, see below) ?
For classes I write, I usually implement assignment as
class T
{
T(T const&);
void swap(T&);
T& operator=(T x) { this->swap(x); return *this; }
};
which takes advantage of copy elision, and solves my problem. For standard types however, I really would like to use swap since I don't want an useless copy of the temporary.
And since I must use VC8 and produce standard C++, I don't want to hear about C++0x and its rvalue references.
EDIT: Finally, I came up with
typedef <typename T>
void assign(T &x, T y)
{
std::swap(x, y);
}
when I use lvalues, since the compiler is free to optimize the call to the copy constructor if y is temporary, and go with std::swap when I have lvalues. All the classes I use are "required" to implement a non-stupid version of std::swap.
Since std::vector is a class type and member functions can be called on rvalues:
some_func().swap(some_reference);
If you don't want useless copies of temporaries, don't return by value.
Use (shared) pointers, pass function arguments by reference to be filled in, insert iterators, ....
Is there a specific reason why you want to return by value?
The only way I know - within the constraints of the standard - to achieve what you want are to apply the expression templates metaprogramming technique: http://en.wikipedia.org/wiki/Expression_templates Which might or not be easy in your case.