I've got a class which holds in it a container, and an iterator into that container. How can I correctly implement the move constructor? I seem to recall that by Standard, you can't rely on the iterators remaining valid after moving (which is so silly). Is there some means by which I can "update" the iterator if it was invalidated or something? Or will I have to dynamically allocate the container, move it, and then have the iterators remain valid that way?
Update: Using a std::unique_ptr as a holder for the container is the canonical generic solution - simply don't move the container, just transfer the ownership and swap the iterators. As you already said you can special-case this as an optimization, although I'd expect the generic solution to be also quite efficient and I'd only accept more complexity (aka bug-potential) to the code after proving that it's a real performance win for your use-case.
I'll leave the former answer below for future readers: Read it and the comments to see why other solutions are not really working and in which cases they cause trouble.
The obvious way to update the iterator would be:
Container c = ...;
Container::iterator it = ...;
const auto d = std::distance( c.begin(), it );
Container n = std::move(c);
it = n.begin();
std::advance( it, d );
which is generally linear, but constant when the iterator is a random access iterator.
Since you probably don't want to do that, you have two options which should help: Either default construct the new container and use swap without invalidating the iterators or put the container into a std::unique_ptr and move that instead.
The first approach (swap) requires both instances to have the container instance and this might be a bit larger than the simple, single pointer stored inside a std::unique_ptr. When you move your instances around very often, the std::unique_ptr-based approach seems preferable to me, although each access requires one more pointer indirection. Judge (and measure) for yourself what fits best in your case.
I think the implicit guarantee on iterator invalidation holds for the move ctor. That is, the following should work for all containers but std::array:
template<class Container>
struct foo_base
{
Container c;
Container::iterator i;
foo_base(foo_base&& rhs, bool is_end)
: c( std::move(rhs.c) )
, i( get_init(is_end, rhs.i) )
{}
Container::iterator get_init(bool is_end, Container::iterator ri)
{
using std::end; // enable ADL
return is_end ? end(c) : ri;
}
};
template<class Container>
struct foo : private foo_base<Container>
{
foo(foo&& rhs)
: foo_base(std::move(rhs), rhs.i == end(rhs.c))
{}
};
The complicated initialization via a base class is necessary as move assignment isn't required to move if the allocator doesn't propagate for move-assignment. The check for the iterator is required as the end() iterator may be invalidated; this check has to be performed before the container is moved. If you can ensure however that the allocator propagates (or otherwise the move-assignment doesn't invalidate iterators for your cases), you can use the simpler version below, replacing the swap with a move-assignment.
N.B. The sole purpose of the get_init function is to enable ADL. It is possible that foo_base has a member function end, which would disable ADL. The using-declaration stops unqualified lookup to find a possible member function, therefore ADL is always performed. You could as well use std::end(c) and get rid of get_init, if you're comfortable with losing ADL here.
If it should turn out that there is no such implicit guarantee for the move ctor, there's still the explicit guarantee for swap. For this, you can use:
template<class Container>
struct foo
{
Container c;
Container::iterator i;
foo(foo&& rhs)
{
using std::end; // enable ADL
bool const is_end = (rhs.i == end(rhs.c));
c.swap( rhs.c );
i = get_init(is_end, rhs.i);
}
Container::iterator get_init(bool is_end, Container::iterator ri)
{
using std::end; // enable ADL
return is_end ? end(c) : ri;
}
};
However, a swap has some requirements, defined in [container.requirements.general]/7+8:
The behavior of a call to a container's swap function is undefined unless the objects being swapped have allocators that compare equal or allocator_traits<allocator_type>::propagate_on_container_swap::value is true
[...]
Any Compare, Pred, or Hash objects belonging to a and b shall be swappable and shall be exchanged by unqualified calls to non-member swap. If allocator_traits<allocator_type>::propagate_on_container_swap::value is true, then the allocators of a and b shall also be exchanged using an unqualified call to non-member swap. Otherwise, they shall not be swapped, and the behavior is undefined unless
a.get_allocator() == b.get_allocator().
I.e. both containers should (but not have to) have equal allocators.
Move construction OTOH only requires that no exception is thrown (for allocator-aware containers); the allocator is always moved.
Related
There's a similar question: check if elements of a range can be moved?
I don't think the answer in it is a nice solution. Actually, it requires partial specialization for all containers.
I made an attempt, but I'm not sure whether checking operator*() is enough.
// RangeType
using IteratorType = std::iterator_t<RangeType>;
using Type = decltype(*(std::declval<IteratorType>()));
constexpr bool canMove = std::is_rvalue_reference_v<Type>;
Update
The question may could be split into 2 parts:
Could algorithms in STL like std::copy/std::uninitialized_copy actually avoid unnecessary deep copy when receiving elements of r-value?
When receiving a range of r-value, how to check if it's a range adapter like std::ranges::subrange, or a container which holds the ownership of its elements like std::vector?
template <typename InRange, typename OutRange>
void func(InRange&& inRange, OutRange&& outRange) {
using std::begin;
using std::end;
std::copy(begin(inRange), end(inRange), begin(outRange));
// Q1: if `*begin(inRange)` returns a r-value,
// would move-assignment of element be called instead of a deep copy?
}
std::vector<int> vi;
std::list<int> li;
/* ... */
func(std::move(vi), li2);
// Q2: Would elements be shallow copy from vi?
// And if not, how could I implement just limited count of overloads, without overload for every containers?
// (define a concept (C++20) to describe those who take ownership of its elements)
Q1 is not a problem as #Nicol Bolas , #eerorika and #Davis Herring pointed out, and it's not what I puzzled about.
(But I indeed think the API is confusing, std::assign/std::uninitialized_construct may be more ideal names)
#alfC has made a great answer about my question (Q2), and gives a pristine perspective. (move idiom for ranges with ownership of elements)
To sum up, for most of the current containers (especially those from STL), (and also every range adapter...), partial specialization/overload function for all of them is the only solution, e.g.:
template <typename Range>
void func(Range&& range) { /*...*/ }
template <typename T>
void func(std::vector<T>&& movableRange) {
auto movedRange = std::ranges::subrange{
std::make_move_iterator(movableRange.begin()),
std::make_move_iterator(movableRange.end())
};
func(movedRange);
}
// and also for `std::list`, `std::array`, etc...
I understand your point.
I do think that this is a real problem.
My answer is that the community has to agree exactly what it means to move nested objected (such as containers).
In any case this needs the cooperation of the container implementors.
And, in the case of standard containers, good specifications.
I am pessimistic that standard containers can be changed to "generalize" the meaning of "move", but that can't prevent new user defined containers from taking advantage of move-idioms.
The problem is that nobody has studied this in depth as far as I know.
As it is now, std::move seem to imply "shallow" move (one level of moving of the top "value type").
In the sense that you can move the whole thing but not necessarily individual parts.
This, in turn, makes useless to try to "std::move" non-owning ranges or ranges that offer pointer/iterator stability.
Some libraries, e.g. related to std::ranges simply reject r-value of references ranges which I think it is only kicking the can.
Suppose you have a container Bag.
What should std::move(bag)[0] and std::move(bag).begin() return? It is really up to the implementation of the container decide what to return.
It is hard to think of general data structures, bit if the data structure is simple (e.g. dynamic arrays) for consistency with structs (std::move(s).field) std::move(bag)[0] should be the same as std::move(bag[0]) however the standard strongly disagrees with me already here: https://en.cppreference.com/w/cpp/container/vector/operator_at
And it is possible that it is too late to change.
Same goes for std::move(bag).begin() which, using my logic, should return a move_iterator (or something of the like that).
To make things worst, std::array<T, N> works how I would expect (std::move(arr[0]) equivalent to std::move(arr)[0]).
However std::move(arr).begin() is a simple pointer so it looses the "forwarding/move" information! It is a mess.
So, yes, to answer your question, you can check if using Type = decltype(*std::forward<Bag>(bag).begin()); is an r-value but more often than not it will not implemented as r-value.
That is, you have to hope for the best and trust that .begin and * are implemented in a very specific way.
You are in better shape by inspecting (somehow) the category of the range itself.
That is, currently you are left to your own devices: if you know that bag is bound to an r-value and the type is conceptually an "owning" value, you currently have to do the dance of using std::make_move_iterator.
I am currently experimenting a lot with custom containers that I have. https://gitlab.com/correaa/boost-multi
However, by trying to allow for this, I break behavior expected for standard containers regarding move.
Also once you are in the realm of non-owning ranges, you have to make iterators movable by "hand".
I found empirically useful to distinguish top-level move(std::move) and element wise move (e.g. bag.mbegin() or bag.moved().begin()).
Otherwise I find my self overloading std::move which should be last resort if anything at all.
In other words, in
template<class MyRange>
void f(MyRange&& r) {
std::copy(std::forward<MyRange>(r).begin(), ..., ...);
}
the fact that r is bound to an r-value doesn't necessarily mean that the elements can be moved, because MyRange can simply be a non-owning view of a larger container that was "just" generated.
Therefore in general you need an external mechanism to detect if MyRange owns the values or not, and not just detecting the "value category" of *std::forward<MyRange>(r).begin() as you propose.
I guess with ranges one can hope in the future to indicate deep moves with some kind of adaptor-like thing "std::ranges::moved_range" or use the 3-argument std::move.
If the question is whether to use std::move or std::copy (or the ranges:: equivalents), the answer is simple: always use copy. If the range given to you has rvalue elements (i.e., its ranges::range_reference_t is either kind(!) of rvalue), you will move from them anyway (so long as the destination supports move assignment).
move is a convenience for when you own the range and decide to move from its elements.
The answer of the question is: IMPOSSIBLE. At least for the current containers of STL.
Assume if we could add some limitations for Container Requirements?
Add a static constant isContainer, and make a RangeTraits. This may work well, but not an elegant solution I want.
Inspired by #alfC , I'm considering the proper behaviour of a r-value container itself, which may help for making a concept (C++20).
There is an approach to distinguish the difference between a container and range adapter, actually, though it cannot be detected due to the defect in current implementation, but not of the syntax design.
First of all, lifetime of elements cannot exceed its container, and is unrelated with a range adapter.
That means, retrieving an element's address (by iterator or reference) from a r-value container, is a wrong behaviour.
One thing is often neglected in post-11 epoch, ref-qualifier.
Lots of existing member functions, like std::vector::swap, should be marked as l-value qualified:
auto getVec() -> std::vector<int>;
//
std::vector<int> vi1;
//getVec().swap(vi1); // pre-11 grammar, should be deprecated now
vi1 = getVec(); // move-assignment since C++11
For the reasons of compatibility, however, it hasn't been adopted. (It's much more confusing the ref-qualifier hasn't been widely applied to newly-built ones like std::array and std::forward_list..)
e.g., it's easy to implement the subscript operator as we expected:
template <typename T>
class MyArray {
T* _items;
size_t _size;
/* ... */
public:
T& operator [](size_t index) & {
return _items[index];
}
const T& operator [](size_t index) const& {
return _items[index];
}
T operator [](size_t index) && {
// not return by `T&&` !!!
return std::move(_items[index]);
}
// or use `deducing this` since C++23
};
Ok, then std::move(container)[index] would return the same result as std::move(container[index]) (not exactly, may increase an additional move operation overhead), which is convenient when we try to forward a container.
However, how about begin and end?
template <typename T>
class MyArray {
T* _items;
size_t _size;
/* ... */
class iterator;
class const_iterator;
using move_iterator = std::move_iterator<iterator>;
public:
iterator begin() & { /*...*/ }
const_iterator begin() const& { /*...*/ }
// may works well with x-value, but pr-value?
move_iterator begin() && {
return std::make_move_iterator(begin());
}
// or more directly, using ADL
};
So simple, like that?
No! Iterator will be invalidated after destruction of container. So deferencing an iterator from a temporary (pr-value) is undefined behaviour!!
auto getVec() -> std::vector<int>;
///
auto it = getVec().begin(); // Noooo
auto item = *it; // undefined behaviour
Since there's no way (for programmer) to recognize whether an object is pr-value or x-value (both will be duduced into T), retrieving iterator from a r-value container should be forbidden.
If we could regulate behaviours of Container, explicitly delete the function that obtain iterator from a r-value container, then it's possible to detect it out.
A simple demo is here:
https://godbolt.org/z/4zeMG745f
From my perspective, banning such an obviously wrong behaviour may not be so destructive that lead well-implemented old projects failing to compile.
Actually, it just requires some lines of modification for each container, and add proper constraints or overloads for range access utilities like std::begin/std::ranges::begin.
Every standard container has a begin and end method for returning iterators for that container. However, C++11 has apparently introduced free functions called std::begin and std::end which call the begin and end member functions. So, instead of writing
auto i = v.begin();
auto e = v.end();
you'd write
auto i = std::begin(v);
auto e = std::end(v);
In his talk, Writing Modern C++, Herb Sutter says that you should always use the free functions now when you want the begin or end iterator for a container. However, he does not go into detail as to why you would want to. Looking at the code, it saves you all of one character. So, as far as the standard containers go, the free functions seem to be completely useless. Herb Sutter indicated that there were benefits for non-standard containers, but again, he didn't go into detail.
So, the question is what exactly do the free function versions of std::begin and std::end do beyond calling their corresponding member function versions, and why would you want to use them?
How do you call .begin() and .end() on a C-array ?
Free-functions allow for more generic programming because they can be added afterwards, on a data-structure you cannot alter.
Using the begin and end free functions adds one layer of indirection. Usually that is done to allow more flexibility.
In this case I can think of a few uses.
The most obvious use is for C-arrays (not c pointers).
Another is when trying to use a standard algorithm on a non-conforming container (ie the container is missing a .begin() method). Assuming you can't just fix the container, the next best option is to overload the begin function. Herb is suggesting you always use the begin function to promote uniformity and consistency in your code. Instead of having to remember which containers support method begin and which need function begin.
As an aside, the next C++ rev should copy D's pseudo-member notation. If a.foo(b,c,d) is not defined it instead tries foo(a,b,c,d). It's just a little syntactic sugar to help us poor humans who prefer subject then verb ordering.
Consider the case when you have library that contain class:
class SpecialArray;
it has 2 methods:
int SpecialArray::arraySize();
int SpecialArray::valueAt(int);
to iterate over it's values you need to inherit from this class and define begin() and end() methods for cases when
auto i = v.begin();
auto e = v.end();
But if you always use
auto i = begin(v);
auto e = end(v);
you can do this:
template <>
SpecialArrayIterator begin(SpecialArray & arr)
{
return SpecialArrayIterator(&arr, 0);
}
template <>
SpecialArrayIterator end(SpecialArray & arr)
{
return SpecialArrayIterator(&arr, arr.arraySize());
}
where SpecialArrayIterator is something like:
class SpecialArrayIterator
{
SpecialArrayIterator(SpecialArray * p, int i)
:index(i), parray(p)
{
}
SpecialArrayIterator operator ++();
SpecialArrayIterator operator --();
SpecialArrayIterator operator ++(int);
SpecialArrayIterator operator --(int);
int operator *()
{
return parray->valueAt(index);
}
bool operator ==(SpecialArray &);
// etc
private:
SpecialArray *parray;
int index;
// etc
};
now i and e can be legally used for iteration and accessing of values of SpecialArray
To answer your question, the free functions begin() and end() by default do nothing more than call the container's member .begin() and .end() functions. From <iterator>, included automatically when you use any of the standard containers like <vector>, <list>, etc., you get:
template< class C >
auto begin( C& c ) -> decltype(c.begin());
template< class C >
auto begin( const C& c ) -> decltype(c.begin());
The second part of you question is why prefer the free functions if all they do is call the member functions anyway. That really depends on what kind of object v is in your example code. If the type of v is a standard container type, like vector<T> v; then it doesn't matter if you use the free or member functions, they do the same thing. If your object v is more generic, like in the following code:
template <class T>
void foo(T& v) {
auto i = v.begin();
auto e = v.end();
for(; i != e; i++) { /* .. do something with i .. */ }
}
Then using the member functions breaks your code for T = C arrays, C strings, enums, etc. By using the non-member functions, you advertise a more generic interface that people can easily extend. By using the free function interface:
template <class T>
void foo(T& v) {
auto i = begin(v);
auto e = end(v);
for(; i != e; i++) { /* .. do something with i .. */ }
}
The code now works with T = C arrays and C strings. Now writing a small amount of adapter code:
enum class color { RED, GREEN, BLUE };
static color colors[] = { color::RED, color::GREEN, color::BLUE };
color* begin(const color& c) { return begin(colors); }
color* end(const color& c) { return end(colors); }
We can get your code to be compatible with iterable enums too. I think Herb's main point is that using the free functions is just as easy as using the member functions, and it gives your code backward compatibility with C sequence types and forward compatibility with non-stl sequence types (and future-stl types!), with low cost to other developers.
One benefit of std::begin and std::end is that they serve as extension points
for implementing standard interface for external classes.
If you'd like to use CustomContainer class with range-based for loop or template
function which expects .begin() and .end() methods, you'd obviously have to
implement those methods.
If the class does provide those methods, that's not a problem. When it doesn't,
you'd have to modify it*.
This is not always feasible, for example when using external library, esspecially
commercial and closed source one.
In such situations, std::begin and std::end come in handy, since one can provide
iterator API without modifying the class itself, but rather overloading free functions.
Example: suppose that you'd like to implement count_if function that takes a container
instead of a pair of iterators. Such code might look like this:
template<typename ContainerType, typename PredicateType>
std::size_t count_if(const ContainerType& container, PredicateType&& predicate)
{
using std::begin;
using std::end;
return std::count_if(begin(container), end(container),
std::forward<PredicateType&&>(predicate));
}
Now, for any class you'd like to use with this custom count_if, you only have
to add two free functions, instead of modifying those classes.
Now, C++ has a mechanisim called Argument Dependent Lookup
(ADL), which makes such approach even more flexible.
In short, ADL means, that when a compiler resolves an unqualified function (i. e.
function without namespace, like begin instead of std::begin), it will also
consider functions declared in namespaces of its arguments. For example:
namesapce some_lib
{
// let's assume that CustomContainer stores elements sequentially,
// and has data() and size() methods, but not begin() and end() methods:
class CustomContainer
{
...
};
}
namespace some_lib
{
const Element* begin(const CustomContainer& c)
{
return c.data();
}
const Element* end(const CustomContainer& c)
{
return c.data() + c.size();
}
}
// somewhere else:
CustomContainer c;
std::size_t n = count_if(c, somePredicate);
In this case, it doesn't matter that qualified names are some_lib::begin and some_lib::end
- since CustomContainer is in some_lib:: too, compiler will use those overloads in count_if.
That's also the reason for having using std::begin; and using std::end; in count_if.
This allows us to use unqualified begin and end, therefore allowing for ADL and
allowing compiler to pick std::begin and std::end when no other alternatives are found.
We can eat the cookie and have the cookie - i. e. have a way to provide custom implementation
of begin/end while the compiler can fall back to standard ones.
Some notes:
For the same reason, there are other similar functions: std::rbegin/rend,
std::size and std::data.
As other answers mentions, std:: versions have overloads for naked arrays. That's useful,
but is simply a special case of what I've described above.
Using std::begin and friends is particularly good idea when writing template code,
because this makes those templates more generic. For non-template you might just
as well use methods, when applicable.
P. S. I'm aware that this post is nearly 7 years old. I came across it because I wanted to
answer a question which was marked as a duplicate and discovered that no answer here mentions ADL.
Whereas the non-member functions don't provide any benefit for the standard containers, using them enforces a more consistent and flexible style. If you at some time want to extend an existing non-std container class, you'd rather define overloads of the free functions, instead of altering the existing class's definition. So for non-std containers they are very useful and always using the free functions makes your code more flexible in that you can substitute the std container by a non-std container more easily and the underlying container type is more transparent to your code as it supports a much wider variety of container implementations.
But of course this always has to be weighted properly and over abstraction is not good either. Although using the free functions is not that much of an over-abstraction, it nevertheless breaks compatibility with C++03 code, which at this young age of C++11 might still be an issue for you.
Ultimately the benefit is in code that is generalized such that it's container agnostic. It can operate on a std::vector, an array, or a range without changes to the code itself.
Additionally, containers, even non-owned containers can be retrofitted such that they can also be used agnostically by code using non-member range based accessors.
See here for more detail.
Recently I was trying to fix a pretty difficult const-correctness compiler error. It initially manifested as a multi-paragraph template vomit error deep within Boost.Python.
But that's irrelevant: it all boiled down to the following fact: the C++11 std::begin and std::end iterator functions are not overloaded to take R-values.
The definition(s) of std::begin are:
template< class C >
auto begin( C& c ) -> decltype(c.begin());
template< class C >
auto begin( const C& c ) -> decltype(c.begin());
So since there is no R-value/Universal Reference overload, if you pass it an R-value you get a const iterator.
So why do I care? Well, if you ever have some kind of "range" container type, i.e. like a "view", "proxy" or a "slice" or some container type that presents a sub iterator range of another container, it is often very convenient to use R-value semantics and get non-const iterators from temporary slice/range objects. But with std::begin, you're out of luck because std::begin will always return a const-iterator for R-values. This is an old problem which C++03 programmers were often frustrated with back in the day before C++11 gave us R-values - i.e. the problem of temporaries always binding as const.
So, why isn't std::begin defined as:
template <class C>
auto begin(C&& c) -> decltype(c.begin());
This way, if c is constant we get a C::const_iterator and a C::iterator otherwise.
At first, I thought the reason was for safety. If you passed a temporary to std::begin, like so:
auto it = std::begin(std::string("temporary string")); // never do this
...you'd get an invalid iterator. But then I realized this problem still exists with the current implementation. The above code would simply return an invalid const-iterator, which would probably segfault when dereferenced.
So, why is std::begin not defined to take an R-value (or more accurately, a Universal Reference)? Why have two overloads (one for const and one for non-const)?
The above code would simply return an invalid const-iterator
Not quite. The iterator will be valid until the end of the full-expression that the temporary the iterator refers to was lexically created in. So something like
std::copy_n( std::begin(std::string("Hallo")), 2,
std::ostreambuf_iterator<char>(std::cout) );
is still valid code. Of course, in your example, it is invalidated at the end of the statement.
What point would there be in modifying a temporary or xvalue? That is probably one of the questions the designers of the range accessors had in mind when proposing the declarations. They didn't consider "proxy" ranges for which the iterators returned by .begin() and .end() are valid past its lifetime; Perhaps for the very reason that, in template code, they cannot be distinguished from normal ranges - and we certainly don't want to modify temporary non-proxy ranges, since that is pointless and might lead to confusion.
However, you don't need to use std::begin in the first place but could rather declare them with a using-declaration:
using std::begin;
using std::end;
and use ADL. This way you declare a namespace-scope begin and end overload for the types that Boost.Python (o.s.) uses and circumvent the restrictions of std::begin. E.g.
iterator begin(boost_slice&& s) { return s.begin(); }
iterator end (boost_slice&& s) { return s.end() ; }
// […]
begin(some_slice) // Calls the global overload, returns non-const iterator
Why have two overloads (one for const and one for non-const)?
Because we still want rvalues objects to be supported (and they cannot be taken by a function parameter of the form T&).
Say I'm making a function to copy a value:
template<class ItInput, class ItOutput>
void copy(ItInput i, ItOutput o) { *o = *i; }
and I would like to avoid the assignment if i and o point to the same object, since then the assignment is pointless.
Obviously, I can't say if (i != o) { ... }, both because i and o might be of different types and because they might point into different containers (and would thus be incomparable). Less obviously, I can't use overloaded function templates either, because the iterators might belong to different containers even though they have the same type.
My initial solution to this was:
template<class ItInput, class ItOutput>
void copy(ItInput i, ItOutput o)
{
if (&*o != static_cast<void const *>(&*i))
*o = *i;
}
but I'm not sure if this works. What if *o or *i actually returns an object instead of a reference?
Is there a way to do this generally?
I don't think that this is really necessary: if assignment is expensive, the type should define an assignment operator that performs the (relatively cheap) self assignment check to prevent doing unnecessary work. But, it's an interesting question, with many pitfalls, so I'll take a stab at answering it.
If we are to assemble a general solution that works for input and output iterators, there are several pitfalls that we must watch out for:
An input iterator is a single-pass iterator: you can only perform indirection via the iterator once per element, so, we can't perform indirection via the iterator once to get the address of the pointed-to value and a second time to perform the copy.
An input iterator may be a proxy iterator. A proxy iterator is an iterator whose operator* returns an object, not a reference. With a proxy iterator, the expression &*it is ill-formed, because *it is an rvalue (it's possible to overload the unary-&, but doing so is usually considered evil and horrible, and most types do not do this).
An output iterator can only be used for output; you cannot perform indirection via it and use the result as an rvalue. You can write to the "pointed to element" but you can't read from it.
So, if we're going to make your "optimization," we'll need to make it only for the case where both iterators are forward iterators (this includes bidirectional iterators and random access iterators: they're forward iterators too).
Because we're nice, we also need to be mindful of the fact that, despite the fact that it violates the concept requirements, many proxy iterators misrepresent their category because it is very useful to have a proxy iterator that supports random access over a sequence of proxied objects. (I'm not even sure how one could implement an efficient iterator for std::vector<bool> without doing this.)
We'll use the following Standard Library headers:
#include <iterator>
#include <type_traits>
#include <utility>
We define a metafunction, is_forward_iterator, that tests whether a type is a "real" forward iterator (i.e., is not a proxy iterator):
template <typename T>
struct is_forward_iterator :
std::integral_constant<
bool,
std::is_base_of<
std::forward_iterator_tag,
typename std::iterator_traits<T>::iterator_category
>::value &&
std::is_lvalue_reference<
decltype(*std::declval<T>())
>::value>
{ };
For brevity, we also define a metafunction, can_compare, that tests whether two types are both forward iterators:
template <typename T, typename U>
struct can_compare :
std::integral_constant<
bool,
is_forward_iterator<T>::value &&
is_forward_iterator<U>::value
>
{ };
Then, we'll write two overloads of the copy function and use SFINAE to select the right overload based on the iterator types: if both iterators are forward iterators, we'll include the check, otherwise we'll exclude the check and always perform the assignment:
template <typename InputIt, typename OutputIt>
auto copy(InputIt const in, OutputIt const out)
-> typename std::enable_if<can_compare<InputIt, OutputIt>::value>::type
{
if (static_cast<void const volatile*>(std::addressof(*in)) !=
static_cast<void const volatile*>(std::addressof(*out)))
*out = *in;
}
template <typename InputIt, typename OutputIt>
auto copy(InputIt const in, OutputIt const out)
-> typename std::enable_if<!can_compare<InputIt, OutputIt>::value>::type
{
*out = *in;
}
As easy as pie!
I think this may be a case where you may have to document some assumptions about the types you expect in the function and be content with not being completely generic.
Like operator*, operator& could be overloaded to do all sorts of things. If you're guarding against operator*, then you should consider operator& and operator!=, etc.
I would say that a good prerequisite to enforce (either through comments in the code or a concept/static_assert) is that operator* returns a reference to the object pointed to by the iterator and that it doesn't (or shouldn't) perform a copy. In that case, your code as it stands seems fine.
Your code, as is, is definitly not okay, or atleast not okay for all iterator categories.
Input iterators and output iterators are not required to be dereferenceable after the first time (they're expected to be single-pass) and input iterators are allowed to dereference to anything "convertible to T" (§24.2.3/2).
So, if you want to handle all kinds of iterators, I don't think you can enforce this "optimization", i.e. you can't generically check if two iterators point to the same object. If you're willing to forego input and output iterators, what you have should be fine. Otherwise, I'd stick with doing the copy in any case (I really don't think you have another option on this).
Write a helper template function equals that automatically returns false if the iterators are different types. Either that or do a specialization or overload of your copy function itself.
If they're the same type then you can use your trick of comparing the pointers of the objects they resolve to, no casting required:
if (&*i != &*o)
*o = *i;
If *i or *o doesn't return a reference, no problem - the copy will occur even if it didn't have to, but no harm will be done.
Every standard container has a begin and end method for returning iterators for that container. However, C++11 has apparently introduced free functions called std::begin and std::end which call the begin and end member functions. So, instead of writing
auto i = v.begin();
auto e = v.end();
you'd write
auto i = std::begin(v);
auto e = std::end(v);
In his talk, Writing Modern C++, Herb Sutter says that you should always use the free functions now when you want the begin or end iterator for a container. However, he does not go into detail as to why you would want to. Looking at the code, it saves you all of one character. So, as far as the standard containers go, the free functions seem to be completely useless. Herb Sutter indicated that there were benefits for non-standard containers, but again, he didn't go into detail.
So, the question is what exactly do the free function versions of std::begin and std::end do beyond calling their corresponding member function versions, and why would you want to use them?
How do you call .begin() and .end() on a C-array ?
Free-functions allow for more generic programming because they can be added afterwards, on a data-structure you cannot alter.
Using the begin and end free functions adds one layer of indirection. Usually that is done to allow more flexibility.
In this case I can think of a few uses.
The most obvious use is for C-arrays (not c pointers).
Another is when trying to use a standard algorithm on a non-conforming container (ie the container is missing a .begin() method). Assuming you can't just fix the container, the next best option is to overload the begin function. Herb is suggesting you always use the begin function to promote uniformity and consistency in your code. Instead of having to remember which containers support method begin and which need function begin.
As an aside, the next C++ rev should copy D's pseudo-member notation. If a.foo(b,c,d) is not defined it instead tries foo(a,b,c,d). It's just a little syntactic sugar to help us poor humans who prefer subject then verb ordering.
Consider the case when you have library that contain class:
class SpecialArray;
it has 2 methods:
int SpecialArray::arraySize();
int SpecialArray::valueAt(int);
to iterate over it's values you need to inherit from this class and define begin() and end() methods for cases when
auto i = v.begin();
auto e = v.end();
But if you always use
auto i = begin(v);
auto e = end(v);
you can do this:
template <>
SpecialArrayIterator begin(SpecialArray & arr)
{
return SpecialArrayIterator(&arr, 0);
}
template <>
SpecialArrayIterator end(SpecialArray & arr)
{
return SpecialArrayIterator(&arr, arr.arraySize());
}
where SpecialArrayIterator is something like:
class SpecialArrayIterator
{
SpecialArrayIterator(SpecialArray * p, int i)
:index(i), parray(p)
{
}
SpecialArrayIterator operator ++();
SpecialArrayIterator operator --();
SpecialArrayIterator operator ++(int);
SpecialArrayIterator operator --(int);
int operator *()
{
return parray->valueAt(index);
}
bool operator ==(SpecialArray &);
// etc
private:
SpecialArray *parray;
int index;
// etc
};
now i and e can be legally used for iteration and accessing of values of SpecialArray
To answer your question, the free functions begin() and end() by default do nothing more than call the container's member .begin() and .end() functions. From <iterator>, included automatically when you use any of the standard containers like <vector>, <list>, etc., you get:
template< class C >
auto begin( C& c ) -> decltype(c.begin());
template< class C >
auto begin( const C& c ) -> decltype(c.begin());
The second part of you question is why prefer the free functions if all they do is call the member functions anyway. That really depends on what kind of object v is in your example code. If the type of v is a standard container type, like vector<T> v; then it doesn't matter if you use the free or member functions, they do the same thing. If your object v is more generic, like in the following code:
template <class T>
void foo(T& v) {
auto i = v.begin();
auto e = v.end();
for(; i != e; i++) { /* .. do something with i .. */ }
}
Then using the member functions breaks your code for T = C arrays, C strings, enums, etc. By using the non-member functions, you advertise a more generic interface that people can easily extend. By using the free function interface:
template <class T>
void foo(T& v) {
auto i = begin(v);
auto e = end(v);
for(; i != e; i++) { /* .. do something with i .. */ }
}
The code now works with T = C arrays and C strings. Now writing a small amount of adapter code:
enum class color { RED, GREEN, BLUE };
static color colors[] = { color::RED, color::GREEN, color::BLUE };
color* begin(const color& c) { return begin(colors); }
color* end(const color& c) { return end(colors); }
We can get your code to be compatible with iterable enums too. I think Herb's main point is that using the free functions is just as easy as using the member functions, and it gives your code backward compatibility with C sequence types and forward compatibility with non-stl sequence types (and future-stl types!), with low cost to other developers.
One benefit of std::begin and std::end is that they serve as extension points
for implementing standard interface for external classes.
If you'd like to use CustomContainer class with range-based for loop or template
function which expects .begin() and .end() methods, you'd obviously have to
implement those methods.
If the class does provide those methods, that's not a problem. When it doesn't,
you'd have to modify it*.
This is not always feasible, for example when using external library, esspecially
commercial and closed source one.
In such situations, std::begin and std::end come in handy, since one can provide
iterator API without modifying the class itself, but rather overloading free functions.
Example: suppose that you'd like to implement count_if function that takes a container
instead of a pair of iterators. Such code might look like this:
template<typename ContainerType, typename PredicateType>
std::size_t count_if(const ContainerType& container, PredicateType&& predicate)
{
using std::begin;
using std::end;
return std::count_if(begin(container), end(container),
std::forward<PredicateType&&>(predicate));
}
Now, for any class you'd like to use with this custom count_if, you only have
to add two free functions, instead of modifying those classes.
Now, C++ has a mechanisim called Argument Dependent Lookup
(ADL), which makes such approach even more flexible.
In short, ADL means, that when a compiler resolves an unqualified function (i. e.
function without namespace, like begin instead of std::begin), it will also
consider functions declared in namespaces of its arguments. For example:
namesapce some_lib
{
// let's assume that CustomContainer stores elements sequentially,
// and has data() and size() methods, but not begin() and end() methods:
class CustomContainer
{
...
};
}
namespace some_lib
{
const Element* begin(const CustomContainer& c)
{
return c.data();
}
const Element* end(const CustomContainer& c)
{
return c.data() + c.size();
}
}
// somewhere else:
CustomContainer c;
std::size_t n = count_if(c, somePredicate);
In this case, it doesn't matter that qualified names are some_lib::begin and some_lib::end
- since CustomContainer is in some_lib:: too, compiler will use those overloads in count_if.
That's also the reason for having using std::begin; and using std::end; in count_if.
This allows us to use unqualified begin and end, therefore allowing for ADL and
allowing compiler to pick std::begin and std::end when no other alternatives are found.
We can eat the cookie and have the cookie - i. e. have a way to provide custom implementation
of begin/end while the compiler can fall back to standard ones.
Some notes:
For the same reason, there are other similar functions: std::rbegin/rend,
std::size and std::data.
As other answers mentions, std:: versions have overloads for naked arrays. That's useful,
but is simply a special case of what I've described above.
Using std::begin and friends is particularly good idea when writing template code,
because this makes those templates more generic. For non-template you might just
as well use methods, when applicable.
P. S. I'm aware that this post is nearly 7 years old. I came across it because I wanted to
answer a question which was marked as a duplicate and discovered that no answer here mentions ADL.
Whereas the non-member functions don't provide any benefit for the standard containers, using them enforces a more consistent and flexible style. If you at some time want to extend an existing non-std container class, you'd rather define overloads of the free functions, instead of altering the existing class's definition. So for non-std containers they are very useful and always using the free functions makes your code more flexible in that you can substitute the std container by a non-std container more easily and the underlying container type is more transparent to your code as it supports a much wider variety of container implementations.
But of course this always has to be weighted properly and over abstraction is not good either. Although using the free functions is not that much of an over-abstraction, it nevertheless breaks compatibility with C++03 code, which at this young age of C++11 might still be an issue for you.
Ultimately the benefit is in code that is generalized such that it's container agnostic. It can operate on a std::vector, an array, or a range without changes to the code itself.
Additionally, containers, even non-owned containers can be retrofitted such that they can also be used agnostically by code using non-member range based accessors.
See here for more detail.