How to specialize std::begin? - c++

I'm trying to specialize std::begin for a custom container. I'm doing this because I want to use range-based for with the container. This is what I have:
class stackiterator { … };
class stack { … };
#include <iterator>
template <> stackiterator std::begin(stack& S)
{
return S.GetBottom();
}
I get the following error at the definition of my begin specialization:
No function template matches function template specialization 'begin'
What am I doing wrong?

I'm trying to specialize std::begin for a custom container. I'm
doing this because I want to use range-based for with the container.
You are barking up the wrong tree. Range-based for does not use std::begin at all. For class types, the compiler looks up member begin and end directly, and if neither is found, does an ADL-only lookup for free begin and end in the associated namespaces. Ordinary unqualified lookup is not performed; there's no way for std::begin to be picked up if your class is not in the std namespace.
Even if the specialization you want to do is possible (it isn't unless you introduce a member begin() - an explicit specialization for a function template can't change the return type, and the overload at issue returns "whatever member begin() returns"; and if you do introduce a member begin(), why are you specializing std::begin to do what it would have done anyway?), you still won't be able to use it with a range-based for.

Leaving aside the policy and semantic issues of whether you should specialize a function template from the std namespace,
The following snippet does not work:
class stackiterator {};
struct stack { stackiterator Begin() { return stackiterator{};} };
#include <iterator>
namespace std
{
template <> stackiterator begin<stack>(stack& S)
{
return S.Begin();
}
}
However, the following snippet works just fine:
class stackiterator {};
struct stack { stackiterator begin() { return stackiterator{};} };
#include <iterator>
namespace std
{
template <> stackiterator begin<stack>(stack& S)
{
return S.begin();
}
}
The key difference is the presence of Begin() vs begin() as a member function of stack. std::begin() is defined as:
template <class C> auto begin(C& c) -> decltype(c.begin());
template <class C> auto begin(const C& c) -> decltype(c.begin());
When you specialize a function template, you must still keep the return type the same. When you don't have begin() as a member of Stack, the compiler does not know how to determine the return type.
That is the reason for error produced by the compiler.
BTW, there is another SO post that partially answers what can be specialized and what can't be specialized.
Looking at the part of the standard that deals with std::begin(), Section 24.3, I don't see anything about not being able to specialize std::begin().

The right way to add a free function begin that enables for(:) loops is to add, in the namespace of stack, a begin(stack&) and begin(stack const&) function that returns a iterator and const_iterator respectively (and ditto for end)
The other way is to add a member begin() and end() to stack.
Specializing std::begin is bad practice for a number of reasons, not the least of which is that not all for(:) loops will work with it (the lookup rules where changed in the resolution of this defect report). Overloading std::begin is undefined behavior (you may not overload functions in namespace std under the standard: doing so makes your program ill-formed).
This is how it has to be done, even if it violates the naming convention of your project.

Related

Difference between std::vector::empty and std::empty

To check if a vector v is empty, I can use std::empty(v) or v.empty(). I looked at the signatures on cppreference, but am lacking the knowledge to make sense of them. How do they relate to each other? Does one implementation call the other?
I know that one comes from the containers library and the other from the iterators library, but that is about it.
Difference between std::vector::empty and std::empty
The difference between Container::empty member function and std::empty free function (template) is the same as difference between Container::size,std::size, Container::data,std::data, Container::begin,std::begin and Container::end,std::end.
In all of these cases for any standard container, the free function (such as std::empty) simply calls the corresponding member function. The purpose for the existence of the free function is to provide a uniform interface between containers (and also std::initializer_list) and arrays. Arrays cannot have member functions like the class templates can have, so they have specialised overload for these free functions.
If you are writing code with a templated container type, then you should be using the free function in order to be able to support array as the templated type. If the type isn't templated, then there is no difference between the choice of using member function or the free function other than the convenience of potential refactoring into a template (or just plain array).
There are three overloads of std::empty, but the one used by std::empty(v) for a vector v is the first:
template <class C>
constexpr auto empty(const C& c) -> decltype(c.empty()); // (since c++17, until c++20)
template <class C>
[[nodiscard]] constexpr auto empty(const C& c) -> decltype(c.empty());
(since C++20) // (since c++20)
This overload has the following effect:
returns c.empty()
So, std::empty(v) and v.empty() have the same effect in this case.
std::empty returns the result of calling std::vector::empty.
std::empty is useful for scenarios where a container may or may not provide a member function empty() for types providing a member function empty, std::empty provides a default implementation, but for a custom type not providing this function you can provide a function empty at namespace scope for use in templates; thanks to argument dependent lookup the function in the same namespace as the parameter will be used as fallback:
namespace Custom
{
struct Container
{
bool m_empty;
};
constexpr bool empty(Container const& c) // custom implementation for our own type
{
return c.m_empty;
}
}
template<class T>
void PrintEmpty(char const* containerName, T&& container)
{
using namespace std;
std::cout << containerName << " is " << (empty(container) ? "empty" : "not empty") << '\n';
}
int main()
{
PrintEmpty("std::vector<int>()", std::vector<int>());
PrintEmpty("Container{}", Custom::Container{});
PrintEmpty("Container{ true }", Custom::Container{ true });
}
To language-lawyer a bit, the C++20 standard says that a container a has its a.empty() member function return true if and only if a.begin() == a.end() ([container.requirements.general]).
The std::empty() non-member function is specified in [iterator.range]. The overload constexpr auto empty(const C& c) “Returns: c.empty().”
So, in the Standard, the non-member function is specified in terms of the member functions, not vice versa. In general, that doesn’t mean that the implementation needs to have the non-member function actually call the member function, so long as it “behaves as if.” In this case, the relationship must hold for any allowed specialization of the STL container classes, or for custom container classes, so that constrains what the writers of the library are allowed to do.

Can I use the iterator Libraries' Access Functions on Nonstandard Containers?

The iterator library has been introducing a lot of access functions over the course of C++11, C++14, and C++17:
begin/end
cbegin/cend
crbegin/crend
data
empty
rbegin/rend
size
Can I use these on any container, even nonstandard containers (provided they supply an accessible corresponding method?) For example given a QVector foo can I do this:
const auto bar = begin(foo);
The declarations for std::begin are as follow (from §24.7):
template <class C> auto begin(C& c) -> decltype(c.begin());
template <class C> auto begin(const C& c) -> decltype(c.begin());
So these functions will be defined for any class C such that c.begin() is "valid" (does exist). The standard also guarantees that these will:
Returns: c.begin().
So yes you can use begin(c) on any container of type C as long as either:
The member function C::begin() is provided.
There exist a begin(C const&) or begin(C &) function.
The standalone begin function should not be in the std:: namespace but rather in the same namespace as your class C so name-lookup can find it.
As I understand it, your question seem to be: if the container provides a member function begin, can I then call it as a free function? The answer to your question is yes, because a templated free function begin is provided by the standard, that simply tries to call the member function begin; from http://en.cppreference.com/w/cpp/iterator/begin: "Returns an iterator to the beginning of the given container c or array array. These templates rely on C::begin() having a reasonable implementation.".

Why is ADL not working with Boost.Range?

Considering:
#include <cassert>
#include <boost/range/irange.hpp>
#include <boost/range/algorithm.hpp>
int main() {
auto range = boost::irange(1, 4);
assert(boost::find(range, 4) == end(range));
}
Live Clang demo
Live GCC demo
this gives:
main.cpp:8:37: error: use of undeclared identifier 'end'
Considering that if you write using boost::end; it works just fine, which implies that boost::end is visible:
Why is ADL not working and finding boost::end in the expression end(range)? And if it's intentional, what's the rationale behind it?
To be clear, the expected result would be similar to what happens in this example using std::find_if and unqualified end(vec).
Historical background
The underlying reason is discussed in this closed Boost ticket
With the following code, compiler will complain that no begin/end is
found for "range_2" which is integer range. I guess that integer range
is missing ADL compatibility ?
#include <vector>
#include <boost/range/iterator_range.hpp>
#include <boost/range/irange.hpp>
int main() {
std::vector<int> v;
auto range_1 = boost::make_iterator_range(v);
auto range_2 = boost::irange(0, 1);
begin(range_1); // found by ADL
end(range_1); // found by ADL
begin(range_2); // not found by ADL
end(range_2); // not found by ADL
return 0;
}
boost::begin() and boost::end() are not meant to be found by ADL. In
fact, Boost.Range specifically takes precautions to prevent
boost::begin() and boost::end() from being found by ADL, by declaring
them in the namespace boost::range_adl_barrier and then exporting them
into the namespace boost from there. (This technique is called an "ADL
barrier").
In the case of your range_1, the reason unqualified begin() and end()
calls work is because ADL looks not only at the namespace a template
was declared in, but the namespaces the template arguments were
declared in as well. In this case, the type of range_1 is
boost::iterator_range<std::vector<int>::iterator>. The template
argument is in namespace std (on most implementations), so ADL finds
std::begin() and std::end() (which, unlike boost::begin() and
boost::end(), do not use an ADL barrier to prevent being found by
ADL).
To get your code to compile, simply add "using boost::begin;" and
"using boost::end;", or explicitly qualify your begin()/end() calls
with "boost::".
Extended code example illustrating the dangers of ADL
The danger of ADL from unqualified calls to begin and end is two-fold:
the set of associated namespaces can be much larger than one expects. E.g. in begin(x), if x has (possibly defaulted!) template parameters, or hidden base classes in its implementation, the associated namespaces of the template parameters and of its base classes are also considered by ADL. Each of those associated namespace can lead to many overloads of begin and end being pulled in during argument dependent lookup.
unconstrained templates cannot be distinguished during overload resolution. E.g. in namespace std, the begin and end function templates are not separately overloaded for each container, or otherwise constrained on the signature of the container being supplied. When another namespace (such as boost) also supplies similarly unconstrained function templates, overload resolution will consider both an equal match, and an error occurs.
The following code samples illustrate the above points.
A small container library
The first ingredient is to have a container class template, nicely wrapped in its own namespace, with an iterator that derives from std::iterator, and with generic and unconstrained function templates begin and end.
#include <iostream>
#include <iterator>
namespace C {
template<class T, int N>
struct Container
{
T data[N];
using value_type = T;
struct Iterator : public std::iterator<std::forward_iterator_tag, T>
{
T* value;
Iterator(T* v) : value{v} {}
operator T*() { return value; }
auto& operator++() { ++value; return *this; }
};
auto begin() { return Iterator{data}; }
auto end() { return Iterator{data+N}; }
};
template<class Cont>
auto begin(Cont& c) -> decltype(c.begin()) { return c.begin(); }
template<class Cont>
auto end(Cont& c) -> decltype(c.end()) { return c.end(); }
} // C
A small range library
The second ingredient is to have a range library, also wrapped in its own namespace, with another set of unconstrained function templates begin and end.
namespace R {
template<class It>
struct IteratorRange
{
It first, second;
auto begin() { return first; }
auto end() { return second; }
};
template<class It>
auto make_range(It first, It last)
-> IteratorRange<It>
{
return { first, last };
}
template<class Rng>
auto begin(Rng& rng) -> decltype(rng.begin()) { return rng.begin(); }
template<class Rng>
auto end(Rng& rng) -> decltype(rng.end()) { return rng.end(); }
} // R
Overload resolution ambiguity through ADL
Trouble begins when one tries to make an iterator range into a container, while iterating with unqualified begin and end:
int main()
{
C::Container<int, 4> arr = {{ 1, 2, 3, 4 }};
auto rng = R::make_range(arr.begin(), arr.end());
for (auto it = begin(rng), e = end(rng); it != e; ++it)
std::cout << *it;
}
Live Example
Argument-dependent name lookup on rng will find 3 overloads for both begin and end: from namespace R (because rng lives there), from namespace C (because the rng template parameter Container<int, 4>::Iterator lives there), and from namespace std (because the iterator is derived from std::iterator). Overload resolution will then consider all 3 overloads an equal match and this results in a hard error.
Boost solves this by putting boost::begin and boost::end in an inner namespace and pulling them into the enclosing boost namespace by using directives. An alternative, and IMO more direct way, would be to ADL-protect the types (not the functions), so in this case, the Container and IteratorRange class templates.
Live Example With ADL barriers
Protecting your own code may not be enough
Funny enough, ADL-protecting Container and IteratorRange would -in this particular case- be enough to let the above code run without error because std::begin and std::end would be called because std::iterator is not ADL-protected. This is very surprising and fragile. E.g. if the implementation of C::Container::Iterator no longer derives from std::iterator, the code would stop compiling. It is therefore preferable to use qualified calls R::begin and R::end on any range from namespace R in order to be protected from such underhanded name-hijacking.
Note also that the range-for used to have the above semantics (doing ADL with at least std as an associated namespace). This was discussed in N3257 which led to semantic changes in range-for. The current range-for first looks for member functions begin and end, so that std::begin and std::end will not be considered, regardless of ADL-barriers and inheritance from std::iterator.
int main()
{
C::Container<int, 4> arr = {{ 1, 2, 3, 4 }};
auto rng = R::make_range(arr.begin(), arr.end());
for (auto e : rng)
std::cout << e;
}
Live Example
In boost/range/end.hpp they explicitly block ADL by putting end in a range_adl_barrier namespace, then using namespace range_adl_barrier; to bring it into the boost namespace.
As end is not actually from ::boost, but rather from ::boost::range_adl_barrier, it is not found by ADL.
Their reasoning is described in boost/range/begin.hpp:
// Use a ADL namespace barrier to avoid ambiguity with other unqualified
// calls. This is particularly important with C++0x encouraging
// unqualified calls to begin/end.
no examples are given of where this causes a problem, so I can only theorize what they are talking about.
Here is an example I have invented of how ADL can cause ambiguity:
namespace foo {
template<class T>
void begin(T const&) {}
}
namespace bar {
template<class T>
void begin(T const&) {}
struct bar_type {};
}
int main() {
using foo::begin;
begin( bar::bar_type{} );
}
live example. Both foo::begin and bar::begin are equally valid functions to call for the begin( bar::bar_type{} ) in that context.
This could be what they are talking about. Their boost::begin and std::begin might be equally valid in a context where you have using std::begin on a type from boost. By putting it in a sub-namespace of boost, std::begin gets called (and works on ranges, naturally).
If the begin in the namespace boost had been less generic, it would be preferred, but that isn't how they wrote it.
That's because boost::end is inside an ADL barrier, which is then pulled in boost at the end of the file.
However, from cppreference's page on ADL (sorry, I don't have a C++ draft handy):
1) using-directives in the associated namespaces are ignored
That prevents it from being included in ADL.

Can I specialize std::begin and std::end for the return value of equal_range()?

The <algorithm> header provides std::equal_range(), as well as some containers having it as a member function. What bothers me with this function is that it returns a pair of iterators, making it tedious to iterate from the begin iterator to the end iterator. I'd like to be able to use std::begin() and std::end() so that I can use the C++11 range-based for-loop.
Now, I've heard contradictory information in regards to specializing std::begin() and std::end() - I've been told that adding anything to the std namespace results in undefined behavior, whereas I have also been told that you can provide your own specializations of std::begin() and std::end().
This is what I am doing right now:
namespace std
{
template<typename Iter, typename = typename iterator_traits<Iter>::iterator_category>
Iter begin(pair<Iter, Iter> const &p)
{
return p.first;
}
template<typename Iter, typename = typename iterator_traits<Iter>::iterator_category>
Iter end(pair<Iter, Iter> const &p)
{
return p.second;
}
}
And this does work: http://ideone.com/wHVfkh
But I am wondering, what are the downsides to doing this? Is there a better way to do this?
17.6.4.2.1/1 The behavior of a C++ program is undefined if it adds declarations or definitions to namespace std or to a namespace
within namespace std unless otherwise specified. A program may add a
template specialization for any standard library template to namespace
std only if the declaration depends on a user-defined type and the
specialization meets the standard library requirements for the
original template and is not explicitly prohibited.
So yes, I believe that, technically, your code exhibits undefined behavior. Perhaps you can write a simple class that takes a pair of iterators in its constructor and implements begin() and end() methods. Then you can write something like
for (const auto& elem: as_range(equal_range(...))) {}

Can iter_swap be specialised?

If I have a container std::vector<T*> items, I can create an IndirectIterator which wraps std::vector<T*>::iterator and allows iterating over T's rather than T*'s.
Can I specialise iter_swap for IndirectIterator to make standard algorithms (such as std::sort) swap items by pointer?
i.e., if I write the following, will it have any effect on standard algorithms?
namespace some_namespace
{
template <typename IterT>
class IndirectIterator
{
IterT m_base;
public:
typedef IterT base_iterator;
typedef /* ... */ reference;
/* ... */
reference operator*() const { **m_base; }
const base_iterator& base() const { return m_base; }
base_iterator& base() { return m_base; }
};
template <typename T>
void iter_swap(IndirectIterator<T>& a, IndirectIterator<T>& b)
{
using std::iter_swap;
iter_swap(a.base(), b.base());
}
}
The benefit of this specialisation is that it swaps pointers rather than full T instances, so it's faster (potentially).
As far as I can see, iter_swap is only used in std::reverse, and it does not mention any kind of argument-dependent lookup: it always uses std::iter_swap. And since you are not allowed to overload functions in the std namespace, you're out of luck.
Can I specialise iter_swap for IndirectIterator to make standard
algorithms (such as std::sort) swap items by pointer?
You can always make your overload/specialization. However your question is whether you can specialize iter_swap inside the namespace std.
I think the answer is unclear from the standard.
I have found that in some cases, I had to define a special iter_swap inside std so that std::sort uses it. In gcc std lib std::sort uses the qualified std::iter_swap.
This is probably a defect in std::sort.
IMO std::sort should call an unqualified swap_iter.
i.e., if I write the following, will it have any effect on standard
algorithms?
Not in GCC (at least) because standard algorithms use qualified std::iter_swap (bug?)
I don't think the standard is clear about it.
You are allowed to reopen std namespace and specialize templates inside std as long as you specialize them for user-defined types. In your case you can actually specialize std::iter_swap for your purposes, just make sure you are doing it in std namespace, not in your own namespace (as in your example). It is not very elegant, but it is allowed.
Starting in C++20, the answer is unambiguously "yes, you can customize your own ADL iter_swap." STL algorithms should use it, but aren't required to, and there is implementation divergence in practice.
For customizing a customization point, you should use the "hidden friend idiom":
namespace some_namespace {
template <class IterT>
class IndirectIterator {
IterT m_base;
public:
/* ... */
reference operator*() const { return **m_base; }
const base_iterator& base() const { return m_base; }
// "It's not a member, it's just a friend"
friend void iter_swap(IndirectIterator a, IndirectIterator b) {
using std::iter_swap;
iter_swap(a.base(), b.base());
}
};
}
Subtle change here: your iter_swap had been taking its arguments by mutable reference! That wouldn't work, e.g. if it were passed an IndirectIterator that was const, or that was an rvalue. Iterator parameters should always be taken by value. Similarly, there's no sense in providing a non-const version of base(). Your base() is a getter, not a setter; it should be a const member function for the same reason vector::size() is a const member function.
Finally, the big caveat here is that you must never customize iter_swap to do anything observably different from swapping the targets of the iterators. If you do that, then it's undefined behavior — all bets are off — the program could do anything. Formally, this requirement is given in [iterator.cust.swap]:
If the [iter_swap] function selected by overload resolution does not exchange the values denoted by E1 and E2, the program is ill-formed, no diagnostic required.
I believe your idea of having iter_swap swap the "first-level" targets rather than the "second-level" targets is perfectly kosher, as long as you're never going to rely on observing the difference. The standard library is still allowed to bypass iter_swap and just do swap(*it, *kt) if it feels like it; your code must be able to cope with that, or else "the program is ill-formed, no diagnostic required." (I.e., you must fully commit to the idea that swapping those first-level targets is tantamount to "exchanging the values of E1 and E2," and thus interchangeable with any other method of exchanging the values. You can't rely on your own method getting used.)
Here is a complete C++20 example on Godbolt. As of February 2023, here's where I see the customized iter_swap actually getting used:
Algorithm
libstdc++
libc++
Microsoft
std::ranges::partition
Yes
Yes
Yes
std::ranges::sort
No
No
Yes
std::partition
No
No
No
std::sort
No
No
No