ADL and container functions (begin, end, etc) - c++

C++11 and later define free functions begin, end, empty, etc in namespace std. For most containers these functions invoke the corresponding member function. But for some containers (like valarray) these free functions are overloaded (initializer_list does not have a member begin()). So to iterate over any container free functions should be used and to find functions for container from namespaces other than std ADL should be used:
template<typename C>
void foo(C c)
{
using std::begin;
using std::end;
using std::empty;
if (empty(c)) throw empty_container();
for (auto i = begin(c); i != end(c); ++i) { /* do something */ }
}
Question 1: Am I correct? Are begin and end expected to be found via ADL?
But ADL rules specify that if type of an argument is a class template specialization ADL includes namespaces of all template arguments. And then Boost.Range library comes into play, it defines boost::begin, boost::end, etc. These functions are defined like this:
template< class T >
inline BOOST_DEDUCED_TYPENAME range_iterator<T>::type begin( T& r )
{
return range_begin( r );
}
If I use std::vector<boost::any> and a Boost.Range I run into trouble. std::begin and boost::begin overloads are ambiguous. That it, I can not write template code that will find a free begin via ADL. If I explicitly use std::begin I expect that any non-std:: container has a member begin.
Question 2: What shall I do in this case?
Rely on the presence of member function? Simplest way.
Ban Boost.Range? Well, algorithms that take container instead of a pair of iterators are convinient. Boost.Range adaptors (containers that lazily apply an algorithm to a container) are also convinient. But if I do not use Boost.Range in my code it still can be used in a boost library (other than Range). This make template code really fragile.
Ban Boost?

A few years ago, I had a similar problem where my code suddenly started getting ambiguities between std::begin and boost::begin. I found it was due to using Boost.Operator to aid in defining a class, even though it was not even a public base class or apparent to the user of the types involved. A random change somewhere caused #include <boost/range/begin.hpp> to be present in the nested include files somewhere, thus making boost::begin visible to the compiler.
I complained to the mailing list about the putting of classes directly in the Boost namespace, rather than in a nested class and exposing them via using declarations; all stuff defined directly in the Boost namespace could potentially step on each other via accidental ADL.
I just tried to reproduce this today, and it seems quite resilient against such ambiguities now! Looking at the definition, boost::begin is itself in an inner namespace so it can never be found via unqualified lookup if you had not supplied your own using boost::begin; in your own scope.
I don’t know how long ago this fix took place. (If you can still reproduce it, please post a complete program with version and platform details.)
So your answer is:
For Boost, don’t worry about it anymore (upgrade Boost if necessary).
For new code, never define a free function named begin in the same namespace as any of its defined types.

Related

what does type alias do in classes?

I got curious and looked at the implementation details of the std::vector. I don't understand a lot of the code but I was confused about the "using" declarations'. I see that a lot of classes also include those using statements or typedef within their classes. What is the point of using a using declaration within your class? Does it introduce a new member variable or something? I am confused about how it works.
using iterator = _Vector_iterator<_Scary_val>;
using const_iterator = _Vector_const_iterator<_Scary_val>;
using reverse_iterator = _STD reverse_iterator<iterator>;
using const_reverse_iterator = _STD reverse_iterator<const_iterator>;
using was introduced in C++11 as a more advanced and flexible replacement for typedef, particularly in template metaprogramming.
Whether typedef or using is used, standard containers define a common set of inner type aliases, so that code can be written in a more generic manner, particularly in template metaprogramming. For instance, you can write a function that accepts any standard container as input, and then use its inner aliases without having to know the type of the container itself. This allows you to easily switch between containers without having to re-write the code that is using the containers. For example 1:
template<typename Container>
void doSomething(const Container &c) {
for(typename Container::const_iterator iter = c.cbegin(); iter != c.cend(); ++iter) {
// use *iter as needed ...
}
}
1: obviously, there are better ways to write this nowadays, like for(const auto &elem : c) { ... }.
So, maybe one day you start out using std::vector, eg:
std::vector<int> v;
...
doSomething(v);
And then later on you decide to use std::list instead, eg:
std::list<int> l;
...
doSomething(l);
You can change the type of container used, without having to change the function itself.
These declarations declare type members of the class. Just as a class may have data members, it can also have type members (they take up no space as they are compile time constructs). Some of the type members are prescribed by the C++ standard. Some are just for convenience so that you can use abbreviated type names instead of a type name that may not even fit in one line. Some others are provided to interoperate with the C++ library, ie. the library may place requirements on user types passed to it, requiring that certain member types are present within those types. Some times, the type aliases are used to expose an internal implementation detail in a form that is thus made public. Ie. a class may have some private template member types, whose particular instantiation can be exposed using such an alias, without exposing the generic (templated) type.
Finally, using type aliases helps with compiler performance in template-heavy code: if you have a choice between making a new type (that eg derives from another type, or contains a type) or using a convenient alias, then the alias will have significantly lower memory and performance impact during compilation. This only applies to template metaprogramming and is less of a concern in less template-heavy code as facilitated by modern C++.

What are customization point objects and how to use them?

The last draft of the c++ standard introduces the so-called "customization point objects" ([customization.point.object]),
which are widely used by the ranges library.
I seem to understand that they provide a way to write custom version of begin, swap, data, and the like, which are
found by the standard library by ADL. Is that correct?
How is this different from previous practice where a user defines an overload for e.g. begin for her type in her own
namespace? In particular, why are they objects?
What are customization point objects?
They are function object instances in namespace std that fulfill two objectives: first unconditionally trigger (conceptified) type requirements on the argument(s), then dispatch to the correct function in namespace std or via ADL.
In particular, why are they objects?
That's necessary to circumvent a second lookup phase that would directly bring in the user provided function via ADL (this should be postponed by design). See below for more details.
... and how to use them?
When developing an application: you mainly don't. This is a standard library feature, it will add concept checking to future customization points, hopefully resulting e.g. in clear error messages when you mess up template instantiations. However, with a qualified call to such a customization point, you can directly use it. Here's an example with an imaginary std::customization_point object that adheres to the design:
namespace a {
struct A {};
// Knows what to do with the argument, but doesn't check type requirements:
void customization_point(const A&);
}
// Does concept checking, then calls a::customization_point via ADL:
std::customization_point(a::A{});
This is currently not possible with e.g. std::swap, std::begin and the like.
Explanation (a summary of N4381)
Let me try to digest the proposal behind this section in the standard. There are two issues with "classical" customization points used by the standard library.
They are easy to get wrong. As an example, swapping objects in generic code is supposed to look like this
template<class T> void f(T& t1, T& t2)
{
using std::swap;
swap(t1, t2);
}
but making a qualified call to std::swap(t1, t2) instead is too simple - the user-provided
swap would never be called (see
N4381, Motivation and Scope)
More severely, there is no way to centralize (conceptified) constraints on types passed to such user provided functions (this is also why this topic gained importance with C++20). Again
from N4381:
Suppose that a future version of std::begin requires that its argument model a Range concept.
Adding such a constraint would have no effect on code that uses std::begin idiomatically:
using std::begin;
begin(a);
If the call to begin dispatches to a user-defined overload, then the constraint on std::begin
has been bypassed.
The solution that is described in the proposal mitigates both issues
by an approach like the following, imaginary implementation of std::begin.
namespace std {
namespace __detail {
/* Classical definitions of function templates "begin" for
raw arrays and ranges... */
struct __begin_fn {
/* Call operator template that performs concept checking and
* invokes begin(arg). This is the heart of the technique.
* Everyting from above is already in the __detail scope, but
* ADL is triggered, too. */
};
}
/* Thanks to #cpplearner for pointing out that the global
function object will be an inline variable: */
inline constexpr __detail::__begin_fn begin{};
}
First, a qualified call to e.g. std::begin(someObject) always detours via std::__detail::__begin_fn,
which is desired. For what happens with an unqualified call, I again refer to the original paper:
In the case that begin is called unqualified after bringing std::begin into scope, the situation
is different. In the first phase of lookup, the name begin will resolve to the global object
std::begin. Since lookup has found an object and not a function, the second phase of lookup is not
performed. In other words, if std::begin is an object, then using std::begin; begin(a); is
equivalent to std::begin(a); which, as we’ve already seen, does argument-dependent lookup on the
users’ behalf.
This way, concept checking can be performed within the function object in the std namespace,
before the ADL call to a user provided function is performed. There is no way to circumvent this.
"Customization point object" is a bit of a misnomer. Many - probably a majority - aren't actually customization points.
Things like ranges::begin, ranges::end, and ranges::swap are "true" CPOs. Calling one of those causes some complex metaprogramming to take place to figure out if there is a valid customized begin or end or swap to call, or if the default implementation should be used, or if the call should instead be ill-formed (in a SFINAE-friendly manner). Because a number of library concepts are defined in terms of CPO calls being valid (like Range and Swappable), correctly constrained generic code must use such CPOs. Of course, if you know the concrete type and another way to get an iterator out of it, feel free.
Things like ranges::cbegin are CPOs without the "CP" part. They always do the default thing, so it's not much of a customization point. Similarly, range adaptor objects are CPOs but there's nothing customizable about them. Classifying them as CPOs is more of a matter of consistency (for cbegin) or specification convenience (adaptors).
Finally, things like ranges::all_of are quasi-CPOs or niebloids. They are specified as function templates with special magical ADL-blocking properties and weasel wording to allow them to be implemented as function objects instead. This is primarily to prevent ADL picking up the unconstrained overload in namespace std when a constrained algorithm in std::ranges is called unqualified. Because the std::ranges algorithm accepts iterator-sentinel pairs, it's usually less specialized than its std counterpart and loses overload resolution as a result.

STL Extension/Modification Best Practice

I have been writing in c++ for a few months, and i am comfortable enough with it now to begin implementing my own library, consisting of things that i have found myself reusing again and again. One thing that nagged me was the fact that you always had to provide a beginning and end iterator for functions like std::accumulate,std::fill etc...
The option to provide a qualified container was completely absent and it was simply an annoyance to write begin and end over and over. So, I decided to add this functionality to my library, but i came across problem, i couldn't figure out the best approach of doing so. Here were my general solutions:
1. Macros
- A macro that encapsulates an entire function call
ex. QUICK_STL(FCall)
- A macro that takes the container, function name, and optional args
ex. QUICK_STL(C,F,Args...)
2. Wrapper Function/Functor
- A class that takes the container, function name, and optional args
ex. quick_stl(F, C, Args...)
3. Overload Functions
- Overload every function in namespace std OR my library namespace
ex
namespace std { // or my library root namespace 'cherry'
template <typename C, typename T>
decltype(auto) count(const C& container, const T& value);
}
I usually steer clear of macros, but in this case it could certainty save alot
of lines of code from being written. With regards to function overloading, every single function that i want to use i must overload, which wouldn't really scale. The upside to that approach though is that you retain the names of the functions. With perfect forwarding and decltype(auto) overloading becomes alot easier, but still will take time to implement, and would have to be modified if ever another function was added. As to whether or not i should overload the std namespace i am rather skeptical on whether or not it would be appropriate in this case.
What would be the most appropriate way of going about overloading functions in the STD namespace (note these functions will only serve as proxy's to the original functions)?
You need to read this: Why do all functions take only ranges, not containers?
And This: STL algorithms: Why no additional interface for containers (additional to iterator pairs)?
I have been writing in c++ for a few months, and i am comfortable
enough with it now to begin implementing my own library...
Let me look on the brighter side and just say... Some of us have been there before.... :-)
One thing that nagged me was the fact that you always had to provide a
beginning and end iterator for functions like
std::accumulate,std::fill etc...
That's why you have Boost.Ranges and the Eric's proposed ranges that seems like it isn't gonna make it to C++17.
Macros
See Macros
Wrapper Function/Functor
Not too bad...Provided you do it correctly, You can do that, that's what essentially Ranges do for Containers... See the aforementioned implementations
Overload Functions
Overload every function in namespace std ...
Don't do that... The C++ standard doesn't like it.
See what the standard has to say
$17.6.4.2.1 The behavior of a C++ program is undefined if it adds declarations or definitions to namespace std or to a namespace within
namespace std unless otherwise specified. A program may add a template
specialization for any standard library template to namespace std only
if the declaration depends on a user-defined type and the
specialization meets the standard library requirements for the
original template and is not explicitly prohibited.

Is "using std::begin;" a good practice?

As I have read, begin(some_vector) is more standard than some_vector.begin() because of array support... and as I know also, the use of using keyword is not really desirable behavior. However, I also see lot of code that contains just these two usings:
using std::begin;
using std::end;
Is that considered good or bad practice? Especially when many begin and end are needed?
As I have read, begin(some_vector) is more standard than some_vector.begin() because of array support
It's not "more standard", both are 100% standard. It is more generic, because it works for arrays, but it's not actually very common to have an object and not know whether it's an array or a container type. If you have something that you know is an array then use std::begin(a) but if you have something that you know is not an array then there is no advantage to using the form that also works with arrays. If you're in a generic context where you might have either, then std::begin works for both cases.
the use of using keyword is not really desirable behavior.
That's debatable. There are very good reasons to avoid using-directives in most contexts (i.e. using namespace foo) but the same arguments don't apply to using-declarations that introduce a single name (i.e. using foo::bar). For example the recommended way to use std::swap is via using std::swap so it's certainly not true that the using keyword is undesirable in general.
Is that considered good or bad practice? Especially when many begin and end are needed?
I would say that in general it's a bad idea. As the other answers explain, it allows begin and end to be found by ADL, but that's not necessarily a good thing. Enabling ADL for begin and end can cause problems, that's why we changed range-based for at the last minute to not use using std::begin; using std::end; to avoid ambiguities, see N3257 for the details.
It can be helpful for ADL help. Something like:
template<typename T, typename F>
void apply(T& v, const F& f)
{
using std::begin;
using std::end;
std::for_each(begin(v), end(v), f);
}
So, than we just can call apply with types, that have begin/end functions and classic C arrays.
In other case it's actually okay to use using directive in source file, in little scope and so on, but it's bad to use in header.
Is that considered good or bad practice?
This depends a lot on the context; I think the answer is equally applicable to the more general question of introducing names via a using. Limit the scope of the use of using makes for more readable code (such as function level scope); use it as required, use it with care.
A particularly interesting case here revolves around ADL.
template <class Container>
void func(Container& c)
{
std::for_each(begin(c), end(c), /* some functor*/);
}
ADL is already in play, since if the container is from the std namespace, the std::begin and std::end will be found. If the container is from a custom namespace, begin and end functions can be found in that namespace (for this discussion, I assume the container provider has also provided these).
If the container is a normal C-style array, the array form std::begin(T (&array)[N]) will not be found since the C-style array is not the std namespace. Introducing a using std::begin here allows the code to be used for arrays and containers.
template <class Container>
void func(Container& c)
{
using std::begin; // if c is an array, the function will still compile
using std::end;
std::for_each(begin(c), end(c), /* some functor*/);
}
// ...
int a[100];
func(a); // works as expected
Demo.
It depends. Using declarations (and, far more, using directives) in header files are heavily discouraged. However, if inside the body of a function you use a lot of times one or several functions from another namespace, a using declaration / directives (put inside the body of the function, so limited to its scope) can make the code more readable.

How to use begin() free function

I am currently writing a function template that deals with a generic container. I want to use std::begin() and std::end(), because of the reasons mentioned in this question. My question is, whether should I use:
std::begin( myContainer )
Or:
using namespace std; // Better use: "using std::begin"
begin( myContainer )
Or, in other words, is it okay to overload begin() within the std namespace? Should I allow users of my function to overload the begin() function in the global namespace somewhere else as well? How does the STL deal with it?
There's no need for a using directive, so let's assume the second snippet contains a using declaration instead.
using std::begin;
If you're creating your own container to go with this function template, provide Container::begin() and Container::end() member functions, and then it doesn't make a difference whether you use the first or the second. std::begin() and std::end() will call the respective member functions when available (§24.7 [iterator.range]).
On the other hand, if you're creating a function template that should work with any container, those present in the standard library, or a custom container; I'd recommend the second approach.
using std::begin;
begin( myContainer );
Note that that will enable ADL to find user defined overloads for free functions begin() and end() within the same namespace as the container definition. The overloads should not be added to namespace std or the global namespace (unless the container definition is also in the global namespace). In the absence of these free function overloads, std::begin will be called (because of the using declaration) and this in turn will call Container::begin().
It's not okay to overload something in std namespace, only specializations are allowed. If you want to enable ADL you can use
using std::begin;
begin(myContainer)
For a custom container, std::begin actually can call begin in your container. So if you have MyContainerClass::begin that will be enough. Same with std::end and the constant-iterator versions std::cbegin and std::cend.