STL Extension/Modification Best Practice - c++

I have been writing in c++ for a few months, and i am comfortable enough with it now to begin implementing my own library, consisting of things that i have found myself reusing again and again. One thing that nagged me was the fact that you always had to provide a beginning and end iterator for functions like std::accumulate,std::fill etc...
The option to provide a qualified container was completely absent and it was simply an annoyance to write begin and end over and over. So, I decided to add this functionality to my library, but i came across problem, i couldn't figure out the best approach of doing so. Here were my general solutions:
1. Macros
- A macro that encapsulates an entire function call
ex. QUICK_STL(FCall)
- A macro that takes the container, function name, and optional args
ex. QUICK_STL(C,F,Args...)
2. Wrapper Function/Functor
- A class that takes the container, function name, and optional args
ex. quick_stl(F, C, Args...)
3. Overload Functions
- Overload every function in namespace std OR my library namespace
ex
namespace std { // or my library root namespace 'cherry'
template <typename C, typename T>
decltype(auto) count(const C& container, const T& value);
}
I usually steer clear of macros, but in this case it could certainty save alot
of lines of code from being written. With regards to function overloading, every single function that i want to use i must overload, which wouldn't really scale. The upside to that approach though is that you retain the names of the functions. With perfect forwarding and decltype(auto) overloading becomes alot easier, but still will take time to implement, and would have to be modified if ever another function was added. As to whether or not i should overload the std namespace i am rather skeptical on whether or not it would be appropriate in this case.
What would be the most appropriate way of going about overloading functions in the STD namespace (note these functions will only serve as proxy's to the original functions)?

You need to read this: Why do all functions take only ranges, not containers?
And This: STL algorithms: Why no additional interface for containers (additional to iterator pairs)?
I have been writing in c++ for a few months, and i am comfortable
enough with it now to begin implementing my own library...
Let me look on the brighter side and just say... Some of us have been there before.... :-)
One thing that nagged me was the fact that you always had to provide a
beginning and end iterator for functions like
std::accumulate,std::fill etc...
That's why you have Boost.Ranges and the Eric's proposed ranges that seems like it isn't gonna make it to C++17.
Macros
See Macros
Wrapper Function/Functor
Not too bad...Provided you do it correctly, You can do that, that's what essentially Ranges do for Containers... See the aforementioned implementations
Overload Functions
Overload every function in namespace std ...
Don't do that... The C++ standard doesn't like it.
See what the standard has to say
$17.6.4.2.1 The behavior of a C++ program is undefined if it adds declarations or definitions to namespace std or to a namespace within
namespace std unless otherwise specified. A program may add a template
specialization for any standard library template to namespace std only
if the declaration depends on a user-defined type and the
specialization meets the standard library requirements for the
original template and is not explicitly prohibited.

Related

Which functions in standard C++ library should not be prefixed with std:: when used?

When I program in C++, instead of writing using namespace std;, I generally tend to use std:: prefixed components like std::cout, std::cin etc. But then I came across ADL and why you should use using std::swap;.
Many components of the standard library (within std) call swap in an unqualified manner to allow custom overloads for non-fundamental types to be called instead of this generic version: Custom overloads of swap declared in the same namespace as the type for which they are provided get selected through argument-dependent lookup over this generic version.
But in all sources about ADL, they only mention std::swap. Are there any other functions like this that I have to be beware of when using? Or for all other functions should I use fully qualified name? Or should I use unqualified name for every function in std::?
EDIT: (I wasn't clear when phrasing my initial question. Here is what I exactly intended when I was writing the question.)
Is there any other function in C++ standard libraries that is a popular candidate for ADL based customization much like std::swap, so that when I use them, I have to be cautious to use using std::foo; foo(); form instead of invoking std::foo(); directly?
Suppose you write a library that works with user supplied types. Then you might want to provide a default implementation to foo a bar. But at the same time you want to enable the user to provide their custom implementation because they might know better how to foo their custom bar.
Thats exactly what happens in the standard library with std::swap. The code that relies on ADL is inside libraries, it works with custom types, types it doesn't know until instantiated. And it can provide a default std::swap but at the same time allows the user to provide their own implementation.
In your user-code, when you know the type, then typically you want to know what function is called. Typically you do know if you want to call my::swap or std::swap and you do not need ADL to choose.
That being said, if you have to ask this quesiton then the answer is: Do not make use of ADL. (I know "If you have to ask.." is somewhat silly, but sometimes is applies just too well). Even in library code you do not always want to enable ADL, or allow it as customization. Even if you do write generic code, there might be situations where you need your way to foo a bar without allowing the user to customize it. It really depends.
Short answer: If you want to call std::swap then call std::swap. If you want to call my::swap then call my::swap. If you want to make use of ADL you will know what to do.
PS: There are more examples of ADL that you know but might not be aware of. You almost never use the std:: prefix when you call an operator overload for example. For example std::string operator+(const std::string&,const std::string&). You could write
std::string a,b;
auto c = std::operator+(a,b);
but you don't. You are used to write
auto c = a + b;
And this only works due to ADL.
PPS: Even for std::swap its not right that it "should not be prefixed with std::". As outlined above, sometimes you do want to make use of ADL and sometimes you don't. That decision is not per function, but per use case. If you write code that swaps two objects and for some reason you want to call std::swap and nothing else, then you call std::swap. If you want to make use of ADL then you make use of ADL.

What are customization point objects and how to use them?

The last draft of the c++ standard introduces the so-called "customization point objects" ([customization.point.object]),
which are widely used by the ranges library.
I seem to understand that they provide a way to write custom version of begin, swap, data, and the like, which are
found by the standard library by ADL. Is that correct?
How is this different from previous practice where a user defines an overload for e.g. begin for her type in her own
namespace? In particular, why are they objects?
What are customization point objects?
They are function object instances in namespace std that fulfill two objectives: first unconditionally trigger (conceptified) type requirements on the argument(s), then dispatch to the correct function in namespace std or via ADL.
In particular, why are they objects?
That's necessary to circumvent a second lookup phase that would directly bring in the user provided function via ADL (this should be postponed by design). See below for more details.
... and how to use them?
When developing an application: you mainly don't. This is a standard library feature, it will add concept checking to future customization points, hopefully resulting e.g. in clear error messages when you mess up template instantiations. However, with a qualified call to such a customization point, you can directly use it. Here's an example with an imaginary std::customization_point object that adheres to the design:
namespace a {
struct A {};
// Knows what to do with the argument, but doesn't check type requirements:
void customization_point(const A&);
}
// Does concept checking, then calls a::customization_point via ADL:
std::customization_point(a::A{});
This is currently not possible with e.g. std::swap, std::begin and the like.
Explanation (a summary of N4381)
Let me try to digest the proposal behind this section in the standard. There are two issues with "classical" customization points used by the standard library.
They are easy to get wrong. As an example, swapping objects in generic code is supposed to look like this
template<class T> void f(T& t1, T& t2)
{
using std::swap;
swap(t1, t2);
}
but making a qualified call to std::swap(t1, t2) instead is too simple - the user-provided
swap would never be called (see
N4381, Motivation and Scope)
More severely, there is no way to centralize (conceptified) constraints on types passed to such user provided functions (this is also why this topic gained importance with C++20). Again
from N4381:
Suppose that a future version of std::begin requires that its argument model a Range concept.
Adding such a constraint would have no effect on code that uses std::begin idiomatically:
using std::begin;
begin(a);
If the call to begin dispatches to a user-defined overload, then the constraint on std::begin
has been bypassed.
The solution that is described in the proposal mitigates both issues
by an approach like the following, imaginary implementation of std::begin.
namespace std {
namespace __detail {
/* Classical definitions of function templates "begin" for
raw arrays and ranges... */
struct __begin_fn {
/* Call operator template that performs concept checking and
* invokes begin(arg). This is the heart of the technique.
* Everyting from above is already in the __detail scope, but
* ADL is triggered, too. */
};
}
/* Thanks to #cpplearner for pointing out that the global
function object will be an inline variable: */
inline constexpr __detail::__begin_fn begin{};
}
First, a qualified call to e.g. std::begin(someObject) always detours via std::__detail::__begin_fn,
which is desired. For what happens with an unqualified call, I again refer to the original paper:
In the case that begin is called unqualified after bringing std::begin into scope, the situation
is different. In the first phase of lookup, the name begin will resolve to the global object
std::begin. Since lookup has found an object and not a function, the second phase of lookup is not
performed. In other words, if std::begin is an object, then using std::begin; begin(a); is
equivalent to std::begin(a); which, as we’ve already seen, does argument-dependent lookup on the
users’ behalf.
This way, concept checking can be performed within the function object in the std namespace,
before the ADL call to a user provided function is performed. There is no way to circumvent this.
"Customization point object" is a bit of a misnomer. Many - probably a majority - aren't actually customization points.
Things like ranges::begin, ranges::end, and ranges::swap are "true" CPOs. Calling one of those causes some complex metaprogramming to take place to figure out if there is a valid customized begin or end or swap to call, or if the default implementation should be used, or if the call should instead be ill-formed (in a SFINAE-friendly manner). Because a number of library concepts are defined in terms of CPO calls being valid (like Range and Swappable), correctly constrained generic code must use such CPOs. Of course, if you know the concrete type and another way to get an iterator out of it, feel free.
Things like ranges::cbegin are CPOs without the "CP" part. They always do the default thing, so it's not much of a customization point. Similarly, range adaptor objects are CPOs but there's nothing customizable about them. Classifying them as CPOs is more of a matter of consistency (for cbegin) or specification convenience (adaptors).
Finally, things like ranges::all_of are quasi-CPOs or niebloids. They are specified as function templates with special magical ADL-blocking properties and weasel wording to allow them to be implemented as function objects instead. This is primarily to prevent ADL picking up the unconstrained overload in namespace std when a constrained algorithm in std::ranges is called unqualified. Because the std::ranges algorithm accepts iterator-sentinel pairs, it's usually less specialized than its std counterpart and loses overload resolution as a result.

ADL and container functions (begin, end, etc)

C++11 and later define free functions begin, end, empty, etc in namespace std. For most containers these functions invoke the corresponding member function. But for some containers (like valarray) these free functions are overloaded (initializer_list does not have a member begin()). So to iterate over any container free functions should be used and to find functions for container from namespaces other than std ADL should be used:
template<typename C>
void foo(C c)
{
using std::begin;
using std::end;
using std::empty;
if (empty(c)) throw empty_container();
for (auto i = begin(c); i != end(c); ++i) { /* do something */ }
}
Question 1: Am I correct? Are begin and end expected to be found via ADL?
But ADL rules specify that if type of an argument is a class template specialization ADL includes namespaces of all template arguments. And then Boost.Range library comes into play, it defines boost::begin, boost::end, etc. These functions are defined like this:
template< class T >
inline BOOST_DEDUCED_TYPENAME range_iterator<T>::type begin( T& r )
{
return range_begin( r );
}
If I use std::vector<boost::any> and a Boost.Range I run into trouble. std::begin and boost::begin overloads are ambiguous. That it, I can not write template code that will find a free begin via ADL. If I explicitly use std::begin I expect that any non-std:: container has a member begin.
Question 2: What shall I do in this case?
Rely on the presence of member function? Simplest way.
Ban Boost.Range? Well, algorithms that take container instead of a pair of iterators are convinient. Boost.Range adaptors (containers that lazily apply an algorithm to a container) are also convinient. But if I do not use Boost.Range in my code it still can be used in a boost library (other than Range). This make template code really fragile.
Ban Boost?
A few years ago, I had a similar problem where my code suddenly started getting ambiguities between std::begin and boost::begin. I found it was due to using Boost.Operator to aid in defining a class, even though it was not even a public base class or apparent to the user of the types involved. A random change somewhere caused #include <boost/range/begin.hpp> to be present in the nested include files somewhere, thus making boost::begin visible to the compiler.
I complained to the mailing list about the putting of classes directly in the Boost namespace, rather than in a nested class and exposing them via using declarations; all stuff defined directly in the Boost namespace could potentially step on each other via accidental ADL.
I just tried to reproduce this today, and it seems quite resilient against such ambiguities now! Looking at the definition, boost::begin is itself in an inner namespace so it can never be found via unqualified lookup if you had not supplied your own using boost::begin; in your own scope.
I don’t know how long ago this fix took place. (If you can still reproduce it, please post a complete program with version and platform details.)
So your answer is:
For Boost, don’t worry about it anymore (upgrade Boost if necessary).
For new code, never define a free function named begin in the same namespace as any of its defined types.

Is "using std::begin;" a good practice?

As I have read, begin(some_vector) is more standard than some_vector.begin() because of array support... and as I know also, the use of using keyword is not really desirable behavior. However, I also see lot of code that contains just these two usings:
using std::begin;
using std::end;
Is that considered good or bad practice? Especially when many begin and end are needed?
As I have read, begin(some_vector) is more standard than some_vector.begin() because of array support
It's not "more standard", both are 100% standard. It is more generic, because it works for arrays, but it's not actually very common to have an object and not know whether it's an array or a container type. If you have something that you know is an array then use std::begin(a) but if you have something that you know is not an array then there is no advantage to using the form that also works with arrays. If you're in a generic context where you might have either, then std::begin works for both cases.
the use of using keyword is not really desirable behavior.
That's debatable. There are very good reasons to avoid using-directives in most contexts (i.e. using namespace foo) but the same arguments don't apply to using-declarations that introduce a single name (i.e. using foo::bar). For example the recommended way to use std::swap is via using std::swap so it's certainly not true that the using keyword is undesirable in general.
Is that considered good or bad practice? Especially when many begin and end are needed?
I would say that in general it's a bad idea. As the other answers explain, it allows begin and end to be found by ADL, but that's not necessarily a good thing. Enabling ADL for begin and end can cause problems, that's why we changed range-based for at the last minute to not use using std::begin; using std::end; to avoid ambiguities, see N3257 for the details.
It can be helpful for ADL help. Something like:
template<typename T, typename F>
void apply(T& v, const F& f)
{
using std::begin;
using std::end;
std::for_each(begin(v), end(v), f);
}
So, than we just can call apply with types, that have begin/end functions and classic C arrays.
In other case it's actually okay to use using directive in source file, in little scope and so on, but it's bad to use in header.
Is that considered good or bad practice?
This depends a lot on the context; I think the answer is equally applicable to the more general question of introducing names via a using. Limit the scope of the use of using makes for more readable code (such as function level scope); use it as required, use it with care.
A particularly interesting case here revolves around ADL.
template <class Container>
void func(Container& c)
{
std::for_each(begin(c), end(c), /* some functor*/);
}
ADL is already in play, since if the container is from the std namespace, the std::begin and std::end will be found. If the container is from a custom namespace, begin and end functions can be found in that namespace (for this discussion, I assume the container provider has also provided these).
If the container is a normal C-style array, the array form std::begin(T (&array)[N]) will not be found since the C-style array is not the std namespace. Introducing a using std::begin here allows the code to be used for arrays and containers.
template <class Container>
void func(Container& c)
{
using std::begin; // if c is an array, the function will still compile
using std::end;
std::for_each(begin(c), end(c), /* some functor*/);
}
// ...
int a[100];
func(a); // works as expected
Demo.
It depends. Using declarations (and, far more, using directives) in header files are heavily discouraged. However, if inside the body of a function you use a lot of times one or several functions from another namespace, a using declaration / directives (put inside the body of the function, so limited to its scope) can make the code more readable.

Should every class have its own namespace?

Something that has been troubling me for a while:
The current wisdom is that types should be kept in a namespace that only
contains functions which are part of the type's non-member interface (see C++ Coding Standards Sutter and Alexandrescu or here) to prevent ADL pulling in unrelated definitions.
Does this imply that all classes must have a namespace of their own? If
we assume that a class may be augmented in the future by the addition of
non-member functions, then it can never be safe to put two types in the
same namespace as either one of them may introduce non-member functions
that could interfere with the other.
The reason I ask is that namespaces are becoming cumbersome for me. I'm
writing a header-only library and I find myself using classes names such as
project::component::class_name::class_name. Their implementations call
helper functions but as these can't be in the same namespace they also have
to be fully qualified!
Edit:
Several answers have suggested that C++ namespaces are simply a mechanism for avoiding name clashes. This is not so. In C++ functions that take a parameter are resolved using Argument Dependent Lookup. This means that when the compiler tries to find a function definition that matches the function name it will look at every function in the same namespace(s) as the type(s) of its parameter(s) when finding candidates.
This can have unintended, unpleasant consequences as detailed in A Modest Proposal: Fixing ADL. Sutter and Alexandrescu's rule states never put a function in the same namespace as a class unless it is meant to be part of the interface of that class. I don't see how I can obey that rule unless I'm prepared to give every class its own namespace.
More suggestions very welcome!
No. I have never heard that convention. Usually each library has its own namespace, and if that library has multiple different modules (e.g. different logical units that differ in functionality), then those might have their own namespace, although one namespace per library is sufficient. Within the library or module namespace, you might use namespace detail or an anonymous namespace to store implementation details. Using one namespace per class is, IMHO, complete overkill. I would definitely shy away from that. At the same time, I would strongly to urge you to have at least one namespace for your library and put everything within that one namespace or a sub-namespace thereof to avoid name clashes with other libraries.
To make this more concrete, allow me to use the venerable Boost C++ Libraries as an example. All of the elements within boost reside in boost::. There are some modules within Boost, such as the interprocess library that have its own namespace such as boost::interprocess::, but for the most part, elements of boost (especially those used very frequently and across modules) simply reside in boost::. If you look within boost, it frequently uses boost::detail or boost::name_of_module::detail for storing implementation details for the given namespace. I suggest you model your namespaces in that way.
No, no and a thousand times no! Namespaces in C++ are not architectural or design elements. They are simply a mechanism for preventing name clashes. If in practice you don't have name clashes, you don't need namespaces.
To avoid ADL, you need only two namespaces: one with all your classes, and the other with all your loose functions. ADL is definitely not a good reason for every class to have its own namespace.
Now, if you want some functions to be found via ADL, you might want to make a namespace for that purpose. But it's still quite unlikely that you'd actually need a separate namespace per class to avoid ADL collisions.
Probably not. See Eric Lippert's post on the subject.
Couple things here:
Eric Lippert is a C# designer, but what he's saying about bad hierarchical design applies here.
A lot of what is being described in that article has to do with naming your class the same thing as a namespace in C#, but many of the same pitfalls apply to C++.
You can save on some of the typedef pain by using typedefs but that's of course only a band-aid.
It's quite an interesting paper, but then given the authors there was a good chance it would be. However, I note that the problem concerns mostly:
typedef, because they only introduce an alias and not a new type
templates
If I do:
namespace foo
{
class Bar;
void copy(const Bar&, Bar&, std::string);
}
And invoke it:
#include <algorithms>
#include "foo/bar.h"
int main(int argc, char* argv[])
{
Bar source; Bar dest;
std::string parameter;
copy(source, dest, parameter);
}
Then it should pick foo::copy. In fact it will consider both foo::copy and std::copy but foo::copy not being template will be given priority.