Should I make my functions as general as possible? - c++

template<class T>
void swap(T &a, T &b)
{
T t;
t = a;
a = b;
b = t;
}
replace
void swap(int &a, int &b)
{
int t;
t = a;
a = b;
b = t;
}
This is the simplest example I could come up with,but there should be many other complicated functions.Should I make all methods I write templated if possible?
Any disadvantages to do this?
thanks.

Genericity has the advantage of being reusable. However, write things generic, only if:
It doesn't take much more time to do that, than do it non-generic
It doesn't complicate the code more than a non-generic solution
You know will benefit from it later
However, know your standard library. The case you presented is already in STL as std::swap.
Also, remember that when writing generically using templates, you can optimize special cases by using template specialization. However, always to it when it's needed for performance, not as you write it.
Also, note that you have the question of run-time and compile-time performance here. Template-based solutions increase compile-time. Inline solutions can but not must decrease run-time.
`Cause "Premature optimization and genericity is the root of all evil". And you can quote me on that -_-.

Reusable code is reusable only if you actually reuse it. so write the function naturally in the first instance. If a bit later you come across a situation where the code could be reused with a little tweak, go back and refactor it, It is at the refactoring stage you should consider writing template functions.

The simplest answer to your question is what many people smarter than myself have been saying for years:
Never write more than the minimum you can get away with.

Make them as generic as you can trivially make them. If it's truly trivial (such as the above example) then it takes no extra work, and might save you some work in the future

The first time you write swap you shouldn't
The second time it might be tempting but sometime you can get away without making the whole thing a mess
The third time it should be clear that you must. However depending on how many places you've used one and two it might be time consuming so the second time should be a good decision

There are disadvantages to using templates all the time. It (can) greatly increase the compilation time of your program and can make compilation errors more difficult to understand.
As taldor said, don't make your functions more generic than they need to be.

You may take a look at the function parameters and the way they are used. If all operations are done through overloaded operators the function may be very generic and a good candidate to become a template. Otherwise, the presence of very specialized class types and functions calls may make generic reusability very problematic and any eventual flexibility should be rather realized through polymorphism.

A few thoughts:
Know the STL. There is std::swap already. Instead of spending your time making everything as generic as possible, spend your time becoming more familiar with the STL.
Don't do it till you need it: "Always implement things when you actually need them, never when you just foresee that you need them."---Ron Jeffries. If you don't actually reuse the code you didn't write reusable code, you wrote unnecessary code. Unnecessary code is expensive to develop, expensive to test, and expensive to maintain. Don't forget opportunity cost!
Keep things simple: "Make everything as simple as possible, but not simpler."---Albert Einstein. This is KISS.

Related

Efficiency of std::bind vs lambda

I have searched around a bit and found many examples and discussions of cases where you would use std::bind instead of a lambda, but the burning question I have is whether or not there is any performance benefit to one over the other. I will describe my use case:
I have a generic A* I have implemented, to which I pass successor, heuristic distance, and move cost functions.
Here is an example of my heuristic function ready to be passed off for a search (in both forms):
std::function<float(const Location*, const Location*)> hdist = std::bind(&TerrainMap::straightLineDist, this, std::placeholders::_1, std::placeholders::_2);
std::function<float(const Location*, const Location*)> hdist2 = [this](const Location* a, const Location* b){
return straightLineDist(a,b);
};
Is there any difference in the performance of these approaches? I realize the difference is probably negligible but I am curious enough to want to know.
Is there any difference in the performance of these approaches?
Perhaps, perhaps not; as commenters suggest - profile to check, or look at the assemby code you get (e.g. using the GodBolt Compiler Explorer). But you're asking the wrong question, for two main reasons:
You should probably not be passing lambda's, nor bind() results, around in the part of your code that's performance-critical.
You should definitely avoid invoking arbitrary functions via function pointer or std::function variables in performance-critical areas of your code (except if this can be de-virtualized and inlined by the compiler).
and one mind reason:
Lambdas (and std::bind()'s) are usable, and useful, without being wrapped in std::function; this wrapper has its own performance penalty, so you would only be comparing one way of using these constructs.
Bottom line recommendation: Just use Lambdas. They're cleaner, easier to understand, cheaper to compile, and more flexible syntactically. So don't worry and be happy :-) . And in performance-critical code, either use Lambda's without std::function, or don't use any of the two.

Explicit function template specialization - Why?

I keep reading and researching, different posts, c++ books, articles and so far nobody has explained the rational for this construct to me. It makes no sense and its really bugging me. The whole point of a template is to parameterize types to functions (or classes, but i'm talking specifically function template, not class). Why use funny template syntax without the type parameter???
//this seems ridiculous. why would anybody ever use this?
template<> void Swap(int & a , int & b){}
//I would always use this if I needed to take care of a special case, no?
void Swap(int & a , int & b){}
What am I missing? I would really appreciate some insight, and I do understand that function template specialization is not all that useful in practice anyway, but i still want to understand why it was ever invented in the first place. Whoever came up with it must have had a reason which seemed compelling enough at the time.
Thanks.
Great question! Function template specialisation is a bit niche and not generally worth it. You might be a bit confused as to the reason though.
You ask "what's the funny template syntax without a type parameter?" Plenty of use! Specialising templates is very important and useful, and the good old swap example is a classic reason to consider it. A template embodies the idea of a generic algorithm that works with any type, but often if you know a bit about the type you can drop in a much better algorithm, without calling code needing to know that anything different is happening under the hood. Only the compiler knows and it pulls in the best implementation for the real types at the point where the algorithm is instantiated with specific types, so your fast swap happens without the sorting algorithm needing special cases. Specialisation is a key part of making generic programming useful in the real world (or we'd have to use un-generic versions to get the performance we need).
Function template specialisation though is a bit niche, for more obscure reasons. I guess you've read Herb Sutter's summary? So, if you don't want to be caught out, it's a good idea to avoid specialising function templates. (std::swap is an example though of something you have to specialise rather than overload if you want to be ultra-conformant to the standard. We do this widely in our codebase here and it works well in practice, though overloading would probably work well enough too.)
So, please, specialise away all you like. Having class template specialisations, far from being "ridiculous" is often vital, but function template specialisation isn't as useful.

Compile time vs run time polymorphism in C++ advantages/disadvantages

In C++ when it is possible to implement the same functionality using either run time (sub classes, virtual functions) or compile time (templates, function overloading) polymorphism, why would you choose one over the other?
I would think that the compiled code would be larger for compile time polymorphism (more method/class definitions created for template types), and that compile time would give you more flexibility, while run time would give you "safer" polymorphism (i.e. harder to be used incorrectly by accident).
Are my assumptions correct? Are there any other advantages/disadvantages to either? Can anyone give a specific example where both would be viable options but one or the other would be a clearly better choice?
Also, does compile time polymorphism produce faster code, since it is not necessary to call functions through vtable, or does this get optimized away by the compiler anyway?
Example:
class Base
{
virtual void print() = 0;
}
class Derived1 : Base
{
virtual void print()
{
//do something different
}
}
class Derived2 : Base
{
virtual void print()
{
//do something different
}
}
//Run time
void print(Base o)
{
o.print();
}
//Compile time
template<typename T>
print(T o)
{
o.print();
}
Static polymorphism produces faster code, mostly because of the possibility of aggressive inlining. Virtual functions can rarely be inlined, and mostly in a "non-polymorphic" scenarios. See this item in C++ FAQ. If speed is your goal, you basically have no choice.
On the other hand, not only compile times, but also the readability and debuggability of the code is much worse when using static polymorphism. For instance: abstract methods are a clean way of enforcing implementation of certain interface methods. To achieve the same goal using static polymorphism, you need to restore to concept checking or the curiously recurring template pattern.
The only situation when you really have to use dynamic polymorphism is when the implementation is not available at compile time; for instance, when it's loaded from a dynamic library. In practice though, you may want to exchange performance for cleaner code and faster compilation.
After you filter out obviously bad and suboptimal cases I believe you're left with almost nothing. IMO it is pretty rare when you're facing that kind of choice. You could improve the question by stating an example, and for that a real comparison van be provided.
Assuming we have that realistic choice I'd go for the compile time solution -- why waste runtime for something not absolutely necessary? Also is something is decided at compile time it is easier to think about, follow in head and do evaluation.
Virtual functions, just like function pointers make you unable to create accurate call graphs. You can review the bottom but not easily from the top. virtual functions shall follow some rules but if they don't, you have to look all of them for the sinner.
Also there are some losses on performance, probably not a big deal in majority of cases but if no balance on the other side, why take it?
In C++ when it is possible to implement the same functionality using either run time (sub classes, virtual functions) or compile time (templates, function overloading) polymorphism, why would you choose one over the other?
I would think that the compiled code would be larger for compile time polymorphism (more method/class definitions created for template types)...
Often yes - due to multiple instantiations for different combinations of template parameters, but consider:
with templates, only the functions actually called are instantiated
dead code elimination
constant array dimensions allowing member variables such as T mydata[12]; to be allocated with the object, automatic storage for local variables etc., whereas a runtime polymorphic implementation might need to use dynamic allocation (i.e. new[]) - this can dramatically impact cache efficiency in some cases
inlining of function calls, which makes trivial things like small-object get/set operations about an order of magnitude faster on the implementations I've benchmarked
avoiding virtual dispatch, which amounts to following a pointer to a table of function pointers, then making an out-of-line call to one of them (it's normally the out-of-line aspect that hurts performance most)
...and that compile time would give you more flexibility...
Templates certainly do:
given the same template instantiated for different types, the same code can mean different things: for example, T::f(1) might call a void f(int) noexcept function in one instantiation, a virtual void f(double) in another, a T::f functor object's operator()(float) in yet another; looking at it from another perspective, different parameter types can provide what the templated code needs in whatever way suits them best
SFINAE lets your code adjust at compile time to use the most efficient interfaces objects supports, without the objects actively having to make a recommendation
due to the instantiate-only-functions-called aspect mentioned above, you can "get away" with instantiating a class template with a type for which only some of the class template's functions would compile: in some ways that's bad because programmers may expect that their seemingly working Template<MyType> will support all the operations that the Template<> supports for other types, only to have it fail when they try a specific operation; in other ways it's good because you can still use Template<> if you're not interested in all the operations
if Concepts [Lite] make it into a future C++ Standard, programmers will have the option of putting stronger up-front contraints on the semantic operations that types used as template paramters must support, which will avoid nasty surprises as a user finds their Template<MyType>::operationX broken, and generally give simpler error messages earlier in the compile
...while run time would give you "safer" polymorphism (i.e. harder to be used incorrectly by accident).
Arguably, as they're more rigid given the template flexibility above. The main "safety" problems with runtime polymorphism are:
some problems end up encouraging "fat" interfaces (in the sense Stroustrup mentions in The C++ Programming Language): APIs with functions that only work for some of the derived types, and algorithmic code needs to keep "asking" the derived types "should I do this for you", "can you do this", "did that work" etc..
you need virtual destructors: some classes don't have them (e.g. std::vector) - making it harder to derive from them safely, and the in-object pointers to virtual dispatch tables aren't valid across processes, making it hard to put runtime polymorphic objects in shared memory for access by multiple processes
Can anyone give a specific example where both would be viable options but one or the other would be a clearly better choice?
Sure. Say you're writing a quick-sort function: you could only support data types that derive from some Sortable base class with a virtual comparison function and a virtual swap function, or you could write a sort template that uses a Less policy parameter defaulting to std::less<T>, and std::swap<>. Given the performance of a sort is overwhelmingly dominated by the performance of these comparison and swap operations, a template is massively better suited to this. That's why C++ std::sort clearly outperforms the C library's generic qsort function, which uses function pointers for what's effectively a C implementation of virtual dispatch. See here for more about that.
Also, does compile time polymorphism produce faster code, since it is not necessary to call functions through vtable, or does this get optimized away by the compiler anyway?
It's very often faster, but very occasionally the sum impact of template code bloat may overwhelm the myriad ways compile time polymorphism is normally faster, such that on balance it's worse.

How can I make switching between arithmetics easy in C++?

I am making a project that will use mathematic computations a lot. Also I want to be able to simply change the implementation of real numbers. Let's say between float, double, my own implementation and gmplib float types.
So far I thouht of two ways:
I create a class "Number" which will interface with the rest of the program.
I typedef the arithmetic type and write global functions to interface with the rest of the program.
The first choice seems to be more elegant, but the second seems to have less overhead. Is there a third better choice? Also I am worried by the elementary mathematical functions such as sine, cosine, exp... I figured out that to make the switching easy, I should implement them as templates, but my implementations are hopelessly slow.
I am generally new to programming in C++. I was brought up in the comfortable Matlab and Mathematica environments, where I did not have to worry about such things.
You'll want to use templates with constraints to avoid re-implementing things.
For instance, say you want to use sin in your program differently for float and double. You can overload based on type and create specialized templates.
template<class T> T MySin(const T& f) {
return genericSin(f);
}
template<> float MySin<float>(float f) {
return sinf(f);
}
template<> double MySin<double>(double d) {
return sin(d);
}
For functions. The syntax is similar when partially specializing a Math class if you want to go the OO route. This will enable you to call your routines with any type and have the most specialized and most efficient routine called.
Templates are the way I have done this. it makes it easy to specialize what must be specialized, and provides a good way to reuse implementations when it applies to multiple types.
The number type can be done, but it's actually not simple to do right and introduces some restrictions (compared to templates).
Multiple types are just hopelessly complex, if you want something even close to fast, accurate, and simple to maintain. You'd likely end up using templates to implement these correctly if you were to create a global typedef.
Templates provide all the power, control, and flexibility you would need, and they will be faster than the alternatives posted (technically, #2 could be as fast if you resorted to... templates).
a template class like real numbers should work for you. in that you can overload the required functions and if required use template specializations.
in order to improve efficiency use STL algorithms instead of hand written loops.
good luck
Both alternatives are equivalent in terms of encapsulation: There will be a single point in your program where you'll have to change the number type, and this one change will affect your whole program. If presented with those two alternatives, choose the typedef; it is less elegant (=> simpler, and simpler is better) and has the same power.
When you get more comfortable with C++, templating your functions will be a better fit, since the determination of the number type can be made locally instead of globally. With templates, you determine the number type at the instantiation point (most likely the call site), giving much greater flexibility. However, there is a number of pitfalls in templates, and I'd recommend to you that you get a little more experience with C++ first and then start templating.

Large scale usage of Meyer's advice to prefer Non-member,non-friend functions?

For some time I've been designing my class interfaces to be minimal, preferring namespace-wrapped non-member functions over member functions. Essentially following Scott Meyer's advice in the article How Non-Member Functions Improve Encapsulation.
I've been doing this with good effect in a few small scale projects, but I'm wondering how well it works on a larger scale. Are there any large, well regarded open-source C++ projects that I can take a look at and perhaps reference where this advice is strongly followed?
Update: Thanks for all the input, but I'm not really interested in opinion so much as finding out how well it works in practice on a larger scale. Nick's answer is closest in this regard, but I'd like to be able to see the code. Any sort of detailed description of practical experiences (positives, negatives, practical considerations, etc) would be acceptable as well.
I do this quite a bit on the project I work on; the largest of which at my current company is around 2M lines, but it's not open source, so I can't provide it as a reference. However, I will say that I agree with the advice, generally speaking. The more you can separate the functionality which is not strictly contained to just one object from that object, the better your design will be.
By way of an example, consider the classic polymorphism example: a Shape base class with subclasses, and a virtual Draw() function. In the real world, Draw() would need to take some drawing context, and potentially be aware of the state of other things being drawn, or the application in general. Once you put all that into each subclass implementation of Draw(), you're likely to have some code overlap, or most of your actual Draw() logic will be in the base class, or somewhere else. Then consider that if you want to re-use some of that code, you'll need to provide more entry points into the interface, and possibly pollute the functions with other code not related to drawing shapes (eg: multi-shape drawing correlation logic). Before long, it'll be a mess, and you'll wish you had a draw function which took a Shape (and context, and other data) instead, and Shape just had functions/data which were entirely encapsulated and not using or referencing external objects.
Anyway, that's my experience/advice, for what it's worth.
I'd argue that the benefit of non-member functions increases as the size of the project increases. The standard library containers, iterators, and algorithms library are proof of this.
If you can decouple algorithms from data structures (or, to phrase it another way, if you can decouple what you do with objects from how their internal state is manipulated), you can decrease coupling between your classes and take greater advantage of generic code.
Scott Meyers isn't the only author who has argued in favor of this principle; Herb Sutter has too, especially in Monoliths Unstrung, which ends with the guideline:
Where possible, prefer writing functions as nonmember nonfriends.
I think one of the best examples of an unneccessary member function from that article is std::basic_string::find; there is no reason for it to exist, really, as std::find provides exactly the same functionality.
OpenCV library does this. They have a cv::Mat class that presents a 3D matrix (or images). Then they have all the other functions in the cv namespace.
OpenCV library is huge and is widely regarded in its field.
One practical advantage of writing functions as nonmember nonfriends is that doing so can significantly reduce the time it takes to thoroughly test and verify the code.
Consider, for example, the sequence container member functions insert and push_back. There are at least two approaches to implementing push_back:
It can simply call insert (it's behavior is defined in terms of insert anyway)
It can do all the work that insert would do (possibly calling private helper functions) without actually calling insert
Obviously, when implementing a sequence container, you probably want to use the first approach. push_back is just a special form of insert and (to the best of my knowledge) you can't really get any performance benefit by implementing push_back some other way (at least not for list, deque, or vector).
However, to thoroughly test such a container, you have to test push_back separately: since push_back is a member function, it can modify any and all of the internal state of the container. From a testing standpoint, you should (must?) assume that push_back is implemented using the second approach because it is possible that it could be implemented using the second approach. There is no guarantee that it is implemented in terms of insert.
If push_back is implemented as a nonmember nonfriend, it can't touch any of the internal state of the container; it must use the first approach. When you write tests for it, you know that it can't break the internal state of the container (assuming the actual container member functions are implemented correctly). You can use that knowledge to significantly reduce the number of tests that you need to write to fully exercise the code.
(I don't have time to write this up nicely, the following's a 5 minute brain dump which doubtless can be ripped apart at various trival levels, but please address the concepts and general thrust.)
I have considerable sympathy for the position taken by Jonathan Grynspan, but want to say a bit more about it than can reasonably be done in comments.
First - a "well said" to Alf Steinbach, who chipped in with "It's only over-simplified caricatures of their viewpoints that might seem to be in conflict. For what it's worth I don't agree with Scott Meyers on this matter; as I see it he's over-generalizing here, or he was."
Scott, Herb etc. were making these points when few people understood the trade-offs or alternatives, and they did so with disproportionate strength. Some nagging hassles people had during evolution of code were analysed and a new design approach addressing those issues was rationally derived. Let's return to the question of whether there were downsides later, but first - worth saying that the pain in question was typically small and infrequent: non-member functions are just one small aspect of designing reusable code, and in enterprise scale systems I've worked on simply writing the same kind of code you'd have put into a member function as a non-member is rarely enough to make the non-members reusable. It's pretty rare for them to even express algorithms that are both complex enough to be worth reusing and yet not tightly bound to the specific of the class they were designed for, that being weird enough that it's practically inconceivable some other class will happen along supporting the same operations and semantics. Often, you also need to template arguments, or introduce a base class to abstract the set of operations required. Both have significant implications in terms of performance, being inline vs out-of-line, client-code recompilation.
That said, there's often less code changes and impact study required when changing implementation if operations have been implementing in terms of a public interface, and being a non-friend non-member systematically enforces that. Occasionally though, it makes the initial implementation more verbose or in some other way less desirable and maintainble.
But, as a litmus test - how many of these non-member functions sit in the same header as the only class for which they're currently applicable? How many want to abstract their arguments via templates (which means inlining, compilation dependencies) or base classes (virtual function overheads) to allow reuse? Both discourage people from seeing them as reusable, but when not the case, the operations available on a class are delocalised, which can frustrate developers perception of a system: the develop often has to work out for themselves the rather disappointing fact that - "oh - that will only work for class X".
Bottom line: most member functions aren't potentially reusable. Much corporate code isn't broken into clean algorithm versus data with potential for reuse of the former. That kind of division just isn't required or useful or conceivably useful 20 years down the road. It's much the same as get/set methods - they're needed at certain API boundaries, but can constitute needless verbosity when ownership and use of the code is localised.
Personally, I don't have an all or nothing approach to this, but decide what to make a member function or non-member based on whether there's any likely benefit to either, potential reusability versus locality of interface.
I also do this alot, where it seems to make sense, and it causes absolutely no problems with scaling. (although my current project is only 40000 LOC) In fact, I think it makes the code more scalable - it slims down classes, reduces dependencies.
It sometimes requires you to refactor your functions to make them independent of members of the class - and thereby often creating a library of more general helper functions, which you can easly reuse elsewhere. I'd also mention that one of the common problems with many large projects is the bloating of classes - and I think preferring non-member, non-friend functions also helps here.
Prefer non-member non-friend functions for encapsulation UNLESS you want implicit conversions to work for class templates non-member functions (in which case you better make them friend functions):
That is, if you have a class template type<T>:
template<class T>
struct type {
void friend foo(type<T> a) {}
};
and a type implicitly convertible to type<T>, e.g.:
template<class T>
struct convertible_to_type {
operator type<T>() { }
};
The following works as expected:
auto t = convertible_to_type<int>{};
foo(t); // t is converted to type<int>
However, if you make foo a non-friend function:
template<class T>
void foo(type<T> a) {}
then the following doesn't work:
auto t = convertible_to_type<int>{};
foo(t); // FAILS: cannot deduce type T for type
Since you cannot deduce T then the function foo is removed from the overload resolution set, that is: no function is found, which means that the implicit conversion does not trigger.