interface paradigm performance (dynamic binding vs. generic programming) - c++

While at their core dynamic binding and templates are fundamentally different things, they can be used to implement the same functionality.
Code example (only for reference)
A) dynamic binding
namespace DB {
// interface
class CustomCode {
public:
virtual void operator()(char) const = 0;
};
class Lib {
public:
void feature(CustomCode const& c) {
c('d');
}
};
// user code
class MyCode1 : public CustomCode {
public:
void operator()(char i) const {
std::cout << "1: " << i << std::endl;
}
};
class MyCode2 : public CustomCode {
public:
void operator()(char i) const {
std::cout << "2: " << i << std::endl;
}
};
void use() {
Lib lib;
lib.feature(MyCode1());
lib.feature(MyCode2());
}
}
B) generic programming
namespace GP {
//interface
template <typename CustomCode> class Lib {
public:
void feature(CustomCode const& c) {
c('g');
}
};
// user code
class MyCode1 {
public:
void operator()(char i) const {
std::cout << "1: " << i << std::endl;
}
};
class MyCode2 {
public:
void operator()(char i) const {
std::cout << "2: " << i << std::endl;
}
};
void use() {
Lib<MyCode1> lib;
lib.feature(MyCode1());
//lib.feature(MyCode2()); <-- illegal
}
}
Question
Some thoughts
While these paradigms are not identical and have their advantages and disadvantages (A is a bit more powerful (see MyCode2) and B is more flexible for the user) they both allow the implementation of the same functionality (while the limitations hinted above apply).
Anyway, in theory (TM) A is a bit slower at runtime because of the indirection of the virtual function, while B offers some great optimisation opportunities as methods can be inlined (and of course you don't have the indirection).
However, I often feel that A is a bit more self-documenting because you have a clear interface you have to implement (which usually consists of more than one method) while B is a bit more anarchistic (which implies its flexibility).
Core
Are there any general results / comparative studies of these paradigms?
Is the speed-up significant?
What about compilation time?
What are the design implications of either for interfaces in larger systems (I mainly used A for my inter-module interfaces and I haven't done really really big projects so far)?
Edit
Note: Saying "dynamic binding is better because it is more powerful" is not at all an answer, because the precondition is that you have a case where both approaches are applicable (otherwise there is no freedom to choose -- at least not reasonably).

Are there any general results / comparative studies of these paradigms?
from what i have seen, many examples of proofs can be found in articles and publications. your favorite c++ books should provide several demonstrations; if you have no such resource, you may want to read Modern C++ Design: Generic Programming and Design Patterns Applied - A. Alexandrescu. although, there is not a specific resource that comes to mind that directly answers your question. as well, the result will vary by implementation and compiler - even compiler settings can greatly affect the outcome of such a test. (responding to each of your questions, although this does not qualify as an answer to this specific question).
Is the speed-up significant?
short answer: it depends.
in your example, the compiler could in fact use static dispatch or even inline the virtual function calls (enough information is visible to the compiler). i am now going to move the responses away from a trivial example (specifically, the OP) to larger, more complex programs.
expanding on 'it depends': yes, the speed up can range from unmeasurable to huge. you have to (and likely already) realize that the compiler can be provided an incredible amount of information at compilation via generics. it can then use this information to optimize your program much more accurately. one good example of this is the use of std::array vs std::vector. the vector adds flexibility at runtime, but the cost can be quite significant. the vector needs to implement more for resizing, the need for dynamic allocations can be costly. there are other differences: the backing allocation of the array will not change (++optimization), the element count is fixed (++optimization), and again - there's no need to call through new in many cases.
you may now be thinking this example has significantly deviated from the original question. in many ways, it's really not so different: the compiler knows more and more about your program as its complexity expands. this information can remove several portions of your program (dead code) and using std::array as an example, the information the type provides is enough such that a compiler can easily say "oh, i see that this array's size is seven elements, i will unroll the loop accordingly" and you will have fewer instructions and will have eliminated mispredictions. there's a lot more to it, but in the array/vector case, i have seen executable size of optimized programs reduce to 20% when converting from vector to an interface similar to array. as well, the code can perform several times faster. in fact, some expressions can be calculated entirely at compilation.
dynamic dispatch still has its virtues, and using dynamic dispatch can also improve your program's speed if used correctly - what you will really need to learn comes down to deciding when to favor one over the other. similar to how a huge function with many variables cannot be optimized very effectively (the result of all that template expansion in a real program), a virtual function call can in fact be a faster, cleaner approach in many situations. as such, they are two separate features, you will need some practice to determine what is right (and many programmers don't take the time to learn this well enough).
in conclusion, they should be regarded as separate features, applicable to different scenarios. these should (imho) have have much less practical overlap than they actually do in the real world.
What about compilation time?
with templates, the compilation and link times during development can be quite high. each time a header/template changes, you will require a compilation on all dependencies -- this can often be a significant boon for favoring dynamic dispatch. you can of course reduce this if you plan ahead and build out appropriately - understanding how is a much more difficult subject to master with templates. with templates, you not only increrase the frequency of large builds, you often increase the time and complexity of large builds. (more notes follow)
What are the design implications of either for interfaces in larger systems (I mainly used A for my inter-module interfaces and I haven't done really really big projects so far)?
it really depends on your program's expectations. i write virtual less every year (and many others as well). among other approaches, templates are becoming more and more common. honestly, i don't understand how B is 'anarchistic'. to me, A is a bit anachronistic as there are plenty of suitable alternatives. it's ultimately a design choice which can take a lot of consideration to architect large systems well. a good system will use a healthy combination of the language's features. history proves no feature in this discussion is necessary to write a nontrivial program, but all features were added because somebody saw better alternatives in some specific uses. you should also expect lambdas to replace virtuals in more than 50% of their current uses in some (not all) teams/codebases.
Generalizations:
templates can be significantly faster to execute, if used correctly.
either can produce larger executables. if used correctly and executable size is important, then the writer will use multiple approaches to reduce executable size while providing nice usable interfaces.
templates can grow very very complex. learning to crawl through and interpret error messages takes time.
templates push several errors into the compilation domain. personally, i favor a compilation error over a runtime error.
reducing compilation times via virtuals is often straightforward (virtuals belong in the .cpp). if your program is huge, then huge template systems that change often can quickly send your rebuild times and counts through the roof because there will be a lot of intermodule visibility and dependency.
deferred and/or selective instantiation with fewer compiled files can be used to reduce compile times.
in larger systems, you'll have to be much more considerate about forcing a nontrivial recompiles for your team/client. using virtuals are one approach to minimize this. as well, a higher percentage of your methods will be defined in the cpp. the alternative is of course that you can hide more of the implementation or provide to the client more expressive ways to use your interfaces.
in larger systems, templates, lambdas/functors, etc. can actually be used to significantly reduce coupling and dependencies.
virtuals increase dependency, often become difficult to maintain, bloat interfaces, and become structurally awkward beasts. template-centric libraries tend to invert that precedence.
all approaches can be used for the wrong reasons.
Bottom Line
a large, well designed modern system will use many paradigms effectively and simultaneously. if you use virtuals most of the time at present, you are (imo) doing it wrong -- especially if that is still the approach once you've had time to absorb c++11. if speed, performance, and/or parallelism are also significant concerns, then templates and lambdas deserve to be your closer friends.

Which is better? It depends. You've focused on the overlap. Better is to focus on where the approaches diverge. You also missed where you need to use both approaches simultaneously.
The biggest advantage of templates is that they offer the ability to cut down, sometimes immensely, on cookie-cutter code. Another advantage of templates is metaprogramming. There are some truly bizarre things you can do thanks to SFINAE.
One disadvantage of templates is that the syntax is a bit clunky. There is no way around this. It is what it is. Another disadvantage of templates is that each instantiation is a different class, completely unrelated to the other classes instantiated from the same template class. There is a way around this: Combine both approaches. Make your template class derive from some non-template base class. Of course, now you have lost some of the run time advantages.
The biggest advantage of polymorphism is that it is dynamic. This can be a huge, huge win. Don't discount it. There is a performance penalty for such polymorphism, but you are going to pay that penalty one way or another if you want to have a collection of objects that obey a common interface but different objects have different implementations for that interface.

Related

When you prefer virtual functions over templates in C++?

I was interviewed by a financial company and was asked this question:
"List the case(s) when you prefer virtual functions over templates?"
It sounds weird for me, because usually we are aiming the opposite right?
All the books, articles, talks are there encouraging us to use static polymorphism instead of dynamic.
Are there any known cases I was not aware of when you should use virtual functions and avoid templates?
GUI / visualization widget toolkits are an obvious case. Re-implementing a draw method, for example, is certainly less cumbersome with virtual methods and dynamic dispatch. And since modern C++ tends to discourage raw pointer management, std::unique_ptr can manage the resource for you.
I'm sure there are plenty of other hierarchical examples you can come up with... a base enemy class for a game, with virtual methods handling the behaviour of various baddies:)
The whole 'overhead' argument for dynamic dispatch is completely without merit today. I'd argue that the vtable indirection implementation hasn't been a significant overhead for serious workloads in decades. There's the more interesting question that if C++ was designed today, would polymorphism be part of the language? But that's neither here nor there now.
I don't like the chances of this question remaining open, as it's not a direct programming problem and is probably too subject to opinions. It might be a better question for software engineering.
When the type of the object is not known at compile time, use virtual methods.
e.g.
void Accept (Fruit* pFruit) // supplied from external factors at runtime
{
pFruit->eat(); // `Fruit` can be anyone among `Apple/Blackberry/Chickoo/`...
}
Based on what user enters, a fruit will be supplied to the function. Hence, there is no way we can figure out that what is going to be eat(). So it's a candidate for runtime polymorphism:
class Fruit
{
public: virtual void eat () = 0;
}
In all the other cases, always prefer static polymorphism (includes templates). It's more deterministic and easy to maintain.
Interview question seems to asking for preference (& your reasoing).
I would prefer interfaces (virtual methods) for ease of creating mocks for unit testing. We can use template for these, but it is cumbersome (consumer has to be templated). If profiling shows no speed degradation with vtable lookups, prefer that over mocking/testing non-virtual methods.
Also, type erasure. I don't think it can be implemented at all using templates. Type erasure can be roughly thought of as void ptr and function pointers, which can be implemented easily using interfaces + virtual methods.
Templates need code implementation to be made available. I believe virtual methods of an interface can be invoked on a child ptr with just its binary (object) file.
Not sure if code bloat with template is an issue with modern compilers, but if executable size increases significantly, that could be an issue in embedded systems with limited memory and these would not prefer to use templates.

Is it a good idea to change pre c++11 setting functions to modern templated ones with forwarding

in, my classes I have lots of setXXX function which looks something like follows:
void setName(const std::string& newName) {
name = newName;
}
this is what most programmers would do before C++11.
I learned (item 25 from Effective Moderm C++) that using forwarding would make those functions more efficient. So they would look as follows:
template<typename T>
void setName(T&& newName) {
name = std::forward<T>(newName);
}
my qestion is whether it is a good idea to convert all such pre C++11 setting functions to templates with forwarding reference parameter? My current understanding is that it will always be beneficial, at least I dont see drawbacks.
Perfect forwarding might sound interesting, though, I haven't used it for setters. (Although, I do work on performance critical software)
From my experience, I would say that it makes your code more complex to understand without much benefits. Let me explain:
Complexity
Every caller of this set-method from now on, needs to check the type of the member before knowing what to pass. Let's assume a method 'setFile':
template<typename T>
void setFile(T &&file) { m_file = std::forward<T>(file); }
Should we call this method with a string (the filename), a c-style file-handle, a file-stream, your library specific file-wrapper ...? From seeing this method, it is not clear which to use. Especially not if you are new to the codebase.
Even though you class could be that small it can fit on a single terminal screen, code completion can't provide you with the information that you need.
Please note that compilation errors also become more complex if you use the wrong type.
Gain
This brings us to the question: What do you gain? From now on, you use perfect forwarding. In other words, you can assign a const char * to a std::string without the performance overhead. However, how much will this gain? If you had performance critical code, this overload already exists. However, if the code is not performance critical, it won't matter that you loose a few CPU cycles.
On top of that, your compiler (Clang, GCC, MSVC ...) is an optimizing compiler. In other words, it allows you to write the code you want without having to worry about performance. (Don't get me wrong, if the code is critical, please worry, though trust the many compiler writers and researchers to handle the obvious)
So, if you have this setName(const std::string &), make sure it is implemented in the header so that the compiler can reason about it and can try to optimize it away as much as possible. (For your own classes, make sure all constructors, destructor and assign/cast functions are visible to the compiler)
Conclusion
I doubt it is useful to spend your time 'upgrading' your code. If it would bring that much gain, a clang-based utility would already exist. I would even bet that you would gain more performance by spending the same time profiling your application and fixing the low-hanging fruit.

Inheritance & virtual functions Vs Generic Programming

I need to Understand that whether really Inheritance & virtual functions not necessary in C++ and one can achieve everything using Generic programming. This came from Alexander Stepanov and Lecture I was watching is Alexander Stepanov: STL and Its Design Principles
I always like to think of templates and inheritance as two orthogonal concepts, in the very literal sense: To me, inheritance goes "vertically", starting with a base class at the top and going "down" to more and more derived classes. Every (publically) derived class is a base class in terms of its interface: A poodle is a dog is an animal.
On the other hand, templates go "horizontal": Each instance of a template has the same formal code content, but two distinct instances are entirely separate, unrelated pieces that run in "parallel" and don't see each other. Sorting an array of integers is formally the same as sorting an array of floats, but an array of integers is not at all related to an array of floats.
Since these two concepts are entirely orthogonal, their application is, too. Sure, you can contrive situations in which you could replace one by another, but when done idiomatically, both template (generic) programming and inheritance (polymorphic) programming are independent techniques that both have their place.
Inheritance is about making an abstract concept more and more concrete by adding details. Generic programming is essentially code generation.
As my favourite example, let me mention how the two technologies come together beautifully in a popular implementation of type erasure: A single handler class holds a private polymorphic pointer-to-base of an abstract container class, and the concrete, derived container class is determined a templated type-deducing constructor. We use template code generation to create an arbitrary family of derived classes:
// internal helper base
class TEBase { /* ... */ };
// internal helper derived TEMPLATE class (unbounded family!)
template <typename T> class TEImpl : public TEBase { /* ... */ }
// single public interface class
class TE
{
TEBase * impl;
public:
// "infinitely many" constructors:
template <typename T> TE(const T & x) : impl(new TEImpl<T>(x)) { }
// ...
};
They serve different purpose. Generic programming (at least in C++) is about compile time polymorphisim, and virtual functions about run-time polymorphisim.
If the choice of the concrete type depends on user's input, you really need runtime polymorphisim - templates won't help you.
Polymorphism (i.e. dynamic binding) is crucial for decisions that are based on runtime data. Generic data structures are great but they are limited.
Example: Consider an event handler for a discrete event simulator: It is very cheap (in terms of programming effort) to implement this with a pure virtual function, but is verbose and quite inflexible if done purely with templated classes.
As rule of thumb: If you find yourself switching (or if-else-ing) on the value of some input object, and performing different actions depending on its value, there might exist a better (in the sense of maintainability) solution with dynamic binding.
Some time ago I thought about a similar question and I can only dream about giving you such a great answer I received. Perhaps this is helpful: interface paradigm performance (dynamic binding vs. generic programming)
It seems like a very academic question, like with most things in life there are lots of ways to do things and in the case of C++ you have a number of ways to solve things. There is no need to have an XOR attitude to things.
In the ideal world, you would use templates for static polymorphism to give you the best possible performance in instances where the type is not determined by user input.
The reality is that templates force most of your code into headers and this has the consequence of exploding your compile times.
I have done some heavy generic programming leveraging static polymorphism to implement a generic RPC library (https://github.com/bytemaster/mace (rpc_static_poly branch) ). In this instance the protocol (JSON-RPC, the transport (TCP/UDP/Stream/etc), and the types) are all known at compile time so there is no reason to do a vtable dispatch... or is there?
When I run the code through the pre-processor for a single.cpp it results in 250,000 lines and takes 30+ seconds to compile a single object file. I implemented 'identical' functionality in Java and C# and it compiles in about a second.
Almost every stl or boost header you include adds thousands or 10's of thousands of lines of code that must be processed per-object-file, most of it redundant.
Do compile times matter? In most cases they have a more significant impact on the final product than 'maximally optimized vtable elimination'. The reason being that every 'bug' requires a 'try fix, compile, test' cycle and if each cycle takes 30+ seconds development slows to a crawl (note motivation for Google's go language).
After spending a few days with java and C# I decided that I needed to 're-think' my approach to C++. There is no reason a C++ program should compile much slower than the underlying C that would implement the same function.
I now opt for runtime polymorphism unless profiling shows that the bottleneck is in vtable dispatches. I now use templates to provide 'just-in-time' polymorphism and type-safe interface on top of the underlying object which deals with 'void*' or an abstract base class. In this way users need not derive from my 'interfaces' and still have the 'feel' of generic programming, but they get the benefit of fast compile times. If performance becomes an issue then the generic code can be replaced with static polymorphism.
The results are dramatic, compile times have fallen from 30+ seconds to about a second. The post-preprocessor source code is now a couple thousand lines instead of 250,000 lines.
On the other side of the discussion, I was developing a library of 'drivers' for a set of similar but slightly different embedded devices. In this instance the embedded device had little room for 'extra code' and no use for 'vtable' dispatch. With C our only option was 'separate object files' or runtime 'polymorphism' via function pointers. Using generic programming and static polymorphism we were able to create maintainable software that ran faster than anything we could produce in C.

A C++ based variants chess engine design question

I have a chess variants engine that plays suicide chess and losers chess along with normal chess. I might, over time, add more variants to my engine. The engine is implemented completely in C++ with proper usage of OOP. My question is related to design of such a variant engine.
Initially the project started as a suicide-only engine while over time I added other flavors. For adding new variants, I experimented using polymorphism in C++ first. For instance, a MoveGenerator abstract class had two subclasses SuicideMoveGenerator and NormalMoveGenerator and depending on the type of game chosen by user, a factory would instantiate the right subclass. But I found this to be much slower - obviously because instantiating classes containing virtual functions and calling virtual functions in tight loops are both quite inefficient.
But then it occurred to me to use C++ templates with template specialization for separating logic for different variants with maximum reuse of code. This also seemed very logical because dynamic linking is not really necessary in the context as once you choose the type of game, you basically stick with it until the end of the game. C++ template specialization provides exactly this - static polymorphism. The template parameter is either SUICIDE or LOSERS or NORMAL.
enum GameType { LOSERS, NORMAL, SUICIDE };
So once user selects a game type, appropriate game object is instantiated and everything called from there will be appropriately templatized. For instance if user selects suicide chess, lets say:
ComputerPlayer<SUICIDE>
object is instantiated and that instantiation basically is linked to the whole control flow statically. Functions in ComputerPlayer<SUICIDE> would work with MoveGenerator<SUICIDE>, Board<SUICIDE> and so on while corresponding NORMAL one will appropriately work.
On a whole, this lets me instantiate the right templatize specialized class at the beginning and without any other if conditions anywhere, the whole thing works perfectly. The best thing is there is no performance penalty at all!
The main downside with this approach however is that using templates makes your code a bit harder to read. Also template specialization if not appropriately handled can lead to major bugs.
I wonder what do other variant engine authors normally do for separation of logic (with good reuse of code)?? I found C++ template programming quite suitable but if there's anything better out there, I would be glad to embrace. In particular, I checked Fairymax engine by Dr. H G Muller but that uses config files for defining game rules. I don't want to do that because many of my variants have different extensions and by making it generic to the level of config-files the engine might not grow strong. Another popular engine Sjeng is littered with if conditions everywhere and I personally feel thats not a good design.
Any new design insights would be very useful.
"Calling virtual functions in tight loops are inefficient"
I would be pretty surprised actually if this caused any real bloat, if all the variables of the loop have the same dynamic type, then I would expect the compiler to fetch the corresponding instruction from its L1 cache and thus not suffer much.
However there is one part that worries me:
"obviously because instantiating classes containing virtual functions [is] quite inefficient"
Now... I am really surprised.
The cost of instantiating a class with virtual functions is near undistinguishable from the cost of instantiating a class without any virtual functions: it's one more pointer, and that's all (on popular compilers, which corresponds to the _vptr).
I surmise that your problem lies elsewhere. So I am going to take a wild guess:
do you, by any chance, have a lot of dynamic instantiation going on ? (calling new)
If that is the case, you would gain much by removing them.
There is a Design Pattern called Strategy which would be eminently suitable for your precise situation. The idea of this pattern is akin, in fact, to the use of virtual functions, but it actually externalize those functions.
Here is a simple example:
class StrategyInterface
{
public:
Move GenerateMove(Player const& player) const;
private:
virtual Move GenerateMoveImpl(Player const& player) const = 0;
};
class SuicideChessStrategy: public StrategyInterface
{
virtual Move GenerateMoveImpl(Player const& player) const = 0;
};
// Others
Once implemented, you need a function to get the right strategy:
StrategyInterface& GetStrategy(GameType gt)
{
static std::array<StrategyInterface*,3> strategies
= { new SuicideChessStrategy(), .... };
return *(strategies[gt]);
}
And finally, you can delegate the work without using inheritance for the other structures:
class Player
{
public:
Move GenerateMove() const { return GetStrategy(gt).GenerateMove(*this); }
private:
GameType gt;
};
The cost is pretty much similar to using virtual functions, however you do not need dynamically allocated memory for the basic objects of your game any longer, and stack allocation is a LOT faster.
I'm not quite sure if this is a fit but you may be able to achieve static polymorphism via the CRTP with some slight modifications to your original design.

How best to switch from template mess to clean classes architecture (C++)?

Assuming a largish template library with around 100 files containing around 100 templates with overall more than 200,000 lines of code. Some of the templates use multiple inheritance to make the usage of the library itself rather simple (i.e. inherit from some base templates and only having to implement certain business rules).
All that exists (grown over several years), "works" and is used for projects.
However, compilation of projects using that library consumes a growing amount of time and it takes quite some time to locate the source for certain bugs. Fixing often causes unexpected side effects or is quite difficult, because some interdependent templates need changing. Testing is nearly impossible due to the sheer amount of functions.
Now, I would really like to simplify the architecture to use less templates and more specialized smaller classes.
Is there any proven way to go about that task? What would be a good place to start?
I'm not sure I see how/why templates are the problem, and why plain non-templated classes would be an improvement. Wouldn't that just mean even more classes, less type safety and so larger potential for bugs?
I can understand simplifying the architecture, refactoring and removing dependencies between the various classes and templates, but automatically assuming that "fewer templates will make the architecture better" is flawed imo.
I'd say that templates potentially allow you to build a much cleaner architecture than you'd get without them. Simply because you can make separate classes totally independent. Without templates, classes functions which call into another class must know about the class, or an interface it inherits, in advance. With templates, this coupling isn't necessary.
Removing templates would only lead to more dependencies, not fewer.
The added type-safety of templates can be used to detect a lot of bugs at compile-time (Sprinkle your code liberally with static_assert's for this purpose)
Of course, the added compile-time may be a valid reason to avoid templates in some cases, and if you only have a bunch of Java programmers, who are used to thinking in "traditional" OOP terms, templates might confuse them, which can be another valid reason to avoid templates.
But from an architecture point of view, I think avoiding templates is a step in the wrong direction.
Refactor the application, sure, it sounds like that's needed. But don't throw away one of the most useful tools for producing extensible and robust code just because the original version of the app misused it. Especially if you're already concerned with the amount of code, removing templates will most likely lead to more lines of code.
You need automated tests, that way in ten years time when your succesor has the same problem he can refactor the code (probably to add more templates because he thinks it will simplify usage of the library) and know it still meets all test cases. Similarly the side effects of any minor bug fixes will be immediately visible (assuming your test cases are good).
Other than that, "divide and conqueor"
Write unit tests.
Where the new code must do the same as the old code.
That's one tip at least.
Edit:
If you deprecate old code that you have replaced with the new functionality you
can phase over to the new code little by little.
Well, the problem is that template way of thinking is very different from object-oriented inheritance-based way. It's hard to answer anything else than "redesign the whole thing and start from scratch".
Of course, there may be a simple way for a particular case. We can't tell without knowing more about what you have.
The fact that the template solution is so difficult to maintain is an indication of a poor design anyway.
Some points (but note: these are not evil indeed. If you want to change to non-template code, though, this can help out):
Lookup your static interfaces. Where do templates depend on what functions exist? Where do they need typedefs?
Put the common parts in an abstract base class. A good example is when you happen to stumble over the CRTP idiom. You can just replace it with an abstract base class having virtual functions.
Lookup integer lists. If you find your code uses integral lists like list<1, 3, 3, 1, 3>, you can replace them with std::vector, if all the codes using them can live with working with runtime values instead of constant expressions.
Lookup type traits. There is much code involved checking whether some typedef exists, or whether some method exists in typical templated code. Abstract baseclasses solve these two issues by using pure virtual methods, and by inheriting typedefs to the base. Often, typedefs are only needed to trigger hideous features like SFINAE, which would then be superfluous too.
Lookup expression templates. If your code uses expression templates to avoid creating temporaries, you will have to eliminate them and use the traditional way of returning / passing temporaries to the operators involved.
Lookup function objects. If you find your code uses function objects, you can change them to use abstract base classes too, and have something like void run(); to call them (or if you want to keep using operator(), better so! It can be virtual too).
As I understand, you are most concerned with build times, and the maintainability of your library?
First, don't try to "fix" all at once.
Second, understand what you fix. Template complexity is there often for a reason, e.g. to enforce certain use, and make the compiler help you not make a mistake. That reason might sometimes be taken to far, but throwing out 100 lines because "noone really knows what they do" shouldn't be taken lightly. Everything I suggest here can introduce really nasty bugs, you have been warned.
Third, consider cheaper fixes first: e.g. faster machines or distributed build tools. At least, throw in all the RAM the boards will take, and throw out old disks. It does maike a difference. One drive for OS, one drive for build is a cheap mans RAID.
Is the library well documented? That's your best chance at making it Look into tools such as doxygen that help you create such a documentation.
All considered? OK, now some suggestions for the build times ;)
Understand the C++ build model: every .cpp is compiled individually. That means many .cpp files with many headers = huge build. This is NOT an advise to put everything into one .cpp file, though! However, one trick (!) that can speed up a build immensely is to create a single .cpp file that includes a bunch of .cpp files, and only feed that "master" file to the compiler. You can't do that blindly, though - you need to understand the types of errors this could introduce.
If you don't have one yet, get a separate build machine that you can remote into. You'll have to do a lot of almost-full builds to check if you broke some include. You will want to run this in another machine, that doesn't block you from working on something else. Long term, you'll need it for daily integration builds anyway ;)
Use precompiled headers. (scales better with fast machines, see above)
Check your header inclusion policy. While every file should be "independent" (i.e. include everything it needs to be included by someone else), don't include liberally. Unfortunately, I haven't yet found a tool to find unnecessary #incldue statements, but it might help to spend some time removing unused headers in "hotspot" files.
Create and use forward declarations for the templates you use. Often, you can incldue a header with forwad declarations in many places, and use the full header only in a few specific ones. This can greatly help compile time. Check the <iosfwd> header how the standard library does that for i/o streams.
overloads for templates for few types: If you have a complex function template that is useful only for a very few types like this:
// .h
template <typename FLOAT> // float or double only
FLOAT CalcIt(int len, FLOAT * values) { ... }
You can declare the overloads in the header, and move the template to the body:
// .h
float CalcIt(int len, float * values);
double CalcIt(int len, double * values);
// .cpp
template <typename FLOAT> // float or double only
FLOAT CalcItT(int len, FLOAT * values) { ... }
float CalcIt(int len, float * values) { return CalcItT(len, values); }
double CalcIt(int len, double * values) { return CalcItT(len, values); }
this moves the lengthy template to a single compilation unit.
Unfortunately, this is only of limited use for classes.
Check if the PIMPL idiom can move code from the headers into .cpp files.
The general rule that hides behind that is separate the interface of your library from the implementation. Use comments, detail namesapces and separate .impl.h headers to mentally and physically isolate what should be known to the outside from how it is accomplished. This exposes the real value of your library (does it actually encapsulate complexity?), and gives you a chance to replace "easy targets" first.
More specific advise - and how useful the one given is - depends largely on the actual library.
Good luck!
As mentioned, unit tests are a good idea. Indeed, rather than breaking your code by introducing "simple" changes that are likely to ripple out, just focus on creating a suite of tests, and fixing non-compliance with the tests. Have an activity to update the tests when bugs come to light.
Beyond that, I would suggest upgrading your tools, if possible, to help with debugging template-related problems.
I've often come across legacy templates that were huge and required a lot of time and memory to instantiate, but didn't need to be. In those cases, the easiest way to cut out the fat was to take all of the code that didn't rely on any of the template arguments and hide it in separate functions defined in a normal translation unit. This also had the positive side-effect of triggering fewer recompiles when this code had to be slightly modified or documentation changed. It sounds rather obvious, but it's really surprising how often people write a class template and think that EVERYTHING it does has to be defined in the header, rather than just the code that needs the templated information.
Another thing you might want to consider is how often you clean up the inheritance hierarchies by making the templates "mixin" style instead of aggregations of multiple inheritance. See how many places you can get away with making one of the template arguments the name of the base class that it should derive from (the way boost::enable_shared_from_this works). Of course this typically only works well if the constructors take no arguments, as you don't have to worry about initializing anything correctly.