Header file best practices for typedefs - c++

I'm using shared_ptr and STL extensively in a project, and this is leading to over-long, error-prone types like shared_ptr< vector< shared_ptr<const Foo> > > (I'm an ObjC programmer by preference, where long names are the norm, and still this is way too much.) It would be much clearer, I believe, to consistently call this FooListPtr and documenting the naming convention that "Ptr" means shared_ptr and "List" means vector of shared_ptr.
This is easy to typedef, but it's causing headaches with the headers. I seem to have several options of where to define FooListPtr:
Foo.h. That entwines all the headers and creates serious build problems, so it's a non-starter.
FooFwd.h ("forward header"). This is what Effective C++ suggests, based on iosfwd.h. It's very consistent, but the overhead of maintaining twice the number of headers seems annoying at best.
Common.h (put all of them together into one file). This kills reusability by entwining a lot of unrelated types. You now can't just pick up one object and move it to another project. That's a non-starter.
Some kind of fancy #define magic that typedef's if it hasn't already been typedefed. I have an abiding dislike for the preprocessor because I think it makes it hard for new people to grok the code, but maybe....
Use a vector subclass rather than a typedef. This seems dangerous...
Are there best practices here? How do they turn out in real code, when reusability, readability and consistency are paramount?
I've marked this community wiki if others want to add additional options for discussion.

I'm programming on a project which sounds like it uses the common.h method. It works very well for that project.
There is a file called ForwardsDecl.h which is in the pre-compiled header and simply forward-declares all the important classes and necessary typedefs. In this case unique_ptr is used instead of shared_ptr, but the usage should be similar. It looks like this:
// Forward declarations
class ObjectA;
class ObjectB;
class ObjectC;
// List typedefs
typedef std::vector<std::unique_ptr<ObjectA>> ObjectAList;
typedef std::vector<std::unique_ptr<ObjectB>> ObjectBList;
typedef std::vector<std::unique_ptr<ObjectC>> ObjectCList;
This code is accepted by Visual C++ 2010 even though the classes are only forward-declared (the full class definitions are not necessary so there's no need to include each class' header file). I don't know if that's standard and other compilers will require the full class definition, but it's useful that it doesn't: another class (ObjectD) can have an ObjectAList as a member, without needing to include ObjectA.h - this can really help reduce header file dependencies!
Maintenance is not particularly an issue, because the forwards declarations only need to be written once, and any subsequent changes only need to happen in the full declaration in the class' header file (and this will trigger fewer source files to be recompiled due to reduced dependencies).
Finally it appears this can be shared between projects (I haven't tried myself) because even if a project does not actually declare an ObjectA, it doesn't matter because it was only forwards declared and if you don't use it the compiler doesn't care. Therefore the file can contain the names of classes across all projects it's used in, and it doesn't matter if some are missing for a particular project. All that is required is the necessary full declaration header (e.g. ObjectA.h) is included in any source (.cpp) files that actually use them.

I would go with a combined approach of forward headers and a kind of common.h header that is specific to your project and just includes all the forward declaration headers and any other stuff that is common and lightweight.
You complain about the overhead of maintaining twice the number of headers but I don’t think this should be too much of a problem: the forward headers usually only need to know a very limited number of types (one?), and sometimes not even the full type.
You could even try auto-generating the headers using a script (this is done e.g. in SeqAn) if there are really that many headers.

+1 for documenting the typedef conventions.
Foo.h - can you detail the problems you have with that?
FooFwd.h - I'd not use them generally, only on "obvious hotspots". (Yes, "hotspots" are hard to determine).
It doesn't change the rules IMO because when you do introduce a fwd header, the associated typedefs from foo.h move there.
Common.h - cool for small projects, but doesn't scale, I do agree.
Some kind of fancy #define... PLEASE NO!...
Use a vector subclass - doesn't make it better.
You might use containment, though.
So here the prelimenary suggestions (revised from that other question..)
Standard type headers <boost/shared_ptr.hpp>, <vector> etc. can go into a precompiled header / shared include file for the project. This is not bad. (I personally still include them where needed, but that works in addition to putting them into the PCH.)
If the container is an implementation detail, the typedefs go where the container is declared (e.g. private class members if the container is a private class member)
Associated types (like FooListPtr) go to where Foo is declarated, if the associated type is the primary use of the type. That's almost always true for some types - e.g. shared_ptr.
If Foo gets a separate forward declaration header, and the associated type is ok with that, it moves to the FooFwd.h, too.
If the type is only associated with a particular interface (e.g. parameter for a public method), it goes there.
If the type is shared (and does not meet any of the previous criteria), it gets its own header. Note that this also means to pull in all dependencies.
It feels "obvious" for me, but I agree it's not good as a coding standard.

I'm using shared_ptr and STL extensively in a project, and this is leading to over-long, error-prone types like shared_ptr< vector< shared_ptr > > (I'm an ObjC programmer by preference, where long names are the norm, and still this is way too much.) It would be much clearer, I believe, to consistently call this FooListPtr and documenting the naming convention that "Ptr" means shared_ptr and "List" means vector of shared_ptr.
for starters, i recommend using good design structures for scoping (e.g., namespaces) as well as descriptive, non-abbreviated names for typedefs. FooListPtr is terribly short, imo. nobody wants to guess what an abbreviation means (or be surprised to find Foo is const, shared, etc.), and nobody wants to alter their code simply because of scope collisions.
it may also help to choose a prefix for typedefs in your libraries (as well as other common categories).
it's also a bad idea to drag types out of their declared scope:
namespace MON {
namespace Diddy {
class Foo;
} /* << Diddy */
/*...*/
typedef Diddy::Foo Diddy_Foo;
} /* << MON */
there are exceptions to this:
an entirely ecapsualted private type
a contained type within a new scope
while we're at it, using in namespace scopes and namespace aliases should be avoided - qualify the scope if you want to minimize future maintentance.
This is easy to typedef, but it's causing headaches with the headers. I seem to have several options of where to define FooListPtr:
Foo.h. That entwines all the headers and creates serious build problems, so it's a non-starter.
it may be an option for declarations which really depend on other declarations. implying that you need to divide packages, or there is a common, localized interface for subsystems.
FooFwd.h ("forward header"). This is what Effective C++ suggests, based on iosfwd.h. It's very consistent, but the overhead of maintaining twice the number of headers seems annoying at best.
don't worry about the maintenance of this, really. it is a good practice. the compiler uses forward declarations and typedefs with very little effort. it's not annoying because it helps reduce your dependencies, and helps ensure that they are all correct and visible. there really isn't more to maintain since the other files refer to the 'package types' header.
Common.h (put all of them together into one file). This kills reusability by entwining a lot of unrelated types. You now can't just pick up one object and move it to another project. That's a non-starter.
package based dependencies and inclusions are excellent (ideal, really) - do not rule this out. you'll obviously have to create package interfaces (or libraries) which are designed and structured well, and represent related classes of components. you're making an unnecessary issue out of object/component reuse. minimize the static data of a library, and let the link and strip phases do their jobs. again, keep your packages small and reusable and this will not be an issue (assuming your libraries/packages are well designed).
Some kind of fancy #define magic that typedef's if it hasn't already been typedefed. I have an abiding dislike for the preprocessor because I think it makes it hard for new people to grok the code, but maybe....
actually, you may declare a typedef in the same scope multiple times (e.g., in two separate headers) - that is not an error.
declaring a typedef in the same scope with different underlying types is an error. obviously. you must avoid this, and fortunately the compiler enforces that.
to avoid this, create a 'translation build' which includes the world - the compiler will flag declarations of typedeffed types which don't match.
trying to sneak by with minimal typedefs and/or forwards (which are close enough to free at compilation) is not worth the effort. sometimes you'll need a bunch of conditional support for forward declarations - once that is defined, it is easy (stl libraries are a good example of this -- in the event you are also forward declaring template<typename,typename>class vector;).
it's best to just have all these declarations visible to catch any errors immediately, and you can avoid the preprocessor in this case as a bonus.
Use a vector subclass rather than a typedef. This seems dangerous...
a subclass of std::vector is often flagged as a "beginner's mistake". this container was not meant to be subclassed. don't resort to bad practices simply to reduce your compile times/dependencies. if the dependency really is that significant, you should probably be using PIMPL, anyways:
// <package>.types.hpp
namespace MON {
class FooListPtr;
}
// FooListPtr.hpp
namespace MON {
class FooListPtr {
/* ... */
private:
shared_ptr< vector< shared_ptr<const Foo> > > d_data;
};
}
Are there best practices here? How do they turn out in real code, when reusability, readability and consistency are paramount?
ultimately, i've found a small concise package based approach the best for reuse, for reducing compile times, and minimizing dependence.

Unfortunately with typedefs you have to choose between not ideal options for your header files. There are special cases where option one (right in the class header) works well, but it sounds like it won't work for you. There are also cases where the last option works well, but it's usually where you are using the subclass to replace a pattern involving a class with a single member of type std::vector. For your situation, I'd use the forward declaring header solution. There's extra typing and overhead, but it wouldn't be C++ otherwise, right? It keeps things separate, clean and fast.

Related

I've done a shady thing

Are (seemingly) shady things ever acceptable for practical reasons?
First, a bit of background on my code. I'm writing the graphics module of my 2D game. My module contains more than two classes, but I'll only mention two in here: Font and GraphicsRenderer.
Font provides an interface through which to load (and release) files and nothing much more. In my Font header I don't want any implementation details to leak, and that includes the data types of the third-party library I'm using. The way I prevent the third-party lib from being visible in the header is through an incomplete type (I understand this is standard practice):
class Font
{
private:
struct FontData;
boost::shared_ptr<FontData> data_;
};
GraphicsRenderer is the (read: singleton) device that initializes and finalizes the third-party graphics library and also is used to render graphical objects (such as Fonts, Images, etc). The reason it's a singleton is because, as I've said, the class initializes the third-party library automatically; it does this when the singleton object is created and exits the library when the singleton is destroyed.
Anyway, in order for GR to be able to render Font it must obviously have access to its FontData object. One option would be to have a public getter, but that would expose the implementation of Font (no other class other than Font and GR should care about FontData). Instead I considered it's better to make GR a friend of Font.
Note: Until now I've done two things that some may consider shady (singleton and friend), but these are not the things I want to ask you about. Nevertheless, if you think my rationale for making GR a singleton and a friend of Font is wrong please do criticize me and maybe offer better solutions.
The shady thing. So GR has access to Font::data_ though friendship, but how does it know exactly what a FontData is (since it's not defined in the header, it's an incomplete type)? I'll just show the code and the comment that includes the rationale...
// =============================================================================
// graphics/font.cpp
// -----------------------------------------------------------------------------
struct Font::FontData
: public sf::Font
{
// Just a synonym of sf::Font
};
// A redefinition of FontData exists in GraphicsRenderer::printText(),
// which will have to be modified as well if this definition is modified.
// (The redefinition is called FontDataSurogate.)
// Why not have FontData defined only once in a separate header:
// If the definition of FontData changes, most likely printText() text will
// have to be altered also regardless. Considering that and also that FontData
// has (and should have) a very simple definition, a separate header was
// considered too much of an overhead and of little practical advantage.
// =============================================================================
// graphics/graphics_renderer.cpp
// -----------------------------------------------------------------------------
void GraphicsRenderer::printText(const Font& fnt /* ... */)
{
struct FontDataSurogate
: public sf::Font {
};
FontDataSurogate* suro = (FontDataSurogate*)fnt.data_.get();
sf::Font& font = (sf::Font)(*suro);
// ...
}
So that's the shady thing I'm trying to do. Basically what I want is a review of my rationale, so please tell me if you think I've done something horrendous or if not confirm my rationale so I can be a bit surer I'm doing the right thing. :) (This is my biggest project yet and I'm only at the beginning so I'm kinda feeling things in the dark atm.)
In general, if something looks sketchy, I've found that it's often worth going back a few times and trying to figure out exactly why that's necessary. In most cases, some kind of fix pops up (maybe not as "nice", but not relying on any kind of trick).
Now, the first issue I see in your example is this bit of code:
struct FontDataSurogate
: public sf::Font {
};
occurs twice, in different files (neither being a header). That may come back and be a bother when you change one but not the other in the future, and making sure both are identical will very likely be a pain.
To solve that, I would suggest putting the definition to FontDataSurogate and the appropriate includes (whatever library/header defines sf::Font) in a separate header. From the two files that need to use FontDataSurogate, include that definition header (not from any other code files or headers, just those two).
If you have a main class declaration header for your library, place the forward declaration for the class there, and use pointers in your objects and parameters (regular pointers or shared pointers).
You can then use friend or add a get method to retrieve the data, but by moving the class definition to its own header, you've created a single copy of that code and have a single object/file that's interfacing with the other library.
Edit:
You commented on the question while I was writing this, so I'll add on a reply to your comment.
"Too much overhead" - more to document, one more thing to include, the complexity of the code grows, etc.
Not so. You will have one copy of the code, compared to the two that must remain identical now. The code exists either way, so it needs documented, but your complexity and particularly maintenance is simplified. You do gain two #include statements, but is that such a high cost?
"Little practical advantage" - printText() would have to be modified every time FontData is modified regardless of whether or not it's defined in a separate header or not.
The advantage is less duplicate code, making it easier to maintain for you (and others). Modifying the function when the input data changes is not surprising or unusual really. Moving it to another header doesn't cost you anything but the mentioned includes.
friend is fine, and encouraged. See C++ FAQ Lite's rationale for more info: Do friends violate encapsulation?
This line is indeed horrendous, as it invokes undefined behavior: FontDataSurogate* suro = (FontDataSurogate*)fnt.data_.get();
You forward declare the existence of the FontData struct, and then go on to fully declare it in two locations: Font, and GraphicsRenderer. Ew. Now you have to manually keep these exactly binary compatible.
I'm sure it works, but you're right, it is kindof shady. But whenever we say such-and-such is eeevil, we mean to avoid a certain practice, with the caveat that sometimes it can be useful. That being said, I don't think this is one of those times.
One technique is to invert your handling. Instead of putting all of the logic inside GraphicsRenderer, put some of it inside Font. Like so:
class Font
{
public:
void do_something_with_fontdata(GraphicsRenderer& gr);
private:
struct FontData;
boost::shared_ptr<FontData> data_;
};
void GraphicsRenderer::printText(const Font& fnt /* ... */)
{
fnt.do_something_with_fontdata(*this);
}
This way, the Font details are kept within the Font class, and even GraphicsRenderer doesn't need to know the specifics of the implementation. This solves the friend issue too (although I don't think friend is all that bad to use).
Depending on how your code is laid out, and what it's doing, attempting to invert it like this may be quite difficult. If that is the case, simply move the real declaration of FontData to its own header file, and use it in both Font and GraphicsRenderer.
You've spent a lot more effort asking this question then you've supposedly saved by duplicating that code.
You state three reasons you didn't want to add the file:
Extra include
Extra Documentation
Extra Complexity
But I would have to say that 2 and 3 are increased by duplicating that code. Now you document what its doing in the original place and what the fried monkey its doing defined again in another random place in the code base. And duplicating code can only increase the complexity of a project.
The only thing you are saving is an include file. But files are cheap. You should not be afraid of creating them. There is almost zero cost (or at least there should be) to add a new header file.
The advantages of doing this properly:
The compiler doesn't have to make the definition you give compatible
Someday, somebody is going to modify the FontData class without modifying PrintText(), maybe they should modify PrintText(), but they either haven't done it yet or don't know that they need to. Or perhaps in a way that simply hasn't occoured to additional data on FontData make sense. Regardless, the different pieces of code will operate on different assumptions and will explode in a very hard to trace bug.

C++ cyclical header dependency

I had a question: Supposed I have a header/source file set and a header set as follows"
BaseCharacter.h and BaseCharacter.cpp and EventTypes.h
BaseCharacter.h makes use of structs and typedefs defined in EventTypes.h, but EventTypes.h has to be aware of the BaseCharacter class defined in BaseCharacter.h. This creates a cyclical dependency and I'm pretty sure that's whats stopping my program from compiling. If I take out EventTypes.h and all the methods that rely on the stuff in EventTypes.h, my program compiles fine. But if I added EventTypes.h, it, and every file referencing BaseCharacter.h will complain that it can't find the BaseCharacter class.
Is there a way around this dependency or would this not be what's causing my problem?
I'm using MSVC 2010 as my compiler
Forward declaration.
In EventTypes.h, remove the include and add:
class BaseCharacter;
Note that you can only use references and pointers to BaseCharacter within EventTypes.h, you cannot e.g. have a struct with a BaseCharacter myChar; member variable.
You should probably have a good look at your design and make sure it makes sense; often cyclical dependencies indicate a sub-optimal design (though it's possible that it's the best solution for your needs).
In any case, you can predeclare the classes upfront in each header file, thus avoiding having circular includes. This is called using forward declarations.
Another good option would be to extract the stuff that both BaseCharacter.h and EventTypes.h depend on into a third header file that the first two include; then you'd only have a one-way dependency of EventTypes.h on BaseCharacter.h.
A third alternative is to simply merge everything into one header file; this may or may not make sense based on your design, but if the inter-dependencies are strong enough, then surely a unified model makes sense?
You want to learn how to forward declare classes and structs.
See here: cyclic dependency between header files
or here: C++ error: 'Line2' has not been declared
Just to add to the answers mentioning forward declaration, there is another alternative which is slightly different, and it is called PIMPL, which stands for Pointer to IMPlementation. It is often used in conjunction with forward declaration, but can be used without. Not only it helps to solve cyclic dependency problems, but can also dramatically speed up build time and reduce code dependency.
Once I had similar problem, solved it with templates. Can't you define the EventTypes and/or BaseCharacter as template?

Is there a good way to avoid duplication of method prototypes in C++?

Most C++ class method signatures are duplicated between the declaration normally in a header files and the definition in the source files in the code I have read. I find this repetition undesirable and code written this way suffers from poor locality of reference. For instance, the methods in source files often reference instance variables declared in the header file; you end up having to constantly switch between header files and source files when reading code.
Would anyone recommend a way to avoid doing so? Or, am I mainly going to confuse experienced C++ programmers by not doing things in the usual way?
See also Question 538255 C++ code in header files where someone is told that everything should go in the header.
There is an alternative, but the cure is worse than the illness — define all the function bodies in the header, or even inline in the class, like C#. The downsides are that this will bloat compile times significantly, and it'll annoy veteran C++ programmers. It can also get you into some annoying situations of circular dependency that, while solvable, are a nuisance to deal with.
Personally, I just set my IDE to have a vertical split, and put the header file on the right side and the source file on the left.
I assume you're talking about member function declarations in a header file and definitions in source files?
If you're used to the Java/Python/etc. model, it may well seem redundant. In fact, if you were so inclined, you could define all functions inline in the class definition (in the header file). But, you'd definitely be breaking with convention and paying the price of additional coupling and compilation time every time you changed anything minor in the implementation.
C++, Ada, and other languages originally designed for large scale systems kept definitions hidden for a reason--there's no good reason that the users of a class should have to be concerned with its implementation, nor any reason they should have to repeatedly pay to compile it. Less of an issue nowadays with faster systems, but still relevant for really large systems. Additionally, TDD, stubbing and other testing strategies are facilitated by the isolation and quicker compilation.
Don't break with convention. In the end, you will make a ball of worms that doesn't work very well. Plus, compilers will hate you. C/C++ are setup that way for a reason.
C++ language supports function overloading, which means that the entire function signature is basically a way to identify a specific function. For this reason, as long as you declare and define function separately, there's really no redundancy in having to list the parameters again. More precisely, having to list the parameter types is not redundant. Parameters names, on the other hand, play no role in this process and you are free to omit them in the declaration (i.e in the header file), although I belive this limits readability.
You "can" get around the problem. You define an abstract interface class that only contains the pure virtual functions that an outside application will call. Then in the CPP file you provide the actual class that derives from the interface and contains all the class variables. You implement as normal now. The only thing this requires is a way to instantiate the derived implementation class from the interface class. You could do that by providing a static "Create" function that has its implementation in the CPP file.
ie
InterfaceClass* InterfaceClass::Create()
{
return new ImplementationClass;
}
This way you effectively hide the implementation from any outside user. You can't, however, create the class on the stack only on the heap ... but it does solve your problem AND provides a better layer of abstraction. In the end though if you aren't prepared to do this you need to stick with what you are doing.

When to use Header files that do not declare a class but have function definitions

I am fairly new to C++ and I have seen a bunch of code that has method definitions in the header files and they do not declare the header file as a class. Can someone explain to me why and when you would do something like this. Is this a bad practice?
Thanks in advance!
Is this a bad practice?
Not in general. There are a lot of libraries that are header only, meaning they only ship header files. This can be seen as a lightweight alternative to compiled libraries.
More importantly, though, there is a case where you cannot use separate precompiled compilation units: templates must be specialized in the same compilation unit in which they get declared. This may sound arcane but it has a simple consequence:
Function (and class) templates cannot be defined inside cpp files and used elsewhere; instead, they have to be defined inside header files directly (with a few notable exceptions).
Additionally, classes in C++ are purely optional – while you can program object oriented in C++, a lot of good code doesn't. Classes supplement algorithms in C++, not the other way round.
It's not bad practice. The great thing about C++ is that it lets you program in many styles. This gives the language great flexibility and utility, but possibly makes it trickier to learn than other languages that force you to write code in a particular style.
If you had a small program, you could write it in one function - possibly using a couple of goto's for code flow.
When you get bigger, splitting the code into functions helps organize things.
Bigger still, and classes are generally a good way of grouping related functions that work on a certain set of data.
Bigger still, namespaces help out.
Sometimes though, it's just easiest to write a function to do something. This is often the case where you write a function that only works on primitive types (like int). int doesn't have a class, so if you wanted to write a printInt() function, you might make it standalone. Also, if a function works on objects from multiple classes, but doesn't really belong to one class and not the other, that might make sense as a standalone function. This happens a lot when you write operators such as define less than so that it can compare objects of two different classes. Or, if a function can be written in terms of a classes public methods, and doesn't need to access data of the class directly, some people prefer to write that as a standalone function.
But, really, the choice is yours. Whatever is the most simple thing to do to solve your problem is best.
You might start a program off as just a few functions, and then later decide some are related and refactor them into a class. But, if the other standalone functions don't naturally fit into a class, you don't have to force them into one.
An H file is simply a way of including a bunch of declarations. Many things in C++ are useful declarations, including classes, types, constants, global functions, etc.
C++ has a strong object oriented facet. Most OO languages tackle the question of where to deal with operations that don't rely on object state and don't actually need the object.
In some languages, like Java, language restrictions force everything to be in a class, so everything becomes a static member function (e.g., classes with math utilities or algorithms).
In C++, to maintain compatibility with C, you are allowed to declare standalone C-style functions or use the Java style of static members. My personal view is that it is better, when possible, to use the OO style and organize operations around a central concept.
However, C++ does provide the namespaces facilities and often it is used in the same way that a class would be used in those situations - to group a bunch of standalone items where each item is prefixed by the "namespace" name. As others point out, many C++ standard library functions are located this way. My view is that this is much like using a class in Java. However, others would argue that Java uses classes because it doesn't have namespaces.
As long as you use one or the other (rather than a floating standalone non-namespaced function) you're generally going to be ok.
I am fairly new to C++ and I have seen a bunch of code that has method definitions in the header files and they do not declare the header file as a class.
Lets clarify things.
method definitions in the header files
This means something like this:
file "A.h":
class A {
void method(){/*blah blah*/} //definition of a method
};
Is this what you meant?
Later you are saying "declare the header file". There is no mechanism for DECLARING a file in C++. A file can be INCLUDED by witing #include "filename.h". If you do this, the contents of the header file will be copied and pasted to wherever you have the above line before anything gets compiled.
So you mean that all the definitions are in the class definition (not anywhere in A.h FILE, but specifically in the class A, which is limited by 'class A{' and '};' ).
The implication of having method definition in the class definition is that the method will be 'inline' (this is C++ keyword), which means that the method body will be pasted whenever there is a call to it. This is:
good, because the function call mechanism no longer slows down the execution
bad if the function is longer than a short statement, because the size of executable code grows badly
Things are different for templates as someone above stated, but for them there is a way of defining methods such that they are not inline, but still in the header file (they must be in headers). This definitions have to be outside the class definition anyway.
In C++, functions do not have to be members of classes.

How best to switch from template mess to clean classes architecture (C++)?

Assuming a largish template library with around 100 files containing around 100 templates with overall more than 200,000 lines of code. Some of the templates use multiple inheritance to make the usage of the library itself rather simple (i.e. inherit from some base templates and only having to implement certain business rules).
All that exists (grown over several years), "works" and is used for projects.
However, compilation of projects using that library consumes a growing amount of time and it takes quite some time to locate the source for certain bugs. Fixing often causes unexpected side effects or is quite difficult, because some interdependent templates need changing. Testing is nearly impossible due to the sheer amount of functions.
Now, I would really like to simplify the architecture to use less templates and more specialized smaller classes.
Is there any proven way to go about that task? What would be a good place to start?
I'm not sure I see how/why templates are the problem, and why plain non-templated classes would be an improvement. Wouldn't that just mean even more classes, less type safety and so larger potential for bugs?
I can understand simplifying the architecture, refactoring and removing dependencies between the various classes and templates, but automatically assuming that "fewer templates will make the architecture better" is flawed imo.
I'd say that templates potentially allow you to build a much cleaner architecture than you'd get without them. Simply because you can make separate classes totally independent. Without templates, classes functions which call into another class must know about the class, or an interface it inherits, in advance. With templates, this coupling isn't necessary.
Removing templates would only lead to more dependencies, not fewer.
The added type-safety of templates can be used to detect a lot of bugs at compile-time (Sprinkle your code liberally with static_assert's for this purpose)
Of course, the added compile-time may be a valid reason to avoid templates in some cases, and if you only have a bunch of Java programmers, who are used to thinking in "traditional" OOP terms, templates might confuse them, which can be another valid reason to avoid templates.
But from an architecture point of view, I think avoiding templates is a step in the wrong direction.
Refactor the application, sure, it sounds like that's needed. But don't throw away one of the most useful tools for producing extensible and robust code just because the original version of the app misused it. Especially if you're already concerned with the amount of code, removing templates will most likely lead to more lines of code.
You need automated tests, that way in ten years time when your succesor has the same problem he can refactor the code (probably to add more templates because he thinks it will simplify usage of the library) and know it still meets all test cases. Similarly the side effects of any minor bug fixes will be immediately visible (assuming your test cases are good).
Other than that, "divide and conqueor"
Write unit tests.
Where the new code must do the same as the old code.
That's one tip at least.
Edit:
If you deprecate old code that you have replaced with the new functionality you
can phase over to the new code little by little.
Well, the problem is that template way of thinking is very different from object-oriented inheritance-based way. It's hard to answer anything else than "redesign the whole thing and start from scratch".
Of course, there may be a simple way for a particular case. We can't tell without knowing more about what you have.
The fact that the template solution is so difficult to maintain is an indication of a poor design anyway.
Some points (but note: these are not evil indeed. If you want to change to non-template code, though, this can help out):
Lookup your static interfaces. Where do templates depend on what functions exist? Where do they need typedefs?
Put the common parts in an abstract base class. A good example is when you happen to stumble over the CRTP idiom. You can just replace it with an abstract base class having virtual functions.
Lookup integer lists. If you find your code uses integral lists like list<1, 3, 3, 1, 3>, you can replace them with std::vector, if all the codes using them can live with working with runtime values instead of constant expressions.
Lookup type traits. There is much code involved checking whether some typedef exists, or whether some method exists in typical templated code. Abstract baseclasses solve these two issues by using pure virtual methods, and by inheriting typedefs to the base. Often, typedefs are only needed to trigger hideous features like SFINAE, which would then be superfluous too.
Lookup expression templates. If your code uses expression templates to avoid creating temporaries, you will have to eliminate them and use the traditional way of returning / passing temporaries to the operators involved.
Lookup function objects. If you find your code uses function objects, you can change them to use abstract base classes too, and have something like void run(); to call them (or if you want to keep using operator(), better so! It can be virtual too).
As I understand, you are most concerned with build times, and the maintainability of your library?
First, don't try to "fix" all at once.
Second, understand what you fix. Template complexity is there often for a reason, e.g. to enforce certain use, and make the compiler help you not make a mistake. That reason might sometimes be taken to far, but throwing out 100 lines because "noone really knows what they do" shouldn't be taken lightly. Everything I suggest here can introduce really nasty bugs, you have been warned.
Third, consider cheaper fixes first: e.g. faster machines or distributed build tools. At least, throw in all the RAM the boards will take, and throw out old disks. It does maike a difference. One drive for OS, one drive for build is a cheap mans RAID.
Is the library well documented? That's your best chance at making it Look into tools such as doxygen that help you create such a documentation.
All considered? OK, now some suggestions for the build times ;)
Understand the C++ build model: every .cpp is compiled individually. That means many .cpp files with many headers = huge build. This is NOT an advise to put everything into one .cpp file, though! However, one trick (!) that can speed up a build immensely is to create a single .cpp file that includes a bunch of .cpp files, and only feed that "master" file to the compiler. You can't do that blindly, though - you need to understand the types of errors this could introduce.
If you don't have one yet, get a separate build machine that you can remote into. You'll have to do a lot of almost-full builds to check if you broke some include. You will want to run this in another machine, that doesn't block you from working on something else. Long term, you'll need it for daily integration builds anyway ;)
Use precompiled headers. (scales better with fast machines, see above)
Check your header inclusion policy. While every file should be "independent" (i.e. include everything it needs to be included by someone else), don't include liberally. Unfortunately, I haven't yet found a tool to find unnecessary #incldue statements, but it might help to spend some time removing unused headers in "hotspot" files.
Create and use forward declarations for the templates you use. Often, you can incldue a header with forwad declarations in many places, and use the full header only in a few specific ones. This can greatly help compile time. Check the <iosfwd> header how the standard library does that for i/o streams.
overloads for templates for few types: If you have a complex function template that is useful only for a very few types like this:
// .h
template <typename FLOAT> // float or double only
FLOAT CalcIt(int len, FLOAT * values) { ... }
You can declare the overloads in the header, and move the template to the body:
// .h
float CalcIt(int len, float * values);
double CalcIt(int len, double * values);
// .cpp
template <typename FLOAT> // float or double only
FLOAT CalcItT(int len, FLOAT * values) { ... }
float CalcIt(int len, float * values) { return CalcItT(len, values); }
double CalcIt(int len, double * values) { return CalcItT(len, values); }
this moves the lengthy template to a single compilation unit.
Unfortunately, this is only of limited use for classes.
Check if the PIMPL idiom can move code from the headers into .cpp files.
The general rule that hides behind that is separate the interface of your library from the implementation. Use comments, detail namesapces and separate .impl.h headers to mentally and physically isolate what should be known to the outside from how it is accomplished. This exposes the real value of your library (does it actually encapsulate complexity?), and gives you a chance to replace "easy targets" first.
More specific advise - and how useful the one given is - depends largely on the actual library.
Good luck!
As mentioned, unit tests are a good idea. Indeed, rather than breaking your code by introducing "simple" changes that are likely to ripple out, just focus on creating a suite of tests, and fixing non-compliance with the tests. Have an activity to update the tests when bugs come to light.
Beyond that, I would suggest upgrading your tools, if possible, to help with debugging template-related problems.
I've often come across legacy templates that were huge and required a lot of time and memory to instantiate, but didn't need to be. In those cases, the easiest way to cut out the fat was to take all of the code that didn't rely on any of the template arguments and hide it in separate functions defined in a normal translation unit. This also had the positive side-effect of triggering fewer recompiles when this code had to be slightly modified or documentation changed. It sounds rather obvious, but it's really surprising how often people write a class template and think that EVERYTHING it does has to be defined in the header, rather than just the code that needs the templated information.
Another thing you might want to consider is how often you clean up the inheritance hierarchies by making the templates "mixin" style instead of aggregations of multiple inheritance. See how many places you can get away with making one of the template arguments the name of the base class that it should derive from (the way boost::enable_shared_from_this works). Of course this typically only works well if the constructors take no arguments, as you don't have to worry about initializing anything correctly.