I have a question that bothers me for a long time.
I have a mathematical library, implemented as DLL.
The library implements many types of mathematical functions.
The mathematical functions are implemented in different classes based on their functionality. For example, all functions that implements polynomial equations are under CPolynom, all functions that implements differential equations are under CDifferential.
The DLL has to expose an interface.
Now my question:
Option 1
I can implement in one header file, one interface class which includes all the “sub” interface methods of all practical classes => On one hand, I expose to the outside world only one header file with one class, on the other hand this class might be huge, including interface methods which are not related (i.e. someone wants the CPolynom functionality, he/she will have to include huge header file and construct huge interface class)
Option 2
I can implement interface class for each class in different header file.
=> On one hand I split classes to different files based on functionality. If someone wants a specific functionality, he/she will have to include only the relevant file.
On the other hand, I can end up with many header files introduced to the outside world.
Which approach counts as more professional, more correct in software design discipline?
Thank you,
David
I generally implement one header per class. If you find yourself always needing a shotgun blast of headers because you need "one from column A and two from column B", that generally implies some sort of design code smell. For instance, you have a single responsibility spread across several classes instead of concentrated in a single class.
Well, you could always go for option 2 and create a header, that will include all other headers, so that way, if someone doesn't care, he will just include 1 big header, otherwise, he will be able to include only what he needs.
Personally, I don't see anything wrong with one big header, when I try to use small headers as an user, I always have problems like "I need only 1 function from header X, 2 from header Y, ...." and after all, I end up with 15 includes to one library in each *.cpp file, which is annoying (so I actually create one big header myself :) ).
I think that you should implement interface for each class and put them in one header file. No need to produce lot of library headers in the world of precompiled headers.
One header (or few, if there is a lot of interfaces and you can group them logically), but separate classes is an option too. Don't make one huge class with bunch of methods like you were going to do in option 1.
The size of the header file is relevant only in terms of how long it takes to compile. Unused declarations will not impact performance, and your code size is already impacted by shipping all the functions in a DLL instead of statically linking only the needed functions.
Have you considered putting a COM interface on the library? Instead of shipping a separate .h file your users can simply:
#import "mathFunctions.dll" named_guids
Automatically declared smart pointers then make the functions usable with no further manual declarations or even cleanup. You can define your documentation in the IDL, which will then appear in tooltips in the IDE when your clients go to use your code. Your code is also reusable in all Microsoft languages from VB3-6 and wscript and .Net.
double myFunc(double x)
{
double a=1, b=2, c=3, y=0;
MathLib::PolynomPtr polynom(CLSID_CPolynom, 0, CLSCTX_INPROC_SERVER);
y = polynom->someFunction(a, b, c, x); // use it somehow
return y;
}
Used in-proc, COM adds only about 16 cycles of overhead to the execution of each function. Because it's a smart pointer, you do not have to clean it up. And I didn't include it above, but as with anything external you should add basic try/catch functionality wrappers:
catch (_com_error &e) { printf("COM error: %08x %s\n", e.Error(), e.Description()) };
Option 2 would look natural and professional to everybody.
This is the way the large majority of libraries are done.
However, grouping small related classes in the same header (but still having several headers for the whole API) sounds very reasonable.
The biggest disadvantages I see of option 1 are:
- Everything in a big class might be difficult to handle: when a developer is busy on using CPolynom, having its declaration lost between CDifferential and other class would be painful.
- Using a namespace is more appropriate to aggregate classes, than a bigger, "mother" class.
In short, I would choose option 2, with all classes under the library namespace.
Related
Most C++ class method signatures are duplicated between the declaration normally in a header files and the definition in the source files in the code I have read. I find this repetition undesirable and code written this way suffers from poor locality of reference. For instance, the methods in source files often reference instance variables declared in the header file; you end up having to constantly switch between header files and source files when reading code.
Would anyone recommend a way to avoid doing so? Or, am I mainly going to confuse experienced C++ programmers by not doing things in the usual way?
See also Question 538255 C++ code in header files where someone is told that everything should go in the header.
There is an alternative, but the cure is worse than the illness — define all the function bodies in the header, or even inline in the class, like C#. The downsides are that this will bloat compile times significantly, and it'll annoy veteran C++ programmers. It can also get you into some annoying situations of circular dependency that, while solvable, are a nuisance to deal with.
Personally, I just set my IDE to have a vertical split, and put the header file on the right side and the source file on the left.
I assume you're talking about member function declarations in a header file and definitions in source files?
If you're used to the Java/Python/etc. model, it may well seem redundant. In fact, if you were so inclined, you could define all functions inline in the class definition (in the header file). But, you'd definitely be breaking with convention and paying the price of additional coupling and compilation time every time you changed anything minor in the implementation.
C++, Ada, and other languages originally designed for large scale systems kept definitions hidden for a reason--there's no good reason that the users of a class should have to be concerned with its implementation, nor any reason they should have to repeatedly pay to compile it. Less of an issue nowadays with faster systems, but still relevant for really large systems. Additionally, TDD, stubbing and other testing strategies are facilitated by the isolation and quicker compilation.
Don't break with convention. In the end, you will make a ball of worms that doesn't work very well. Plus, compilers will hate you. C/C++ are setup that way for a reason.
C++ language supports function overloading, which means that the entire function signature is basically a way to identify a specific function. For this reason, as long as you declare and define function separately, there's really no redundancy in having to list the parameters again. More precisely, having to list the parameter types is not redundant. Parameters names, on the other hand, play no role in this process and you are free to omit them in the declaration (i.e in the header file), although I belive this limits readability.
You "can" get around the problem. You define an abstract interface class that only contains the pure virtual functions that an outside application will call. Then in the CPP file you provide the actual class that derives from the interface and contains all the class variables. You implement as normal now. The only thing this requires is a way to instantiate the derived implementation class from the interface class. You could do that by providing a static "Create" function that has its implementation in the CPP file.
ie
InterfaceClass* InterfaceClass::Create()
{
return new ImplementationClass;
}
This way you effectively hide the implementation from any outside user. You can't, however, create the class on the stack only on the heap ... but it does solve your problem AND provides a better layer of abstraction. In the end though if you aren't prepared to do this you need to stick with what you are doing.
I'm currently in the process of trying to organize my code in better way.
To do that I used namespaces, grouping classes by components, each having a defined role and a few interfaces (actually Abstract classes).
I found it to be pretty good, especially when I had to rewrite an entire component and I did with almost no impact on the others. (I believe it would have been a lot more difficult with a bunch of mixed-up classes and methods)
Yet I'm not 100% happy with it. Especially I'd like to do a better separation between interfaces, the public face of the components, and their implementations in behind.
I think the 'interface' of the component itself should be clearer, I mean a new comer should understand easily what interfaces he must implement, what interfaces he can use and what's part of the implementation.
Soon I'll start a bigger project involving up to 5 devs, and I'd like to be clear in my mind on that point.
So what about you? how do you do it? how do you organize your code?
Especially I'd like to do a better
separation between interfaces, the
public face of the components, and
their implementations in behind.
I think what you're looking for is the Facade pattern:
A facade is an object that provides a simplified interface to a larger body of code, such as a class library. -- Wikipedia
You may also want to look at the Mediator and Builder patterns if you have complex interactions in your classes.
The Pimpl idiom (aka compiler firewall) is also useful for hiding implementation details and reducing build times. I prefer to use Pimpl over interface classes + factories when I don't need polymorphism. Be careful not to over-use it though. Don't use Pimpl for lightweight types that are normally allocated on the stack (like a 3D point or complex number). Use it for the bigger, longer-lived classes that have dependencies on other classes/libraries that you'd wish to hide from the user.
In large-scale projects, it's important to not use an #include directives in a header file when a simple forward declaration will do. Only put an #include directives in a header file if absolutely necessary (prefer to put #includes in the implementation files). If done right, proper #include discipline will reduce your compile times significantly. The Pimpl idiom can help to move #includes from header files to their corresponding implementation files.
A coherent collection of classes / functions can be grouped together in it's own namespace and put in a subdirectory of your source tree (the subdirectory should have the same name as the library namespace). You can then create a static library subproject/makefile for that package and link it with your main application. This is what I'd consider a "package" in UML jargon. In an ideal package, classes are closely related to each other, but loosely related with classes outside the package. It is helpful to draw dependency diagrams of your packages to make sure there are no cyclical dependencies.
There are two common approaches:
If, apart from the public interface, you only need some helper functions, just put them in an unnamed namespace in the implementation file:
// header:
class MyClass {
// interface etc.
};
// source file:
namespace {
void someHelper() {}
}
// ... MyClass implementation
If you find yourself wanting to hide member functions, consider using the PIMPL idiom:
class MyClassImpl; // forward declaration
class MyClass {
public:
// public interface
private:
MyClassImpl* pimpl;
};
MyClassImpl implements the functionality, while MyClass forwards calls to the public interface to the private implementation.
You might find some of the suggestions in Large Scale C++ Software Design useful. It's a bit dated (published in 1996) but still valuable, with pointers on structuring code to minimize the "recompiling the world when a single header file changes" problem.
Herb Sutter's article on "What's In a Class? - The Interface Principle" presents some ideas that many don't seem to think of when designing interfaces. It's a bit dated (1998) but there's some useful stuff in there, nonetheless.
First declare you variables you may use them in one string declaration also like so.
char Name[100],Name2[100],Name3[100];
using namespace std;
int main(){
}
and if you have a long peice of code that could be used out of the program make it a new function.
likeso.
void Return(char Word[100]){
cout<<Word;
cin.ignore();
}
int main(){
Return("Hello");
}
And I suggest any outside functions and declarations you put into a header file and link it
likeso
Dev-C++ #include "Resource.h"
What is best practice for C++ Public API?
I am working on a C++ project that has multiple namespaces, each with multiple objects. Some objects have the same names, but are in different namespaces. Currently, each object has its own .cpp file and .h file. I am not sure how to word this... Would it be appropriate to create a second .h file to expose only the public API? Should their be a .h file per namespace or per object or some other scope? What might be a best practice for creating Public APIs for C++ libraries?
Thanks For Any Help,
Chenz
It is sometimes convenient to have a single class in every .cpp and .h pair of files and to have the namespace hierarchy as the directory hierarchy.
For instance if you have this class:
namespace stuff {
namespace important {
class SecretPassword
{
...
};
}
}
then it will be in two files:
/stuff/important/SecretPassword.cpp
/stuff/important/SecretPassword.h
another possible layout might be:
/src/stuff/important/SecretPassword.cpp
/include/stuff/important/SecretPassword.h
G'day,
One suggestion is to take a look at the C++ idiom of Handle-Body, sometimes known as Cheshire Cat. Here's James Coplien's original paper containing the idiom.
This is a well known method for decoupling public API's from implementations.
HTH
I'd say it's best decided by you, and the type of 'library' this is.
Is your API provides one "Action"? or handles only one abstract "Data type"? examples for this would be zlib and libpng. Both have only one header that gives everything that is needed to perform what the libraries are for.
If your library is a collection of unrelated (or even related) classes that do, or not, the same goal, then provide each subset with it's own header. Major example for this will be boost.
Here is what I'm used to do:
"Some objects have the same names, but are in different namespaces"
That's why namespaces exist.
"Would it be appropriate to create a second .h file to expose only the public API? "
You always should expose only the public API. But what means to expose public API? If it would be only to headers then, since public API relies on private API, the private API would be included by public API anyway. To expose a public API mark public functions/classes with a macro (which in case of Windows exports public functions to the symbol table; and probably it will be soon adopted by Unix systems). So you should define a macro like MYLIB_API or MYLIB_DECLSPEC, just check some existing libraries and MS declspec documentation. It is sufficient, usually non-public API will be kept in subdirectories so it doesn't attend library's user.
"Should their be a .h file per namespace or per object or some other scope?"
I prefer Java-style, one public class per header. I found that libs written in this way are far more clean and readable than those which are mixing file and structure names. But there are cases when I brake this rule, especially when it comes to templates. In such cases I give #warning message to not include header directly and carefully explain in comments what is going on.
Great response LiraNuna.
Are you providing an API to an application or a library?
An application API will generally only provide methods that either query the state of an application or attempt to alter that state. In this case you'd generally have separate interface declarations in a separate header file. Your objects would then implement this interface to handle API requests.
A library will generally expose objects that can be re-used. In this case, generally speaking, your API is simply the public methods in the header file.
I agree with what doc said - one class per file. 99.9% of the time!
Also, consider what filenames to use. It's generally a bad idea to have multiple headers of the same name in different directories, even though the classes they contain may well be in different namespaces.
Especially if this is a public API, you probably cannot control what include paths are defined by the user of your library, so a build may find the wrong one. Tweaking the order of include paths would definitely not be a solution I'd recommend!
We use a naming convention of Namespace-Class.h to explicitly identify classes in files.
I am fairly new to C++ and I have seen a bunch of code that has method definitions in the header files and they do not declare the header file as a class. Can someone explain to me why and when you would do something like this. Is this a bad practice?
Thanks in advance!
Is this a bad practice?
Not in general. There are a lot of libraries that are header only, meaning they only ship header files. This can be seen as a lightweight alternative to compiled libraries.
More importantly, though, there is a case where you cannot use separate precompiled compilation units: templates must be specialized in the same compilation unit in which they get declared. This may sound arcane but it has a simple consequence:
Function (and class) templates cannot be defined inside cpp files and used elsewhere; instead, they have to be defined inside header files directly (with a few notable exceptions).
Additionally, classes in C++ are purely optional – while you can program object oriented in C++, a lot of good code doesn't. Classes supplement algorithms in C++, not the other way round.
It's not bad practice. The great thing about C++ is that it lets you program in many styles. This gives the language great flexibility and utility, but possibly makes it trickier to learn than other languages that force you to write code in a particular style.
If you had a small program, you could write it in one function - possibly using a couple of goto's for code flow.
When you get bigger, splitting the code into functions helps organize things.
Bigger still, and classes are generally a good way of grouping related functions that work on a certain set of data.
Bigger still, namespaces help out.
Sometimes though, it's just easiest to write a function to do something. This is often the case where you write a function that only works on primitive types (like int). int doesn't have a class, so if you wanted to write a printInt() function, you might make it standalone. Also, if a function works on objects from multiple classes, but doesn't really belong to one class and not the other, that might make sense as a standalone function. This happens a lot when you write operators such as define less than so that it can compare objects of two different classes. Or, if a function can be written in terms of a classes public methods, and doesn't need to access data of the class directly, some people prefer to write that as a standalone function.
But, really, the choice is yours. Whatever is the most simple thing to do to solve your problem is best.
You might start a program off as just a few functions, and then later decide some are related and refactor them into a class. But, if the other standalone functions don't naturally fit into a class, you don't have to force them into one.
An H file is simply a way of including a bunch of declarations. Many things in C++ are useful declarations, including classes, types, constants, global functions, etc.
C++ has a strong object oriented facet. Most OO languages tackle the question of where to deal with operations that don't rely on object state and don't actually need the object.
In some languages, like Java, language restrictions force everything to be in a class, so everything becomes a static member function (e.g., classes with math utilities or algorithms).
In C++, to maintain compatibility with C, you are allowed to declare standalone C-style functions or use the Java style of static members. My personal view is that it is better, when possible, to use the OO style and organize operations around a central concept.
However, C++ does provide the namespaces facilities and often it is used in the same way that a class would be used in those situations - to group a bunch of standalone items where each item is prefixed by the "namespace" name. As others point out, many C++ standard library functions are located this way. My view is that this is much like using a class in Java. However, others would argue that Java uses classes because it doesn't have namespaces.
As long as you use one or the other (rather than a floating standalone non-namespaced function) you're generally going to be ok.
I am fairly new to C++ and I have seen a bunch of code that has method definitions in the header files and they do not declare the header file as a class.
Lets clarify things.
method definitions in the header files
This means something like this:
file "A.h":
class A {
void method(){/*blah blah*/} //definition of a method
};
Is this what you meant?
Later you are saying "declare the header file". There is no mechanism for DECLARING a file in C++. A file can be INCLUDED by witing #include "filename.h". If you do this, the contents of the header file will be copied and pasted to wherever you have the above line before anything gets compiled.
So you mean that all the definitions are in the class definition (not anywhere in A.h FILE, but specifically in the class A, which is limited by 'class A{' and '};' ).
The implication of having method definition in the class definition is that the method will be 'inline' (this is C++ keyword), which means that the method body will be pasted whenever there is a call to it. This is:
good, because the function call mechanism no longer slows down the execution
bad if the function is longer than a short statement, because the size of executable code grows badly
Things are different for templates as someone above stated, but for them there is a way of defining methods such that they are not inline, but still in the header file (they must be in headers). This definitions have to be outside the class definition anyway.
In C++, functions do not have to be members of classes.
Assuming a largish template library with around 100 files containing around 100 templates with overall more than 200,000 lines of code. Some of the templates use multiple inheritance to make the usage of the library itself rather simple (i.e. inherit from some base templates and only having to implement certain business rules).
All that exists (grown over several years), "works" and is used for projects.
However, compilation of projects using that library consumes a growing amount of time and it takes quite some time to locate the source for certain bugs. Fixing often causes unexpected side effects or is quite difficult, because some interdependent templates need changing. Testing is nearly impossible due to the sheer amount of functions.
Now, I would really like to simplify the architecture to use less templates and more specialized smaller classes.
Is there any proven way to go about that task? What would be a good place to start?
I'm not sure I see how/why templates are the problem, and why plain non-templated classes would be an improvement. Wouldn't that just mean even more classes, less type safety and so larger potential for bugs?
I can understand simplifying the architecture, refactoring and removing dependencies between the various classes and templates, but automatically assuming that "fewer templates will make the architecture better" is flawed imo.
I'd say that templates potentially allow you to build a much cleaner architecture than you'd get without them. Simply because you can make separate classes totally independent. Without templates, classes functions which call into another class must know about the class, or an interface it inherits, in advance. With templates, this coupling isn't necessary.
Removing templates would only lead to more dependencies, not fewer.
The added type-safety of templates can be used to detect a lot of bugs at compile-time (Sprinkle your code liberally with static_assert's for this purpose)
Of course, the added compile-time may be a valid reason to avoid templates in some cases, and if you only have a bunch of Java programmers, who are used to thinking in "traditional" OOP terms, templates might confuse them, which can be another valid reason to avoid templates.
But from an architecture point of view, I think avoiding templates is a step in the wrong direction.
Refactor the application, sure, it sounds like that's needed. But don't throw away one of the most useful tools for producing extensible and robust code just because the original version of the app misused it. Especially if you're already concerned with the amount of code, removing templates will most likely lead to more lines of code.
You need automated tests, that way in ten years time when your succesor has the same problem he can refactor the code (probably to add more templates because he thinks it will simplify usage of the library) and know it still meets all test cases. Similarly the side effects of any minor bug fixes will be immediately visible (assuming your test cases are good).
Other than that, "divide and conqueor"
Write unit tests.
Where the new code must do the same as the old code.
That's one tip at least.
Edit:
If you deprecate old code that you have replaced with the new functionality you
can phase over to the new code little by little.
Well, the problem is that template way of thinking is very different from object-oriented inheritance-based way. It's hard to answer anything else than "redesign the whole thing and start from scratch".
Of course, there may be a simple way for a particular case. We can't tell without knowing more about what you have.
The fact that the template solution is so difficult to maintain is an indication of a poor design anyway.
Some points (but note: these are not evil indeed. If you want to change to non-template code, though, this can help out):
Lookup your static interfaces. Where do templates depend on what functions exist? Where do they need typedefs?
Put the common parts in an abstract base class. A good example is when you happen to stumble over the CRTP idiom. You can just replace it with an abstract base class having virtual functions.
Lookup integer lists. If you find your code uses integral lists like list<1, 3, 3, 1, 3>, you can replace them with std::vector, if all the codes using them can live with working with runtime values instead of constant expressions.
Lookup type traits. There is much code involved checking whether some typedef exists, or whether some method exists in typical templated code. Abstract baseclasses solve these two issues by using pure virtual methods, and by inheriting typedefs to the base. Often, typedefs are only needed to trigger hideous features like SFINAE, which would then be superfluous too.
Lookup expression templates. If your code uses expression templates to avoid creating temporaries, you will have to eliminate them and use the traditional way of returning / passing temporaries to the operators involved.
Lookup function objects. If you find your code uses function objects, you can change them to use abstract base classes too, and have something like void run(); to call them (or if you want to keep using operator(), better so! It can be virtual too).
As I understand, you are most concerned with build times, and the maintainability of your library?
First, don't try to "fix" all at once.
Second, understand what you fix. Template complexity is there often for a reason, e.g. to enforce certain use, and make the compiler help you not make a mistake. That reason might sometimes be taken to far, but throwing out 100 lines because "noone really knows what they do" shouldn't be taken lightly. Everything I suggest here can introduce really nasty bugs, you have been warned.
Third, consider cheaper fixes first: e.g. faster machines or distributed build tools. At least, throw in all the RAM the boards will take, and throw out old disks. It does maike a difference. One drive for OS, one drive for build is a cheap mans RAID.
Is the library well documented? That's your best chance at making it Look into tools such as doxygen that help you create such a documentation.
All considered? OK, now some suggestions for the build times ;)
Understand the C++ build model: every .cpp is compiled individually. That means many .cpp files with many headers = huge build. This is NOT an advise to put everything into one .cpp file, though! However, one trick (!) that can speed up a build immensely is to create a single .cpp file that includes a bunch of .cpp files, and only feed that "master" file to the compiler. You can't do that blindly, though - you need to understand the types of errors this could introduce.
If you don't have one yet, get a separate build machine that you can remote into. You'll have to do a lot of almost-full builds to check if you broke some include. You will want to run this in another machine, that doesn't block you from working on something else. Long term, you'll need it for daily integration builds anyway ;)
Use precompiled headers. (scales better with fast machines, see above)
Check your header inclusion policy. While every file should be "independent" (i.e. include everything it needs to be included by someone else), don't include liberally. Unfortunately, I haven't yet found a tool to find unnecessary #incldue statements, but it might help to spend some time removing unused headers in "hotspot" files.
Create and use forward declarations for the templates you use. Often, you can incldue a header with forwad declarations in many places, and use the full header only in a few specific ones. This can greatly help compile time. Check the <iosfwd> header how the standard library does that for i/o streams.
overloads for templates for few types: If you have a complex function template that is useful only for a very few types like this:
// .h
template <typename FLOAT> // float or double only
FLOAT CalcIt(int len, FLOAT * values) { ... }
You can declare the overloads in the header, and move the template to the body:
// .h
float CalcIt(int len, float * values);
double CalcIt(int len, double * values);
// .cpp
template <typename FLOAT> // float or double only
FLOAT CalcItT(int len, FLOAT * values) { ... }
float CalcIt(int len, float * values) { return CalcItT(len, values); }
double CalcIt(int len, double * values) { return CalcItT(len, values); }
this moves the lengthy template to a single compilation unit.
Unfortunately, this is only of limited use for classes.
Check if the PIMPL idiom can move code from the headers into .cpp files.
The general rule that hides behind that is separate the interface of your library from the implementation. Use comments, detail namesapces and separate .impl.h headers to mentally and physically isolate what should be known to the outside from how it is accomplished. This exposes the real value of your library (does it actually encapsulate complexity?), and gives you a chance to replace "easy targets" first.
More specific advise - and how useful the one given is - depends largely on the actual library.
Good luck!
As mentioned, unit tests are a good idea. Indeed, rather than breaking your code by introducing "simple" changes that are likely to ripple out, just focus on creating a suite of tests, and fixing non-compliance with the tests. Have an activity to update the tests when bugs come to light.
Beyond that, I would suggest upgrading your tools, if possible, to help with debugging template-related problems.
I've often come across legacy templates that were huge and required a lot of time and memory to instantiate, but didn't need to be. In those cases, the easiest way to cut out the fat was to take all of the code that didn't rely on any of the template arguments and hide it in separate functions defined in a normal translation unit. This also had the positive side-effect of triggering fewer recompiles when this code had to be slightly modified or documentation changed. It sounds rather obvious, but it's really surprising how often people write a class template and think that EVERYTHING it does has to be defined in the header, rather than just the code that needs the templated information.
Another thing you might want to consider is how often you clean up the inheritance hierarchies by making the templates "mixin" style instead of aggregations of multiple inheritance. See how many places you can get away with making one of the template arguments the name of the base class that it should derive from (the way boost::enable_shared_from_this works). Of course this typically only works well if the constructors take no arguments, as you don't have to worry about initializing anything correctly.