What is the best header structure to use in a library?

What is the best header structure to use in a library? - c++

Concerning headers in a library, I see two options, and I'm not sure if the choice really matters. Say I created a library, lets call it foobar. Please help me choose the most appropriate option:
Have one include in the very root of the library project, lets call it foobar.h, which includes all of the headers in the library, such as "src/some_namespace/SomeClass.h" and so on. Then from outside the library, in the file that I want to use anything to do with the foobar library, just #include <foobar.h>.
Don't have a main include, and instead include only the headers we need in the places that I am to use them, so I may have a whole bunch of includes in a source file. Since I'm using namespaces sometimes as deep as 3, including the headers seems like a bit of a chore.
I've opted for option 1 because of how easy it is to implement. OpenGL and many other libraries seem to do this, so it seemed sensible. However, the standard C++ library can require me to include several headers in any given file, why didn't they just have one header file? Unless it's me being and idiot, and they're separate libraries...
Update:
Further to answers, I think it makes sense to provide both options, correct? I'd be pretty annoyed if I wanted to use a std::string but had to include a mass of header files; that would be silly. On the other hand, I'd be irritated if I had to type a mass of #include lines when I wanted to use most of a library anyway.
Forward headers:
Thanks to all that advised me of forward headers, this has helped me make the header jungle less complicated! :)

stl, boost and others who have a lot of header files to include they provide you with independent tools and you can use them independently.
So if you library is a set of uncoupling tools you have to give a choice to include them as separate parts as well as to include the whole library as the one file.

Think a bit about how your libary will be used, and organize it that way. If someone is unlikely to use one small part without using the whole thing, structure it as one big include. If a small part is independent and useful on its own, make sure you can include just enough for that part. If there's some logical grouping that makes sense, create include files for each group.
As with most programming questions, there's no one-size-fits-all answer.

All #included headers have to be processed. This isn't as bad as it could be, since modern compilers provide some sort of option for not processing them repeatedly (perhaps with something like #pragma once, or an ifndef guard). Still, every #included header has to be processed once for each translation unit, and that can add up fast.
The usual practice is for header files to #include only those header files they need, and to use forward declarations (class foo;) as much as possible. That way, you don't get the overhead.
If you want to #include everything and its brother, you can provide your own header file that #includes everything. You don't have to explicitly write everything out in every header and source file. That option is something you can provide, but if everything in std came as one monolithic header, you wouldn't have an option.

Every time you #include a header file you make the compiler do some pretty hard work. The fewer headers you #include, the less work it has to do and the faster your compilations will be.

All include files should have own sense. And you should choose header structure from lib-users positions: how users should use my library? what structure will best for users?
examples:
if you library provide string algorithms - it will be better make one header with all - string_algorithms.h;
if you library provide some one facade object - it will be better to use one header file ( maybe few other files with extensions or helpers );
if you provide complex of objects which will be used independently make different header files (containers lib provide different containers);

Forward declare instead of including all those header files at once, then include as and when you need.

However you decide on the header file(s) that you make available (one, several or some combination thereof) for the library's public API, it's always a good idea to have at least one separate header for the private API. (No need to expose the prototypes of the non-exported functions and classes or the definitions that are only intended to be used internally.)

Related

Include directives in header file? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
where should “include” be put in C++
Obviously, there are two "schools of thought" as to whether to put #include directives into C++ header files (or, as an alternative, put #include only into cpp files). Some people say it's ok, others say it only causes problems. Does anybody know whether this discussion has reached a conclusion what is to be preferred?

I am not aware of any schools of thoughts concerning this. Put them in the header when they are needed there, otherwise forward declare and put them in the .cpp files that require them. There is no benefit in including headers where they are not needed.

What I found effective is following a few simple rules:
Headers shall be self-sufficient, i.e., they shall declare classes they need names for and include headers for any definition they use.
Headers should minimize dependencies as much as possible without violation the previous point.
Getting the first point rught is fairly easy: Include the header first thing from the source implementing what it declares. Getting the second point exactly right isn't trivial, though, and I think it requires tool support to get it exactly right. However, a few unnecessary dependencies generally aren't that bad.

As a rule of thumb, you don't include the headers in a header as long as full definition of them is necessary there. Most of the time you play around with pointers of classes in a header file so it's just fine to forward declare them there.

I think the issue was settle a long time ago: headers should be self-contained (that is should not depend on the user to have included other headers before -- that aspect is settle for so long that some aren't even aware there was a debate on this, but your put includes only in .cpp seems to hint at this) but minimal (i.e. should not include definitions when a declaration would be enough for self-containment).
The reason for self-containment is maintenance: should an header be modified and now depend on something new, you'd have to track all the place it is used to include the new dependency. BTW, the standard trick to ensure self-containment is to include the header providing the declarations for things defined in a .cpp first in the .cpp.

These are not schools of thought so much as religions. In reality, both approaches have their advantages and disadvantages, and there are certain practices to be followed for either approach to be successful. But only one of these approaches will "scale" to large projects.
The advantage of not including headers inside headers is faster compilation. However, this advantage does not come from headers being read only once, because even if you include headers inside headers, smart compilers can work that out. The speed advantage comes from the fact that you include only those headers which are strictly necessary for a given source file. Another advantage is that if we look at a source file, we can see exactly what its dependencies are: the flat list of header files gives that to us plainly.
However, this practice is hard to maintain, especially in large projects with many programmers. It's quite an inconvenience when you want to use module foo, but you cannot just #include "foo.h": you need to include 35 other headers.
What ends up happening is this: programmers are not going to waste their time discovering the exact, minimal set of headers that they need just to add module foo. To save time, they will go to some example source file similar to the one they are working on, and cut and paste all of the #include directives. Then they will try compiling it, and if it doesn't build, then they will cut and paste more #include directives from yet elsewhere, and repeat that until it works.
The net result is that, little by little, you lose the advantage of faster compiling, because your files are now including unnecessary headers. Moreover, the list of #include directives no longer shows the true dependencies. Moreover, when you do incremental compiles now, you compile more than is necessary due to these false dependencies.
Once every source file includes nearly every header, you might as well have a big everything.h which includes all the headers, and then #include "everything.h" in every source file.
So this practice of including just specific headers is best left to small projects that are carefully maintained by a handful of developers who have plenty of time to maintain the ethic of minimal include dependencies by hand, or write tools to hunt down unnecessary #include directives.

Header and cpp or just cpp files - best practise?

Looking around at different code bases I see a variety of styles:
Class "interfaces" defined in header file and the actual impl in a cpp file. In this approach the headers look well defined and easy to read but the cpp files look confusing as it's just a list of methods.
The second approach i see is just to put everything in a single class cpp file. These class files contain the definition and actual method impls in the body of the class definition. This approach looks better to me (more like Java and c#).
Which style should I be using?

For all but the simplest programs, style #2 is simply impossible. If you #include a .cpp file with function definitions from multiple other .cpp files, the definitions get added to multiple object files (.o / .obj) and the linker will complain about clashing symbols.
Use style #1 and learn to live with the confusion.

The former - interfaces in header files and class bodies in implementation files. You'll find this causes you fewer problems when working on large systems.
In C++ why have header files and cpp files?

C++ doesn't use "interfaces" they use classes - base/derived classes. I use one file to define class/and its implementation methods if the project is small and separate files if the project is large.
In java, I pack them up into one package then import it once in need.

Since you tagged with c++, go for first style. I don't find it confusing, for a Java programmer, it may seem different, but in C++, you are always going to use this approach.
In fact in my favorite IDE (MSVS), I open header file, and cpp file side by side. Makes looking up prototypes, and class declaration easy.
And when you have a dozen classes; a dozen .h files, and another dozen .cpp file, will make your work simpler. Because, when you want just to see, what a class does, you just open relevant .h file, and take a look at class members, and maybe short comments. You don't need to wade through several lines deep code.
Conclusion : The style options you gave, are option only for a small code, typically single file, with very few methods etc. Otherwise, it is not even a option. (#Thomas has given the reason why #2 is not even a option)

Header (HPP):
The header includes the declarations of your code, particularly function declarations. Technically speaking classes are defined in header-files, but again, the member functions are just declared.
Code in other files will include just this header and retain all necessary information from there.
Implementation (CPP):
The implementation includes the definition of functions, member-functions and variables.
Rationale:
Header-files gives a developer (a external user of your code) a plain overview and just offers the external available code (i.e. easy to read, only the information necessary for users).
Header-files allow the compiler to check the implementation for correctness
Header-files allow the compiler to check external code for correctness
Header-files allow for seperate-compilation. You need to keep in mind. that in former times, computers doesn't have enough resources to keep everything in main-memory during a compilation process. Header files are small, while implementation files are big.
Use #style 1, even for simple programs. So you can learn easily to work with. That maybe look outated today, especially in background of modern Multi-Pass-Compilers. But seperate header-files are even today beneficial. Rumours about the next C++-Standard appeared, as far as I know something like symbol export ( Java or C#) will be possible. But don't nail me down on this!
Notes:
- member-functions which are defined inside a class are by default inline, normally you don't want this
- use always defined guards

If you are developing large project, you'll find the first approach helps you a lot. The second approach may help you in small project. As your project becomes larger, management of complexity is a big issue of software development, and the first approach turns out to be a better choice.

What I do is:
write .cpp files, with the method names prefixed with the class name
in the .h file, create an empty class, with the appropriate name, then use a cogapp generator script, cog_addheaders.py, to insert the declarations, eg:
.cpp file: WeightsPersister.cpp
.h file: WeightsPersister.h
This way I get:
fast compilation (just needs to recompile the .cpp file, unless I change the class interface)
few issues with circular declarations
acceptably low tedious mindless manual work :-)

Include everything, Separate with "using"

I'm developing a C++ library. It got me thinking of the ways Java and C# handle including different components of the libraries. For example, Java uses "import" to allow use of classes from other packages, while C# simply uses "using" to import entire modules.
My questions is, would it be a good idea to #include everything in the library in one massive include and then just use the using directive to import specific classes and modules? Or would this just be down right crazy?
EDIT:
Good responses so far, here are a few mitigating factors which I feel add to this idea:
1) Internal #includes are kept as normal (short and to the point)
2) The file which includes everything is optionally supplied with the library to those who wish to use it3) You could optionally make the big include file part of the pre-compiled header

You're confusing the purpose of #include statements in C++. They do not behave like import statements in Java or using statements in C#. #include does what it says; namely, loads and parses the entire indicated file as part of the current translation unit. The reason for the separate includes is to not have to spend compilation time parsing the entire standard library in every file. In contrast, the statements you're trying to make #include behave like are merely for programmer organization purposes.
#include is for management of the compilation process; not for separating uses. (In fact, you cannot use seperate headers to enforce seperate uses because to do so would violate the one definition rule)
tl;dr -> No, you shouldn't do that. #include as little as possible. When your project becomes large, you'll thank yourself when you're not waiting many hours to compile your project.

I would personally recommend only including the headers when you need them to explicitly show which functionalities your file requires. At the same time, doing so will prevent you from gaining access to functionalities you might no necessarily want, e.g functions unrelated to the goal of the file. Sure, this is no big deal, but I think that it's easier to maintain and change code when you don't have access to unnecessary functions/classes; it just makes it more straightforward.

I might be downvoted for this, but I think you bring up an interesting idea. It would probably slow down compilation a bit, but I think the concept is neat.
As long as you used using sparingly — only for the namespaces you need — other developers would be able to get an idea of what classes were used in a file by glancing at the top. It wouldn't be as granular as seeing a list of #included files, but is seeing a list of included header files really very useful? I don't think so.
Just make sure that all of the header files all use inclusion guards, of course. :)

As said by #Billy ONeal, the main thing is that #include is a preprocessor directive that causes a "^C, ^V" (copy-paste) of code that leads to a compile time increase.
The best considered policy in C++ is to forward declare all possible classes in ".h" files and just include them in the ".cpp" file. It isolates dependencies, as a C/C++ project will be cascadingly rebuilt if a dependent include file is changed.
Of course M$ compilers and its precompiled headers tend to do the opposite, enclosing to what you suggest. But anyone that tried to port code across those compilers is well aware of how smelly it can go.
Some libraries like Qt make extensive use of forward declarations. Take a look on it to see if you like its taste.

I think it will be confusing. When you write C++ you should avoid making it look like Java or C# (or C :-). I for one would really wonder why you did that.
Supplying an include-all file isn't really that helpful either, as a user could easily create one herself, with the parts of the library actually used. Could then be added to a precompiled header, if one is used.

Include File Ordering Strategy

I've seen fairly consistent advice that an implementation file (.cc / .cpp) should include its corresponding class definition file first, before including other header files. But when the topic shifts to header files themselves, and the order of includes they contain, the advice seems to vary.
Google coding standards suggest:
dir2/foo2.h (preferred location — see details below).
C system files.
C++ system files.
Other libraries' .h files.
Your project's .h files.
It is unclear what the difference is between entry 1 and 5 above, and why one or the other location would be chosen. That said, another online guide suggests this order (found in the "Class Layout" section of that doc):
system includes
project includes
local includes
Once again there is an ambiguity, this time between items 2 and 3. What is the distinction? Do those represent inter-project and intra-project includes?
But more to the point, it looks as if both proposed coding standards are suggesting "your" header files are included last. Such advice, being backwards from what is recommended for include-ordering in implementation files, is not intuitive. Would it not make sense to have "your" header files consistently listed first - ahead of system and 3rd party headers?

The order you list your includes shouldn't matter from a technical point of view. If you designed it right, you should be able to put them in any order you want and it will still work. For example, if your foo.h needs <string>, it should be included inside your foo.h so you don't have to remember that dependency everywhere you use foo.
That being said, if you do have order dependencies, most of the time putting your definition file last will fix it. That's because foo.h depends on <string>, but not the other way around.
You might think that makes a good case for putting your definition file last, but it's actually quite the opposite. If your coding standards require the definition first, your compiler is more likely to catch incorrect order dependencies when they are first written.

I'm not aware of any verbatim standard but as a general rule of thumb include as little headers as possible especially within other header files to reduce compile times, conflicts, and dependencies. I'm a fan of using forward declaration of classes in header files and only including the header and definition on the .cpp side whenever I can afford to do so.
That said my personal preference is below:
For Headers:
C++ headers
3rd party headers
other project headers
this project's headers
For Source:
precompiled header file
this source file's header
C++ headers
3rd party headers
other project headers
this project's headers
Pointers or suggestions are usually to avoid conflicts and circular references, otherwise it's all personal preference or whatever policy you prefer adhere to for collaborative projects.

Regarding Google's style:
There is no ambiguity, at all.
The first header included should be the header related to this source file, thus in position 1. This way you make sure that it includes anything it needs and that there is no "hidden" dependency: if there is, it'll be exposed right away and prevent compilation.
The other headers are ordered from those you are the least likely to be able to change if an issue occurs to those you are the more likely to. An issue could be either an identifier clash, a macro leaking, etc...
By definition the C and C++ systems headers are very rarely altered, simply because there's so many people using them, thus they come second.
3rd party code can be changed, but it's generally cumbersome and takes time, thus they come third.
The "project includes" refer to project-wide includes, generally home-brawn libraries (middle-ware) that are used by several projects. They can be changed, but this would impact the other projects as well, they come fourth.
And finally the "local includes", that is those files who are specific to this project and can be changed without affecting anyone else. In case of issue, those are prime candidates, they come last.
Note that you can in fact have many more layers (especially in a software shop), the key idea is to order the dependencies starting from the bottom layer (system libs) to the top layer.
Within a given layer, I tend to organize them by alphabetical order, because it's easier to check them.

For Headers:
this project's headers
other project headers
3rd party headers
C++ headers
For Source:
this source file's header
this project's headers
other project headers
3rd party headers
C++ headers
This orders minimize the chance to MISS some required header inside .hpp file. Also it minimize INTERSECTIONS of 3rd party headers etc. And all hpp modules compiles with minimum required dependensis.
for example:
->test.hpp
// missing #include <string> header
void test(std::string& s);
->test.cpp
#include <string>
#include "test.hpp"
// we hide bug with missing required header
->test2.cpp
#include "test.hpp"
#include <string>
// compilation error with missing header

Benefits of splitting interface and implementation in C++

I'm using C++ and I'm considering putting my function implementation into .h. I know that .h file is for definitions and .cpp is for implementations but how splitting all files into headers and sources will benefit me. Well if my aim would be to create static or dynamic library than of course that would make a difference but I am creating this code for myself and not planning to make a library out of it. So is there any other benefit from splitting source from definition?

The obvious goal is to reduce coupling : as soon as you change a header file, anything that includes it must be recompiled. This can rapidly have a strong impact on compilation times (even in a small project).

You can put almost all code into .h file, it will be header-only library. But if you want more faster partial recompilation, or if you want put some code to shared library - you should create .cpp files.

Depending on the size of your project it will save you compile time and make it possible to know all ressources etc. (unless you put everything into one single file).
The better your header files are organized the less work your compiler has to do to apply changes. Also looking in a small header file to look up some forgotten parameter information is a lot easier than scrolling through a hole cpp file.

One other obvious improvement is in avoiding re-compiling the code for your function in each file that uses it, instead compiling it once and using it where needed.
Another is that it follows convention (and the standard's one definition rule), so others will find it much easier to deal with and understand.

It depends on the size of the project. Up to about 500 LOC,
I tend to put everything in a single file, with the function
definitions in the class. Except that up to about 500 LOC,
I generally use a simpler language than C++; something like AWK.
As soon as the code gets big enough to warrent several source
files, it's definitely an advantage to put as little as possible
in the header, and that means putting all of the function
definitions in the source files. And as soon as the classes
become non-trivial, you probably don't want the function
definitions in the class itself, for readability reasons.
--
James Kanze

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js