When should I consider making a library header-only? - c++

Obviously template libraries need to be header only, but for non-templates, when should you make things header-only?

If you think your non-template library could be header-only, consider dividing it into two files anyway, then providing a third file that includes both the .h and the .cpp (with an include guard).
Then anyone who uses your library in a lot of different TUs, and suspects that this might be costing a lot of compile time, can easily make the change to test it.
Once you know users have the option which way to use the library, the answer probably becomes "offer that option whenever you possibly can". So pretty much any time that including it from multiple TUs wouldn't violate the ODR. For instance, if your non-static free functions refer to static globals, then you're out of luck, since the different definitions of that function in different TUs would refer to different objects by the same name, which is an ODR-violation.

You could follow Boost.Asio lead.
They simply provide the two versions of the libraries: header-only and header + library.
They do it with a single macro to be defined (or not) before including their headers. I think the default (if not defined) is to use the header-only version.
See Optional Separate Compilation.
Note how they neatly provide a single source file to be compiled that define everything or the option to link against a dynamically loaded library.

Template libraries need not to be header-only: implementations might well contain some pieces independent of template parameters, and for some reasons (e.g. less code size) separated into a special binary.
I cannot imagine a case where a non-template library really must be header-only. However sometimes it might be reasonable performance-wise to allow inlining of all the code. An example can be a library of wrappers around platform-specific interfaces, e.g. for things like synchronization primitives, thread-local storage, platform- and compiler-specific implementation of atomic operations etc.

Without templates, you'd have actual definitions in the headers. That means that if two files include your header, you'd get multiple definitions and the code will not compile.
In other words, putting definitions in headers is a very bad idea. You should stick to declarations only, and templates.
As for templates, compilers know that you may include the same header more than once, they will not generate the same code over and over again.
EDIT: If you mean "keep everything inlined", I think this is a very bad approach. The header files become completely unreadable, and any change in implementation forces any user of your library to recompile everything.

Related

Why are function bodies in C/C++ placed in separate source code files instead of headers?

For instance, when I define a class file in C++ I've always put the function bodies in the class header files(.h) along with the class definition. The source code file(.cpp) is the one with the main() function. Now is this commonly done among pro c++ programmers or do they follow the convention of separate header/source code files.
As for Native C, I do notice then done in GCC(and of course for the headers in Visual Studio for Windows).
So is this just a convention? Or is there a reason for this?
Function bodies are placed into .cpp files to achieve the following:
To make the compiler parse and compile them only once, as opposed to forcing it to compile them again, again and again everywhere the header file is included. Additionally, in case of header implementation linker will later have to detect and eliminate identical external-linkage functions arriving in different object files.
Header pre-compilation facilities implemented by many modern compilers might significantly reduce the wasted effort required for repetitive recompilation of the same header file, but they don't entirely eliminate the issue.
To hide the implementations of these functions from the future users of the module or library. Implementation hiding techniques help to enforce certain programming discipline, which reduces parasitic inter-dependencies between modules and thus leads to cleaner code and faster compilation times.
I'd even say that even if users have access to full source code of the library (i.e. nothing is really "hidden" from them), clean separation between what is supposed to be visible through header files and what is not supposed to be visible is beneficial to library's self-documenting properties (although such separation is achievable in header-only libraries as well).
To make some functions "invisible" to the outside world (i.e. internal linkage, not immediately relevant to your example with class methods).
Non-inline functions residing in a specific translation unit can be subjected to certain context-dependent optimizations. For example, two different functions with identical tail portions can end up "sharing" the machine code implementing these identical tails.
Functions declared as inline in header files are compiled multiple times in different translation units (i.e. in different contexts) and have to be eliminated by the linker later, which makes it more difficult (if at all possible) to take advantage of such optimization opportunities.
Other reasons I might have missed.
It is a convention but it also depends on the specific needs. For example if you are writing a library that you want the functionality to be fast (inline) and you are designing the library for others to use to be a simple header only library, then you can write all of your code within the header file(s) itself.
On the other hand; if you are writing a library that will be linked either statically or dynamically and you are trying to encapsulate internal object data from the user. Your functions - class member functions etc. would be written in a manner that they do what they are supposed to do so as to where the user of your library code shouldn't have to worry about the actual implementation details for that part is hidden. All they would need to know about your functions and classes are their interfaces. It would be in this manner that you would have both header files and implementation files.
If you place your function definitions in the header files along with their declarations, they will be inline and should run faster however your executable will be larger and they will have to be compiled every time. The implementation details are also exposed to the user.
If you place your function definitions in the header's related code file they will not be inline, your code will be smaller, it may run a little slower, but you should only have to compile them once. The implementation details are hidden and abstracted away from the user.
There is absolutely no reason to put function bodies in header files in 'c'. If the header file is included in multiple 'c' files, this would force the compiler to define the function multiple times. If the function is 'static', there will be multiple copies of it in the program, if it is global, the linker will complain.
Similar reasoning is for c++. The exception is for 'inline' members of the class and some template implementations.
If you define a temporary class in your 'cpp' file, it is perfectly ok to define it there and have function bodies defined inside the class.

Why the definition of functions are separated from declarations?

Why the definition of functions is not written in the same "some.h" file together with their declarations? What will be happen if we are not separate "some.h" file from "some.c" file?
So that one knows the minimum he needs to know. (This makes the compilation faster, chances of name collisions lesser, ability to manage code easier etc) As mentioned in this comment by Kerrek SB: If you mix source and code, dividing big projects into modules is difficult.
For example, you can compile a library (Containing definition) and give it to your clients (who care only about the interface) along with the declarations (Headers) and he would be able to use it without needing to know the source of implementation. (This way you can also hide the implementation detail)
Without headers, he doesn't know usage of functions available in library. So though not mandatory, it is recommended to keep the declaration and definitions separate.
Well in fact, separating declarations and definitions is not mandatory, while in practice everything was thought this way to separate usage and implementation, this to promote abstraction, reusability, modularity, etc.
If you don't follow these rules then you will very rapidly face compilation problems with multiple definitions, etc.

How to publish header files with template implementations?

We are creating a set of libraries with a public API which is to be used by different third parties. Some of the libraries are pure C so obviously they have a C styled header with functions and struct definitions and the corresponding library. They are ok.
Some of the libraries are written with the usage of a moderately complex C++ (targeting older compilers), so there we have implemented some form of the famous pimpl idiom. This is ok too.
On the other end a significant part of the header files is C++ using heavily templated code. Knowing Why can templates only be implemented in the header file? but also not willing to disclose too much implementation details to eyes who are not supposed to see them we have heavily refactored them to exclude as much internal details as possible and having only the really necessary bits... and there is still a significant amount of code left.
So it puzzles me: Is there a preferred way of distributing header files which largely contain templates? What good practices, best approaches and tips and tricks are there?
Look at your C++ compiler's header files, for an inspiration. The standard C++ library is full of templates, and you will generally find all the template code in the headers.
Having said that, if particular templates are meant to be used with a small number of possible classes (or values) as template parameters, you do have an option of explicitly instantiating templates inside the library itself, leaving just the bare template declarations visible in the header files.
Using a simpler pre-C++11 scenario as an example, a C++ library will typically provide a std::basic_string implementation for only a std::basic_string<char> and std::basic_string<wchar_t>; and leave a bunch of template code inside the library itself, with just a bare std::basic_string template declaration visible in the header files.

Condensing Declaration and Implementation into an HPP file

I've read a few of the articles about the need / applicability / practicality of keeping headers around in C++ but I can't seem to find anywhere a solid reason why / when the above should or should not be done. I'm aware that boost uses .hpp files to deliver template functions to end users without the need for an associated .cpp file, and this thought is partially sourced off browsing through that code. It seems like this would be a convenient way to deliver single file modules of say a new Wt or Qt widget (still sticking to the one class per .h convention).
However are there any negative technical implementations for giving somebody a single .hpp file with both the header declaration and implementation assuming you have no problem with them having access to the implementation (say in the context of OSS). Does it for instances have any negative implications from the compiler's / linker's perspective?
Any opinions or perspectives on this would be appreciated.
'm aware that boost uses .hpp files to deliver template functions to end users without the need for an associated .cpp file
Wrong verb: it’s not “without the need”, it’s “without the ability”.
If Boost could, they would separate their libraries into headers and implementation files. In fact, they do so where ever possible.
The reason for a clean separation is simple: compilation time for header-only projects increases tremendously because associated header files have to be read, parsed and compiled every time you recompile the tiniest part of your application.
Implementation files only need to be compiled if you happen to recompile that particular object file.
Large C and/or C++ projects take hours to compile. And these use a clean separation into header and object files. If they would only use header files, I’m betting the compilation time would be measured in days instead of hours.
But for many of Boost’s libraries, the fact is that template definitions may not reside in a separate compilation unit than their declarations so this is simply not possible.
The major negative aspect of .hpp-only libraries is that they cannot refer to a precompiled module. All of the code present in the .hpp and hence all of the code in the library must be added to your application. This increases the size of the binary and makes for redundant binaries on such a system that uses the library more than once.
With templates you have no real choice. In theory, export allows you to separate the interface from the implementation, but only one compiler (Comeau) really supports this1, and it's being dropped from C++0x.
In any case, trying to put the implementations of non-template functions into headers leads to one obvious problem: the One Definition Rule remains in effect, so if you define the same function in more than one translation unit, you have a problem. The linker will typically give an error saying the same symbol has been defined more than one.
1Though it's mostly the EDG compiler front-end that really supports it, so other EDG-based compilers, such as Intel's also support export to some degree, though they don't document it, so you can't depend on much with them.

What is the best header structure to use in a library?

Concerning headers in a library, I see two options, and I'm not sure if the choice really matters. Say I created a library, lets call it foobar. Please help me choose the most appropriate option:
Have one include in the very root of the library project, lets call it foobar.h, which includes all of the headers in the library, such as "src/some_namespace/SomeClass.h" and so on. Then from outside the library, in the file that I want to use anything to do with the foobar library, just #include <foobar.h>.
Don't have a main include, and instead include only the headers we need in the places that I am to use them, so I may have a whole bunch of includes in a source file. Since I'm using namespaces sometimes as deep as 3, including the headers seems like a bit of a chore.
I've opted for option 1 because of how easy it is to implement. OpenGL and many other libraries seem to do this, so it seemed sensible. However, the standard C++ library can require me to include several headers in any given file, why didn't they just have one header file? Unless it's me being and idiot, and they're separate libraries...
Update:
Further to answers, I think it makes sense to provide both options, correct? I'd be pretty annoyed if I wanted to use a std::string but had to include a mass of header files; that would be silly. On the other hand, I'd be irritated if I had to type a mass of #include lines when I wanted to use most of a library anyway.
Forward headers:
Thanks to all that advised me of forward headers, this has helped me make the header jungle less complicated! :)
stl, boost and others who have a lot of header files to include they provide you with independent tools and you can use them independently.
So if you library is a set of uncoupling tools you have to give a choice to include them as separate parts as well as to include the whole library as the one file.
Think a bit about how your libary will be used, and organize it that way. If someone is unlikely to use one small part without using the whole thing, structure it as one big include. If a small part is independent and useful on its own, make sure you can include just enough for that part. If there's some logical grouping that makes sense, create include files for each group.
As with most programming questions, there's no one-size-fits-all answer.
All #included headers have to be processed. This isn't as bad as it could be, since modern compilers provide some sort of option for not processing them repeatedly (perhaps with something like #pragma once, or an ifndef guard). Still, every #included header has to be processed once for each translation unit, and that can add up fast.
The usual practice is for header files to #include only those header files they need, and to use forward declarations (class foo;) as much as possible. That way, you don't get the overhead.
If you want to #include everything and its brother, you can provide your own header file that #includes everything. You don't have to explicitly write everything out in every header and source file. That option is something you can provide, but if everything in std came as one monolithic header, you wouldn't have an option.
Every time you #include a header file you make the compiler do some pretty hard work. The fewer headers you #include, the less work it has to do and the faster your compilations will be.
All include files should have own sense. And you should choose header structure from lib-users positions: how users should use my library? what structure will best for users?
examples:
if you library provide string algorithms - it will be better make one header with all - string_algorithms.h;
if you library provide some one facade object - it will be better to use one header file ( maybe few other files with extensions or helpers );
if you provide complex of objects which will be used independently make different header files (containers lib provide different containers);
Forward declare instead of including all those header files at once, then include as and when you need.
However you decide on the header file(s) that you make available (one, several or some combination thereof) for the library's public API, it's always a good idea to have at least one separate header for the private API. (No need to expose the prototypes of the non-exported functions and classes or the definitions that are only intended to be used internally.)