When writing a header library (like Boost), can one define free-floating (non-method) functions without (1) bloating the generated binary and (2) incurring "unused" warnings?
When I define a function in a header that's included by multiple source files which in turn is linked into the same binary, the linker complains about redefinitions. One way around this is to make the functions static, but this reproduces the code in each translation unit (BTW, can linkers safely dereplicate these?). Furthermore, this triggers compiler warnings about the function being unused.
I was trying to look for an example of a free-floating function in Boost, but I couldn't find one. Is the trick to contain everything in a class (or template)?
If you really want to define the function (as opposed to declaring it), you'll need to use inline to prevent linker errors.
Otherwise, you can declare the function in the header file and provide its implementation separately in your source file.
You can use the inline keyword:
inline void wont_give_linker_errors(void)
{
// ...
}
Er... The answer to your question is simply don't. You just don't define functions in header files, unless they are inline.
'static' function can also be defined in headers, but it is only useful for very specific rare purposes. Using 'static' just to work around a multiple-definition problem is utter nonsense.
Again, header files are for non-defining function declarations. Why on Earth would you want to define functions there?
You said you are writing "header library". What's a "header library"? Please note, that Boost defines its "functions" in header files because their "functions" are not really functions, they are function templates. Function templates have to be defined in header files (well, almost). If that's wasn't the case, Boost wouldn't be doing something as strange as defining anything in header files.
Besides the already mentioned inline, with most compilers templates have to be defined in headers (and with all compilers it's allowed). Since boost is mostly templates, that explains why it is almost all headers.
People have suggested inline but that violates the very first part of your question i.e. it bloats the code as the full definition is inserted into the code at each call of the function. The answer to your overall question is therefore "No".
If you mark them as static then they are still defined in each source file as you rightly pointed out but only once and so that's a better option than inline if code size is the only issue. I don't know if linkers can, or are allowed to, spot the duplicates and merge them. I suspect not.
Edit:
Just to clear up any confusion as to whether I support the notion of using static and/or defining functions within headers files generally then rest assured I don't. This was simply meant as a technical response as to the differences between functions marked inline and static defined in header files. Nothing more.
Related
I am writing a header-only template library in C++. I want to able to write some helper functions inside that header file that will not be visible from a cpp file that includes this header library.
Any tips on how to do this?
I know static keyword can be used in cpp files to limit visibility to that one translation unit. Is there something similar for header files?
There isn't really a way.
The convention is to use a namespace for definitions that are not meant to be public. Typical names for this namespace are detail, meaning implementation details, or internal meaning internal to your library.
And as mentioned in comments, C++20 modules changes this situation.
The easy answer is no.
Headers do not exist to the linker, so all functions in the headers are actually in the module that included them. Technically static (or anonymous namespace) functions in a header, are static to the module that included them. This might work, but you will end up with multiple functions, and bloated code-sizes.
Due to this you should always inline functions in header files, or use something that implies inline - like constexpr; If possible...
Function in headers usually rely on either being inline, or templated. A templated function is "weak", meaning the linker assumes that they are all the same, and just uses a random one, and discards the others.
Background: The C++ inline keyword does not determine if a function should be inlined.
Instead, inline permits you to provide multiple definitions of a single function or variable, so long as each definition occurs in a different translation unit.
Basically, this allows definitions of global variables and functions in header files.
Are there some examples of why I might want to write a definition in a header file?
I've heard that there might be templating examples where it's impossible to write the definition in a separate cpp file.
I've heard other claims about performance. But is that really true? Since, to my knowledge, the use of the inline keyword doesn't guarantee that the function call is inlined (and vice versa).
I have a sense that this feature is probably primarily used by library writers trying to write wacky and highly optimized implementations. But are there some examples?
It's actually simple: you need inline when you want to write a definition (of a function or variable (since c++17)) in a header. Otherwise you would violate odr as soon as your header is included in more than 1 tu. That's it. That's all there is to it.
Of note is that some entities are implicitly declared inline like:
methods defined inside the body of the class
template functions and variables
constexpr functions and variables
Now the question becomes why and when would someone want to write definitions in the header instead of separating declarations in headers and definitions in source code files. There are advantages and disadvantages to this approach. Here are some to consider:
optimization
Having the definition in a source file means that the code of the function is baked into the tu binary. It cannot be inlined at the calling site outside of the tu that defines it. Having it in a header means that the compiler can inline it everywhere it sees fit. Or it can generate different code for the function depending on the context where it is called. The same can be achieved with lto within an executable or library, but for libraries the only option for enabling this optimization is having the definitions in the header.
library distribution
Besides enabling more optimizations in a library, having a header only library (when it's possible) means an easier way to distribute that library. All the user has to do is download the headers folder and add it to the include path of his/her project. In the case of non header only library things become more complicated. Because you can't mix and match binaries compiled by different compiler and even by the same compiler but with different flags. So you either have to distribute your library with the full source code along with a build tool or have the library compiled in many formats (cpu architecture/OS/compiler/compiler flags combinations)
human preference
Having to write the code once is considered by some (me included) an advantage: both from code documentation perspective and from a maintenance perspective. Others consider separating declaration from definitions is better. One argument is that it achieves separation of interface vs implementation but that is just not the case: in a header you need to have private member declarations even if those aren't part of the interface.
compile time performance
Having all the code in header means duplicating it in every tu. This is a real problem when it comes to compilation time. Heavy header C++ projects are notorious for slow compilation times. It also means that a modification of a function definition would trigger the recompilation of all the tu that include it, as opposed to just 1 tu in the case of definition in source code. Precompiled headers try to solve this problem but the solutions are not portable and have problems of their own.
If the same function definition appears in multiple compilation units then it needs to be inline otherwise you get a linking error.
You need the inline keyword e.g. for function templates if you want to make them available using a header because then their definition also has to be in the header.
The below statement might be a bit oversimplified because compilers and linkers are really complex nowadays, but to get a basic idea it is still valid.
A cpp file and the headers included by that cpp file form a compilation unit and each compilation unit is compiled individually. Within that compilation unit, the compiler can do many optimizations like potentially inlining any function call (no matter if it is a member or a free function) as long as the code still behaves according to the specification.
So if you place the function definition in the header you allow the compiler to know the code of that function and potentially do more optimizations.
If the definition is in another compilation unit the compiler can't do much and optimizations then can only be done at linking time. Link time optimizations are also possible and are indeed also done. And while link-time optimizations became better they potentially can't do as much as the compiler can do.
Header only libraries have the big advantage that you do not need to provide project files with them, the one how wants to use that library just copies the headers to their projects and includes them.
In short:
You're writing a library and you want it to be header-only, to make its use more convenient.
Even if it's not a library, in some cases you may want to keep some of the definitions in a header to make it easier to maintain (whether or not this makes things easier is subjective).
to my knowledge, the use of the inline keyword doesn't guarantee that the function call is inlined
Yes, defining it in a header (as inline) doesn't guarantee inlining. But if you don't define it in a header, it will never be inlined (unless you're using link-time optimizations). So:
You want the compiler to be able to inline the functions, if it decides to.
Also it may the compiler more knowledge about a function:
maybe it never throws, but is not marked noexcept;
maybe several consecutive calls can be merged into one (there's no side effects, etc), but __attribute__((const)) is missing;
maybe it never returns, but [[noreturn]] is missing;
...
there might be templating examples where it's impossible to write the definition in a separate cpp file.
That's true for most templates. They automatically behave as if they were inline, so you don't need to specify it explicitly. See Why can templates only be implemented in the header file? for details.
I have a .cpp file which has some static free functions. I know how that would help in a header file, but since the cpp is not included anywhere, what's the point? Are there any advantages to it?
Declaring free functions as static gives them internal linkage, which allows the compiler more aggressive optimizations, as it is now guaranteed that nobody outside the TU can see that function. For example, the function might disappear entirely from the assembly and get inlined everywhere, as there is no need to provide a linkable version.
Note of course that this also changes the semantics slightly, since you are allowed to have different static functions of the same name in different TUs, while having multiple definitions of non-static functions is an error.
Since comment boxes are too small to explain why you have a serious error in your reasoning, I'm putting this as a community wiki answer. For header-only functions, static is pretty much useless because anyone who includes their header will get a different function. That means you will duplicate code the compiler creates for each of the functions (unless the linker can merge the code, but by all I know, that's very unlikely), and worse, if the function would have local statics, each of those locals would be different, resulting in potentially multiple initializations for each call to a definition from a different inclusion. Not good.
What you need for header-only functions is inline (non-static inline), which means each header inclusion will define the same function and modern linkers are capable of not duplicating the code of each definition like done for static (for many cases, the C++ Standard even requires them to do so), but emitting only one copy of the code out of all definitions created by all inclusions.
A bit out of order response because the first item addressed raises some huge questions in my head.
but since the cpp is not included anywhere
I strongly hope that you never #include a source file anywhere. The preprocessor doesn't care about the distinction between source versus header. This is distinction exists largely to benefit humans, not the compilers. There are many reasons you should never #include a source file anywhere.
I have a .cpp file which has some static free functions. I know how that would help in a header file ...
How would that help?
You declare non-static free functions in a header if those free functions have external linkage. Declaring (but not defining) static free functions in a header doesn't help. It is a hindrance. You want to put stuff in a header that helps you and other programmers understand the exported content of something. Those static free functions are not exported content. You can define free functions in a header and thus make them exported content, but the standard practice is to use the inline keyword rather than static.
As far as your static free functions in your source file, you might want to consider putting the declarations of those functions near the top of the source file (but not in a header). This can help improve understandability. Without those declarations, the organization of the source file will look Pascalish, with the low-level functions defined first. Most people like a top-down presentation. By declaring the functions first you can employ a top-down strategy. Or an inside out strategy, or whatever strategy makes the functionality easiest to understand.
I'm working on a very tiny piece of C/C++ source code. The program reads input values from stdin, processes them with an algorithm and writes the results to stdout.
I would just implement all that in a single file, but I also want test cases for the algorithm (not the input/output reading), so I have the following files in my project:
main.cpp
sort.hpp
sort_test.cpp
I implement the algorithm in sort.hpp right away, no sort.cpp. It's rather short and doesn't have any dependencies.
Would you say that, in some cases, functions defined in headers are okay, even if they are sophisticated algorithms and not just simple accessors/mutators? Or is there a reason I should avoid this? When should I move code from header to source file?
There is nothing wrong with having functions in header files, as long as you understand the tradeoff. Putting them in a header file means they'll have to be compiled (and recompiled) in any translation unit that includes the header. (and they have to be declared inline, or you will get linker errors.)
In projects with many translation units, that may add up to a noticeable slowdown in compile times, if you do it a lot.
On the other hand, it ensures that the function definition is visible everywhere the function is called -- and that means that it can be trivially inlined, so the resulting program may run faster.
And finally, with function templates, you typically have no realistic alternative. The definition must be visible at the call site, and the only practical way to achieve that is to put it in a header.
A final consideration is that header-only libraries are easier to deploy and use. You don't need to link against anything, you don't have to worry about ABI's or anything else. You just add the headers to your project, include them and off you go.
Quite a few popular libraries use a header-only strategy.
When you put functions in headers you have to make sure to declare them inline. This is required to avoid a duplicate definition warning when more than one .cpp file include that header file. Generally you should only put small functions inside header files because it will be compiled for each cpp file that includes the header which will slow down compilation time and also results in code bloat; a larger executable file.
It's OK to put any function in the header as long as it's inline. Things such as functions defined inside class { } and templates are implicitly inline.
If the resulting application becomes too large, then optimize the code size. Optimizing before there is a problem is an anti-pattern, especially when there is a benefit to doing it "your way," and the fix is as simple as moving from one file to another and erasing inline.
Of course, if you want to distribute the code as a library, then deciding between a header, static library, or dynamic library binary is an important decision affecting the users.
The vast majority of the boost libraries are header-only, so I'd say: Yes, this is an established and accepted practice. Just don't forget to inline.
That really is a stile choice. But putting it in the header does mean that it will be inline code rather than a function. If you wanted that same functionality, you could use the inline keyword:
inline int max(int a, int b)
{
return (a > b) ? a : b;
}
http://en.wikipedia.org/wiki/Inline_function
The reason you should avoid this in general (for non inline functions) is because multiple source files will be including your header, creating linker errors.
It doesn't matter if you have a pramga once or similar trick - the duplication will show up if you have more than one compilation unit (e.g. cpp files) including the same header.
If you wish to inline the function, it MUST be in the header else it can't get inlined.
If you publish a header with your libraries and the header has some sort of implementation in it, you can be sure that after a few years if you change the implementation and it doesn't work exactly the same way as it did before, some peoples code will break since thay will have come to rely on the implementation they saw in the header. Yeah i know one should not do it but many people do look in header for the implementation and other behaviour they can exploit/use in a not intended way to overcome some problem they are having.
If you are planning to use templates then you have no choice but to put it all in header. (this might not be necessary if you compiler supports export templates but there is only 1 i know of).
Its ok to have the implementation in the header. It depends on what you need. If you separate the definition to a different file then the compiler will create symbols with external linkage if you dont want that you can define the functions inside the header itself. But you would be wasting some amount of memory for the code segment. If you include this header file in two different files then both files codes segment will have this function definition.
If other header file is going to have a function with similar name then its going to be a problem. Then you have to use inline.
I want to write a library that to use, you only need to include one header file. However, if you have multiple source files and include the header in both, you'll get multiple definition errors, because the library is both declared and defined in the header. I have seen header-only libraries, in Boost I think. How did they do that?
Declare your functions inline, and put them in a namespace so you don't collide:
namespace fancy_schmancy
{
inline void my_fn()
{
// magic happens
}
};
The main reason why Boost is largely header-only is because it's heavily template oriented. Templates generally get a pass from the one definition rule. In fact to effectively use templates, you must have the definition visible in any translation unit that uses the template.
Another way around the one definition rule (ODR) is to use inline functions. Actually, getting a free-pass from the ODR is what inline really does - the fact that it might inline the function is really more of an optional side-effect.
A final option (but probably not as good) is to make your functions static. This may lead to code bloat if the linker isn't able to figure out that all those function instances are really the same. But I mention it for completeness. Note that compilers will often inline static functions even if they aren't marked as inline.
Boost uses header-only libraries a lot because like the STL, it's mostly built using class and function templates, which are almost always header-only.
If you are not writing templates I would avoid including code in your header files - it's more trouble than it's worth. Make this a plain old static library.
There are many truly header-only Boost libraries, but they tend to be very simple (and/or only templates). The bigger libraries accomplish the same effect through some trickery: they have "automatic linking" (you'll see this term used here). They essentially have a bunch of preprocessor directives in the headers that figure out the appropriate lib file for your platform and use a #pragma to instruct the linker to link it in. So you don't have to explicitly link it, but it is still being linked.