How to structure a "library" of C++ source? - c++

I'm developing a collection of C++ classes and am struggling with how to share the code in a way that maintains organization without compromising ease of compilation for a user of the collection.
Options that I have seen include:
Distribute compiled library file
Put the source in the header file (with implicit inline as discussed in this answer)
Use symbolic links to allow the compiler to find the files.
I'm currently using the third option where, for each class the I want to include I symbolic link each classess headers and source files (e.g. ln -s <path_to_class folder>/myclass.cpp) This works well except that I can't move the project folder location (it breaks all the symlinks) and I have to have all those symlinked files hanging around.
I like the second option (it has the appearance of Java), but I'm worried about code size bloat if everything is declared inline.
A user of the collection will create a project folder somewhere, and somehow include the collection into their compilation process.
I'd like a few things to be possible:
Easy compilation (something like gcc *.cpp from the project folder)
Easy distribution of library in uncompiled form.
Library organization by module.
Compiled code size is not bloated.
I'm not worried about documentation (Doxygen takes care of that) or compile time: the overall modules are small and even the largest projects on the slowest machines won't take more than a few seconds to compile.
I'm using the GCC compiler, if it makes any difference.

A library is the best option (in my opinion) of the three you raised. Then provide the header file(s) in the include path and the library in the linker path.
Since you also want to distribute the library in source code form, I would be inclined to provide a compressed archive (gzip, 7-zip, tarball, or other preferred format) in a central repository.

If I understand correctly, you do not want users to have to include the .cpp files in their build, but instead just want them to use either: (i) the headers directly, (ii) use a compiled form of the lib.
Your requirements are a bit unusual, but they can be achieved. It seems to me like you could organize your code in the following manner. First, have a global define that dictates whether or not you are compiling the library:
// global.h
// ...
#define LIB_SOURCE
// ...
Then in every header file, you check whether that define is set: if the library is distributed as a static/shared lib, the definitions are not included, otherwise, the '.cpp' file is included from the header file.
// A.h
#ifndef _A_H
#include "global.h"
#ifdef LIB_SOURCE
#include "A.cpp"
#endif
// ...
#endif
where 'A.cpp' would contain the actual implementation.
Again, this is a very strange way of doing things and I would actually advise against such practice. A better way (but one which requires more work) is to always distribute a shared library. But to keep things independent of the compiler, write a C layer around it. This way, you have a portable, maintainable library.
As for some of the other requirements:
Keep the build process simple by providing a Makefile
If you worry about the code size of the compiled library, look into gcc's optimization options (-Os). If you worry about the code size of the library when distributed in source-form in the headers, this is more tricky. Since the (inlined) code will actually be in the headers, the code will obviously grow with each inclusion in a .cpp file by the user.

I ended up using inline headers for all of the code. You can see the library here:
https://github.com/libpropeller/libpropeller/tree/master/libpropeller
The library is structured as:
library folder
class A
classA.h
classA.test.h
class B
classB.h
classB.test.h
class C
...
With this structure I can distribute the library as source, and all the user has to do is include -I/path/to/library in their makefile, and #include "library/classA/classA.h" in their source files.
And, as it turns out, having inline headers actually reduces the code size. I've done a full analysis of this, and it turns out that inline code in the headers allows the compiler to make the final binary roughly 5% smaller.

Related

Why create an include/ directory in C and C++ projects?

When I work on my personal C and C++ projects I usually put file.h and file.cpp in the same directory and then file.cpp can reference file.h with a #include "file.h" directive.
However, it is common to find out libraries and other kinds of projects (like the linux kernel and freeRTOS) where all .h files are placed inside an include/ directory, while .cpp files remain in another directory. In those projects, .h files are also included with #include "file.h" instead of #include "include/file.h" as I was hoping.
I have some questions about all of this:
What are the advantages of this file structure organization?
Why are .h files inside include/ included with #include "file.h" instead of #include "include/file.h"? I know the real trick is inside some Makefile, but is it really better to do that way instead of making clear (in code) that the file we want to include is actually in the include/ directory?
The main reason to do this is that compiled libraries need headers in order to be consumed by the eventual user. By convention, the contents of the include directory are the headers exposed for public consumption. The source directory may have headers for internal use, but those are not meant to be distributed with the compiled library.
So when using the library, you link to the binary and add the library's include directory to your build system's header paths. Similarly, if you install your compiled library to a centralized location, you can tell which files need to be copied to the central location (the compiled binaries and the include directory) and which files don't (the source directory and so forth).
It used to be that <header> style includes were of the implicit path type, that is, to be found on the includes environment variable path or a build macro, and the "header" style includes were of the explicit form, as-in, exactly relative to where-ever the source file is that included it. While some build tool chains still allow for this distinction, they often default to a configuration that effectively nullifies it.
Your question is interesting because it brings up the question of which really is better, implicit or explicit? The implicit form is certainly easier because:
Convenient groupings of related headers in hierarchies of directories.
You only need include a few directories in the includes path and need not be aware of every detail with regard to exact locations of files. You can change versions of libraries and their related headers without changing code.
DRY.
Flexible! Your build environment doesn't have to match mine, but we can often get nearly exact same results.
Explicit on the other hand has:
Repeatable builds. A reordering of paths in an includes macro/environment variable, doesn't change resulting header files found during the build.
Portable builds. Just package everything from the root of the build and ship it off to another dev.
Proximity of information. You know exactly where the header is with #include "\X\Y\Z". In the implicit form, you may have to go searching along multiple paths and might even find multiple versions of the same file, how do you know which one is used in the build?
Builders have been arguing over these two approaches for many decades, but a hybrid form of the two, mostly wins out because of the effort required to maintain builds based purely of the explicit form, and the obvious difficulty one might have familiarizing one's self with code of a purely implicit nature. We all generally understand that our various tool chains put certain common libraries and headers in particular locations, such that they can be shared across users and projects, so we expect to find standard C/C++ headers in one place, but we don't initially know anything about the specific structure of any arbitrary project, lacking a locally well documented convention, so we expect the code in those projects to be explicit with regard to the non-standard bits that are unique to them and implicit regarding the standard bits.
It is a good practice to always use the <header> form of include for all the standard headers and other libraries that are not project specific and to use the "header" form for everything else. Should you have an include directory in your project for your local includes? That depends to some extent on whether those headers will be shipped as interfaces to your libraries or merely consumed by your code, and also on your preferences. How large and complex is your project? If you have a mix of internal and external interfaces or lots of different components, you might want to group things into separate directories.
Keep in mind that the directory structure your finished product unpacks to, need not look anything like the directory structure under which you develop and build that product in. If you have only a few .c/.cpp files and headers, it's ok to put them all in one directory, but eventually, you're going to work on something non-trivial and will have to think through the consequences of your build environment choices, and hopefully document it for others to understand it.
1 . .hpp and .cpp doesn't necessary have 1 to 1 relationship, there may have multiple .cpp using same .hpp according to different conditions (eg:different environments), for example: a multi-platform library, imagine there is a class to get the version of the app, and the header is like that:
Utilities.h
#include <string.h>
class Utilities{
static std::string getAppVersion();
}
main.cpp
#include Utilities.h
int main(){
std::cout << Utilities::getAppVersion() << std::ends;
return 0;
}
there may have one .cpp for each platform, and the .cpp may be placed at different locations so that they are easily be selected by the corresponding platform, eg:
.cpp for iOS (path:DemoProject/ios/Utilities.cpp):
#include "Utilities.h"
std::string Utilities::getAppVersion(){
//some objective C code
}
.cpp for Android (path:DemoProject/android/Utilities.cpp):
#include "Utilities.h"
std::string Utilities::getAppVersion(){
//some jni code
}
and of course 2 .cpp would not be used at the same time normally.
2.
#include "file.h"
instead of
#include "include/file.h"
allows you to keep the source code unchanged when your headers are not placed in the "include" folder anymore.

Single header file with all the necessary #include statements

I am currently working on program with a lot of source files. Sometimes it is difficult to keep track of what libraries I have already #included. Theoretically, I could make a single header file called Headers.h that just contains all the #include statements I need, then make all other header files #include "Headers.h".
Why is this a good/bad idea?
Pros:
Slightly less maintenance as you don't have to keep track of which of your files are including headers from which libraries or other compoenents.
Cons:
Definitions in included files might conflict with each other. Especially in C where you don't have namespaces (you tagged with C and C++)
Macros in particular can cause hard to debug problems, where a macro definition unexpectedly conflicts with some name in your file or one of the other included files
Depending on which compiler you use, compilation times might blow out. If using a compiler that pre-compiles headers it might actually reduce compilation time, but if not the opposite will happen
You will often unnecessarily trigger rebuilds of files. If you have your build system set up correctly, then each source file will get rebuilt if any of the included files gets modified. If you always include all headers in your project, then a change to any of your headers will force recompilation of all your source files. Not likely to be an issue for system headers but it will be if you include your own headers in the master file as well.
On the whole I would not recommend that approach. The last con listed above it particularly important.
Best practice would be to include only headers that are needed for the code in each file.
In complement of Harmic's answer, indeed the main issue is the build system (most builders work on file timestamp, not on file contents. omake is a notable exception).
Notice that if you only care about many dependencies, GNU make can be used with autodependencies, together with -M* options passed to GCC (i.e. to g++ and actually to the preprocessor).
However, many libraries are offering to their user a single header (e.g. <gtk/gtk.h>)
Also, a single header file is more friendly to precompiled headers technology. In particular, GCC wants a single header for precompilation.
See also ccache.
Tracking all the required includes would be more difficult as they are abstracted from their c source files and not really supporting modularisation pus all the cons from #harmic

Library design: allow user to decide between "header-only" and dynamically linked?

I have created several C++ libraries that currently are header-only. Both the interface and the implementation of my classes are written in the same .hpp file.
I've recently started thinking that this kind of design is not very good:
If the user wants to compile the library and link it dynamically, he/she can't.
Changing a single line of code requires full recompilation of existing projects that depend on the library.
I really enjoy the aspects of header-only libraries though: all functions get potentially inlined and they're very very easy to include in your projects - no need to compile/link anything, just a simple #include directive.
Is it possible to get the best of both worlds? I mean - allowing the user to choose how he/she wants to use the library. It would also speed up development, as I'd work on the library in "dynamically-linking mode" to avoid absurd compilation times, and release my finished products in "header-only mode" to maximize performance.
The first logical step is dividing interface and implementation in .hpp and .inl files.
I'm not sure how to go forward, though. I've seen many libraries prepend LIBRARY_API macros to their function/class declarations - maybe something similar would be needed to allow the user to choose?
All of my library functions are prefixed with the inline keyword, to avoid "multiple definition of..." errors. I assume the keyword would be replaced by a LIBRARY_INLINE macro in the .inl files? The macro would resolve to inline for "header-only mode", and to nothing for the "dynamically-linking mode".
Preliminary note: I am assuming a Windows environment, but this should be easily transferable to other environments.
Your library has to be prepared for four situations:
Used as header-only library
Used as static library
Used as dynamic library (functions are imported)
Built as dynamic library (functions are exported)
So let's make up four preprocessor defines for those cases: INLINE_LIBRARY, STATIC_LIBRARY, IMPORT_LIBRARY, and EXPORT_LIBRARY (it is just an example; you may want to use some sophisticated naming scheme).
The user has to define one of them, depending on what he/she wants.
Then you can write your headers like this:
// foo.hpp
#if defined(INLINE_LIBRARY)
#define LIBRARY_API inline
#elif defined(STATIC_LIBRARY)
#define LIBRARY_API
#elif defined(EXPORT_LIBRARY)
#define LIBRARY_API __declspec(dllexport)
#elif defined(IMPORT_LIBRARY)
#define LIBRARY_API __declspec(dllimport)
#endif
LIBRARY_API void foo();
#ifdef INLINE_LIBRARY
#include "foo.cpp"
#endif
Your implementation file looks just like usual:
// foo.cpp
#include "foo.hpp"
#include <iostream>
void foo()
{
std::cout << "foo";
}
If INLINE_LIBRARY is defined, the functions are declared inline and the implementation gets included like a .inl file.
If STATIC_LIBRARY is defined, the functions are declared without any specifier, and the user has to include the .cpp file into his/her build process.
If IMPORT_LIBRARY is defined, the functions are imported, and there isn't a need for any implementation.
If EXPORT_LIBRARY is defined, the functions are exported and the user has to compile those .cpp files.
Switching between static / import / export is a really common thing, but I'm not sure if adding header-only to the equation is a good thing. Normally, there are good reasons for defining something inline or not to do so.
Personally, I like to put everything into .cpp files unless it really has to be inlined (like templates) or it makes sense performance-wise (very small functions, usually one-liners). This reduces both compile time and - way more important - dependencies.
But if I choose to define something inline, I always put it in separate .inl files, just to keep the header files clean and easy to understand.
It is operating system and compiler specific. On Linux with a very recent GCC compiler (version 4.9) you might produce a static library using interprocedural linktime optimization.
This means that you build your library with g++ -O2 -flto both at compile and at library link time, and that you use your library with g++ -O2 -flto both at compile and link time of the invoking program.
This is to complement #Horstling's answer.
You can either create a static or a dynamic library. When you create statically-linked libraries, compiled code for all functions/objects will be saved to a file (with .lib extension in Windows). At main project (the project using the library) 's link time, these codes will be linked into your final executable together with the main project codes. So the final executable wouldn't have any runtime dependency.
Dynamically linked libraries will be merged into the main project at run time (and not link time). When you compile the library you get a .dll file (which contains actual compiled code) and a .lib file (which contains enough data for the compiler/runtime to find functions/objects in the .dll file). At link time, the executable will be configured to load the .dll and use compiled code from that .dll as needed. You will need to distribute the .dll file with your executable to be able to run it.
There is no need to choose between static or dynamic linking (or header-only) when designing your library, you create multiple project/makefiles, one to create a static .lib, another to create a .lib/.dll pair, and distribute both versions, for the user to choose between. (You'll need to use preprocessor macros like the ones #Horstling suggested).
You cannot put any templates in a pre-compiled library, unless you use a technique called Explicit Instantiation, which limits template parameters.
Also note that modern compiler/linkers usually do not respect the inline modifier. They may inline a function even if it's not designated as inline, or may dynamically call another that has inline modifier, as they see fit. (Regardless, I'll advise explicitly putting inline where applicable for maximum compatibility). So, there won't be any runtime performance penalty if you use a statically linked library instead of a header-only library (and enable compiler/linker optimizations, of course). As others have suggested, for really small functions that are sure to benefit from being called inline, it is best practice to put them in header files, so dynamically linked libraries will also not suffer any significant performance loss. (In any case, inlining functions will only affect performance for functions that are being called very often, inside loops that are going to be called thousands/millions of times).
Instead of putting inline functions in header files (with an #include "foo.cpp" in your header), you can change makefile/project settings and add foo.cpp to the list of source files to be compiled. This way, if you change any function implementation there will be no need to re-compile the whole project and only foo.cpp will be re-compiled. As I mentioned earlier, your small functions will still be inlined by the optimizing compiler, and you don't need to worry about that.
If you use/design a pre-compiled library, you should consider the case where the library is compiled with a different version of compiler to the main project. Each different compiler version (even different configurations, like Debug or Release) uses a different C runtime (things like memcpy, printf, fopen, ...) and C++ standard library runtime (things like std::vector<>, std::string, ...). These different library implementations may complicate linking, or even create runtime errors.
As a general rule, always avoid sharing compiler runtime objects (data structures that are not defined by standards, like FILE*) across libraries, because incompatible data structures will lead to runtime errors.
When linking your project, C/C++ runtime functions must be linked into your library .lib or .lib/.dll, or your executable .exe. C/C++ runtime itself can be linked as static or dynamic library (you can set this in makefile/project settings).
You will find that dynamically linking to C/C++ runtime in both the library and the main project (even when you compile the library itself as a static library) avoids most linking problems (with duplicate function implementations in multiple runtime versions). Of course you would need to distribute runtime DLLs for all used versions with your executable and library.
There are scenarios that statically linking to C/C++ runtime is needed, and the best approach in these cases would be to compile the library with the same compiler setting as the main project to avoid linking problems.
Rationale
Put as little as necessary in header files and as much as possible in library modules, because of the very reasons that you mentioned: compile-time dependency and long compilation time. The only good reasons for header-only modules are:
generic templates for user-defined template parameter;
very short convenience functions when inlining gives significant
performance.
In case 1, it is often possible to hide some functionality that does not depend on the user-defined type in a .cpp file.
Conclusion
If you stick to this rationale, then there is no choice: templated functionality that must allow user-defined types cannot be pre-compiled, but requires a header-only implementation. Other functionality should be hidden from the user in a library to avoid exposing them to the implementation details.
Rather than a dynamic library, you could have a precompiled static library and thin header file. In an interactive quick build, you get the benefit of not having to recompile the world if implementation details changes. But a fully optimized release build can do global optimization and still figure out it can inline functions. Basically, with "link-time code generation" the toolset does the trick you were thinking about.
I'm familiar with Microsoft's compiler, which I know for sure does this as of Visual Studio 2010 (if not earlier).
Templated code will necessarily be header-only: for instantiating this code, the type parameters must be known at compilation time. There is no way to embed template code in shared libraries. Only .NET and Java support JIT instantiation from byte-code.
Re: non-template code, for short one-liners I suggest keeping it header-only. Inline functions give the compiler a lot more opportunities to optimize the final code.
To avoid "insane compilation time", Microsoft Visual C++ has a "precompiled headers" feature. I do not think GCC has a similar feature.
Long functions should not be inlined in any case.
I had one project which had header-only bits, compiled library bits and some bits I could not decide where belonged. I ended up having .inc files, conditionally included in either .hpp or .cxx depending on #ifdef. Truth to be told, the project was always compiled in "max inline" mode, so after a while I got rid of the .inc files and simply moved the contents to .hpp files.
Is it possible to get the best of both worlds?
In terms; limitations arise because tools aren't smart enough. This answer gives the current best effort that is still portable enough to be used effectively.
I've recently started thinking that this kind of design is not very good.
It ought to be. Header-only libraries are ideal because they simplify deployment: makes the reusing mechanism of the language similar to almost all others', which is just the sane thing to do. But this is C++. Current C++ tools still rely on half-a-century-old linking models that remove important degrees of flexibility, such as choosing which entry points to import or export on an individual level without being forced to change the library's original source code. Also, C++ lacks a proper module system and still relies on glorified copy-paste operations to work (although this is just a side factor to the problem in question).
In fact, MSVC is a little better in this regard. It is the only major implementation trying to achieve some degree of modularity in C++ (by attempting e.g. C++ modules). And it is the only compiler that actually allows e.g. the following:
//// Module.c++
#pragma once
inline void Func() { /* ... */ }
//// Program1.c++
#include <Module.c++>
// Inlines or "vague" links Func(), whatever is better.
int main() { Func(); }
//// Program2.c++
// This forces Func() to be imported.
// The declaration must come *BEFORE* the definition.
__declspec(dllimport) __declspec(noinline) void Func();
#include <Module.c++>
int main() { Func(); }
//// Program3.c++
// This forces Func() to be exported.
__declspec(dllexport) __declspec(noinline) void Func();
#include <Module.c++>
Note that this can be used to selectively import and export individual symbols from the library, although still cumbersomely.
GCC also accepts this (but the order of the declarations must be changed) and Clang does not have any way to achieve the same effect without changing the library's source.

How Iostream file is located in computer by c++ code during execution

i want to know that in a c++ code during execution how iostream file is founded. we write #include in c++ program and i know about #include which is a preprocessor directive
to load files and is a file name but i don't know that how that file is located.
i have some questions in my mind...
Is Standard library present in compiler which we are using?
Is that file is present in standard library or in our computer?
Can we give directory path to locate the file through c++ code if yes then how?
You seem to be confused about the compilation and execution model of C++. C++ is generally not interpreted (though it could) but instead a native binary is produced during the compilation phase, which is then executed... So let's take a detour.
In order to go from a handful of text files to a program being executed, there are several steps:
compilation
link
load
execution proper
I will only describe what traditional compilers do (such as gcc or clang), potential variations will be indicated later on.
Compiling
During compilation, each source file (generally .cpp or .cxx though the compiler could care less) is processed to produced an object file (generally .o on Linux):
The source file is preprocessed: this means resolving the #include (copy/pasting the included file in the current file), navigating the #if and #else to remove unneeded sources and expanding macros.
The preprocessed file is fed to the compiler which will produce native code for each function, static or global variable, etc... the format depends on the target system in general
The compiler outputs the object file (it's a binary format, in general)
Linking
In this phase, multiple object files are assembled together into a library or an executable. In the case of a static library or a statically-linked executable, the libraries it depend on are also assembled in the produced file.
Traditionally, the linker job is relatively simple: it just concatenates all object files, which already contain the binary format the target machine can execute. However it often actually does more: in C and C++ inline functions are duplicated across object files, so the linker need to keep only one of the definitions, for example.
At this point, the program is "compiled", and we live the realm of the compiler.
Loading
When you ask to execute a program, the OS will load its code into memory (thanks to a loader) and execute it.
In the case of a statically linked executable, this is easy: it's just a single big blob of code that need be loaded. In the case of a dynamically linked executable, it implies finding the dependencies and "resolving the symbols", I'll describe this below:
First of all, your dynamically linked executable and libraries have a section describing which other libraries they depend on. They only have the name of the library, not its exact location, so the loader will search among a list of paths (LD_LIBRARY_PATH for example on Linux) for the libraries and actually load them.
When loading a library, the loader will perform replacements. Your executable had placeholders saying "Here should be the address of the printf function", and the loader will replace that placeholder with the actual address.
Once everything is loaded properly, all symbols should be resolved. If some symbols are missing, ie the code for them is not found in either the present library or any of its dependencies, then you generally get an error (either immediately, or only when the symbol is actually needed if you use lazy loading).
Executing
The code (assembler instruction in binary format) is now executed. In C++, this starts with building the global and static objects (at file scope, not function-static), and then goes on to calling your main function.
Caveats: this is a simplified view, nowadays Link-Time Optimizations mean that the linker will do more and more, the loader could actually perform optimizations too and as mentioned, using lazy loading, the loader might be invoked after the execution started... still, you've got to start somewhere, don't you ?
So, what does it mean about your question ?
The #include <iostream> in your source code is a pre-processor directive. It is thus fully resolved early in the compilation phase, and only depends on finding the appropriate header (no library code is actually needed yet). Note: a compiler would be allowed not to have a header file sitting around, and just magically inject the necessary code as if the header file existed, because this is a Standard Library header (thus special). For regular headers (yours) the pre-processor is invoked.
Then, at link-time:
if you use static linking, the linker will search for the Standard Library and include it in the executable it produces
if you use dynamic linking, the linker will note that it depends on the Standard Library file (libc++.so for example) and that the produced code is missing an implementation of printf (for example)
Then, at load time:
if you used dynamic linking, the loader will search for the Standard Library and load its code
Then, at execution time, the code (yours and its dependencies) is finally executed.
Several Standard Library implementations exist, off the top of my head:
MSVC ships with a modified version of Dirkumware
gcc ships with libstdc++ (which depends on libc)
clang ships with libc++ (which depends on libc), but may use libstdc++ instead (with compiler flags)
And of course, you could provide others... probably... though setting it up might not be easy.
Which is ultimately used depends on the compiler options you use. By default the most common compiler ship with their own implementation and use it without any intervention on your part.
And finally yes, you can indeed specify paths in a #include directive. For example, using boost:
#include <boost/optional.hpp>
#include <boost/algorithm/string/trim.hpp>
you can even use relative paths:
#include <../myotherproject/x.hpp>
though this is considered poor form by some (since it breaks as soon as your reorganize your files).
What matters is that the pre-processor will look through a list of directories, and for each directory append / and the path you specified. If this creates the path of an existing file, it picks it, otherwise it continues to the next directory... until it runs out (and complain).
The <iostream> file is just not needed during execution. But that's just the header. You do need the standard library, but that's generally named differently if not included outright into your executable.
The C++ Standard Library doesn't ship with your OS, although on many Linux systems the line between OS and common libraries such as the C++ Standard Library is a bit thin.
How you locate that library is very much OS dependent.
There can be 2 ways to load a header file (like iostream.h) in C++
if you write the code as:
# include <iostream>
It will look up the header file in include directory of your C++ compiler and will load it
Other way you can give the full path of the header file as:
# include "path_of_file.h"
And loading up the file is OS dependent as answered by MSalters
You definitely required the standard library header files so that pre-processor directive can locate them.
Yes those files are present in the library and on include copied to our code.
if we had defined the our own header file then we have to give path of that file. In that way we can include also *.c or *.cpp along with the header files in which we had defined various methods and those had to include at pre-processing time.

Combining C++ header files

Is there an automated way to take a large amount of C++ header files and combine them in a single one?
This operation must, of course, concatenate the files in the right order so that no types, etc. are defined before they are used in upcoming classes and functions.
Basically, I'm looking for something that allows me to distribute my library in two files (libfoo.h, libfoo.a), instead of the current bunch of include files + the binary library.
As your comment says:
.. I want to make it easier for library users, so they can just do one single #include and have it all.
Then you could just spend some time, including all your headers in a "wrapper" header, in the right order. 50 headers are not that much. Just do something like:
// libfoo.h
#include "header1.h"
#include "header2.h"
// ..
#include "headerN.h"
This will not take that much time, if you do this manually.
Also, adding new headers later - a matter of seconds, to add them in this "wrapper header".
In my opinion, this is the most simple, clean and working solution.
A little bit late, but here it is. I just recently stumbled into this same problem myself and coded this solution: https://github.com/rpvelloso/oneheader
How does it works?
Your project's folder is scanned for C/C++ headers and a list of headers found is created;
For every header in the list it analyzes its #include directives and assemble a dependency graph in the following way:
If the included header is not located inside the project's folder then it is ignored (e.g., if it is a system header);
If the included header is located inside the project's folder then an edge is create in the dependency graph, linking the included header to the current header being analyzed;
The dependency graph is topologically sorted to determine the correct order to concatenate the headers into a single file. If a cycle is found in the graph, the process is interrupted (i.e., if it is not a DAG);
Limitations:
It currently only detects single line #include directives (e.g., #include );
It does not handles headers with the same name in different paths;
It only gives you a correct order to combine all the headers, you still need to concatenate them (maybe you want remove or modify some of them prior to merging).
Compiling:
g++ -Wall -ggdb -std=c++1y -lstdc++fs oneheader.cpp -o oneheader[.exe]
Usage:
./oneheader[.exe] project_folder/ > file_sequence.txt
(Adapting an answer to my dupe question:)
There are several other libraries which aim for a single-header form of distribution, but are developed using multiple files; and they too need such a mechanism. For some (most?) it is opaque and not part of the distributed code. Luckily, there is at least one exception: Lyra, a command-line argument parsing library; it uses a Python-based include file fuser/joiner script, which you can find here.
The script is not well-documented, but they way you use it is with 3 command-line arguments:
--src-include - The include file to convert, i.e. to merge its include directives into its body. In your case it's libfoo.h which includes the other files.
--dst-include - The output file to write - the result of the merging.
--src-include-dir - The directory relative to which include files are specified (i.e. an "include search path" of one directory; the script doesn't support the complex mechanism of multiple include paths and search priorities which the C++ compiler offers)
The script acts recursively, so if file1.h includes another file under the --src-include-dir, that should be merged in as well.
Now, I could nitpick at the code of that script, but - hey, it works and it's FOSS - distributed with the Boost license.
If your library is so big that you cannot build and maintain a single wrapping header file like Kiril suggested, this may mean that it is not architectured well enough.
So if your library is really huge (above a million lines of source code), you might consider automating that, with tools like
GCC make dependency generator preprocessor options like -M -MD -MF etc, with another hand made script sorting them
expensive commercial static analysis tools like coverity
customizing a compiler thru plugins or (for GCC 4.6) MELT extensions
But I don't understand why you want an automated way of doing this. If the library is of reasonable size, you should understand it and be able to write and maintain a wrapping header by hand. Automating that task will take you some efforts (probably weeks, not minutes) so is worthwhile only for very large libraries.
If you have a master include file that includes all others available, you could simply hack a C preprocessor re-implementation in Perl. Process only ""-style includes and recursively paste the contents of these files. Should be a twenty-liner.
If not, you have to write one up yourself or try at random. Automatic dependency tracking in C++ is hard. Like in "let's see if this template instantiation causes an implicit instantiation of the argument class" hard. The only automated way I see is to shuffle your include files into a random order, see if the whole bunch compiles, and re-shuffle them until it compiles. Which will take n! time, you might be better off writing that include file by hand.
While the first variant is easy enough to hack, I doubt the sensibility of this hack, because you want to distribute on a package level (source tarball, deb package, Windows installer) instead of a file level.
You really need a build script to generate this as you work, and a preprocessor flag to disable use of the amalgamate (that could be for your uses).
To simplify this script/program, it helps to have your header structures and include hygiene in top form.
Your program/script will need to know your discovery paths (hint: minimise the count of search paths to one if possible).
Run the script or program (which you create) to replace include directives with header file contents.
Assuming your headers are all guarded as is typical, you can keep track of what files you have already physically included and perform no action if there is another request to include them. If a header is not found, leave it as-is (as an include directive) -- this is required for system/third party headers -- unless you use a separate header for external includes (which is not at all a bad idea).
It's good to have a build phase/translation that includes header alone and produces zero warnings or errors (warnings as errors).
Alternatively, you can create a special distribution repository so they never need to do more than pull from it occasionally.
What you want to do sounds "javascriptish" to me :-) . But if you insist, there is always "cat" (or the equivalent in Windows):
$ cat file1.h file2.h file3.h > my_big_file.h
Or if you are using gcc, create a file my_decent_lib_header.h with the following contents:
#include "file1.h"
#include "file2.h"
#include "file3.h"
and then use
$ gcc -C -E my_decent_lib_header.h -o my_big_file.h
and this way you even get file/line directives that will refer to the original files (although that can be disabled, if you wish).
As for how automatic is this for your file order, well, it is not at all; you have to decide the order yourself. In fact, I would be surprised to hear that a tool that orders header dependencies correctly in all cases for C/C++ can be built.
usually you don't want to include every bit of information from all your headers into the special header that enables the potential user to actually use your library. The non-trivial removal of type definitions, further includes or defines, that are not necessary for the user of your interface to know can not be automatedly done. As far as I know.
Short answer to your main question:
No.
My suggestions:
manually make a new header, that contains all relevant information (nothing more, nothing less) for the user of your library interface. Add nice documentation comments for each component it contains.
use forward declarations where possible, instead of full-fledged included definitions. Put the actual includes in your implementation files. The less include statements you have in your headers, the better.
don't build a deeply nested hierarchy of includes. This makes it extremely hard to keep an overview on the contents of every bit you include. The user of your library will look into the header to learn how to use it. And he will probably not be able to distinguish relevant code from irrelevant on the first sight. You want to maximize the ratio of relevant code per total code in the main header for your library.
EDIT
If you really do have a toolkit library, and the order of inclusion really does not matter, and you have a bunch of independent headers, that you want to enumerate just for convenience into a single header, then you can use a simple script. Like the following Python (untested):
import glob
with open("convenience_header.h", 'w') as f:
for header in glob.glob("*.h"):
f.write("#include \"%s\"\n" % header)