This question already has answers here:
Separate "include" and "src" folders for application-level code? [closed]
(10 answers)
Closed 6 years ago.
I know that it is common in C/C++ projects to place header files in a directory such as include and implementation in a separate directory such as src. I have been toying with different project structures and am wondering whether there any objective reasons for this or is it simply convention?
Convention is one of the reasons - most of the time, with effective abstraction, you only care about the interface and want to have it easy just looking at the headers.
It's not the only reason though. If your project is organised in modules, you most likely have to include some headers in different modules, and you want your include directory to be cleaned of other "noise" files in there.
Also, if you plan on redistributing your module, you probably want to hide implementation details. So you only supply headers and binaries - and distributing headers from a single folder with nothing else in it is simpler.
There's also an alternative which I actually prefer - public headers go in a separate folder (these contain the minimum interface - no implementation details are visible whatsoever), and private headers and implementation files are separate (possibly, but not necessarily, in separate folders).
I prefer putting them into the same directory. Reason:
The interface specification file(s), and the source file(s) implementing that interface belongs to the same part of the project. Say you have subsystemx. Then, if you put subsystemx files in the subsystemx directory, subsustemx is self-contained.
If there are many include files, sure you could do subsystemx/include and subsystemx/source, but then I argue that if you put the definition of class Foo in foo.hpp, and foo.cpp you certainly want to see both of them (or at least have the possibility to do so easily) together in a directory listing. Finding all files related to foo
ls foo*
Finding all implementation files:
ls *.cpp
Finding all declaration files:
ls *.hpp
Simple and clean.
It keeps your folder structure cleaner. Headers and source files are distinctly different, and are used for different things, so it makes sense to separate them. From this point-of-view the question is basically the same as "why do source files and documentation go in different folders"? The computer is highly agnostic about what you put in folders and what you don't, folders are -- for the most part -- just a handy abstraction because of the way that we humans parse, store, and recall information.
There's also the fact that header files remain useful even after you've built, i.e. if you're building a library and someone wants to use that library, they'll need the header files -- not the source files -- so it makes bundling those header files up -- grabbing the stuff in bin and the stuff in include and not having to sift through src -- much easier.
Besides (arguable?) usefulness for keeping things orderly, useful in other projects etc, there is one very neutral and objective advantage: compile time.
In particular, in a big project with a whole bunch of files, depending on search paths for the headers (.c/.cpp files using #include "headername.h" rather than #include "../../gfx/misc/something/headername.h" and the compiler passed the right parameters to be able to swallow that) you drastically reduce the number of entries that need to be scanned by the compiler in search of the right header. Since most compilers start separately for each file compiled, they need to read in the list of files on the include path and seek the right headers for each compiled file. If there is a bunch of .c, .o and other irrelevant files on the include path, finding the includes among them takes proportionally longer.
In short, a few reasons:
Maintainable code.
Code is well-designed and neat.
Faster compile time (at times, for minor changes done).
Easier segregation of the Interfaces for documentation etc.
Cyclic dependency at compile time can be avoided.
Easy to review.
Have a look at the article Organizing Code Files in C and C++ which explains it well.
Related
This question already has answers here:
Combining C++ header files
(8 answers)
Closed 2 years ago.
I have a C++ repository for a header-only library (built via CMake, although that's not critical). Its structure is roughly:
include/
mylib.hpp
mylib/
foo.hpp
bar.hpp
Now, I know some popular C++ libraries are maintained as single-header files. I don't like dumping everything into a kitchen-sink file; but at the same time I can well appreciate the convenience of being able to utilize a library by just downloading a single file.
So, I was thinking - maybe I can just generate the single header file as part of the installation process?
Supposedly this is a "simple matter of prerocessing"; but - it's not actually quite that simple:
I don't want to fully preprocess the C++ files, just #include directives.
Not all include files are relevant - only the files under a certain source tree.
During actual compilation, the same file is included multiple times (ignoring potential compiler optimizations against doing so), with the second-and-later copies typically removed later using include guards or #pragma once; in my case one would need to watch out and prevent double includes.
So, my question: How do I go about doing this?
Note:
A CMake-based method would be nice, but anything reasonable goes.
You could just code yourself a parser. You give it the beginning file. It replaces all #includes with the actual file. If the file to be included was already included, skip it (similar to how compiler behaves towards include guards)
Building on #uIM7AI9S's suggestion - such a mechanism must surely exist in other libraries which want development to happen with multiple files but still offer a single-include-file convenience. One example is Lyra, a command-line argument parsing library; it uses a Python-based include file fuser/joiner, which you can find here.
I could nitpick at the code of that thing, but - hey, it works and it's FOSS - distributed with the Boost license.
Unfortunately, it seems the Lyra developers pre-generate the single header, and that process is not part of a CMake-based build (despite there being a CMakeLists.txt file in the repository's root)
When I work on my personal C and C++ projects I usually put file.h and file.cpp in the same directory and then file.cpp can reference file.h with a #include "file.h" directive.
However, it is common to find out libraries and other kinds of projects (like the linux kernel and freeRTOS) where all .h files are placed inside an include/ directory, while .cpp files remain in another directory. In those projects, .h files are also included with #include "file.h" instead of #include "include/file.h" as I was hoping.
I have some questions about all of this:
What are the advantages of this file structure organization?
Why are .h files inside include/ included with #include "file.h" instead of #include "include/file.h"? I know the real trick is inside some Makefile, but is it really better to do that way instead of making clear (in code) that the file we want to include is actually in the include/ directory?
The main reason to do this is that compiled libraries need headers in order to be consumed by the eventual user. By convention, the contents of the include directory are the headers exposed for public consumption. The source directory may have headers for internal use, but those are not meant to be distributed with the compiled library.
So when using the library, you link to the binary and add the library's include directory to your build system's header paths. Similarly, if you install your compiled library to a centralized location, you can tell which files need to be copied to the central location (the compiled binaries and the include directory) and which files don't (the source directory and so forth).
It used to be that <header> style includes were of the implicit path type, that is, to be found on the includes environment variable path or a build macro, and the "header" style includes were of the explicit form, as-in, exactly relative to where-ever the source file is that included it. While some build tool chains still allow for this distinction, they often default to a configuration that effectively nullifies it.
Your question is interesting because it brings up the question of which really is better, implicit or explicit? The implicit form is certainly easier because:
Convenient groupings of related headers in hierarchies of directories.
You only need include a few directories in the includes path and need not be aware of every detail with regard to exact locations of files. You can change versions of libraries and their related headers without changing code.
DRY.
Flexible! Your build environment doesn't have to match mine, but we can often get nearly exact same results.
Explicit on the other hand has:
Repeatable builds. A reordering of paths in an includes macro/environment variable, doesn't change resulting header files found during the build.
Portable builds. Just package everything from the root of the build and ship it off to another dev.
Proximity of information. You know exactly where the header is with #include "\X\Y\Z". In the implicit form, you may have to go searching along multiple paths and might even find multiple versions of the same file, how do you know which one is used in the build?
Builders have been arguing over these two approaches for many decades, but a hybrid form of the two, mostly wins out because of the effort required to maintain builds based purely of the explicit form, and the obvious difficulty one might have familiarizing one's self with code of a purely implicit nature. We all generally understand that our various tool chains put certain common libraries and headers in particular locations, such that they can be shared across users and projects, so we expect to find standard C/C++ headers in one place, but we don't initially know anything about the specific structure of any arbitrary project, lacking a locally well documented convention, so we expect the code in those projects to be explicit with regard to the non-standard bits that are unique to them and implicit regarding the standard bits.
It is a good practice to always use the <header> form of include for all the standard headers and other libraries that are not project specific and to use the "header" form for everything else. Should you have an include directory in your project for your local includes? That depends to some extent on whether those headers will be shipped as interfaces to your libraries or merely consumed by your code, and also on your preferences. How large and complex is your project? If you have a mix of internal and external interfaces or lots of different components, you might want to group things into separate directories.
Keep in mind that the directory structure your finished product unpacks to, need not look anything like the directory structure under which you develop and build that product in. If you have only a few .c/.cpp files and headers, it's ok to put them all in one directory, but eventually, you're going to work on something non-trivial and will have to think through the consequences of your build environment choices, and hopefully document it for others to understand it.
1 . .hpp and .cpp doesn't necessary have 1 to 1 relationship, there may have multiple .cpp using same .hpp according to different conditions (eg:different environments), for example: a multi-platform library, imagine there is a class to get the version of the app, and the header is like that:
Utilities.h
#include <string.h>
class Utilities{
static std::string getAppVersion();
}
main.cpp
#include Utilities.h
int main(){
std::cout << Utilities::getAppVersion() << std::ends;
return 0;
}
there may have one .cpp for each platform, and the .cpp may be placed at different locations so that they are easily be selected by the corresponding platform, eg:
.cpp for iOS (path:DemoProject/ios/Utilities.cpp):
#include "Utilities.h"
std::string Utilities::getAppVersion(){
//some objective C code
}
.cpp for Android (path:DemoProject/android/Utilities.cpp):
#include "Utilities.h"
std::string Utilities::getAppVersion(){
//some jni code
}
and of course 2 .cpp would not be used at the same time normally.
2.
#include "file.h"
instead of
#include "include/file.h"
allows you to keep the source code unchanged when your headers are not placed in the "include" folder anymore.
I am currently working on program with a lot of source files. Sometimes it is difficult to keep track of what libraries I have already #included. Theoretically, I could make a single header file called Headers.h that just contains all the #include statements I need, then make all other header files #include "Headers.h".
Why is this a good/bad idea?
Pros:
Slightly less maintenance as you don't have to keep track of which of your files are including headers from which libraries or other compoenents.
Cons:
Definitions in included files might conflict with each other. Especially in C where you don't have namespaces (you tagged with C and C++)
Macros in particular can cause hard to debug problems, where a macro definition unexpectedly conflicts with some name in your file or one of the other included files
Depending on which compiler you use, compilation times might blow out. If using a compiler that pre-compiles headers it might actually reduce compilation time, but if not the opposite will happen
You will often unnecessarily trigger rebuilds of files. If you have your build system set up correctly, then each source file will get rebuilt if any of the included files gets modified. If you always include all headers in your project, then a change to any of your headers will force recompilation of all your source files. Not likely to be an issue for system headers but it will be if you include your own headers in the master file as well.
On the whole I would not recommend that approach. The last con listed above it particularly important.
Best practice would be to include only headers that are needed for the code in each file.
In complement of Harmic's answer, indeed the main issue is the build system (most builders work on file timestamp, not on file contents. omake is a notable exception).
Notice that if you only care about many dependencies, GNU make can be used with autodependencies, together with -M* options passed to GCC (i.e. to g++ and actually to the preprocessor).
However, many libraries are offering to their user a single header (e.g. <gtk/gtk.h>)
Also, a single header file is more friendly to precompiled headers technology. In particular, GCC wants a single header for precompilation.
See also ccache.
Tracking all the required includes would be more difficult as they are abstracted from their c source files and not really supporting modularisation pus all the cons from #harmic
I'm starting to write a data processing library of mine and quite confused about building the proper structure of project and libraries.
Say, I'd like to have a set of functions stored in myfunclib library. My current set up (taken from multiple recommendations online) looks like this:
myproj/include/myfunclib.h - class declaration
myproj/include/myfunclib.cpp - class functionality
myproj/src/functest.cpp - test file to check functions
Firstly, it feels like this is a proper set up in case I use myfunc only for myproj project, but say I want to reuse it - then I'd need to specify it's path in each of cpp files using it or store multiple copies of it.
Secondly, compilation is a bit bulky in such case:
g++ -I include include/myfunclib.cpp src/functest.cpp
Is it a normal practice to type all that stuff every time? What if I have many custom libraries I need? Is there a way to store them all separately, simply include as 'myfunclib.h' and not worry about recompiling etc?
Use a makefile to handle all of your dependencies and building your code. Google the syntax it's pretty simple. then you can just say "make" on the command line and it will build everything for you.
here's a good tutorial
http://mrbook.org/tutorials/make/
some things that bit me originally,
remember that templated classes should only be included, what is generally the source implementation should not be built like normal class implementations into object files, so generally i put my whole template implementation within the include directory
i keep include and source files separate, by source files i mean code (definitions) that needs to be compiled into object files for linking, and includes are all the declarations, inline functions, etc it just seems to make more sense to me
sometimes i'll have a header file that includes all relevant headers for a specific module, and in turn perhaps a header file higher up that includes all main headers for modules i am using
also as said in the comments, you need to introduce yourself to some build tools, and get comfortable with them, these will help you track dependencies within your project, and in most cases avoid rebuilding an entire project when only a subset of dependencies have changed (this can be a pain to get right in the beginning but is worthwhile learning, if you use make and g++ there is a way to get this working with g++ -MM ... not sure how well it works for all cases ), i know that the way i organized my projects changed drastically the more i learnt about the build process, and the more complex my projects became (and the more flaws i had to fix )
this is how i generally keep my a project directory structure when starting
build - where all the built files will be stored
app - these are the main apps (can also be split into include/src)
include - includes files
src - src files (compiled into objects and then linked with main compiled app)
lib - any libs (usually 3rdparty libs , if any my src is compiled into a library it usually ends up in build/lib/target/... )
hope some of this helps
Is there an automated way to take a large amount of C++ header files and combine them in a single one?
This operation must, of course, concatenate the files in the right order so that no types, etc. are defined before they are used in upcoming classes and functions.
Basically, I'm looking for something that allows me to distribute my library in two files (libfoo.h, libfoo.a), instead of the current bunch of include files + the binary library.
As your comment says:
.. I want to make it easier for library users, so they can just do one single #include and have it all.
Then you could just spend some time, including all your headers in a "wrapper" header, in the right order. 50 headers are not that much. Just do something like:
// libfoo.h
#include "header1.h"
#include "header2.h"
// ..
#include "headerN.h"
This will not take that much time, if you do this manually.
Also, adding new headers later - a matter of seconds, to add them in this "wrapper header".
In my opinion, this is the most simple, clean and working solution.
A little bit late, but here it is. I just recently stumbled into this same problem myself and coded this solution: https://github.com/rpvelloso/oneheader
How does it works?
Your project's folder is scanned for C/C++ headers and a list of headers found is created;
For every header in the list it analyzes its #include directives and assemble a dependency graph in the following way:
If the included header is not located inside the project's folder then it is ignored (e.g., if it is a system header);
If the included header is located inside the project's folder then an edge is create in the dependency graph, linking the included header to the current header being analyzed;
The dependency graph is topologically sorted to determine the correct order to concatenate the headers into a single file. If a cycle is found in the graph, the process is interrupted (i.e., if it is not a DAG);
Limitations:
It currently only detects single line #include directives (e.g., #include );
It does not handles headers with the same name in different paths;
It only gives you a correct order to combine all the headers, you still need to concatenate them (maybe you want remove or modify some of them prior to merging).
Compiling:
g++ -Wall -ggdb -std=c++1y -lstdc++fs oneheader.cpp -o oneheader[.exe]
Usage:
./oneheader[.exe] project_folder/ > file_sequence.txt
(Adapting an answer to my dupe question:)
There are several other libraries which aim for a single-header form of distribution, but are developed using multiple files; and they too need such a mechanism. For some (most?) it is opaque and not part of the distributed code. Luckily, there is at least one exception: Lyra, a command-line argument parsing library; it uses a Python-based include file fuser/joiner script, which you can find here.
The script is not well-documented, but they way you use it is with 3 command-line arguments:
--src-include - The include file to convert, i.e. to merge its include directives into its body. In your case it's libfoo.h which includes the other files.
--dst-include - The output file to write - the result of the merging.
--src-include-dir - The directory relative to which include files are specified (i.e. an "include search path" of one directory; the script doesn't support the complex mechanism of multiple include paths and search priorities which the C++ compiler offers)
The script acts recursively, so if file1.h includes another file under the --src-include-dir, that should be merged in as well.
Now, I could nitpick at the code of that script, but - hey, it works and it's FOSS - distributed with the Boost license.
If your library is so big that you cannot build and maintain a single wrapping header file like Kiril suggested, this may mean that it is not architectured well enough.
So if your library is really huge (above a million lines of source code), you might consider automating that, with tools like
GCC make dependency generator preprocessor options like -M -MD -MF etc, with another hand made script sorting them
expensive commercial static analysis tools like coverity
customizing a compiler thru plugins or (for GCC 4.6) MELT extensions
But I don't understand why you want an automated way of doing this. If the library is of reasonable size, you should understand it and be able to write and maintain a wrapping header by hand. Automating that task will take you some efforts (probably weeks, not minutes) so is worthwhile only for very large libraries.
If you have a master include file that includes all others available, you could simply hack a C preprocessor re-implementation in Perl. Process only ""-style includes and recursively paste the contents of these files. Should be a twenty-liner.
If not, you have to write one up yourself or try at random. Automatic dependency tracking in C++ is hard. Like in "let's see if this template instantiation causes an implicit instantiation of the argument class" hard. The only automated way I see is to shuffle your include files into a random order, see if the whole bunch compiles, and re-shuffle them until it compiles. Which will take n! time, you might be better off writing that include file by hand.
While the first variant is easy enough to hack, I doubt the sensibility of this hack, because you want to distribute on a package level (source tarball, deb package, Windows installer) instead of a file level.
You really need a build script to generate this as you work, and a preprocessor flag to disable use of the amalgamate (that could be for your uses).
To simplify this script/program, it helps to have your header structures and include hygiene in top form.
Your program/script will need to know your discovery paths (hint: minimise the count of search paths to one if possible).
Run the script or program (which you create) to replace include directives with header file contents.
Assuming your headers are all guarded as is typical, you can keep track of what files you have already physically included and perform no action if there is another request to include them. If a header is not found, leave it as-is (as an include directive) -- this is required for system/third party headers -- unless you use a separate header for external includes (which is not at all a bad idea).
It's good to have a build phase/translation that includes header alone and produces zero warnings or errors (warnings as errors).
Alternatively, you can create a special distribution repository so they never need to do more than pull from it occasionally.
What you want to do sounds "javascriptish" to me :-) . But if you insist, there is always "cat" (or the equivalent in Windows):
$ cat file1.h file2.h file3.h > my_big_file.h
Or if you are using gcc, create a file my_decent_lib_header.h with the following contents:
#include "file1.h"
#include "file2.h"
#include "file3.h"
and then use
$ gcc -C -E my_decent_lib_header.h -o my_big_file.h
and this way you even get file/line directives that will refer to the original files (although that can be disabled, if you wish).
As for how automatic is this for your file order, well, it is not at all; you have to decide the order yourself. In fact, I would be surprised to hear that a tool that orders header dependencies correctly in all cases for C/C++ can be built.
usually you don't want to include every bit of information from all your headers into the special header that enables the potential user to actually use your library. The non-trivial removal of type definitions, further includes or defines, that are not necessary for the user of your interface to know can not be automatedly done. As far as I know.
Short answer to your main question:
No.
My suggestions:
manually make a new header, that contains all relevant information (nothing more, nothing less) for the user of your library interface. Add nice documentation comments for each component it contains.
use forward declarations where possible, instead of full-fledged included definitions. Put the actual includes in your implementation files. The less include statements you have in your headers, the better.
don't build a deeply nested hierarchy of includes. This makes it extremely hard to keep an overview on the contents of every bit you include. The user of your library will look into the header to learn how to use it. And he will probably not be able to distinguish relevant code from irrelevant on the first sight. You want to maximize the ratio of relevant code per total code in the main header for your library.
EDIT
If you really do have a toolkit library, and the order of inclusion really does not matter, and you have a bunch of independent headers, that you want to enumerate just for convenience into a single header, then you can use a simple script. Like the following Python (untested):
import glob
with open("convenience_header.h", 'w') as f:
for header in glob.glob("*.h"):
f.write("#include \"%s\"\n" % header)