Related
I'm developing a collection of C++ classes and am struggling with how to share the code in a way that maintains organization without compromising ease of compilation for a user of the collection.
Options that I have seen include:
Distribute compiled library file
Put the source in the header file (with implicit inline as discussed in this answer)
Use symbolic links to allow the compiler to find the files.
I'm currently using the third option where, for each class the I want to include I symbolic link each classess headers and source files (e.g. ln -s <path_to_class folder>/myclass.cpp) This works well except that I can't move the project folder location (it breaks all the symlinks) and I have to have all those symlinked files hanging around.
I like the second option (it has the appearance of Java), but I'm worried about code size bloat if everything is declared inline.
A user of the collection will create a project folder somewhere, and somehow include the collection into their compilation process.
I'd like a few things to be possible:
Easy compilation (something like gcc *.cpp from the project folder)
Easy distribution of library in uncompiled form.
Library organization by module.
Compiled code size is not bloated.
I'm not worried about documentation (Doxygen takes care of that) or compile time: the overall modules are small and even the largest projects on the slowest machines won't take more than a few seconds to compile.
I'm using the GCC compiler, if it makes any difference.
A library is the best option (in my opinion) of the three you raised. Then provide the header file(s) in the include path and the library in the linker path.
Since you also want to distribute the library in source code form, I would be inclined to provide a compressed archive (gzip, 7-zip, tarball, or other preferred format) in a central repository.
If I understand correctly, you do not want users to have to include the .cpp files in their build, but instead just want them to use either: (i) the headers directly, (ii) use a compiled form of the lib.
Your requirements are a bit unusual, but they can be achieved. It seems to me like you could organize your code in the following manner. First, have a global define that dictates whether or not you are compiling the library:
// global.h
// ...
#define LIB_SOURCE
// ...
Then in every header file, you check whether that define is set: if the library is distributed as a static/shared lib, the definitions are not included, otherwise, the '.cpp' file is included from the header file.
// A.h
#ifndef _A_H
#include "global.h"
#ifdef LIB_SOURCE
#include "A.cpp"
#endif
// ...
#endif
where 'A.cpp' would contain the actual implementation.
Again, this is a very strange way of doing things and I would actually advise against such practice. A better way (but one which requires more work) is to always distribute a shared library. But to keep things independent of the compiler, write a C layer around it. This way, you have a portable, maintainable library.
As for some of the other requirements:
Keep the build process simple by providing a Makefile
If you worry about the code size of the compiled library, look into gcc's optimization options (-Os). If you worry about the code size of the library when distributed in source-form in the headers, this is more tricky. Since the (inlined) code will actually be in the headers, the code will obviously grow with each inclusion in a .cpp file by the user.
I ended up using inline headers for all of the code. You can see the library here:
https://github.com/libpropeller/libpropeller/tree/master/libpropeller
The library is structured as:
library folder
class A
classA.h
classA.test.h
class B
classB.h
classB.test.h
class C
...
With this structure I can distribute the library as source, and all the user has to do is include -I/path/to/library in their makefile, and #include "library/classA/classA.h" in their source files.
And, as it turns out, having inline headers actually reduces the code size. I've done a full analysis of this, and it turns out that inline code in the headers allows the compiler to make the final binary roughly 5% smaller.
how to include certain header files by default so that i don't have to type them in every programs:
In dev c++ and code::blocks
Make a global header file that in turn includes whatever files you need in every project, and then you only have to include that single file.
However I would recommend against it, unless all your different project are very similar. Different projects have different needs and also need different header files.
You could issue a compiler directive in your project file or make script to do "per project" includes, but in general I would avoid that.
Source code should be as clear as possible to any reader just by its content. Whenever I have source code that dramatically changes its semantics, eg. by headers that are unknown to me, this can be quite confusing.
On top of that, if you "inject" those headers for certain compilation units that don't need them, that will negatively impact compile time.
As a substitution, what about introducing a common.h/hpp header that includes those certain header files? You can then include your common header in all files that need them and change this common set of headers for all depending files at once. It also opens the door to use precompiled header files, which may be worth a look for you.
From GCC documentation (AFAIK GCC is default compiler used by the development environment you are citing)
-include file
Process file as if #include "file" appeared as the first line of the primary source file. However, the first directory searched for
file is the preprocessor's working directory instead of the directory
containing the main source file. If not found there, it is searched
for in the remainder of the #include "..." search chain as normal.
If multiple -include options are given, the files are included in the order they appear on the command line.
-imacros file
Exactly like -include, except that any output produced by scanning file is thrown away. Macros it defines remain defined. This allows you
to acquire all the macros from a header without also processing its
declarations.
All files specified by -imacros are processed before all files specified by -include.
But it is usually a bad idea to use these.
Dev c++ works with MingW compiler, which is gcc compiler for Windows. Gcc supports precompiled headers, so you can try that. Precompiled headers are header files that you want compiled and added to every object file in a project. Try searching for that in Google for some information.
Code::blocks supports them too, when used with gcc, even better, so there it may even be easier.
If your editor of choice supports macros, make one that adds your preferred set of include files. Once made, all you have to do is invoke your macro to save yourself the repetitive typing and you're golden.
Hope this helps.
Is there an automated way to take a large amount of C++ header files and combine them in a single one?
This operation must, of course, concatenate the files in the right order so that no types, etc. are defined before they are used in upcoming classes and functions.
Basically, I'm looking for something that allows me to distribute my library in two files (libfoo.h, libfoo.a), instead of the current bunch of include files + the binary library.
As your comment says:
.. I want to make it easier for library users, so they can just do one single #include and have it all.
Then you could just spend some time, including all your headers in a "wrapper" header, in the right order. 50 headers are not that much. Just do something like:
// libfoo.h
#include "header1.h"
#include "header2.h"
// ..
#include "headerN.h"
This will not take that much time, if you do this manually.
Also, adding new headers later - a matter of seconds, to add them in this "wrapper header".
In my opinion, this is the most simple, clean and working solution.
A little bit late, but here it is. I just recently stumbled into this same problem myself and coded this solution: https://github.com/rpvelloso/oneheader
How does it works?
Your project's folder is scanned for C/C++ headers and a list of headers found is created;
For every header in the list it analyzes its #include directives and assemble a dependency graph in the following way:
If the included header is not located inside the project's folder then it is ignored (e.g., if it is a system header);
If the included header is located inside the project's folder then an edge is create in the dependency graph, linking the included header to the current header being analyzed;
The dependency graph is topologically sorted to determine the correct order to concatenate the headers into a single file. If a cycle is found in the graph, the process is interrupted (i.e., if it is not a DAG);
Limitations:
It currently only detects single line #include directives (e.g., #include );
It does not handles headers with the same name in different paths;
It only gives you a correct order to combine all the headers, you still need to concatenate them (maybe you want remove or modify some of them prior to merging).
Compiling:
g++ -Wall -ggdb -std=c++1y -lstdc++fs oneheader.cpp -o oneheader[.exe]
Usage:
./oneheader[.exe] project_folder/ > file_sequence.txt
(Adapting an answer to my dupe question:)
There are several other libraries which aim for a single-header form of distribution, but are developed using multiple files; and they too need such a mechanism. For some (most?) it is opaque and not part of the distributed code. Luckily, there is at least one exception: Lyra, a command-line argument parsing library; it uses a Python-based include file fuser/joiner script, which you can find here.
The script is not well-documented, but they way you use it is with 3 command-line arguments:
--src-include - The include file to convert, i.e. to merge its include directives into its body. In your case it's libfoo.h which includes the other files.
--dst-include - The output file to write - the result of the merging.
--src-include-dir - The directory relative to which include files are specified (i.e. an "include search path" of one directory; the script doesn't support the complex mechanism of multiple include paths and search priorities which the C++ compiler offers)
The script acts recursively, so if file1.h includes another file under the --src-include-dir, that should be merged in as well.
Now, I could nitpick at the code of that script, but - hey, it works and it's FOSS - distributed with the Boost license.
If your library is so big that you cannot build and maintain a single wrapping header file like Kiril suggested, this may mean that it is not architectured well enough.
So if your library is really huge (above a million lines of source code), you might consider automating that, with tools like
GCC make dependency generator preprocessor options like -M -MD -MF etc, with another hand made script sorting them
expensive commercial static analysis tools like coverity
customizing a compiler thru plugins or (for GCC 4.6) MELT extensions
But I don't understand why you want an automated way of doing this. If the library is of reasonable size, you should understand it and be able to write and maintain a wrapping header by hand. Automating that task will take you some efforts (probably weeks, not minutes) so is worthwhile only for very large libraries.
If you have a master include file that includes all others available, you could simply hack a C preprocessor re-implementation in Perl. Process only ""-style includes and recursively paste the contents of these files. Should be a twenty-liner.
If not, you have to write one up yourself or try at random. Automatic dependency tracking in C++ is hard. Like in "let's see if this template instantiation causes an implicit instantiation of the argument class" hard. The only automated way I see is to shuffle your include files into a random order, see if the whole bunch compiles, and re-shuffle them until it compiles. Which will take n! time, you might be better off writing that include file by hand.
While the first variant is easy enough to hack, I doubt the sensibility of this hack, because you want to distribute on a package level (source tarball, deb package, Windows installer) instead of a file level.
You really need a build script to generate this as you work, and a preprocessor flag to disable use of the amalgamate (that could be for your uses).
To simplify this script/program, it helps to have your header structures and include hygiene in top form.
Your program/script will need to know your discovery paths (hint: minimise the count of search paths to one if possible).
Run the script or program (which you create) to replace include directives with header file contents.
Assuming your headers are all guarded as is typical, you can keep track of what files you have already physically included and perform no action if there is another request to include them. If a header is not found, leave it as-is (as an include directive) -- this is required for system/third party headers -- unless you use a separate header for external includes (which is not at all a bad idea).
It's good to have a build phase/translation that includes header alone and produces zero warnings or errors (warnings as errors).
Alternatively, you can create a special distribution repository so they never need to do more than pull from it occasionally.
What you want to do sounds "javascriptish" to me :-) . But if you insist, there is always "cat" (or the equivalent in Windows):
$ cat file1.h file2.h file3.h > my_big_file.h
Or if you are using gcc, create a file my_decent_lib_header.h with the following contents:
#include "file1.h"
#include "file2.h"
#include "file3.h"
and then use
$ gcc -C -E my_decent_lib_header.h -o my_big_file.h
and this way you even get file/line directives that will refer to the original files (although that can be disabled, if you wish).
As for how automatic is this for your file order, well, it is not at all; you have to decide the order yourself. In fact, I would be surprised to hear that a tool that orders header dependencies correctly in all cases for C/C++ can be built.
usually you don't want to include every bit of information from all your headers into the special header that enables the potential user to actually use your library. The non-trivial removal of type definitions, further includes or defines, that are not necessary for the user of your interface to know can not be automatedly done. As far as I know.
Short answer to your main question:
No.
My suggestions:
manually make a new header, that contains all relevant information (nothing more, nothing less) for the user of your library interface. Add nice documentation comments for each component it contains.
use forward declarations where possible, instead of full-fledged included definitions. Put the actual includes in your implementation files. The less include statements you have in your headers, the better.
don't build a deeply nested hierarchy of includes. This makes it extremely hard to keep an overview on the contents of every bit you include. The user of your library will look into the header to learn how to use it. And he will probably not be able to distinguish relevant code from irrelevant on the first sight. You want to maximize the ratio of relevant code per total code in the main header for your library.
EDIT
If you really do have a toolkit library, and the order of inclusion really does not matter, and you have a bunch of independent headers, that you want to enumerate just for convenience into a single header, then you can use a simple script. Like the following Python (untested):
import glob
with open("convenience_header.h", 'w') as f:
for header in glob.glob("*.h"):
f.write("#include \"%s\"\n" % header)
Question: what is the best way to convert a .c/.h based project (which is forcefully compiled as C++ via the makefiles) to a .cpp/.hpp based project?
Obviously, this is a triple-step process. The first would be to rename everything with *.c at the end to *.cpp; the second would be to rename everything with *.h at the end to *.hpp. What I'm getting caught up on is the third step- somehow building a list of what the files /were/ named (ie, myfile.c), then iterating through every single affected file and replacing every instance of the old filename with the new (myfile.c -> myfile.cpp). Obviously this would have to be done so the source files can still find everything that they need.
The source code in question consists of around 2700 individual source files.
The reason why I'm doing this is mostly because I'm porting said software package to Mac OS X, and that involves Xcode. Things are getting bloody messy trying to keep track of precisely what is C, C++, and the associated headers for either (then overriding the compiler for C++ compilation). It would be much simpler if everything C++ was *.cpp (with the associated headers being *.hpp), since then I can just leave Xcode at the default compiler setting as per the file extension and everything should work without any fancy intervention on my end.
I should probably also note that I know precisely what files need to be converted, because they already compile properly and in a sane fashion if I'm overriding Xcode to compile as C++. That's not a problem- my issue is trying to figure out how to batch rename everything then run through all the files and update the #includes.
Thank you in advance!
-Keven Tipping
You don't need to mess with the headers. filename.h is a perfectly good name for a C++ header.
If you're not using the old makefile, but creating a new XCode project, then you have only one step:
Rename *.c to *.cpp
If the makefile was written right (using rule patterns and not specific per-file rules), there shouldn't be any changes needed there either.
There's no reason to rename those C language header and source files to C++ and there are many reasons not to. Just three of the many:
Reason #1: C and C++ are diverging, different languages. Force-compiling a C file as if it were C++ risks introducing a bug.
Reason #2: Xcode can handle C, C++, and C and C++ mixed together.
Reason #3: C++ can easily call C routines. All you need to do is wrap the declarations of those C functions inside an extern "C" { /* C declarations here */ } construct.
I am working on a large C++ project in Visual Studio 2008, and there are a lot of files with unnecessary #include directives. Sometimes the #includes are just artifacts and everything will compile fine with them removed, and in other cases classes could be forward declared and the #include could be moved to the .cpp file. Are there any good tools for detecting both of these cases?
While it won't reveal unneeded include files, Visual studio has a setting /showIncludes (right click on a .cpp file, Properties->C/C++->Advanced) that will output a tree of all included files at compile time. This can help in identifying files that shouldn't need to be included.
You can also take a look at the pimpl idiom to let you get away with fewer header file dependencies to make it easier to see the cruft that you can remove.
PC Lint works quite well for this, and it finds all sorts of other goofy problems for you too. It has command line options that can be used to create External Tools in Visual Studio, but I've found that the Visual Lint addin is easier to work with. Even the free version of Visual Lint helps. But give PC-Lint a shot. Configuring it so it doesn't give you too many warnings takes a bit of time, but you'll be amazed at what it turns up.
There's a new Clang-based tool, include-what-you-use, that aims to do this.
!!DISCLAIMER!! I work on a commercial static analysis tool (not PC Lint). !!DISCLAIMER!!
There are several issues with a simple non parsing approach:
1) Overload Sets:
It's possible that an overloaded function has declarations that come from different files. It might be that removing one header file results in a different overload being chosen rather than a compile error! The result will be a silent change in semantics that may be very difficult to track down afterwards.
2) Template specializations:
Similar to the overload example, if you have partial or explicit specializations for a template you want them all to be visible when the template is used. It might be that specializations for the primary template are in different header files. Removing the header with the specialization will not cause a compile error, but may result in undefined behaviour if that specialization would have been selected. (See: Visibility of template specialization of C++ function)
As pointed out by 'msalters', performing a full analysis of the code also allows for analysis of class usage. By checking how a class is used though a specific path of files, it is possible that the definition of the class (and therefore all of its dependnecies) can be removed completely or at least moved to a level closer to the main source in the include tree.
I don't know of any such tools, and I have thought about writing one in the past, but it turns out that this is a difficult problem to solve.
Say your source file includes a.h and b.h; a.h contains #define USE_FEATURE_X and b.h uses #ifdef USE_FEATURE_X. If #include "a.h" is commented out, your file may still compile, but may not do what you expect. Detecting this programatically is non-trivial.
Whatever tool does this would need to know your build environment as well. If a.h looks like:
#if defined( WINNT )
#define USE_FEATURE_X
#endif
Then USE_FEATURE_X is only defined if WINNT is defined, so the tool would need to know what directives are generated by the compiler itself as well as which ones are specified in the compile command rather than in a header file.
Like Timmermans, I'm not familiar with any tools for this. But I have known programmers who wrote a Perl (or Python) script to try commenting out each include line one at a time and then compile each file.
It appears that now Eric Raymond has a tool for this.
Google's cpplint.py has an "include what you use" rule (among many others), but as far as I can tell, no "include only what you use." Even so, it can be useful.
If you're interested in this topic in general, you might want to check out Lakos' Large Scale C++ Software Design. It's a bit dated, but goes into lots of "physical design" issues like finding the absolute minimum of headers that need to be included. I haven't really seen this sort of thing discussed anywhere else.
Give Include Manager a try. It integrates easily in Visual Studio and visualizes your include paths which helps you to find unnecessary stuff.
Internally it uses Graphviz but there are many more cool features. And although it is a commercial product it has a very low price.
You can build an include graph using C/C++ Include File Dependencies Watcher, and find unneeded includes visually.
If your header files generally start with
#ifndef __SOMEHEADER_H__
#define __SOMEHEADER_H__
// header contents
#endif
(as opposed to using #pragma once) you could change that to:
#ifndef __SOMEHEADER_H__
#define __SOMEHEADER_H__
// header contents
#else
#pragma message("Someheader.h superfluously included")
#endif
And since the compiler outputs the name of the cpp file being compiled, that would let you know at least which cpp file is causing the header to be brought in multiple times.
PC-Lint can indeed do this. One easy way to do this is to configure it to detect just unused include files and ignore all other issues. This is pretty straightforward - to enable just message 766 ("Header file not used in module"), just include the options -w0 +e766 on the command line.
The same approach can also be used with related messages such as 964 ("Header file not directly used in module") and 966 ("Indirectly included header file not used in module").
FWIW I wrote about this in more detail in a blog post last week at http://www.riverblade.co.uk/blog.php?archive=2008_09_01_archive.xml#3575027665614976318.
Adding one or both of the following #defines
will exclude often unnecessary header files and
may substantially improve
compile times especially if the code that is not using Windows API functions.
#define WIN32_LEAN_AND_MEAN
#define VC_EXTRALEAN
See http://support.microsoft.com/kb/166474
If you are looking to remove unnecessary #include files in order to decrease build times, your time and money might be better spent parallelizing your build process using cl.exe /MP, make -j, Xoreax IncrediBuild, distcc/icecream, etc.
Of course, if you already have a parallel build process and you're still trying to speed it up, then by all means clean up your #include directives and remove those unnecessary dependencies.
Start with each include file, and ensure that each include file only includes what is necessary to compile itself. Any include files that are then missing for the C++ files, can be added to the C++ files themselves.
For each include and source file, comment out each include file one at a time and see if it compiles.
It is also a good idea to sort the include files alphabetically, and where this is not possible, add a comment.
If you aren't already, using a precompiled header to include everything that you're not going to change (platform headers, external SDK headers, or static already completed pieces of your project) will make a huge difference in build times.
http://msdn.microsoft.com/en-us/library/szfdksca(VS.71).aspx
Also, although it may be too late for your project, organizing your project into sections and not lumping all local headers to one big main header is a good practice, although it takes a little extra work.
If you would work with Eclipse CDT you could try out http://includator.com to optimize your include structure. However, Includator might not know enough about VC++'s predefined includes and setting up CDT to use VC++ with correct includes is not built into CDT yet.
The latest Jetbrains IDE, CLion, automatically shows (in gray) the includes that are not used in the current file.
It is also possible to have the list of all the unused includes (and also functions, methods, etc...) from the IDE.
Some of the existing answers state that it's hard. That's indeed true, because you need a full compiler to detect the cases in which a forward declaration would be appropriate. You cant parse C++ without knowing what the symbols mean; the grammar is simply too ambiguous for that. You must know whether a certain name names a class (could be forward-declared) or a variable (can't). Also, you need to be namespace-aware.
Maybe a little late, but I once found a WebKit perl script that did just what you wanted. It'll need some adapting I believe (I'm not well versed in perl), but it should do the trick:
http://trac.webkit.org/browser/branches/old/safari-3-2-branch/WebKitTools/Scripts/find-extra-includes
(this is an old branch because trunk doesn't have the file anymore)
If there's a particular header that you think isn't needed anymore (say
string.h), you can comment out that include then put this below all the
includes:
#ifdef _STRING_H_
# error string.h is included indirectly
#endif
Of course your interface headers might use a different #define convention
to record their inclusion in CPP memory. Or no convention, in which case
this approach won't work.
Then rebuild. There are three possibilities:
It builds ok. string.h wasn't compile-critical, and the include for it
can be removed.
The #error trips. string.g was included indirectly somehow
You still don't know if string.h is required. If it is required, you
should directly #include it (see below).
You get some other compilation error. string.h was needed and isn't being
included indirectly, so the include was correct to begin with.
Note that depending on indirect inclusion when your .h or .c directly uses
another .h is almost certainly a bug: you are in effect promising that your
code will only require that header as long as some other header you're using
requires it, which probably isn't what you meant.
The caveats mentioned in other answers about headers that modify behavior
rather that declaring things which cause build failures apply here as well.