A colleague recently revealed to me that a single source file of ours includes over 3,400 headers during compile time. We have over 1,000 translation units that get compiled in a build, resulting in a huge performance penalty over headers that surely aren't all used.
Are there any static analysis tools that would be able to shed light on the trees in such a forest, specifically giving us the ability to decide which ones we should work on paring out?
UPDATE
Found some interesting information on the cost of including a header file (and the types of include guards to optimize its inclusion) here, originating from this question.
The output of gcc -w -H <file> might be useful (If you parse it and put some counts in) the -w is there to suppress all warnings, which might be awkward to deal with.
From the gcc docs:
-H
Print the name of each header file used, in addition to other normal activities. Each name is indented to show how deep in the
#include stack it is. Precompiled header files are also printed,
even if they are found to be invalid; an invalid precompiled header
file is printed with ...x and a valid one with ...!.
The output looks like this:
. /usr/include/unistd.h
.. /usr/include/features.h
... /usr/include/bits/predefs.h
... /usr/include/sys/cdefs.h
.... /usr/include/bits/wordsize.h
... /usr/include/gnu/stubs.h
.... /usr/include/bits/wordsize.h
.... /usr/include/gnu/stubs-64.h
.. /usr/include/bits/posix_opt.h
.. /usr/include/bits/environments.h
... /usr/include/bits/wordsize.h
.. /usr/include/bits/types.h
... /usr/include/bits/wordsize.h
... /usr/include/bits/typesizes.h
.. /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.5.2/include/stddef.h
.. /usr/include/bits/confname.h
.. /usr/include/getopt.h
. /usr/include/stdio.h
.. /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.5.2/include/stddef.h
.. /usr/include/libio.h
... /usr/include/_G_config.h
.... /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.5.2/include/stddef.h
.... /usr/include/wchar.h
... /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.5.2/include/stdarg.h
.. /usr/include/bits/stdio_lim.h
.. /usr/include/bits/sys_errlist.h
Multiple include guards may be useful for:
/usr/include/bits/confname.h
/usr/include/bits/environments.h
/usr/include/bits/predefs.h
/usr/include/bits/stdio_lim.h
/usr/include/bits/sys_errlist.h
/usr/include/bits/typesizes.h
/usr/include/gnu/stubs-64.h
/usr/include/gnu/stubs.h
/usr/include/wchar.h
If you are using gcc/g++, the -M or -MM option will output a line with the information you seek. (The former will include system headers while the latter will not. There are other variants; see the manual.)
$ gcc -M -c foo.c
foo.o: foo.c /usr/include/stdint.h /usr/include/features.h \
/usr/include/sys/cdefs.h /usr/include/bits/wordsize.h \
/usr/include/gnu/stubs.h /usr/include/gnu/stubs-64.h \
/usr/include/bits/wchar.h
You would need to remove the foo.o: foo.c at the beginning, but the rest is a list of all headers that the file depends on, so it would not be too hard to write a script to gather these and summarize them.
Of course this suggestion is only useful on Unix and only if nobody else has a better idea. :-)
a few things-
use "preprocess only" to look at your preprocessor output. gcc -E option, other compilers have the function too
use precompiled headers.
gcc has -verbose and --trace options which also display the full include tree, MSVC has the /showIncludes option found under Advanced C++ property page
Also, Displaying the #include hierarchy for a C++ file in Visual Studio
"Large Scale C++ Software Design" by John Lakos had tools that extracted the compile-time dependencies among source files.
Unfortunately, their repository on Addison-Wesley's site is gone (along with AW's site itself), but I found a tarball here:
http://prdownloads.sourceforge.net/introspector/LSC-rpkg-0.1.tgz?download
I found it useful several jobs ago, and it has the virtue of being free.
BTW, if you haven't read Lakos's book, it sounds like your project would benefit. (The current edition is a bit dated, but I hear that Lakos has another book coming out in 2012.)
GCC has a -M flag that will output a list of dependencies for a given source file. You could use that information to figure out which of your files have the most dependencies, which files are most depended on, etc.
Check out the man page for more information. There are several variants of -M.
Personally I don't know if there is a tool that will say "Remove this file". It's really a complex matter that depends on a lot of things. Looking at a tree of include statements is surely going to drive you nuts.... It would drive me crazy, as well as ruin my eyes. There are better ways to do things to reduce your compile times.
De-inline your class methods.
After deinlining them, re-examine your include statements and attempt to remove them. Usually helpful to delete them, and start over.
Prefer to use forward declarations are much as possible. If you de-inline methods in your header files you can do this alot.
Break up large header files into smaller files. If a class in a file is used more often than most, then put it in a header file all by itself.
1000 translational units is not very much actually. We have between 10-20 thousand. :)
Get Incredibuild if your compile times are still too long.
I heard there are some tools do it, but I don't use them.
I created some tool https://sourceforge.net/p/headerfinder may be this is useful. Unfortunately it is "HOME MADE" tool with following issues,
Developed in Vb.Net
Source code need to compiled
Very slow and consumes memory.
No help available.
GCC Has a flag (-save-temps) with which you can save intermediate files. This includes .ii files, which are the results of the preprocessor (so before compilation). You can write a script to parse this and determine the weight/cost/size of what is included, as well as the dependency tree.
I wrote a Python script to do just this (publicly available here: https://gitlab.com/p_b_omta/gcc-include-analyzer).
Related
Is there an automated way to take a large amount of C++ header files and combine them in a single one?
This operation must, of course, concatenate the files in the right order so that no types, etc. are defined before they are used in upcoming classes and functions.
Basically, I'm looking for something that allows me to distribute my library in two files (libfoo.h, libfoo.a), instead of the current bunch of include files + the binary library.
As your comment says:
.. I want to make it easier for library users, so they can just do one single #include and have it all.
Then you could just spend some time, including all your headers in a "wrapper" header, in the right order. 50 headers are not that much. Just do something like:
// libfoo.h
#include "header1.h"
#include "header2.h"
// ..
#include "headerN.h"
This will not take that much time, if you do this manually.
Also, adding new headers later - a matter of seconds, to add them in this "wrapper header".
In my opinion, this is the most simple, clean and working solution.
A little bit late, but here it is. I just recently stumbled into this same problem myself and coded this solution: https://github.com/rpvelloso/oneheader
How does it works?
Your project's folder is scanned for C/C++ headers and a list of headers found is created;
For every header in the list it analyzes its #include directives and assemble a dependency graph in the following way:
If the included header is not located inside the project's folder then it is ignored (e.g., if it is a system header);
If the included header is located inside the project's folder then an edge is create in the dependency graph, linking the included header to the current header being analyzed;
The dependency graph is topologically sorted to determine the correct order to concatenate the headers into a single file. If a cycle is found in the graph, the process is interrupted (i.e., if it is not a DAG);
Limitations:
It currently only detects single line #include directives (e.g., #include );
It does not handles headers with the same name in different paths;
It only gives you a correct order to combine all the headers, you still need to concatenate them (maybe you want remove or modify some of them prior to merging).
Compiling:
g++ -Wall -ggdb -std=c++1y -lstdc++fs oneheader.cpp -o oneheader[.exe]
Usage:
./oneheader[.exe] project_folder/ > file_sequence.txt
(Adapting an answer to my dupe question:)
There are several other libraries which aim for a single-header form of distribution, but are developed using multiple files; and they too need such a mechanism. For some (most?) it is opaque and not part of the distributed code. Luckily, there is at least one exception: Lyra, a command-line argument parsing library; it uses a Python-based include file fuser/joiner script, which you can find here.
The script is not well-documented, but they way you use it is with 3 command-line arguments:
--src-include - The include file to convert, i.e. to merge its include directives into its body. In your case it's libfoo.h which includes the other files.
--dst-include - The output file to write - the result of the merging.
--src-include-dir - The directory relative to which include files are specified (i.e. an "include search path" of one directory; the script doesn't support the complex mechanism of multiple include paths and search priorities which the C++ compiler offers)
The script acts recursively, so if file1.h includes another file under the --src-include-dir, that should be merged in as well.
Now, I could nitpick at the code of that script, but - hey, it works and it's FOSS - distributed with the Boost license.
If your library is so big that you cannot build and maintain a single wrapping header file like Kiril suggested, this may mean that it is not architectured well enough.
So if your library is really huge (above a million lines of source code), you might consider automating that, with tools like
GCC make dependency generator preprocessor options like -M -MD -MF etc, with another hand made script sorting them
expensive commercial static analysis tools like coverity
customizing a compiler thru plugins or (for GCC 4.6) MELT extensions
But I don't understand why you want an automated way of doing this. If the library is of reasonable size, you should understand it and be able to write and maintain a wrapping header by hand. Automating that task will take you some efforts (probably weeks, not minutes) so is worthwhile only for very large libraries.
If you have a master include file that includes all others available, you could simply hack a C preprocessor re-implementation in Perl. Process only ""-style includes and recursively paste the contents of these files. Should be a twenty-liner.
If not, you have to write one up yourself or try at random. Automatic dependency tracking in C++ is hard. Like in "let's see if this template instantiation causes an implicit instantiation of the argument class" hard. The only automated way I see is to shuffle your include files into a random order, see if the whole bunch compiles, and re-shuffle them until it compiles. Which will take n! time, you might be better off writing that include file by hand.
While the first variant is easy enough to hack, I doubt the sensibility of this hack, because you want to distribute on a package level (source tarball, deb package, Windows installer) instead of a file level.
You really need a build script to generate this as you work, and a preprocessor flag to disable use of the amalgamate (that could be for your uses).
To simplify this script/program, it helps to have your header structures and include hygiene in top form.
Your program/script will need to know your discovery paths (hint: minimise the count of search paths to one if possible).
Run the script or program (which you create) to replace include directives with header file contents.
Assuming your headers are all guarded as is typical, you can keep track of what files you have already physically included and perform no action if there is another request to include them. If a header is not found, leave it as-is (as an include directive) -- this is required for system/third party headers -- unless you use a separate header for external includes (which is not at all a bad idea).
It's good to have a build phase/translation that includes header alone and produces zero warnings or errors (warnings as errors).
Alternatively, you can create a special distribution repository so they never need to do more than pull from it occasionally.
What you want to do sounds "javascriptish" to me :-) . But if you insist, there is always "cat" (or the equivalent in Windows):
$ cat file1.h file2.h file3.h > my_big_file.h
Or if you are using gcc, create a file my_decent_lib_header.h with the following contents:
#include "file1.h"
#include "file2.h"
#include "file3.h"
and then use
$ gcc -C -E my_decent_lib_header.h -o my_big_file.h
and this way you even get file/line directives that will refer to the original files (although that can be disabled, if you wish).
As for how automatic is this for your file order, well, it is not at all; you have to decide the order yourself. In fact, I would be surprised to hear that a tool that orders header dependencies correctly in all cases for C/C++ can be built.
usually you don't want to include every bit of information from all your headers into the special header that enables the potential user to actually use your library. The non-trivial removal of type definitions, further includes or defines, that are not necessary for the user of your interface to know can not be automatedly done. As far as I know.
Short answer to your main question:
No.
My suggestions:
manually make a new header, that contains all relevant information (nothing more, nothing less) for the user of your library interface. Add nice documentation comments for each component it contains.
use forward declarations where possible, instead of full-fledged included definitions. Put the actual includes in your implementation files. The less include statements you have in your headers, the better.
don't build a deeply nested hierarchy of includes. This makes it extremely hard to keep an overview on the contents of every bit you include. The user of your library will look into the header to learn how to use it. And he will probably not be able to distinguish relevant code from irrelevant on the first sight. You want to maximize the ratio of relevant code per total code in the main header for your library.
EDIT
If you really do have a toolkit library, and the order of inclusion really does not matter, and you have a bunch of independent headers, that you want to enumerate just for convenience into a single header, then you can use a simple script. Like the following Python (untested):
import glob
with open("convenience_header.h", 'w') as f:
for header in glob.glob("*.h"):
f.write("#include \"%s\"\n" % header)
I have a c++ project with multiple source files and multiple header files. I want to submit my project for a programming contest which requires a single source file. Is there an automated way of collapsing all the files into single .cpp file?
For example if I had a.cpp, a.h, b.cpp, b.h etc., I want to get a main.cpp which will compile and run successfully. If I did this manually, could I simply merge the header files and append the source files to each other? Are there gotchas with externs, include dependencies and forward declarations?
I also needed this for a coding contest. Codingame to be precise. So I wrote a quick JavaScript script to do the trick. You can find it here:
https://www.npmjs.com/package/codingame-cpp-merge
I used it in 1 live contest and one offline game and it never produced bad results. Feel free to suggest changes or make pull requests for it on github!
In general, you cannot do this. Whilst you can happily paste the contents of header files to the locations of the corresponding #includes, you cannot, in general, simply concatenate source files. For starters, you may end up with naming clashes between things with file scope. And given that you will have copy-pasted header files (with class definitions, etc.) into each source file, you'll end up with classes defined multiple times.
There are much better solutions. As has been mentioned, why not simply zip up your entire project directory (after you've cleaned out auto-generated object files, etc.)? And if you really must have a single source file, then just write a single source file!
well, this is possiable, I have seen many project combine source files to single .h and .c/.cpp, such as sqlite
but the code must have some limits, such as you should not have static global variable in one of your source codes.
there may not have a generic tool for combine sources.you should write one base on your code.
here is some examples
gaclib source pack tool
The CIL utility is able to do this:
$TIGRESS_HOME/cilly --merge -c x1.c -o x1.o
$TIGRESS_HOME/cilly --merge -c x2.c -o x2.o
$TIGRESS_HOME/cilly --merge -c x3.c -o x4.o
$TIGRESS_HOME/cilly --merge --keepmerged x1.o x2.o x3.o -o merged --mergedout=merged.c
Usage example taken from here. Read the documentation about CIL and its shortcomings here. A binary distribution for Mac OS X and Linux is provided with Tigress.
I think the project with headers, sources files must be must nicer than the one with only one main file. Not only easier to work and read with but also they know you do good job at separating program's modules.
Due to your solution, I provide this format and I think you have to do hand-work:
// STL headers
// --- prototype
// monster.h
// prince.h
// --- implementation
int main() {
// your main function
return 0;
}
I just found an npm package that works perfectly for me, for exactly that porpouse: cpp-merge
I provide the link:
https://www.npmjs.com/package/cpp-merge
To install:
npm install -g cpp-merge
To use:
cpp-merge file.cpp
That will create a merged file with all the files specified in #include "library.cpp", within file.cpp.
Also, Is worth mentioning that it goes to standard output. For directing it to a file, you can do:
cpp-merge --output output.cpp
Checkout the documentation for more details!
I don't know of a tool that combines .cpp files together but I would just zip all of the files up together and send them over as a gzip file.
If you choose to send an individual file rather than a compressed archive, such as a tarball or a zip file, there are probably a few things you should consider.
First, concatenate the files together as Thomas Matthews already mentioned. With a few changes, you can typically compile the one file. Remove the non-existent #include statements, such as the headers that have now been included.
You will also have to concatenate these files in their respective dependency order. That is, if a.cpp needs a class declared in b.hpp, then you will most likely need to concatenate in the order Thomas Matthews listed.
That being said, I think the best way to share code is via a public repository, such as GitHub.com or compressed archive.
I need to submit an assignment, but I only want to include the header files from boost that I actually used (I made use of boost::shared_ptr and boost::function). I tried doing so manually, but I'm missing some header files and everytime i go to add them, it turns out I'm missing more. Is there a quick easy way to find out exactly what headers I actually need?
Thanks
The bcp command is made for this:
NAME
bcp - extract subsets of Boost
SYNOPSIS
bcp --list [options] module-list
bcp [options] module-list output-path
bcp --report [options] module-list html-file
bcp --help
DESCRIPTION
Copies all the files, including dependencies, found in module-list to
output-path. output-path must be an existing path.
But you will probably be surprised to see just how interdependent these Boost headers are.
There is a tool called bcp to do exactly that -- copy out the parts of Boost you need and no more.
There is actually another solution to your issue: the preprocessor.
The compiler you use should have a switch to only run the preprocessor: -E on gcc and clang. Given this, you can preprocess the two files you include, and stash the result of this run into a header file (each) of your own.
Add header guards, include the already preprocessed headers in lieu of the regular boost headers, and you're done.
Of course there might be some repetition between the two headers, a diff tool could potentially help you spotting it and factoring it in another common header... but for an assignment I would certainly not bother.
You might also consider telling your teacher that as he does not ask you to provide the standard library headers you compiled with, he should not be asking for the boost headers you used.
My code is linking against several other libraries that are also developed at my company, one of these libraries is redefining several values from errno.h, I would like to be able to fix this, however I am having trouble finding the exact file that is redefining these values, I am want to know if there is a way to make the compiler tell me when a file has defined a particular value.
You can probably do it by adding -include errno.h to the command line that builds the library in question. Here's a quick example. I have a C program called "file.c":
#define ESRCH 8
That's it - then I compile with:
cc -c -include errno.h file.c
And presto, a compiler warning:
file.c:1:1: warning: "ESRCH" redefined
In file included from /usr/include/errno.h:23,
from <command-line>:0:
/usr/include/sys/errno.h:84:1: warning: this is the location of the previous definition
That will tell you where your bad definitions are.
Have you tried searching with grep?
If you don't want to search through all your headers for the particular #define, you could use
#undef YOUR_MANIFEST_CONSTANT
after each #include in your source module and then start removing them from the bottom up and see where your definitions come from.
Also, your compiler may tell you that a #define has been redefined. Turn all your warnings on.
With GCC I did something similar with:
g++ input.cc -dD -E > cpp.out
-dD tells cpp to print all defines where they were defined. And in the cpp output there are also markers for the include file names and the line numbers.
It is possible that some environments, I'm thinking IDE's here, have configuration options tied into the "project settings" rather than using a configuration header. If you work with a lot of other developers in a place where this behavior is NOT frowned on then you might also check your tool settings.
Most compilers will tell you where the problem is, you have to look and think about what the diagnostic notification is telling you.
Short of that, grep/findstr on *nix/Windows is your friend.
If that yields nothing then check for tool settings in your build system.
Some IDE's will jump to the correct location if you right click on the usage and select 'go to definition'.
Another option if you're really stuck is a command line option on the compiler. Most compilers have an option to output the assembler they generate when compiling C++ code.
You can view this assembler (which has comments letting you know the relative line number in the C++ source file). You don't have to understand the assembler but you can see what value was used and what files and definitions were included when the compiler ran. Check your compiler's documentation for the exact option to use
I am working on a large C++ project in Visual Studio 2008, and there are a lot of files with unnecessary #include directives. Sometimes the #includes are just artifacts and everything will compile fine with them removed, and in other cases classes could be forward declared and the #include could be moved to the .cpp file. Are there any good tools for detecting both of these cases?
While it won't reveal unneeded include files, Visual studio has a setting /showIncludes (right click on a .cpp file, Properties->C/C++->Advanced) that will output a tree of all included files at compile time. This can help in identifying files that shouldn't need to be included.
You can also take a look at the pimpl idiom to let you get away with fewer header file dependencies to make it easier to see the cruft that you can remove.
PC Lint works quite well for this, and it finds all sorts of other goofy problems for you too. It has command line options that can be used to create External Tools in Visual Studio, but I've found that the Visual Lint addin is easier to work with. Even the free version of Visual Lint helps. But give PC-Lint a shot. Configuring it so it doesn't give you too many warnings takes a bit of time, but you'll be amazed at what it turns up.
There's a new Clang-based tool, include-what-you-use, that aims to do this.
!!DISCLAIMER!! I work on a commercial static analysis tool (not PC Lint). !!DISCLAIMER!!
There are several issues with a simple non parsing approach:
1) Overload Sets:
It's possible that an overloaded function has declarations that come from different files. It might be that removing one header file results in a different overload being chosen rather than a compile error! The result will be a silent change in semantics that may be very difficult to track down afterwards.
2) Template specializations:
Similar to the overload example, if you have partial or explicit specializations for a template you want them all to be visible when the template is used. It might be that specializations for the primary template are in different header files. Removing the header with the specialization will not cause a compile error, but may result in undefined behaviour if that specialization would have been selected. (See: Visibility of template specialization of C++ function)
As pointed out by 'msalters', performing a full analysis of the code also allows for analysis of class usage. By checking how a class is used though a specific path of files, it is possible that the definition of the class (and therefore all of its dependnecies) can be removed completely or at least moved to a level closer to the main source in the include tree.
I don't know of any such tools, and I have thought about writing one in the past, but it turns out that this is a difficult problem to solve.
Say your source file includes a.h and b.h; a.h contains #define USE_FEATURE_X and b.h uses #ifdef USE_FEATURE_X. If #include "a.h" is commented out, your file may still compile, but may not do what you expect. Detecting this programatically is non-trivial.
Whatever tool does this would need to know your build environment as well. If a.h looks like:
#if defined( WINNT )
#define USE_FEATURE_X
#endif
Then USE_FEATURE_X is only defined if WINNT is defined, so the tool would need to know what directives are generated by the compiler itself as well as which ones are specified in the compile command rather than in a header file.
Like Timmermans, I'm not familiar with any tools for this. But I have known programmers who wrote a Perl (or Python) script to try commenting out each include line one at a time and then compile each file.
It appears that now Eric Raymond has a tool for this.
Google's cpplint.py has an "include what you use" rule (among many others), but as far as I can tell, no "include only what you use." Even so, it can be useful.
If you're interested in this topic in general, you might want to check out Lakos' Large Scale C++ Software Design. It's a bit dated, but goes into lots of "physical design" issues like finding the absolute minimum of headers that need to be included. I haven't really seen this sort of thing discussed anywhere else.
Give Include Manager a try. It integrates easily in Visual Studio and visualizes your include paths which helps you to find unnecessary stuff.
Internally it uses Graphviz but there are many more cool features. And although it is a commercial product it has a very low price.
You can build an include graph using C/C++ Include File Dependencies Watcher, and find unneeded includes visually.
If your header files generally start with
#ifndef __SOMEHEADER_H__
#define __SOMEHEADER_H__
// header contents
#endif
(as opposed to using #pragma once) you could change that to:
#ifndef __SOMEHEADER_H__
#define __SOMEHEADER_H__
// header contents
#else
#pragma message("Someheader.h superfluously included")
#endif
And since the compiler outputs the name of the cpp file being compiled, that would let you know at least which cpp file is causing the header to be brought in multiple times.
PC-Lint can indeed do this. One easy way to do this is to configure it to detect just unused include files and ignore all other issues. This is pretty straightforward - to enable just message 766 ("Header file not used in module"), just include the options -w0 +e766 on the command line.
The same approach can also be used with related messages such as 964 ("Header file not directly used in module") and 966 ("Indirectly included header file not used in module").
FWIW I wrote about this in more detail in a blog post last week at http://www.riverblade.co.uk/blog.php?archive=2008_09_01_archive.xml#3575027665614976318.
Adding one or both of the following #defines
will exclude often unnecessary header files and
may substantially improve
compile times especially if the code that is not using Windows API functions.
#define WIN32_LEAN_AND_MEAN
#define VC_EXTRALEAN
See http://support.microsoft.com/kb/166474
If you are looking to remove unnecessary #include files in order to decrease build times, your time and money might be better spent parallelizing your build process using cl.exe /MP, make -j, Xoreax IncrediBuild, distcc/icecream, etc.
Of course, if you already have a parallel build process and you're still trying to speed it up, then by all means clean up your #include directives and remove those unnecessary dependencies.
Start with each include file, and ensure that each include file only includes what is necessary to compile itself. Any include files that are then missing for the C++ files, can be added to the C++ files themselves.
For each include and source file, comment out each include file one at a time and see if it compiles.
It is also a good idea to sort the include files alphabetically, and where this is not possible, add a comment.
If you aren't already, using a precompiled header to include everything that you're not going to change (platform headers, external SDK headers, or static already completed pieces of your project) will make a huge difference in build times.
http://msdn.microsoft.com/en-us/library/szfdksca(VS.71).aspx
Also, although it may be too late for your project, organizing your project into sections and not lumping all local headers to one big main header is a good practice, although it takes a little extra work.
If you would work with Eclipse CDT you could try out http://includator.com to optimize your include structure. However, Includator might not know enough about VC++'s predefined includes and setting up CDT to use VC++ with correct includes is not built into CDT yet.
The latest Jetbrains IDE, CLion, automatically shows (in gray) the includes that are not used in the current file.
It is also possible to have the list of all the unused includes (and also functions, methods, etc...) from the IDE.
Some of the existing answers state that it's hard. That's indeed true, because you need a full compiler to detect the cases in which a forward declaration would be appropriate. You cant parse C++ without knowing what the symbols mean; the grammar is simply too ambiguous for that. You must know whether a certain name names a class (could be forward-declared) or a variable (can't). Also, you need to be namespace-aware.
Maybe a little late, but I once found a WebKit perl script that did just what you wanted. It'll need some adapting I believe (I'm not well versed in perl), but it should do the trick:
http://trac.webkit.org/browser/branches/old/safari-3-2-branch/WebKitTools/Scripts/find-extra-includes
(this is an old branch because trunk doesn't have the file anymore)
If there's a particular header that you think isn't needed anymore (say
string.h), you can comment out that include then put this below all the
includes:
#ifdef _STRING_H_
# error string.h is included indirectly
#endif
Of course your interface headers might use a different #define convention
to record their inclusion in CPP memory. Or no convention, in which case
this approach won't work.
Then rebuild. There are three possibilities:
It builds ok. string.h wasn't compile-critical, and the include for it
can be removed.
The #error trips. string.g was included indirectly somehow
You still don't know if string.h is required. If it is required, you
should directly #include it (see below).
You get some other compilation error. string.h was needed and isn't being
included indirectly, so the include was correct to begin with.
Note that depending on indirect inclusion when your .h or .c directly uses
another .h is almost certainly a bug: you are in effect promising that your
code will only require that header as long as some other header you're using
requires it, which probably isn't what you meant.
The caveats mentioned in other answers about headers that modify behavior
rather that declaring things which cause build failures apply here as well.