protocol buffers - generate NON-inline accessors - c++

We're using protocol buffers (2.4.1) in a medium size embedded system w/ c# an c++ code. We use protobufs to isolate our managed and native layers w/ an easy to maintain serialization layer (For the curious, we would have just used Pinvoke, but we also have to run native code in a separate process on test/simulators).
Our system has a number of DLL's, and I have the generated native protobuf code in it's own DLL so that the other parts of the system can don't have to link in the generated code directly.
The problem i'm having is that all the generated accessors are inline, eg:
inline const ::MyProtoClassName::MyField& MyProtoClassName::myfield() const
{
return myfield_ != NULL ? *myfield_ : *default_instance_->myfield_;
}
use the generated API in size (the 'default_instance_' is dereferenced and accessed if this particular field not set). This means that I can't link (lnk2001) any clients using the accessors because there is not symbol default_instance_
I think the typical use case for ProtoBufs would be to have each component link in the generated protobuf code itself (after all, this is primarily a serialization layer for distributed systems), but
I'm wondering if there is a compile switch to change the inlining behavior that I missed. (Have all accessors defined in the CC file, not H)
Thanks!
Thanks #g-makulik! Looks like the answer was about 30 lines up in the ProtoC code, i just didn't see it :)
< see his answer below for the bulk of the solution > This should also help you, though.
As noted in some of Kenton's changelogs, adding this causes several warnings (C4251, c4275) relating to base classes that are not also DLLEXPORT'd
with the way ProtoBufs is implemented, and the protobuf classes being all templates, these warnings are benign. To cleanly ignore them (eg WITHOUT having to disable the warnings for all clients) I used this somewhat hacky approach:
-wrapper for the protobuf.h file that eveyone includes. (no one include the real generated H file)
#pragma once
#pragma warning(push)
#pragma warning(disable:4251)
#pragma warning(disable:4275)
// include the protobuf generated code; but exclude the warn c4251, c4275
// these relate to the dll exported
#include "yourProtoFile.h"
#pragma warning(pop)
and a wrapper C file (the real protobuf CC file is not in my project -> not built directly)
#include "MyProtoFile_WRAPPER.h"
#include "MyProtoFile.cc"

I think this link could answer your question: Protobuf with MSVC: how to export generated Message
Quote from Kenon Varga (Project leader of google protobuf):
If you invoke protoc like:
protoc --cpp_out=dllexport_decl=MY_EXPORT_MACRO:path/to/output/dir myproto.proto
then it will generate code with MY_EXPORT_MACRO in all the right places. However, this option is incomplete -- currently there is no way to force the generated .pb.h to #include a header which defines MY_EXPORT_MACRO. I'm open to patches to fix this. Or, you could use a hack to work around it, such as adding the #include via some sort of text processing after protoc finishes, or perhaps moving the .pb.h to .pb2.h and replacing the .pb.h file with one that first includes your header then includes the .pb2.h...
And additonally to cover the mentioned incompleteness (Quote from Aron Bierbaum):
We currently get around this limitation by forcing the the header to
be included using compiler command line flags:
Windows: /FIproject/Config.h
Linux: -include project/Config.h
NOTE:
Using the inline keyword shouldn't be relevant in this case (that an additional __declspec dllexport/__declspec dllimport attribute is put on the accessor method). Per definition it's up to the compiler to inline functions or not. Of course seeing a __declspec dllexport or __declspec dllimportattribute is contradictionary for inlining.

Related

Show #include hierarchy in C++Builder

I'm having trouble with someone else's code, what seems to be header files included out of order. (E.g., I'm getting redefinition errors, some of which are even in the same file!) It would be useful to see the #include tree the C++Builder compiler is using, similar to Visual Studio's -showIncludes flag. Is there any such functionality; if so, how do I access it? I am specifically using C++Builder 2007.
This usually happens if you including multiple times files which contains global constants, variables and sometimes even #defines. This is very common for MDI apps where the master Form contains include of the child Forms and some of them use the same libs ...
The include hierarchy would not help for this unless you are planning to edit all source files #include order which can lead to problems later on (especially compatibility)...
To remedy this you should encapsulate all such files with
#ifndef _file_name_h
#define _file_name_h
// here your source and includes
#endif
statements. Like in this example:
OpenGL drawing a cube
That will prevent multiple definitions and compilations on pre-compiler level as the source will be processed only the first time (while #define _file_name_h is still not defined).
Sadly, there are no Borland C Compiler options for displaying the hierarchy of #included files. See Embarcadero's BCC32 CLI docs.
However, an alternative (granted, not as clean) is to use the Borland C Compiler Preprocessor, e.g.
CPP32 -Sr source.cpp # outputs source.i with comments and indentation retained

Library design: allow user to decide between "header-only" and dynamically linked?

I have created several C++ libraries that currently are header-only. Both the interface and the implementation of my classes are written in the same .hpp file.
I've recently started thinking that this kind of design is not very good:
If the user wants to compile the library and link it dynamically, he/she can't.
Changing a single line of code requires full recompilation of existing projects that depend on the library.
I really enjoy the aspects of header-only libraries though: all functions get potentially inlined and they're very very easy to include in your projects - no need to compile/link anything, just a simple #include directive.
Is it possible to get the best of both worlds? I mean - allowing the user to choose how he/she wants to use the library. It would also speed up development, as I'd work on the library in "dynamically-linking mode" to avoid absurd compilation times, and release my finished products in "header-only mode" to maximize performance.
The first logical step is dividing interface and implementation in .hpp and .inl files.
I'm not sure how to go forward, though. I've seen many libraries prepend LIBRARY_API macros to their function/class declarations - maybe something similar would be needed to allow the user to choose?
All of my library functions are prefixed with the inline keyword, to avoid "multiple definition of..." errors. I assume the keyword would be replaced by a LIBRARY_INLINE macro in the .inl files? The macro would resolve to inline for "header-only mode", and to nothing for the "dynamically-linking mode".
Preliminary note: I am assuming a Windows environment, but this should be easily transferable to other environments.
Your library has to be prepared for four situations:
Used as header-only library
Used as static library
Used as dynamic library (functions are imported)
Built as dynamic library (functions are exported)
So let's make up four preprocessor defines for those cases: INLINE_LIBRARY, STATIC_LIBRARY, IMPORT_LIBRARY, and EXPORT_LIBRARY (it is just an example; you may want to use some sophisticated naming scheme).
The user has to define one of them, depending on what he/she wants.
Then you can write your headers like this:
// foo.hpp
#if defined(INLINE_LIBRARY)
#define LIBRARY_API inline
#elif defined(STATIC_LIBRARY)
#define LIBRARY_API
#elif defined(EXPORT_LIBRARY)
#define LIBRARY_API __declspec(dllexport)
#elif defined(IMPORT_LIBRARY)
#define LIBRARY_API __declspec(dllimport)
#endif
LIBRARY_API void foo();
#ifdef INLINE_LIBRARY
#include "foo.cpp"
#endif
Your implementation file looks just like usual:
// foo.cpp
#include "foo.hpp"
#include <iostream>
void foo()
{
std::cout << "foo";
}
If INLINE_LIBRARY is defined, the functions are declared inline and the implementation gets included like a .inl file.
If STATIC_LIBRARY is defined, the functions are declared without any specifier, and the user has to include the .cpp file into his/her build process.
If IMPORT_LIBRARY is defined, the functions are imported, and there isn't a need for any implementation.
If EXPORT_LIBRARY is defined, the functions are exported and the user has to compile those .cpp files.
Switching between static / import / export is a really common thing, but I'm not sure if adding header-only to the equation is a good thing. Normally, there are good reasons for defining something inline or not to do so.
Personally, I like to put everything into .cpp files unless it really has to be inlined (like templates) or it makes sense performance-wise (very small functions, usually one-liners). This reduces both compile time and - way more important - dependencies.
But if I choose to define something inline, I always put it in separate .inl files, just to keep the header files clean and easy to understand.
It is operating system and compiler specific. On Linux with a very recent GCC compiler (version 4.9) you might produce a static library using interprocedural linktime optimization.
This means that you build your library with g++ -O2 -flto both at compile and at library link time, and that you use your library with g++ -O2 -flto both at compile and link time of the invoking program.
This is to complement #Horstling's answer.
You can either create a static or a dynamic library. When you create statically-linked libraries, compiled code for all functions/objects will be saved to a file (with .lib extension in Windows). At main project (the project using the library) 's link time, these codes will be linked into your final executable together with the main project codes. So the final executable wouldn't have any runtime dependency.
Dynamically linked libraries will be merged into the main project at run time (and not link time). When you compile the library you get a .dll file (which contains actual compiled code) and a .lib file (which contains enough data for the compiler/runtime to find functions/objects in the .dll file). At link time, the executable will be configured to load the .dll and use compiled code from that .dll as needed. You will need to distribute the .dll file with your executable to be able to run it.
There is no need to choose between static or dynamic linking (or header-only) when designing your library, you create multiple project/makefiles, one to create a static .lib, another to create a .lib/.dll pair, and distribute both versions, for the user to choose between. (You'll need to use preprocessor macros like the ones #Horstling suggested).
You cannot put any templates in a pre-compiled library, unless you use a technique called Explicit Instantiation, which limits template parameters.
Also note that modern compiler/linkers usually do not respect the inline modifier. They may inline a function even if it's not designated as inline, or may dynamically call another that has inline modifier, as they see fit. (Regardless, I'll advise explicitly putting inline where applicable for maximum compatibility). So, there won't be any runtime performance penalty if you use a statically linked library instead of a header-only library (and enable compiler/linker optimizations, of course). As others have suggested, for really small functions that are sure to benefit from being called inline, it is best practice to put them in header files, so dynamically linked libraries will also not suffer any significant performance loss. (In any case, inlining functions will only affect performance for functions that are being called very often, inside loops that are going to be called thousands/millions of times).
Instead of putting inline functions in header files (with an #include "foo.cpp" in your header), you can change makefile/project settings and add foo.cpp to the list of source files to be compiled. This way, if you change any function implementation there will be no need to re-compile the whole project and only foo.cpp will be re-compiled. As I mentioned earlier, your small functions will still be inlined by the optimizing compiler, and you don't need to worry about that.
If you use/design a pre-compiled library, you should consider the case where the library is compiled with a different version of compiler to the main project. Each different compiler version (even different configurations, like Debug or Release) uses a different C runtime (things like memcpy, printf, fopen, ...) and C++ standard library runtime (things like std::vector<>, std::string, ...). These different library implementations may complicate linking, or even create runtime errors.
As a general rule, always avoid sharing compiler runtime objects (data structures that are not defined by standards, like FILE*) across libraries, because incompatible data structures will lead to runtime errors.
When linking your project, C/C++ runtime functions must be linked into your library .lib or .lib/.dll, or your executable .exe. C/C++ runtime itself can be linked as static or dynamic library (you can set this in makefile/project settings).
You will find that dynamically linking to C/C++ runtime in both the library and the main project (even when you compile the library itself as a static library) avoids most linking problems (with duplicate function implementations in multiple runtime versions). Of course you would need to distribute runtime DLLs for all used versions with your executable and library.
There are scenarios that statically linking to C/C++ runtime is needed, and the best approach in these cases would be to compile the library with the same compiler setting as the main project to avoid linking problems.
Rationale
Put as little as necessary in header files and as much as possible in library modules, because of the very reasons that you mentioned: compile-time dependency and long compilation time. The only good reasons for header-only modules are:
generic templates for user-defined template parameter;
very short convenience functions when inlining gives significant
performance.
In case 1, it is often possible to hide some functionality that does not depend on the user-defined type in a .cpp file.
Conclusion
If you stick to this rationale, then there is no choice: templated functionality that must allow user-defined types cannot be pre-compiled, but requires a header-only implementation. Other functionality should be hidden from the user in a library to avoid exposing them to the implementation details.
Rather than a dynamic library, you could have a precompiled static library and thin header file. In an interactive quick build, you get the benefit of not having to recompile the world if implementation details changes. But a fully optimized release build can do global optimization and still figure out it can inline functions. Basically, with "link-time code generation" the toolset does the trick you were thinking about.
I'm familiar with Microsoft's compiler, which I know for sure does this as of Visual Studio 2010 (if not earlier).
Templated code will necessarily be header-only: for instantiating this code, the type parameters must be known at compilation time. There is no way to embed template code in shared libraries. Only .NET and Java support JIT instantiation from byte-code.
Re: non-template code, for short one-liners I suggest keeping it header-only. Inline functions give the compiler a lot more opportunities to optimize the final code.
To avoid "insane compilation time", Microsoft Visual C++ has a "precompiled headers" feature. I do not think GCC has a similar feature.
Long functions should not be inlined in any case.
I had one project which had header-only bits, compiled library bits and some bits I could not decide where belonged. I ended up having .inc files, conditionally included in either .hpp or .cxx depending on #ifdef. Truth to be told, the project was always compiled in "max inline" mode, so after a while I got rid of the .inc files and simply moved the contents to .hpp files.
Is it possible to get the best of both worlds?
In terms; limitations arise because tools aren't smart enough. This answer gives the current best effort that is still portable enough to be used effectively.
I've recently started thinking that this kind of design is not very good.
It ought to be. Header-only libraries are ideal because they simplify deployment: makes the reusing mechanism of the language similar to almost all others', which is just the sane thing to do. But this is C++. Current C++ tools still rely on half-a-century-old linking models that remove important degrees of flexibility, such as choosing which entry points to import or export on an individual level without being forced to change the library's original source code. Also, C++ lacks a proper module system and still relies on glorified copy-paste operations to work (although this is just a side factor to the problem in question).
In fact, MSVC is a little better in this regard. It is the only major implementation trying to achieve some degree of modularity in C++ (by attempting e.g. C++ modules). And it is the only compiler that actually allows e.g. the following:
//// Module.c++
#pragma once
inline void Func() { /* ... */ }
//// Program1.c++
#include <Module.c++>
// Inlines or "vague" links Func(), whatever is better.
int main() { Func(); }
//// Program2.c++
// This forces Func() to be imported.
// The declaration must come *BEFORE* the definition.
__declspec(dllimport) __declspec(noinline) void Func();
#include <Module.c++>
int main() { Func(); }
//// Program3.c++
// This forces Func() to be exported.
__declspec(dllexport) __declspec(noinline) void Func();
#include <Module.c++>
Note that this can be used to selectively import and export individual symbols from the library, although still cumbersomely.
GCC also accepts this (but the order of the declarations must be changed) and Clang does not have any way to achieve the same effect without changing the library's source.

Making libraries and avoiding linker issues

I have about a dozen functions all related to the same purpose, and I am trying to create a library. A library that I can store away in a directory somewhere that I can then include as an include directory and just go about my business including said library in my projects that require its use.
Side Note:
First I'd like to say, I have given considerable time into finding an answer to this. I had thought I found a solution to it when I realized the linker would throw errors when I included the library in multiple source files. Now that I know that, I have again searched for an answer to my woes. I started looking at other libraries I know to do the same thing, or that I believe to. I looked at conio.h as I include it all the time for its kbhit() and getch() functions. Though I didn't understand most of what I had in it, I searched a few keywords and found that it may in fact be including the function definitions through a dll.
I've also done a handful of google searches.
To explain what I am doing a little bit. I have create 2 or 3 data structures that allow me to create chunks of data with headers that specify what the data is and how it is to be treated. Then a couple more for reading and writing those chunks to files.
To allow easy creation of these structures I have made stand-alone functions. To allow easy manipulation of these structures I have made even more stand-alone functions.
I, simply put, need a way to include a library that somehow directly or indirectly defines all these functions and structures. How can I do so?
(without creating a myriad of inlines)
You can include a .lib dependency in your code by using the #pragma comment
#pragma comment (lib,"LibraryFileName.lib")
You can read more about #pragma comment here.
I normally like to make a a Linker.h file in my library soultion and include that file in every header found in my library. Here is an example of it.
#ifndef __GRAPHICCOMMUNICATOR_GUARD_linklib__
#define __GRAPHICCOMMUNICATOR_GUARD_linklib__
#if defined(_DEBUG)
#pragma comment (lib,"GraphicCommunicator-mt-d.lib")
#elif !defined(_DEBUG)
#pragma comment (lib,"GraphicCommunicator-mt.lib")
#else
#error link: no suitable library
#endif
#endif // __GRAPHICCOMMUNICATOR_GUARD_linklib__
Putting everything in one header is causing the problem.
You can leave class member funnctions defined inline or other inline function in headers and include them more than once.
Otherwise you need to move implementations into cpp files and build a static (or dynamic) library. If you are learning, static may be easier. Make a new project and select "Static library" from the application settings.
To use the library you could use the #pragma answer from Caeser, or just add the .lib to the linker settings in the projects that wish to use it, along with #including the headers you need.

How to structure a "library" of C++ source?

I'm developing a collection of C++ classes and am struggling with how to share the code in a way that maintains organization without compromising ease of compilation for a user of the collection.
Options that I have seen include:
Distribute compiled library file
Put the source in the header file (with implicit inline as discussed in this answer)
Use symbolic links to allow the compiler to find the files.
I'm currently using the third option where, for each class the I want to include I symbolic link each classess headers and source files (e.g. ln -s <path_to_class folder>/myclass.cpp) This works well except that I can't move the project folder location (it breaks all the symlinks) and I have to have all those symlinked files hanging around.
I like the second option (it has the appearance of Java), but I'm worried about code size bloat if everything is declared inline.
A user of the collection will create a project folder somewhere, and somehow include the collection into their compilation process.
I'd like a few things to be possible:
Easy compilation (something like gcc *.cpp from the project folder)
Easy distribution of library in uncompiled form.
Library organization by module.
Compiled code size is not bloated.
I'm not worried about documentation (Doxygen takes care of that) or compile time: the overall modules are small and even the largest projects on the slowest machines won't take more than a few seconds to compile.
I'm using the GCC compiler, if it makes any difference.
A library is the best option (in my opinion) of the three you raised. Then provide the header file(s) in the include path and the library in the linker path.
Since you also want to distribute the library in source code form, I would be inclined to provide a compressed archive (gzip, 7-zip, tarball, or other preferred format) in a central repository.
If I understand correctly, you do not want users to have to include the .cpp files in their build, but instead just want them to use either: (i) the headers directly, (ii) use a compiled form of the lib.
Your requirements are a bit unusual, but they can be achieved. It seems to me like you could organize your code in the following manner. First, have a global define that dictates whether or not you are compiling the library:
// global.h
// ...
#define LIB_SOURCE
// ...
Then in every header file, you check whether that define is set: if the library is distributed as a static/shared lib, the definitions are not included, otherwise, the '.cpp' file is included from the header file.
// A.h
#ifndef _A_H
#include "global.h"
#ifdef LIB_SOURCE
#include "A.cpp"
#endif
// ...
#endif
where 'A.cpp' would contain the actual implementation.
Again, this is a very strange way of doing things and I would actually advise against such practice. A better way (but one which requires more work) is to always distribute a shared library. But to keep things independent of the compiler, write a C layer around it. This way, you have a portable, maintainable library.
As for some of the other requirements:
Keep the build process simple by providing a Makefile
If you worry about the code size of the compiled library, look into gcc's optimization options (-Os). If you worry about the code size of the library when distributed in source-form in the headers, this is more tricky. Since the (inlined) code will actually be in the headers, the code will obviously grow with each inclusion in a .cpp file by the user.
I ended up using inline headers for all of the code. You can see the library here:
https://github.com/libpropeller/libpropeller/tree/master/libpropeller
The library is structured as:
library folder
class A
classA.h
classA.test.h
class B
classB.h
classB.test.h
class C
...
With this structure I can distribute the library as source, and all the user has to do is include -I/path/to/library in their makefile, and #include "library/classA/classA.h" in their source files.
And, as it turns out, having inline headers actually reduces the code size. I've done a full analysis of this, and it turns out that inline code in the headers allows the compiler to make the final binary roughly 5% smaller.

How should I detect unnecessary #include files in a large C++ project?

I am working on a large C++ project in Visual Studio 2008, and there are a lot of files with unnecessary #include directives. Sometimes the #includes are just artifacts and everything will compile fine with them removed, and in other cases classes could be forward declared and the #include could be moved to the .cpp file. Are there any good tools for detecting both of these cases?
While it won't reveal unneeded include files, Visual studio has a setting /showIncludes (right click on a .cpp file, Properties->C/C++->Advanced) that will output a tree of all included files at compile time. This can help in identifying files that shouldn't need to be included.
You can also take a look at the pimpl idiom to let you get away with fewer header file dependencies to make it easier to see the cruft that you can remove.
PC Lint works quite well for this, and it finds all sorts of other goofy problems for you too. It has command line options that can be used to create External Tools in Visual Studio, but I've found that the Visual Lint addin is easier to work with. Even the free version of Visual Lint helps. But give PC-Lint a shot. Configuring it so it doesn't give you too many warnings takes a bit of time, but you'll be amazed at what it turns up.
There's a new Clang-based tool, include-what-you-use, that aims to do this.
!!DISCLAIMER!! I work on a commercial static analysis tool (not PC Lint). !!DISCLAIMER!!
There are several issues with a simple non parsing approach:
1) Overload Sets:
It's possible that an overloaded function has declarations that come from different files. It might be that removing one header file results in a different overload being chosen rather than a compile error! The result will be a silent change in semantics that may be very difficult to track down afterwards.
2) Template specializations:
Similar to the overload example, if you have partial or explicit specializations for a template you want them all to be visible when the template is used. It might be that specializations for the primary template are in different header files. Removing the header with the specialization will not cause a compile error, but may result in undefined behaviour if that specialization would have been selected. (See: Visibility of template specialization of C++ function)
As pointed out by 'msalters', performing a full analysis of the code also allows for analysis of class usage. By checking how a class is used though a specific path of files, it is possible that the definition of the class (and therefore all of its dependnecies) can be removed completely or at least moved to a level closer to the main source in the include tree.
I don't know of any such tools, and I have thought about writing one in the past, but it turns out that this is a difficult problem to solve.
Say your source file includes a.h and b.h; a.h contains #define USE_FEATURE_X and b.h uses #ifdef USE_FEATURE_X. If #include "a.h" is commented out, your file may still compile, but may not do what you expect. Detecting this programatically is non-trivial.
Whatever tool does this would need to know your build environment as well. If a.h looks like:
#if defined( WINNT )
#define USE_FEATURE_X
#endif
Then USE_FEATURE_X is only defined if WINNT is defined, so the tool would need to know what directives are generated by the compiler itself as well as which ones are specified in the compile command rather than in a header file.
Like Timmermans, I'm not familiar with any tools for this. But I have known programmers who wrote a Perl (or Python) script to try commenting out each include line one at a time and then compile each file.
It appears that now Eric Raymond has a tool for this.
Google's cpplint.py has an "include what you use" rule (among many others), but as far as I can tell, no "include only what you use." Even so, it can be useful.
If you're interested in this topic in general, you might want to check out Lakos' Large Scale C++ Software Design. It's a bit dated, but goes into lots of "physical design" issues like finding the absolute minimum of headers that need to be included. I haven't really seen this sort of thing discussed anywhere else.
Give Include Manager a try. It integrates easily in Visual Studio and visualizes your include paths which helps you to find unnecessary stuff.
Internally it uses Graphviz but there are many more cool features. And although it is a commercial product it has a very low price.
You can build an include graph using C/C++ Include File Dependencies Watcher, and find unneeded includes visually.
If your header files generally start with
#ifndef __SOMEHEADER_H__
#define __SOMEHEADER_H__
// header contents
#endif
(as opposed to using #pragma once) you could change that to:
#ifndef __SOMEHEADER_H__
#define __SOMEHEADER_H__
// header contents
#else
#pragma message("Someheader.h superfluously included")
#endif
And since the compiler outputs the name of the cpp file being compiled, that would let you know at least which cpp file is causing the header to be brought in multiple times.
PC-Lint can indeed do this. One easy way to do this is to configure it to detect just unused include files and ignore all other issues. This is pretty straightforward - to enable just message 766 ("Header file not used in module"), just include the options -w0 +e766 on the command line.
The same approach can also be used with related messages such as 964 ("Header file not directly used in module") and 966 ("Indirectly included header file not used in module").
FWIW I wrote about this in more detail in a blog post last week at http://www.riverblade.co.uk/blog.php?archive=2008_09_01_archive.xml#3575027665614976318.
Adding one or both of the following #defines
will exclude often unnecessary header files and
may substantially improve
compile times especially if the code that is not using Windows API functions.
#define WIN32_LEAN_AND_MEAN
#define VC_EXTRALEAN
See http://support.microsoft.com/kb/166474
If you are looking to remove unnecessary #include files in order to decrease build times, your time and money might be better spent parallelizing your build process using cl.exe /MP, make -j, Xoreax IncrediBuild, distcc/icecream, etc.
Of course, if you already have a parallel build process and you're still trying to speed it up, then by all means clean up your #include directives and remove those unnecessary dependencies.
Start with each include file, and ensure that each include file only includes what is necessary to compile itself. Any include files that are then missing for the C++ files, can be added to the C++ files themselves.
For each include and source file, comment out each include file one at a time and see if it compiles.
It is also a good idea to sort the include files alphabetically, and where this is not possible, add a comment.
If you aren't already, using a precompiled header to include everything that you're not going to change (platform headers, external SDK headers, or static already completed pieces of your project) will make a huge difference in build times.
http://msdn.microsoft.com/en-us/library/szfdksca(VS.71).aspx
Also, although it may be too late for your project, organizing your project into sections and not lumping all local headers to one big main header is a good practice, although it takes a little extra work.
If you would work with Eclipse CDT you could try out http://includator.com to optimize your include structure. However, Includator might not know enough about VC++'s predefined includes and setting up CDT to use VC++ with correct includes is not built into CDT yet.
The latest Jetbrains IDE, CLion, automatically shows (in gray) the includes that are not used in the current file.
It is also possible to have the list of all the unused includes (and also functions, methods, etc...) from the IDE.
Some of the existing answers state that it's hard. That's indeed true, because you need a full compiler to detect the cases in which a forward declaration would be appropriate. You cant parse C++ without knowing what the symbols mean; the grammar is simply too ambiguous for that. You must know whether a certain name names a class (could be forward-declared) or a variable (can't). Also, you need to be namespace-aware.
Maybe a little late, but I once found a WebKit perl script that did just what you wanted. It'll need some adapting I believe (I'm not well versed in perl), but it should do the trick:
http://trac.webkit.org/browser/branches/old/safari-3-2-branch/WebKitTools/Scripts/find-extra-includes
(this is an old branch because trunk doesn't have the file anymore)
If there's a particular header that you think isn't needed anymore (say
string.h), you can comment out that include then put this below all the
includes:
#ifdef _STRING_H_
# error string.h is included indirectly
#endif
Of course your interface headers might use a different #define convention
to record their inclusion in CPP memory. Or no convention, in which case
this approach won't work.
Then rebuild. There are three possibilities:
It builds ok. string.h wasn't compile-critical, and the include for it
can be removed.
The #error trips. string.g was included indirectly somehow
You still don't know if string.h is required. If it is required, you
should directly #include it (see below).
You get some other compilation error. string.h was needed and isn't being
included indirectly, so the include was correct to begin with.
Note that depending on indirect inclusion when your .h or .c directly uses
another .h is almost certainly a bug: you are in effect promising that your
code will only require that header as long as some other header you're using
requires it, which probably isn't what you meant.
The caveats mentioned in other answers about headers that modify behavior
rather that declaring things which cause build failures apply here as well.