Tool that will check for redundant includes in C++ project - c++

I need a tool that will scan my C++ project to see if there are any includes that are not being referenced or are being referenced redundantly. Thanks.

You don't want this. You want to include any header that declares/defines anything used by the cpp file you're writing. If you remove "redundant" headers that are already included by something you're including then when something minor changes you'll be editing files all over the damn place. Just use proper header guards to make sure you don't break the one-definition rule.

As for the tool - hard to imagine how should it work (at least from my limited point of view).
What I do is commenting out each include line and rebuilding the file (only one file - not the complete project). If it stil compiles - the inclusion was not needed.
Should not take too much time.
Off-topic: I can see the need of this procedure before delivering the code to the customer. I would appreciate the deliverables more knowing that somebody has taken care even of that tiny detail. But as a customer, I wouldn't be so picky to insist on it.

Related

Is there a tool to help reduce #include statements [duplicate]

Are there any tools that help organizing the #includes that belong at the top of a .c or .h file?
I was just wondering because I am reorganizing my code, moving various small function definitions/declarations from one long file into different smaller files. Now each of the smaller files needs a subset of the #includes that were at the top of the long file.
It's just annoying and error-prone to figure out all #includes by hand. Often the code compiles even though not all #includes are there. Example: File A uses std::vector extensively but doesn't include vector; but it currently includes some obscure other header which happens to include vector (maybe through some recursive includes).
VisualAssistX can help you jumping to the definition of a type. E.g. if you use a class MyClass in your source, you can click it, choose goto definition, and VisualAssistX opens the include file that contains the definition of this class (possibly Visual Studio can also do this, but at this point am so getting used to VisualAssistX, that I contribute every wonderful feature to VisualAssistX :-)). You can use this to find the include file necessary for your source code.
PC-Lint can do exactly the opposite. If you have an include file in your source that is not being used, PC-Lint can warn you about it, so that you know that the include file can be removed from the source (which will have a positive impact on your compilation time).
makedepend and gccmakedep
I have been dealing with this problem lately.
In our project we use C++ and have one X.h and one X.cpp file for each class X.
My strategy is the following:
(1) If A.h, where class A is declared, refers to a class B, then
I have to include header B.h.
If the declaration of class A contains only the type *B, then I only need a
forward declaration class B; in A.h. I might need to include B.h in A.cpp
(2) Using the above procedure, I move as many includes as possible from A.h to A.cpp. Then I try removing one include at a time and see if the .cpp file still compiles. This should allow to minimize the set of included files in the .cpp file, but I am not 100% sure if the result is minimal. I also think one can write a tool to do this automatically. I had started to write one for Visual Studio. I would be glad to know that there are such tools.
Note: maybe adding a proper module construct with well-defined import / export relations could be a desired addition for C++0x. It would make the task of reorganizing imports much easier and speed up compilation a lot.
Since this question has been asked a new tool has been created: include-what-you-use, it is based on clang, provides mappings to fake the existence of certain symbols (unique_ptr in memory, but actually defined in bits/unique_ptr.h in some standard headers and lets you provide your own mappings.
It provides nice diagnostics and automatic rewriting.
Now each of the smaller files needs a subset of the #includes that were at the top of the long file.
I do this using VisualAssistX. Firstly compile the file to see what is missing. Then you can use VisualAssistX's Add include function. So I just go and right-click on the functions or classes that I know need an include and hit Add Include. Recompile a couple of times to filter out new missing includes and it is done. I hope I wrote that understandably :)
Not perfect, but faster than doing it by hand.
I use the doxygen/dot-graphviz generated graphs to see how files are linked. Very handy, but not automatic, you have to visually check the graphs, then edit you code to remove unnecessary "#include" lines. Of course, not really suited for very-large projects (say > 100 files), where the graphs become unusable.

Include everything, Separate with "using"

I'm developing a C++ library. It got me thinking of the ways Java and C# handle including different components of the libraries. For example, Java uses "import" to allow use of classes from other packages, while C# simply uses "using" to import entire modules.
My questions is, would it be a good idea to #include everything in the library in one massive include and then just use the using directive to import specific classes and modules? Or would this just be down right crazy?
EDIT:
Good responses so far, here are a few mitigating factors which I feel add to this idea:
1) Internal #includes are kept as normal (short and to the point)
2) The file which includes everything is optionally supplied with the library to those who wish to use it3) You could optionally make the big include file part of the pre-compiled header
You're confusing the purpose of #include statements in C++. They do not behave like import statements in Java or using statements in C#. #include does what it says; namely, loads and parses the entire indicated file as part of the current translation unit. The reason for the separate includes is to not have to spend compilation time parsing the entire standard library in every file. In contrast, the statements you're trying to make #include behave like are merely for programmer organization purposes.
#include is for management of the compilation process; not for separating uses. (In fact, you cannot use seperate headers to enforce seperate uses because to do so would violate the one definition rule)
tl;dr -> No, you shouldn't do that. #include as little as possible. When your project becomes large, you'll thank yourself when you're not waiting many hours to compile your project.
I would personally recommend only including the headers when you need them to explicitly show which functionalities your file requires. At the same time, doing so will prevent you from gaining access to functionalities you might no necessarily want, e.g functions unrelated to the goal of the file. Sure, this is no big deal, but I think that it's easier to maintain and change code when you don't have access to unnecessary functions/classes; it just makes it more straightforward.
I might be downvoted for this, but I think you bring up an interesting idea. It would probably slow down compilation a bit, but I think the concept is neat.
As long as you used using sparingly — only for the namespaces you need — other developers would be able to get an idea of what classes were used in a file by glancing at the top. It wouldn't be as granular as seeing a list of #included files, but is seeing a list of included header files really very useful? I don't think so.
Just make sure that all of the header files all use inclusion guards, of course. :)
As said by #Billy ONeal, the main thing is that #include is a preprocessor directive that causes a "^C, ^V" (copy-paste) of code that leads to a compile time increase.
The best considered policy in C++ is to forward declare all possible classes in ".h" files and just include them in the ".cpp" file. It isolates dependencies, as a C/C++ project will be cascadingly rebuilt if a dependent include file is changed.
Of course M$ compilers and its precompiled headers tend to do the opposite, enclosing to what you suggest. But anyone that tried to port code across those compilers is well aware of how smelly it can go.
Some libraries like Qt make extensive use of forward declarations. Take a look on it to see if you like its taste.
I think it will be confusing. When you write C++ you should avoid making it look like Java or C# (or C :-). I for one would really wonder why you did that.
Supplying an include-all file isn't really that helpful either, as a user could easily create one herself, with the parts of the library actually used. Could then be added to a precompiled header, if one is used.

Optimizing Headers

I'm looking for a tool which can do at least one of two things.
Guess what headers might be unused and can be removed.
Guess which headers should be included in a file, but are included indirectly through inclusion of other files. Thus allows proper compilation of the file.
Is there such a tool?
You could use GCC warnings "-Wmissing-declarations" and "-Wredundant-decls". It's not precisely what you want, but might help a lot.
A colleague of mine wrote a very simple script to achieve part of this (and slow too...).
Basically the idea is to try to comment each include in turn and then try to compile the object, it doesn't deal with include within headers but already remove a substantial number of unused files :)
EDIT:
Pseudo code of the algorithm
for s in sourceFiles:
while t := commentNextInclude(s):
if compilationOk(): s := t
As I said, comment each #include in turn, and each time check if the program still compiles, if it does, validate the commenting, and move on to the next one.
I don't have the rights to disclose the script source though.

Overview look of header inclusions

When a project grows it becomes hard to get an overview of header inclusion. I've noticed our object files have grown rather large and so I'm thinking there's a lot to be won by rearranging dependencies. This is where the problem begin, I know of no convenient way to actually get an overview on what headers actually get included for a specific source file. There's the possibility of outputting the pre-processed source files, that however creates huge files with loads of irrelevant information. I'm thinking there must be a tool for this, but I can't seem to find any. I'm on windows, so in case anyone know of a good tool / way to actually do this for windows I'd be eternally grateful.
Visual C++ has the /showIncludes switch, which causes the compiler to output a message when an include is encountered.
This isn't a tool, but lays out some basic principles for good design of header file #includes:
http://www.eventhelix.com/realtimemantra/headerfileincludepatterns.htm
doxygen makes nice #include file dependency graphs too, if you want to see how all your headers fit together.

Detecting superfluous #includes in C/C++?

I often find that the headers section of a file get larger and larger all the time but it never gets smaller. Throughout the life of a source file classes may have moved and been refactored and it's very possible that there are quite a few #includes that don't need to be there and anymore. Leaving them there only prolong the compile time and adds unnecessary compilation dependencies. Trying to figure out which are still needed can be quite tedious.
Is there some kind of tool that can detect superfluous #include directives and suggest which ones I can safely remove?
Does lint do this maybe?
Google's cppclean (links to: download, documentation) can find several categories of C++ problems, and it can now find superfluous #includes.
There's also a Clang-based tool, include-what-you-use, that can do this. include-what-you-use can even suggest forward declarations (so you don't have to #include so much) and optionally clean up your #includes for you.
Current versions of Eclipse CDT also have this functionality built in: going under the Source menu and clicking Organize Includes will alphabetize your #include's, add any headers that Eclipse thinks you're using without directly including them, and comments out any headers that it doesn't think you need. This feature isn't 100% reliable, however.
Also check out include-what-you-use, which solves a similar problem.
It's not automatic, but doxygen will produce dependency diagrams for #included files. You will have to go through them visually, but they can be very useful for getting a picture of what is using what.
The problem with detecting superfluous includes is that it can't be just a type dependency checker. A superfluous include is a file which provides nothing of value to the compilation and does not alter another item which other files depend. There are many ways a header file can alter a compile, say by defining a constant, redefining and/or deleting a used macro, adding a namespace which alters the lookup of a name some way down the line. In order to detect items like the namespace you need much more than a preprocessor, you in fact almost need a full compiler.
Lint is more of a style checker and certainly won't have this full capability.
I think you'll find the only way to detect a superfluous include is to remove, compile and run suites.
I thought that PCLint would do this, but it has been a few years since I've looked at it. You might check it out.
I looked at this blog and the author talked a bit about configuring PCLint to find unused includes. Might be worth a look.
The CScout refactoring browser can detect superfluous include directives in C (unfortunately not C++) code. You can find a description of how it works in this journal article.
Sorry to (re-)post here, people often don't expand comments.
Check my comment to crashmstr, FlexeLint / PC-Lint will do this for you. Informational message 766. Section 11.8.1 of my manual (version 8.0) discusses this.
Also, and this is important, keep iterating until the message goes away. In other words, after removing unused headers, re-run lint, more header files might have become "unneeded" once you remove some unneeded headers. (That might sound silly, read it slowly & parse it, it makes sense.)
I've never found a full-fledged tool that accomplishes what you're asking. The closest thing I've used is IncludeManager, which graphs your header inclusion tree so you can visually spot things like headers included in only one file and circular header inclusions.
You can write a quick script that erases a single #include directive, compiles the projects, and logs the name in the #include and the file it was removed from in the case that no compilation errors occurred.
Let it run during the night, and the next day you will have a 100% correct list of include files you can remove.
Sometimes brute-force just works :-)
edit: and sometimes it doesn't :-). Here's a bit of information from the comments:
Sometimes you can remove two header files separately, but not both together. A solution is to remove the header files during the run and not bring them back. This will find a list of files you can safely remove, although there might a solution with more files to remove which this algorithm won't find. (it's a greedy search over the space of include files to remove. It will only find a local maximum)
There may be subtle changes in behavior if you have some macros redefined differently depending on some #ifdefs. I think these are very rare cases, and the Unit Tests which are part of the build should catch these changes.
I've tried using Flexelint (the unix version of PC-Lint) and had somewhat mixed results. This is likely because I'm working on a very large and knotty code base. I recommend carefully examining each file that is reported as unused.
The main worry is false positives. Multiple includes of the same header are reported as an unneeded header. This is bad since Flexelint does not tell you what line the header is included on or where it was included before.
One of the ways automated tools can get this wrong:
In A.hpp:
class A {
// ...
};
In B.hpp:
#include "A.hpp
class B {
public:
A foo;
};
In C.cpp:
#include "C.hpp"
#include "B.hpp" // <-- Unneeded, but lint reports it as needed
#include "A.hpp" // <-- Needed, but lint reports it as unneeded
If you blindly follow the messages from Flexelint you'll muck up your #include dependencies. There are more pathological cases, but basically you're going to need to inspect the headers yourself for best results.
I highly recommend this article on Physical Structure and C++ from the blog Games from within. They recommend a comprehensive approach to cleaning up the #include mess:
Guidelines
Here’s a distilled set of guidelines from Lakos’ book that minimize the number of physical dependencies between files. I’ve been using them for years and I’ve always been really happy with the results.
Every cpp file includes its own header file first. [snip]
A header file must include all the header files necessary to parse it. [snip]
A header file should have the bare minimum number of header files necessary to parse it. [snip]
If you are using Eclipse CDT you can try http://includator.com which is free for beta testers (at the time of this writing) and automatically removes superfluous #includes or adds missing ones. For those users who have FlexeLint or PC-Lint and are using Elicpse CDT, http://linticator.com might be an option (also free for beta test). While it uses Lint's analysis, it provides quick-fixes for automatically remove the superfluous #include statements.
This article explains a technique of #include removing by using the parsing of Doxygen. That's just a perl script, so it's quite easy to use.
CLion, the C/C++ IDE from JetBrains, detects redundant includes out-of-the-box. These are grayed-out in the editor, but there are also functions to optimise includes in the current file or whole project.
I've found that you pay for this functionality though; CLion takes a while to scan and analyse your project when first loaded.
Here is a simple brute force way of identifying superfluous header includes. It's not perfect but eliminates the "obvious" unnecessary includes. Getting rid of these goes a long way in cleaning up the code.
The scripts can be accessed directly on GitHub.
Maybe a little late, but I once found a WebKit perl script that did just what you wanted. It'll need some adapting I believe (I'm not well versed in perl), but it should do the trick:
http://trac.webkit.org/browser/branches/old/safari-3-2-branch/WebKitTools/Scripts/find-extra-includes
(this is an old branch because trunk doesn't have the file anymore)
There is a free tool Include File Dependencies Watcher which can be integrated in the visual studio. It shows superfluous #includes in red.
There's two types of superfluous #include files:
A header file actually not needed by
the module(.c, .cpp) at all
A header file is need by the module
but being included more than once, directly, or indirectly.
There's 2 ways in my experience that works well to detecting it:
gcc -H or cl.exe /showincludes (resolve problem 2)
In real world,
you can export CFLAGS=-H before make,
if all the Makefile's not override
CFLAGS options. Or as I used, you
can create a cc/g++ wrapper to add -H
options forcibly to each invoke of
$(CC) and $(CXX). and prepend the
wrapper's directory to $PATH
variable, then your make will all
uses you wrapper command instead. Of
course your wrapper should invoke the
real gcc compiler. This tricks
need to change if your Makefile uses
gcc directly. instead of $(CC) or
$(CXX) or by implied rules.
You can also compile a single file by tweaking with the command line. But if you want to clean headers for the whole project. You can capture all the output by:
make clean
make 2>&1 | tee result.txt
PC-Lint/FlexeLint(resolve problem
both 1 and 2)
make sure add the +e766 options, this warning is about:
unused header files.
pclint/flint -vf ...
This will cause pclint output included header files, nested header files will be indented appropriately.
clangd is doing that for you now. Possibly clang-tidy will soon be able to do that as well.
To end this discussion: the c++ preprocessor is turing complete. It is a semantic property, whether an include is superfluous. Hence, it follows from Rice's theorem that it is undecidable whether an include is superfluous or not. There CAN'T be a program, that (always correctly) detects whether an include is superfluous.