#import in header file or implementation file

#import in header file or implementation file - c++

Some have the habit of adding header file imports/includes to the header file. Some on the other hand, write a forward declaration in the header file and write the actual #include or #import lines in the implementation file.
Is there a standard practice for this? Which is better and why?

Given X.h and X.c, if you #include everything from X.h then clients of "X" that #include <X.h> will also include all those headers, even though some may only be needed in X.c.
X.h should include only what's needed to parse X.h. It should assume no other headers will have been included by the translation unit to ensure that reordering inclusions won't break a client. X.c should include any extras needed for implementation.
This minimises recompilation dependencies. You don't want a change only to the implementation to affect the header and hence trigger a client recompilation. You should simply include directly from X.c.

Including forward references instead of headers is necessary when classes have shallow dependencies. For example:
A.h
#include "B.h"
class A {
B* pointer;
};
B.h
#include "A.h"
class B {
A* pointer;
};
Will break at compilation.
A.h
class B;
class A {
B* pointer;
};
B.h
class A;
class B {
A* pointer;
};
Will work as each class only needs to know the other class exists at the declaration.

I write my imports in header files, so that every implementation file has only one inclusion directive. This also has the advantage of hiding dependencies from the user of the module's code.
However, that same hiding has a disadvantage: your module's user may be importing all kinds of other headers included in your header that he may not need at all. From that point of view, it's better to have the inclusion directives in the implementation file, even if it means manually resolving dependencies, because it leads to lighter code.
I don't think there's a single answer. Considering the reasons I gave, I prefer the first approach, I think it leads to cleaner code (albeit heavier and possibly with unnecessary imports).
I don't remember who I'm quoting (and thus the phrase is not exact), but I always remember reading: "programs are written for human beings to read, and ocasionally for computers to execute". I don't particularly care if there are a few kilobytes of code the user of my module won't need, as long as he can cleanly, easily import it and use it with a single directive.
Again, I think it's a matter of taste, unless there's something I failed to consider. Comments are more than welcome in that case!
Cheers.

Related

Header Include Creep in C++

When I started learning C++ I learned that header files should typically be #included in other header files. Just now I had someone tell me that I should include a specific header file in my .cpp file to avoid header include creep.
Could someone tell me what exactly that is, why it is a problem and maybe point me to some documentation on when I would want to include a file in another header and when I should include one in a .cpp file?

The "creep" refers to the inclusion of one header including many others. This has some unwanted consequences:
a tendency towards every source file indirectly including every header, so that a change to any header requires everything to be recompiled;
more likelihood of circular dependencies causing heartache.
You can often avoid including one header from another by just declaring the classes you need, rather than including the header that gives the complete definition. Such an incomplete type can be used in various ways:
// Don't need this
//#include "thingy.h"
// just this
class thingy;
class whatsit {
// You can declare pointers and references
thingy * p;
thingy & r;
// You can declare static member variables (and external globals)
// They will need the header in the source file that defines them
static thingy s;
// You can declare (but not define) functions with incomplete
// parameter or return types
thingy f(thingy);
};
Some things do require the complete definition:
// Do need this
#include "thingy.h"
// Needed for inheritance
class whatsit : thingy {
// Needed for non-static member variables
thingy t;
// Needed to do many things like copying, accessing members, etc
thingy f() {return t;}
};

Could someone tell me what exactly [include creep] is
It's not a programming term, but interpreting it in an Engish-language context would imply that it's the introduction of #include statements that are not necessary.
why it is a problem
Because compiling code takes time. So compiling code that's not necessary takes unnecessary time.
and maybe point me to some documentation on when I would want to include a file in another header and when I should include one in a .cpp file?
If your header requires the definition of a type, you will need to include the header that defines that type.
#include "type.h"
struct TypeHolder
{
Type t;
// ^^^^ this value type requires the definition to know its size.
}
If your header only requires the declaration of a type, the definition is unnecessary, and so is the include.
class Type;
struct TypeHolder
{
Type * t;
// ^^^^^^ this pointer type has the already-known size of a pointer,
// so the definition is not required.
}
As an anecdote that supports the value of this practice, I was once put on a project whose codebase required ONE HOUR to fully compile. And changing a header often incurred most or all of that hour for the next compilation.
Adding forward-declarations to the headers where applicable instead of includes immediately reduced full compilation time to 16 minutes, and changing a header often didn't require a full rebuild.

Well they were probably referring to a couple of things ("include creep" is not a term I've ever heard before). If you, as an extreme rule, only include headers from headers (aside from one matching header per source file):
Compilation time increases as many headers must be compiled for all source files.
Unnecessary dependencies increase, e.g. with a dependency based build system, modifying one header may cause unnecessary recompilation of many other files, increasing compile time.
Increased possibility of namespace pollution.
Circular dependencies become an issue (two headers that need to include each other).
Other more advanced dependency-related topics (for example, this defeats the purpose of opaque pointers, which causes many design issues in certain types of applications).
As a general rule of thumb, include the bare minimum number of things from a header required to compile only the code in that header (that is, enough to allow the header to compile if it is included by itself, but no more than that). This will combat all of the above issues.
For example, if you have a source file for a class that uses std::list in the code, but the class itself has no members or function parameters that use std::list, there's no reason to #include <list> from the class's header, as nothing in the class's header uses std::list. Do it from the source file, where you actually need it, instead of in the header, where everything that uses the class also has to compile <list> unnecessarily.
Another common technique is forward declaring pointer types. This is the only real way to combat circular dependency issues, and has some of the compilation time benefits listed above as well. For example, if two classes refer to each other but only via pointers:
// A.h
class B; // forward declaration instead of #include "B.h"
class A {
B *b;
}
// B.h
class A; // forward declaration instead of #include "A.h"
class B {
A *a;
}
Then include the headers for each in the source files, where the definitions are actually needed.

Your header files should include all headers they need to compile when included alone (or first) in a source file. You may in many cases allow the header to compile by using forward declarations, but you should never rely on someone including a specific header before they include yours.

theres compie time to think of, but there's also dependancies. if A.h includes B.h and viceversa, code won't compile. You can get around this by forward referancing your class, then including the header in the cpp
A.h
class B;
class A
{
public:
void CreateNewB():
private:
B* m_b;
};
A.cpp
#include "B.h"
A::CreateNewB()
{
m_b = new B;
}

C++ Pre-compiled header and included file organization

I have a very large Native C++ project with hundreds of classes each defined in their own .cpp and .h file. Nothing weird here. I have all of the classes declared in my pre-compiled header stdafx.h file and also included in the stdafx.h file as shown below (to allow Foo to reference Bar AND Bar to reference Foo).
Is this organization bad practice?
Essentially, all classes are declared and included in every object. I assume the compiler is not going to generate overhead binary by having things included which are not necessary to compiling the particular .cpp object file. As of now I'm not having any problem with compile time either. I'm just wondering if this is a bad organization for the code.
I like this layout because the stdafx.h file is essenetially a code map since all classes are declared there and it reveals the nested namespaces is an easy to read format. It also means that my individual class .h files don't require several include statements which means less code for me to maintain when making changes to filenames or directories.
stdafx.h:
#pragma once
namespace MyProject
{
class Foo;
class Bar;
}
#include "Foo.h"
#include "Bar.h"
foo.h:
#pragma once
namespace MyProject
{
class Foo
{
// Declare class members, methods, etc
}
}
foo.cpp:
#include "stdafx.h"
namespace MyProject
{
class Foo
{
// Define class members, methods, etc
}
}

In my mind there are 4 key factors here:
Compile Time: Keep in mind that a #include "x.h" embeds all the code in x.h at that line. If the project is going to grow substantially have a care for how you're going to impact future compile times.
Foo and Bar Stability: If Foo and Bar are changing you're going to have to keep recompiling stdafx.h
Circular Definitions: You will no longer be able to break circular definitions with forward declares.
Explicit Dependencies: You won't be able to look at your includes and know what is used in a particular file.
If you feel your pros outweigh the cons listed here, there's nothing illegal about what you're doing so go for it!

A header should include everything it needs, no more, no less.
That's a basic requirement for ease-of-use and simple encapsulation.
Your example explicitly fail that point.
To detect violations, always include the corresponding header first in the translation-unit.
This is slightly modified if you use precompiled headers, in which case they have to be included first.
Pre-compiled headers should only include rarely changing elements.
Any change to the pre-compiled header or its dependencies will force re-compilation of everything, which runs counter to using one at all.
Your example fails that point too.
All in all, much to improve.

Forward declaration / when best to include headers?

I'm pretty clear on when I can/can't use forward declaration but I'm still not sure about one thing.
Let's say I know that I have to include a header sooner or later to de-reference an object of class A.
I'm not clear on whether it's more efficient to do something like..
class A;
class B
{
A* a;
void DoSomethingWithA();
};
and then in the cpp have something like..
#include "A.hpp"
void B::DoSomethingWithA()
{
a->FunctionOfA();
}
Or might I as well just include A's header in B's header file in the first place?
If the former is more efficient then I'd appreciate it if someone clearly explained why as I suspect it has something to do with the compilation process which I could always do with learning more about.

Use forward declarations (as in your example) whenever possible. This reduces compile times, but more importantly minimizes header and library dependencies for code that doesn't need to know and doesn't care for implementation details. In general, no code other than the actual implementation should care about implementation details.
Here is Google's rationale on this: Header File Dependencies

When you use forward declaration, you explicitly say with it "class B doesn't need to know anything about internal implementation of class A, it only needs to know that class named A exists". If you can avoid including that header, then avoid it. - it's good practice to use forward declaration instead because you eliminate redundant dependencies by using it.
Also note, that when you change the header file, it causes all files that include it to be recompiled.
These questions will also help you:
What are the drawbacks of forward declaration?
What is the purpose of forward declaration?

Don't try to make your compilation efficient. There be dragons. Just include A.hpp in B.hpp.
The standard practice for C and C++ header files is to wrap all of the header file in an #ifndef to make sure it is compiled only once:
#ifndef _A_HPP_
#define _A_HPP_
// all your definitions
#endif
That way, if you #include "A.hpp" in B.hpp, you can have a program that includes both, and it won't break because it won't try to define anything twice.

Purpose of #ifndef FILENAME....#endif in header file

I know it to prevent multiple inclusion of header file. But suppose I ensure that I will include this file in only one .cpp file only once. Are there still scenarios in which I would require this safe-guard?

No, that's the only purpose of the include guards, but using them should be a no-brainer: doing it requires little time and potentially saves a lot.

You can guarantee that your code only includes it once, but can you guarantee that anyone's code will include it once?
Furthermore, imagine this:
// a.h
typedef struct { int x; int y; } type1;
// b.h
#include "a.h"
typedef struct { type1 old; int z; } type2;
// main.c
#include "a.h"
#include "b.h"
Oh, no! Our main.c only included each once, but b.h includes a.h, so we got a.h twice, despite our best efforts.
Now imagine this hidden behind three or more layers of #includes and it's a minor internal-use-only header that gets included twice and it's a problem because one of the headers #undefed a macro that it defined but the second header #defined it again and broke some code and it takes a couple hours to figure out why there are conflicting definitions of things.

That's its sole raison d'etre. It's still a good idea even if you think you have that covered; it doesn't slow your code down or anything, and it never hurts to have an extra guard.

The purpose of the guard is to prevent the file from being re included in the same .cpp file more than once. It does not protect against including the file in more than one .cpp file.
If you are sure that a header file isn't included in another header file, then the guard is not required. but it's still good form.
even better form is to use
#pragma once
if your compiler supports it.

Ensuring your code is included only once is the sole purpose of a so-called "header guard".
This can be useful as if there's somewhere a circular dependency between your header files, you don't get caught in an endless loop of including files.

The additional scenario I can think of (and we did it) is to create C++ mocks.
You explicitly in your build define the file GUARD value and then you are able to add your own mock realization via -include my_mock.h as the additional compiler option (we used g++).
my_mock.h
#define THAT_FILE_GUARD
class ThatClass
{
void connect()
{
std::cout << "mock connect" << std::endl;
}
}

Using a header guard like this speeds up the compilation process, imagine having three source files using the header (minus the header guard), that in turn would mean the compiler would have to include the header (parsing and lexing the syntax) multiple times over.
With a header guard, the compiler will say 'Ha! I have seen this one before, and no I will not parse/lex the syntax' thereby speeding up the compilation process.
Hope this helps,
Best regards,
Tom.

#include header style

I have a question regarding "best-practice" when including headers.
Obviously include guards protect us from having multiple includes in a header or source file, so my question is whether you find it beneficial to #include all of the needed headers in a header or source file, even if one of the headers included already contains one of the other includes. The reasoning for this would be so that the reader could see everything needed for the file, rather than hunting through other headers.
Ex: Assume include guards are used:
// Header titled foo.h
#include "blah.h"
//....
.
// Header titled bar.h that needs blah.h and foo.h
#include "foo.h"
#include "blah.h" // Unnecessary, but tells reader that bar needs blah
Also, if a header is not needed in the header file, but is needed in it's related source file, do you put it in the header or the source?

In your example, yes, bar.h should #include blah.h. That way if someone modifies foo so that it doesn't need blah, the change won't break bar.
If blah.h is needed in foo.c but not in foo.h, then it should not be #included in foo.h. Many other files may #include foo.h, and more files may #include them. If you #include blah.h in foo.h, then you make all those files needlessly dependent on blah.h. Needless dependencies cause lots of headaches:
If you modify blah.h, all those files must be recompiled.
If you want to isolate one of them (say, to carry it over to another project or build a unit test around it) you have to take blah.h along.
If there's a bug in one of them, you can't rule out blah.h as the cause until you check.
If you are foolish enough to have something like a macro in blah.h... well, never mind, in that case there's no hope for you.

The basic rule is, #include any headers that you actually use in your code. So, if we're talking:
// foo.h
#include "utilities.h"
using util::foobar;
void func() {
foobar();
}
// bar.h
#include "foo.h"
#include "utilities.h"
using util::florg;
int main() {
florg();
func();
}
Where bar.h uses tools from the header included twice, then you should #include it, even if you don't necessarily have to. On the other hand, if bar.h doesn't need any functions from utilities.h, then even though foo.h includes it, don't #include it.

The header for a source file should define the interface that the users of the code need to use it accurately. It should contain all that they need to use the interface, but nothing extra. If they need the facility provided by xyz.cpp, then all that is required by the user is #include "xyz.h".
How 'xyz.h' provides that functionality is largely up to the implementer of 'xyz.h'. If it requires facilities that can only be specified by including a specific header, then 'xyz.h' should include that other header. If it can avoid including a specific header (by forward definition or any other clean means), it should do so.
In the example, my coding would probably depend on whether the 'foo.h' header was under the control of the same project as the 'blah.h' header. If so, then I probably would not make the explicit second include; if not, I might include it. However, the statements above should be forcing me to say "yes, include 'foo.h' just in case".
In my defense, I believe the C++ standard allows the inclusion of any one of the C++ headers to include others - as required by the implementation; this could be regarded as similar. The problem is that if you include just 'bar.h' and yet use features from 'blah.h', then when 'bar.h' is modified because its code no longer needs 'blah.h', then the user's code that used to compile (by accident) now fails.
However, if the user was accessing 'blah.h' facilities directly, then the user should have included 'blah.h' directly. The revised interface to the code in 'bar.h' does not need 'blah.h' any more, so any code that was using just the interface to 'bar.h' should be fine still. But if the code was using 'blah.h' too, then it should have been including it directly.
I suspect the Law of Demeter also should be considered - or could be viewed as influencing this. Basically, 'bar.h' should include the headers that are needed to make it work, whether directly or indirectly - and the consumers of 'bar.h' should not need to worry much about it.
To answer the last question: clearly, headers needed by the implementation but not needed by the interface should only be included in the implementation source code and absolutely not in the header. What the implementation uses is irrelevant to the user and compilation efficiency and information hiding both demand that the header only expose the minimum necessary information to the users of the header.

Including everything upfront in headers in C++ can cause compile times to explode
Better to encapsulate and forward declare as much as possible. The forward declarations provide enough hints to what is required to use the class. Its quite acceptable to have standard includes in there though (especially templates as they cannot be forward declared).

My comments might not be a direct answer to your question but useful.
IOD/IOP encourages that put less headers in INTERFACE headers as possible, the main benefits to do so:
less dependencies;
smaller link-time symbols scope;
faster compiling;
smaller size of final executables if header contains static C-style function definitions etc.
per IOD/IOP, should interfaces only be put in .h/.hxx headers. include headers in your .c/.cpp instead.

My Rules for header files are:
Rule #1
In the header file only #include class's that are members or base classes of your class.
If your class has pointers or references used forward declarations.
--Plop.h
#include "Base.h"
#include "Stop.h"
#include <string>
class Plat;
class Clone;
class Plop: public Base
{
int x;
Stop stop;
Plat& plat;
Clone* clone;
std::string name;
};
Caviat: If your define members in the header file (example template) then you may need to include Plat and Clone (but only do so if absolutely required).
Rule #2
In the source put header files from most specific to least specific order.
But don't include anything you do not explicitly need too.
So in this case you will inlcude:
Plop.h (The most specific).
Clone.h/Plat.h (Used directly in the class)
C++ header files (Clone and Plat may depend on these)
C header files
The argument here is that if Clone.h needs map (and Plop needs map) and you put the C++ header files closer to the top of the list then you hide the fact that Clone.h needs map thus you may not add it inside Clone.h.
Rule #3
Always use header guards
#ifndef <NAMESPACE1>_<NAMESPACE2>_<CLASSNAME>_H
#define <NAMESPACE1>_<NAMESPACE2>_<CLASSNAME>_H
// Stuff
#endif
PS: I am not suggesting using multiple nested namespaces. I am just demonstrating how I do if I do it. I normal put everything (apart from main) in a namespace. Nesting would depend on situation.
Rule #4
Avoid the using declaration.
Except for the current scope of the class I am working on:
-- Stop.h
#ifndef THORSANVIL_XXXXX_STOP_H
#define THORSANVIL_XXXXX_STOP_H
namespace ThorsAnvil
{
namespace XXXXX
{
class Stop
{
};
} // end namespace XXXX
} // end namespace ThorsAnvil
#endif
-- Stop.cpp
#include "Stop.h"
using namespace ThorsAnvil:XXXXX;

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js