C++ When worth forward declaration? - c++

I'm new in c++ world, and learned that it is a good practice to make forward declarations. But is it always good?
I'm looking for guidelines for when it is a good practice to make a forward declaration and when it is not good, but I really couldn't find any. There are any objective case for when it should be used or not, or it is entirely up to the developer?

There can be cases where you actually start complicating your code base with otherwise unnecessary applications of the PIMPL idiom (which involves forward declarations) to reduce compilation time when in fact you do not have any compilation speed problems in the first place, at least with that particular class or header file.
This can then be considered a case of premature optimisation and a misuse of forward declarations.

As mentioned by juanchopanza, in general you can use forward declarations when you don't need a full class definition - basically when you don't need to know the size or members of the class in question. It allows you to remove compile-time dependencies, which can simplify and speed-up compilations.
For example, if you only use pointers to a class, then you don't need to know any more about it. However, if you call a method of that class, or store an actual instance of it, then you need to know the full details of the class.
class A; // forward declaration
class B {
...
void func1(A*);
}
class C {
...
A m_inst;
}
In the above, C will not compile, as it needs to know the size of A at the point of declaration. However, func1 is fine, as it only uses a pointer to A, so doesn't need to know size or implementation.
Herb Sutter has a bit of information along with explanations, although be warned that some of it is a little advanced if you're a beginner.

Related

Pertinence of void pointers

Looking through a colleague's code, I see that some of its handles are stored as void pointers.
// Class header
void* hSomeSdk;
// Class implementation
hSomeSdk = new SomeSDK(...);
((SomeSDK*)hSomeSdk)->DoSomeWork();
Now I know that sometimes handles are void pointers because it may be unknown before runtime what will be the actual type of the handle. Or that it can help when we need to share the pointer without revealing its actual structure. But this does not seem to be the case in my situation: it will always be SomeSDK and it is not shared outside the class where it is created. Also the author of this code is gone from the company.
Are there other reasons why it would make sense to have it be a void pointer?
Since this is a member variable, I'm gonna go out on a limb and say your colleague wanted to minimize dependencies. Including the header for SomeSDK is probably undesirable just to define a pointer. The colleague may have had one of two reasons as far as I can see from the code you show:
They just didn't know they can add a forward declarations like class SomeSDK; to allow defining pointers. Some programmers just aren't aware of it.
They couldn't forward declare it. If SomeSDK is not a class, but a type alias (aka typedef), then it's not possible to forward declare it exactly. One can only declare the class it aliases, but that in turn may be an implementation detail that's hard to keep track of. Even the standard library has a similar problem, that is why it provides iosfwd to make forward declaring standard stream types easier.
If the code is peppered with casts of this handle, then the design should have been reworked ages ago. Otherwise, if it's in one place (or a few at most) only, I can see why the people maintaining it could live with it peacefully.
Nope.
If I had to guess, the ex-colleague was unfamiliar with forward declarations and thus didn't know they could still do SomeSDK* in the header without including the entire SomeSDK definition.
Given the constraints you've mentioned, the only thing this pattern achieves is to eliminate some type safety, make the code harder to read/maintain, and generate a Stack Overflow question.
void* were popular and needed back in C. They are convenient in the sense that they can be easily cast to anything. If you need to cast from double* to char*, you have to make a mid cast to void*.
The problem with void* is that they are too flexible: they do not convey intentions of the writer, making them very unsafe especially in big projects.
In Object Oriented Design it is popular to create abstract interface classes (all members are virtual and not implemented) and make pointers to such classes and then instantiate various possible implementation depending on the usage.
However, nowadays, it is more recommended to work with templates (main advantage of C++ over other languages), as those are much faster and enable more compile-time optimization than OOD allowed. Unfortunately, working with templates is still a huge hassle - they have more complicated syntax and it is difficult to convey intentions of the writer to users about restrictions and demands of the template parameters (Concepts TS that solves this problem decently will be available in C++20 - currently there is only SFINAE, a horrible temporary solution from 20 years ago; while Reflection TS, that will greatly enhance generic programming in C++, is unlikely to be available even in C++23).

Why does Google Style Guide discourage forward declaration?

Not to say that the Google Style Guide is the holy bible but as a newbie programmer, it seems like a good reference.
The Google Style Guide lists the following disadvantages of forward declaration
Forward declarations can hide a dependency, allowing user code to skip necessary recompilation when headers change.
A forward declaration may be broken by subsequent changes to the library. Forward declarations of functions and templates can prevent the header owners from making otherwise-compatible changes to their APIs, such as widening a parameter type, adding a template parameter with a default value, or migrating to a new namespace.
Forward declaring symbols from namespace std:: yields undefined behavior.
It can be difficult to determine whether a forward declaration or a full #include is needed. Replacing an #include with a forward declaration can silently change the meaning of code:
Code:
// b.h:
struct B {};
struct D : B {};
// good_user.cc:
#include "b.h"
void f(B*);
void f(void*);
void test(D* x) { f(x); } // calls f(B*)
If the #include was replaced with forward decls for B and D, test() would call f(void*).
Forward declaring multiple symbols from a header can be more verbose than simply #includeing the header.
Structuring code to enable forward declarations (e.g. using pointer members instead of object members) can make the code slower and more complex.
However, some search on SO seemed to suggest that forward declaration is universally a better solution.
So given these seemingly non-trivial disadvantages, can someone explain this discrepancy?
And when is it safe to ignore some or all of these disadvantages?
some search on SO seemed to suggest that forward declaration is
universally a better solution.
I don't think that's what SO says. The text you quote is comparing a "guerilla" forward declaration against including the proper include file. I don't think you'll find a lot of support on SO for the approach Google is criticising here. That bad approach is, "no, don't #include include files, just write declarations for the few functions and types you want to use".
The proper include file will still contain forward declarations of its own, and a search on SO will suggest that this is the right thing to do, so I see where you got the idea that SO is in favour of declarations. But Google isn't saying that a library's own include file shouldn't contain forward declarations, it's saying that you shouldn't go rogue and write your own forward declaration for each function or type you want to use.
If you #include the right include file, and your build chain works, then the dependency isn't hidden and the rest of the problems mostly don't apply, despite the fact that the include file contains declarations. There are still some difficulties, but that's not what Google is talking about here.
Looking in particular at forward declarations of types as compared with class definitions for them, (4) gives an example of that going wrong (since a forward declaration of D cannot express that it's derived from B, for that you need the class definition). There's also a technique called "Pimpl" that does make careful use of a forward type declaration for a particular purpose. So again you'll see some support on SO for that, but this isn't the same as supporting the idea that everyone should in general run around forward-declaring classes instead of #includeing their header files.
From Titus Winters' CppCon 2014 talk:
The big one that we've learned lately is: Forward declaring anything with a template in it is a really bad idea. This led to maintenance problems like you would not believe. Forward declaration may be okay, in some cases? My suspicion is the rule is actually going to change to: Library owners are encouraged to provide a header that specifically forward declares the things that they think are worth it (emphasis added), and you probably shouldn't forward declare yourself, and nobody should ever forward declare a templated type. We'll see. We're still working out the [inaudible] details from what we've learned.
So maybe issues with trying to directly forward declare templated types might be one of their motives for discouraging forward declaration wholesale...?
Also, providing "a header that specifically forward declares the things that they think are worth it" sounds similar to the way <iosfwd> is used, as described here (Solution 2.2), here, and here.
EDIT:
To be clear, I wasn't saying I agree or disagree with Google's discouragement of forward declaration. I was just trying to understand their rationale, and made an aside/observation about <iosfwd>.
Personally, I use forward declarations whenever I can, following the guideline stated later in that same GOTW linked above (Solution 3):
Guideline: Never #include a header when a forward declaration will suffice.
but Winters' reasoning seems to have some merit as well. I've worked on code where I forward declared templated types from a third party library, and the syntax does get messy (I haven't yet run into the maintenance problems Winters alluded to). OTOH, I'm not so sure about discouraging all forward declarations as stated in the Google C++ Style Guide, but I guess that's what works for Google?
Disclaimer: I'm no expert, still learning.

What are the drawbacks of forward declaration?

I am wondering if there is any drawback for using forward declarations in all places when possible. This is if my header contains only declarations.
As far as I understand, using forward declaration speeds up compile time, but I don't know of any drawbacks as such.
Example:
a.h:
Class A
{
};
b.h:
// Should I use and include "a.h" in the cpp file (e.g., a.cpp)
Class A;
Class B
{
doSomething(A *a);
A *myA;
};
Or is it better to use
b.h:
#include "a.h"
Class B
{
doSomething(A *a);
A *myA;
};
Using forward declarations improves decoupling. If you can avoid including "A.h" by using a forward declaration, it is a good idea to use forward declaration. It is better not only because your builds run faster (after all, preprocessed headers can deal with compiler efficiency pretty well) but because it tells the readers of your declaration that the structure of your class B does not depend on knowing anything about your class A, other than that it exists*.
EDIT (to answer your question) The only downside to forward declarations that I know is that you cannot use them in all situations: for example, a declaration similar to this:
class B
{
A myA[10];
};
would not compile, because the compiler needs to know the size of A. However, the compiler finds such issues very reliably, and informs you about them in unambiguous terms.
* The implementation of class B could very well depend on knowing the details of class A. However, this dependency becomes an implementation detail of B hidden from the users of your class; you can change it at any time without breaking the code dependent upon class B.
using forward declaration speeds up compiler time
This is partially true, because the compiler (and preprocessor) do not need to parse included headers in every file you include this header.
The real improvement you see when you change the header and need to recompile.
Forward declaration is the only way to break the cyclic inclusion.
Forward declaration is the only way to break the cyclic inclusion.
This is the main drawback when not used carefully, I think. I worked in a large project where forward declarations are made whenever possible. Cyclic dependencies were a real problem in the end.
I'll speak in practical terms. Pros:
Avoids circular compiler dependencies. The way you wrote the code above would not even compile otherwise unless you put A and B in the same header.
It avoids compile-time dependencies. You're allowed to change a.h without recompiling units that include b.h. For the same reason, it speeds up builds in general. To find out more on this subject, I recommend looking up the Pimpl idiom.
Cons:
Applied heavily in this way you have above, your general source files will probably need to include more headers (we cannot instantiate or work with A simply by including B.h). To me, that's a worthwhile exchange for faster builds.
This is probably the biggest con which is that it can come with some runtime overhead depending on what you are doing. In the example you gave, B cannot directly store A as a value. It involves a level of indirection, which may also imply an extra heap allocation/deallocation if B is the memory manager of A (the same would be true of a pimpl). Whether this overhead is trivial or not is where you have to draw the line, and it's worth remembering that maintainability and developer productivity is definitely more important than a micro-optimization which won't even be noticeable to the user. I wouldn't use this as a reason to rule out this practice unless it is definitely proving to be a bottleneck or you know well in advance that the cost of a heap allocation/deallocation or pointer indirection is going to be a non-trivial overhead.
The only drawback that comes to mind is that forward declarations require pointers. Therefore they may not be initialized and therefore could cause a null reference exception. As the coding standard that I currently use, requires all pointers require a null reference check if can add allot of code. I started to get around this with a Design By Contract invariants; then I can assert that anything initialized in the constructor never be null.

Class member function declaration doubt

I'm reading a C++ tutorial and I've ran into this sentence:
The only difference between defining a class member function
completely within its class or to include only the prototype and later
its definition, is that in the first case the function will
automatically be considered an inline member function by the compiler,
while in the second it will be a normal (not-inline) class member
function, which in fact supposes no difference in behavior.
I know what an inline function is, my doubt is about which style to choose. Should I define every function inside its class or outside? Perhaps the simplest functions inside and the other outside?
I fear that defining every function inside a class (i.e. having complex inline functions) might mess up the resulting code and introduce debugging problems or weird behaviors during execution. And, finally, there's the "coding style" issue. So,
which approach is better?
Thank you :)
My style: I sometimes will put extremely short (one or two liner) functions in the class itself. Anything longer that I still want as an inlined function go as inline qualified implementations after the class definition, and oftentimes in a separate file that the header #includes at the end of the class definition.
The rationale for putting an inlined function outside the class is that the implementation of some function usually just gets in the way of a human reader's overall understanding of the class. A twenty line function can usually be summarized in a one line comment -- and that comment is all that is needed when you are reading the class definition. If you need more, go to the function definition, or better yet, Read The Fine Documentation. (Expecting someone to Read The F*** Code is a poor substitute for Fine Documentation.)
The best solution is to separate interface and implementation. Interface is your h-file. Put only prototypes there. Implementation goes to cpp-file. This approach has the following advantages:
Compilation goes faster because there is no need to compile function bodies several times.
Header dependency is simpler because there is no need to include all headers to h-file. Some of the headers are needed only in cpp-file and you can use forward declarations in h-file. Also you can avoid circular dependencies.
The last but not the least - it is easier for human beings to understand interface of your class. There is no code mess.
To answer the part of "Which approach is better?" - From C++ FAQ -
There are no simple answers: You have to play with it to see what is best. Do not settle for simplistic answers like, "Never use inline functions" or "Always use inline functions" or "Use inline functions if and only if the function is less than N lines of code." These one-size-fits-all rules may be easy to write down, but they will produce sub-optimal results.
Neither approach is better per se its a matter of preference and style. Personally I always think that defining the functions explicitly in a seperate .inline file is the best way. This way you are very explicit over what you do and you keep the header file clean.
Furthermore if you use a macro such as INLINE which is defined as follows:
#ifdef DEBUG
#define INLINE
#else
#define INLINE inline
#endif
You can then include the inline file from the header in release and from the CPP in debug. This means that even if the compiler inlines functions in debug you won't have any difficulties when debugging. Admittedly, though, this isn't such a problem for compilers these days so you may want to skip doing this unless using an old compiler.
Generally speaking, a member function which has only one or two statements is probably best with its body written in the class declaration—especially if there are many of them. A member function with more than 20-50 statements is probably best not in the class declaration. For lengths and complexities between, it depends on many factors.
For example, having the function body in a class module helps prevent unnecessary recompiling of dependent modules when the class declaration does not change—only a member function body. This can greatly increase productivity when developing the class. Once the class is stable, this becomes much less important.

Internal typedefs in C++ - good style or bad style?

Something I have found myself doing often lately is declaring typedefs relevant to a particular class inside that class, i.e.
class Lorem
{
typedef boost::shared_ptr<Lorem> ptr;
typedef std::vector<Lorem::ptr> vector;
//
// ...
//
};
These types are then used elsewhere in the code:
Lorem::vector lorems;
Lorem::ptr lorem( new Lorem() );
lorems.push_back( lorem );
Reasons I like it:
It reduces the noise introduced by the class templates, std::vector<Lorem> becomes Lorem::vector, etc.
It serves as a statement of intent - in the example above, the Lorem class is intended to be reference counted via boost::shared_ptr and stored in a vector.
It allows the implementation to change - i.e. if Lorem needed to be changed to be intrusively reference counted (via boost::intrusive_ptr) at a later stage then this would have minimal impact to the code.
I think it looks 'prettier' and is arguably easier to read.
Reasons I don't like it:
There are sometimes issues with dependencies - if you want to embed, say, a Lorem::vector within another class but only need (or want) to forward declare Lorem (as opposed to introducing a dependency on its header file) then you end up having to use the explicit types (e.g. boost::shared_ptr<Lorem> rather than Lorem::ptr), which is a little inconsistent.
It may not be very common, and hence harder to understand?
I try to be objective with my coding style, so it would be good to get some other opinions on it so I can dissect my thinking a little bit.
I think it is excellent style, and I use it myself. It is always best to limit the scope of names as much as possible, and use of classes is the best way to do this in C++. For example, the C++ Standard library makes heavy use of typedefs within classes.
It serves as a statement of intent -
in the example above, the Lorem class
is intended to be reference counted
via boost::shared_ptr and stored in a
vector.
This is exactly what it does not do.
If I see 'Foo::Ptr' in the code, I have absolutely no idea whether it's a shared_ptr or a Foo* (STL has ::pointer typedefs that are T*, remember) or whatever. Esp. if it's a shared pointer, I don't provide a typedef at all, but keep the shared_ptr use explicitly in the code.
Actually, I hardly ever use typedefs outside Template Metaprogramming.
The STL does this type of thing all the time
The STL design with concepts defined in terms of member functions and nested typedefs is a historical cul-de-sac, modern template libraries use free functions and traits classes (cf. Boost.Graph), because these do not exclude built-in types from modelling the concept and because it makes adapting types that were not designed with the given template libraries' concepts in mind easier.
Don't use the STL as a reason to make the same mistakes.
Typedefs are the ones what policy based design and traits built upon in C++, so The power of Generic Programming in C++ stems from typedefs themselves.
Typdefs are definitely are good style. And all your "reasons I like" are good and correct.
About problems you have with that. Well, forward declaration is not a holy grail. You can simply design your code to avoid multi level dependencies.
You can move typedef outside the class but Class::ptr is so much prettier then ClassPtr that I don't do this. It is like with namespaces as for me - things stay connected within the scope.
Sometimes I did
Trait<Loren>::ptr
Trait<Loren>::collection
Trait<Loren>::map
And it can be default for all domain classes and with some specialization for certain ones.
The STL does this type of thing all the time - the typedefs are part of the interface for many classes in the STL.
reference
iterator
size_type
value_type
etc...
are all typedefs that are part of the interface for various STL template classes.
Another vote for this being a good idea. I started doing this when writing a simulation that had to be efficient, both in time and space. All of the value types had an Ptr typedef that started out as a boost shared pointer. I then did some profiling and changed some of them to a boost intrusive pointer without having to change any of the code where these objects were used.
Note that this only works when you know where the classes are going to be used, and that all the uses have the same requirements. I wouldn't use this in library code, for example, because you can't know when writing the library the context in which it will be used.
Currently I'm working on code, that intensively uses these kind of typedefs. So far that is fine.
But I noticed that there are quite often iterative typedefs, the definitions are split among several classes, and you never really know what type you are dealing with. My task is to summarize the size of some complex data structures hidden behind these typedefs - so I can't rely on existing interfaces. In combination with three to six levels of nested namespaces and then it becomes confusing.
So before using them, there are some points to be considered
Does anyone else need these typedefs? Is the class used a lot by other classes?
Do I shorten the usage or hide the class? (In case of hiding you also could think of interfaces.)
Are other people working with the code? How do they do it? Will they think it is easier or will they become confused?
When the typedef is used only within the class itself (i.e. is declared as private) I think its a good idea. However, for exactly the reasons you give, I would not use it if the typedef's need to be known outside the class. In that case I recommend to move them outside the class.
I recommend to move those typedefs outside the class. This way, you remove direct dependency on shared pointer and vector classes and you can include them only when needed. Unless you are using those types in your class implementation, I consider they shouldn't be inner typedefs.
The reasons you like it are still matched, since they are solved by the type aliasing through typedef, not by declaring them inside your class.