Related
This question already has answers here:
Is the PIMPL idiom really used in practice?
(12 answers)
Closed 8 years ago.
Backgrounder:
The PIMPL Idiom (Pointer to IMPLementation) is a technique for implementation hiding in which a public class wraps a structure or class that cannot be seen outside the library the public class is part of.
This hides internal implementation details and data from the user of the library.
When implementing this idiom why would you place the public methods on the pimpl class and not the public class since the public classes method implementations would be compiled into the library and the user only has the header file?
To illustrate, this code puts the Purr() implementation on the impl class and wraps it as well.
Why not implement Purr directly on the public class?
// header file:
class Cat {
private:
class CatImpl; // Not defined here
CatImpl *cat_; // Handle
public:
Cat(); // Constructor
~Cat(); // Destructor
// Other operations...
Purr();
};
// CPP file:
#include "cat.h"
class Cat::CatImpl {
Purr();
... // The actual implementation can be anything
};
Cat::Cat() {
cat_ = new CatImpl;
}
Cat::~Cat() {
delete cat_;
}
Cat::Purr(){ cat_->Purr(); }
CatImpl::Purr(){
printf("purrrrrr");
}
I think most people refer to this as the Handle Body idiom. See James Coplien's book Advanced C++ Programming Styles and Idioms. It's also known as the Cheshire Cat because of Lewis Caroll's character that fades away until only the grin remains.
The example code should be distributed across two sets of source files. Then only Cat.h is the file that is shipped with the product.
CatImpl.h is included by Cat.cpp and CatImpl.cpp contains the implementation for CatImpl::Purr(). This won't be visible to the public using your product.
Basically the idea is to hide as much as possible of the implementation from prying eyes.
This is most useful where you have a commercial product that is shipped as a series of libraries that are accessed via an API that the customer's code is compiled against and linked to.
We did this with the rewrite of IONA's Orbix 3.3 product in 2000.
As mentioned by others, using his technique completely decouples the implementation from the interface of the object. Then you won't have to recompile everything that uses Cat if you just want to change the implementation of Purr().
This technique is used in a methodology called design by contract.
Because you want Purr() to be able to use private members of CatImpl. Cat::Purr() would not be allowed such an access without a friend declaration.
Because you then don't mix responsibilities: one class implements, one class forwards.
For what is worth, it separates the implementation from the interface. This is usually not very important in small size projects. But, in large projects and libraries, it can be used to reduce the build times significantly.
Consider that the implementation of Cat may include many headers, may involve template meta-programming which takes time to compile on its own. Why should a user, who just wants to use the Cat have to include all that? Hence, all the necessary files are hidden using the pimpl idiom (hence the forward declaration of CatImpl), and using the interface does not force the user to include them.
I'm developing a library for nonlinear optimization (read "lots of nasty math"), which is implemented in templates, so most of the code is in headers. It takes about five minutes to compile (on a decent multi-core CPU), and just parsing the headers in an otherwise empty .cpp takes about a minute. So anyone using the library has to wait a couple of minutes every time they compile their code, which makes the development quite tedious. However, by hiding the implementation and the headers, one just includes a simple interface file, which compiles instantly.
It does not necessarily have anything to do with protecting the implementation from being copied by other companies - which wouldn't probably happen anyway, unless the inner workings of your algorithm can be guessed from the definitions of the member variables (if so, it is probably not very complicated and not worth protecting in the first place).
If your class uses the PIMPL idiom, you can avoid changing the header file on the public class.
This allows you to add/remove methods to the PIMPL class, without modifying the external class's header file. You can also add/remove #includes to the PIMPL too.
When you change the external class's header file, you have to recompile everything that #includes it (and if any of those are header files, you have to recompile everything that #includes them, and so on).
Typically, the only reference to a PIMPL class in the header for the owner class (Cat in this case) would be a forward declaration, as you have done here, because that can greatly reduce the dependencies.
For example, if your PIMPL class has ComplicatedClass as a member (and not just a pointer or reference to it) then you would need to have ComplicatedClass fully defined before its use. In practice, this means including file "ComplicatedClass.h" (which will also indirectly include anything ComplicatedClass depends on). This can lead to a single header fill pulling in lots and lots of stuff, which is bad for managing your dependencies (and your compile times).
When you use the PIMPL idiom, you only need to #include the stuff used in the public interface of your owner type (which would be Cat here). Which makes things better for people using your library, and means you don't need to worry about people depending on some internal part of your library - either by mistake, or because they want to do something you don't allow, so they #define private public before including your files.
If it's a simple class, there's usually isn't any reason to use a PIMPL, but for times when the types are quite big, it can be a big help (especially in avoiding long build times).
Well, I wouldn't use it. I have a better alternative:
File foo.h
class Foo {
public:
virtual ~Foo() { }
virtual void someMethod() = 0;
// This "replaces" the constructor
static Foo *create();
}
File foo.cpp
namespace {
class FooImpl: virtual public Foo {
public:
void someMethod() {
//....
}
};
}
Foo *Foo::create() {
return new FooImpl;
}
Does this pattern have a name?
As someone who is also a Python and Java programmer, I like this a lot more than the PIMPL idiom.
Placing the call to the impl->Purr inside the .cpp file means that in the future you could do something completely different without having to change the header file.
Maybe next year they discover a helper method they could have called instead and so they can change the code to call that directly and not use impl->Purr at all. (Yes, they could achieve the same thing by updating the actual impl::Purr method as well, but in that case you are stuck with an extra function call that achieves nothing but calling the next function in turn.)
It also means the header only has definitions and does not have any implementation which makes for a cleaner separation, which is the whole point of the idiom.
We use the PIMPL idiom in order to emulate aspect-oriented programming where pre, post and error aspects are called before and after the execution of a member function.
struct Omg{
void purr(){ cout<< "purr\n"; }
};
struct Lol{
Omg* omg;
/*...*/
void purr(){ try{ pre(); omg-> purr(); post(); }catch(...){ error(); } }
};
We also use a pointer-to-base class to share different aspects between many classes.
The drawback of this approach is that the library user has to take into account all the aspects that are going to be executed, but only sees his/her class. It requires browsing the documentation for any side effects.
I just implemented my first PIMPL class over the last couple of days. I used it to eliminate problems I was having, including file *winsock2.*h in Borland Builder. It seemed to be screwing up struct alignment and since I had socket things in the class private data, those problems were spreading to any .cpp file that included the header.
By using PIMPL, winsock2.h was included in only one .cpp file where I could put a lid on the problem and not worry that it would come back to bite me.
To answer the original question, the advantage I found in forwarding the calls to the PIMPL class was that the PIMPL class is the same as what your original class would have been before you pimpl'd it, plus your implementations aren't spread over two classes in some weird fashion. It's much clearer to implement the public members to simply forward to the PIMPL class.
Like Mr Nodet said, one class, one responsibility.
I don't know if this is a difference worth mentioning but...
Would it be possible to have the implementation in its own namespace and have a public wrapper / library namespace for the code the user sees:
catlib::Cat::Purr(){ cat_->Purr(); }
cat::Cat::Purr(){
printf("purrrrrr");
}
This way all library code can make use of the cat namespace and as the need to expose a class to the user arises a wrapper could be created in the catlib namespace.
I find it telling that, in spite of how well-known the PIMPL idiom is, I don't see it crop up very often in real life (e.g., in open source projects).
I often wonder if the "benefits" are overblown; yes, you can make some of your implementation details even more hidden, and yes, you can change your implementation without changing the header, but it's not obvious that these are big advantages in reality.
That is to say, it's not clear that there's any need for your implementation to be that well hidden, and perhaps it's quite rare that people really do change only the implementation; as soon as you need to add new methods, say, you need to change the header anyway.
Consider I'm writting a static library. Let it has a class Foo
// mylib.h
#include <dependency_header_from_other_static_library.h>
class Foo {
// ...
private:
type_from_dependent_library x;
}
As you can see this library (let call it mylib) depends on another library. It compiles well. But when user compile it's code (that uses Foo and includes mylib.h) and linking with my lib the compilation fails, because user need to have dependency_header_from_other_static_library.h header file to compile code as well.
I want to hide this dependency from the user. How this can be done? The one thing that comes to mind is a PIMPL idiom. Like:
// mylib.h
#include <dependency_header_from_other_static_library.h>
class Foo {
// ...
private:
class FooImpl;
boost::shared_ptr<FooImpl> impl_;
}
// mylib_priv.h
class FooImpl {
// ...
private:
type_from_dependent_library x;
}
But it requires me to duplicate the interface of the class Foo in FooImpl. And, is it an overkill to use PIMPL in my case?
Thanks.
When decoupling a header from other headers, there are a few approaches you might be able to use:
If the used library makes a promise about how it declares its types, you may be able to forward declare the needed types in your header. Of course, this still means you can only refer to these types as pointers or in function signatures in the header but this may be good enough. For example, if the used library promises to have a class LibraryType that you need to use, you can do something like this:
// Foo.h
class LibraryType;
class Foo {
// ...
LibraryType* data;
};
This may cut you the necessary slack to use the type without including its header and without jumping through a PImpl approach.
If the library doesn't make a promise about how it declares it types you may use void* to refer to the corresponding types. Of course, this means that whenever you access the data in your implementation, you'll need to cast the void* to the appropriate type. Since the type is statically known, using static_cast<LibraryType*> is perfectly fine, i.e., there isn't any overhead due to the cast, but it is still relatively painful to do.
The other alternative is, of course, to use the PImpl idiom. If you type provides any reasonably service, it will probably change the interface quite a bit and it shouldn't amount much to replicating the interface between the class itself and the privately declared type. Also, note that the private type is just a data container, i.e., it is reasonably to just make it a struct and have no protection to its accesses. The only real issue is that you need to make sure that the type's definition is visible at the point where the destructor is called. Using std::shared_ptr<T>(new T(/*...*)) arranges for this.
Effectively, all three approaches do the same thing although with slightly different techniques: they provide you an opaque handle to be used in the header file whose definition is only known to the implementation. This way, the client of the library doesn't need to include the corresponding header files. However, unless the symbols are resolved when building the library, it would still be necessary for the client to have access to the used library.
I'm using shared_ptr and STL extensively in a project, and this is leading to over-long, error-prone types like shared_ptr< vector< shared_ptr<const Foo> > > (I'm an ObjC programmer by preference, where long names are the norm, and still this is way too much.) It would be much clearer, I believe, to consistently call this FooListPtr and documenting the naming convention that "Ptr" means shared_ptr and "List" means vector of shared_ptr.
This is easy to typedef, but it's causing headaches with the headers. I seem to have several options of where to define FooListPtr:
Foo.h. That entwines all the headers and creates serious build problems, so it's a non-starter.
FooFwd.h ("forward header"). This is what Effective C++ suggests, based on iosfwd.h. It's very consistent, but the overhead of maintaining twice the number of headers seems annoying at best.
Common.h (put all of them together into one file). This kills reusability by entwining a lot of unrelated types. You now can't just pick up one object and move it to another project. That's a non-starter.
Some kind of fancy #define magic that typedef's if it hasn't already been typedefed. I have an abiding dislike for the preprocessor because I think it makes it hard for new people to grok the code, but maybe....
Use a vector subclass rather than a typedef. This seems dangerous...
Are there best practices here? How do they turn out in real code, when reusability, readability and consistency are paramount?
I've marked this community wiki if others want to add additional options for discussion.
I'm programming on a project which sounds like it uses the common.h method. It works very well for that project.
There is a file called ForwardsDecl.h which is in the pre-compiled header and simply forward-declares all the important classes and necessary typedefs. In this case unique_ptr is used instead of shared_ptr, but the usage should be similar. It looks like this:
// Forward declarations
class ObjectA;
class ObjectB;
class ObjectC;
// List typedefs
typedef std::vector<std::unique_ptr<ObjectA>> ObjectAList;
typedef std::vector<std::unique_ptr<ObjectB>> ObjectBList;
typedef std::vector<std::unique_ptr<ObjectC>> ObjectCList;
This code is accepted by Visual C++ 2010 even though the classes are only forward-declared (the full class definitions are not necessary so there's no need to include each class' header file). I don't know if that's standard and other compilers will require the full class definition, but it's useful that it doesn't: another class (ObjectD) can have an ObjectAList as a member, without needing to include ObjectA.h - this can really help reduce header file dependencies!
Maintenance is not particularly an issue, because the forwards declarations only need to be written once, and any subsequent changes only need to happen in the full declaration in the class' header file (and this will trigger fewer source files to be recompiled due to reduced dependencies).
Finally it appears this can be shared between projects (I haven't tried myself) because even if a project does not actually declare an ObjectA, it doesn't matter because it was only forwards declared and if you don't use it the compiler doesn't care. Therefore the file can contain the names of classes across all projects it's used in, and it doesn't matter if some are missing for a particular project. All that is required is the necessary full declaration header (e.g. ObjectA.h) is included in any source (.cpp) files that actually use them.
I would go with a combined approach of forward headers and a kind of common.h header that is specific to your project and just includes all the forward declaration headers and any other stuff that is common and lightweight.
You complain about the overhead of maintaining twice the number of headers but I don’t think this should be too much of a problem: the forward headers usually only need to know a very limited number of types (one?), and sometimes not even the full type.
You could even try auto-generating the headers using a script (this is done e.g. in SeqAn) if there are really that many headers.
+1 for documenting the typedef conventions.
Foo.h - can you detail the problems you have with that?
FooFwd.h - I'd not use them generally, only on "obvious hotspots". (Yes, "hotspots" are hard to determine).
It doesn't change the rules IMO because when you do introduce a fwd header, the associated typedefs from foo.h move there.
Common.h - cool for small projects, but doesn't scale, I do agree.
Some kind of fancy #define... PLEASE NO!...
Use a vector subclass - doesn't make it better.
You might use containment, though.
So here the prelimenary suggestions (revised from that other question..)
Standard type headers <boost/shared_ptr.hpp>, <vector> etc. can go into a precompiled header / shared include file for the project. This is not bad. (I personally still include them where needed, but that works in addition to putting them into the PCH.)
If the container is an implementation detail, the typedefs go where the container is declared (e.g. private class members if the container is a private class member)
Associated types (like FooListPtr) go to where Foo is declarated, if the associated type is the primary use of the type. That's almost always true for some types - e.g. shared_ptr.
If Foo gets a separate forward declaration header, and the associated type is ok with that, it moves to the FooFwd.h, too.
If the type is only associated with a particular interface (e.g. parameter for a public method), it goes there.
If the type is shared (and does not meet any of the previous criteria), it gets its own header. Note that this also means to pull in all dependencies.
It feels "obvious" for me, but I agree it's not good as a coding standard.
I'm using shared_ptr and STL extensively in a project, and this is leading to over-long, error-prone types like shared_ptr< vector< shared_ptr > > (I'm an ObjC programmer by preference, where long names are the norm, and still this is way too much.) It would be much clearer, I believe, to consistently call this FooListPtr and documenting the naming convention that "Ptr" means shared_ptr and "List" means vector of shared_ptr.
for starters, i recommend using good design structures for scoping (e.g., namespaces) as well as descriptive, non-abbreviated names for typedefs. FooListPtr is terribly short, imo. nobody wants to guess what an abbreviation means (or be surprised to find Foo is const, shared, etc.), and nobody wants to alter their code simply because of scope collisions.
it may also help to choose a prefix for typedefs in your libraries (as well as other common categories).
it's also a bad idea to drag types out of their declared scope:
namespace MON {
namespace Diddy {
class Foo;
} /* << Diddy */
/*...*/
typedef Diddy::Foo Diddy_Foo;
} /* << MON */
there are exceptions to this:
an entirely ecapsualted private type
a contained type within a new scope
while we're at it, using in namespace scopes and namespace aliases should be avoided - qualify the scope if you want to minimize future maintentance.
This is easy to typedef, but it's causing headaches with the headers. I seem to have several options of where to define FooListPtr:
Foo.h. That entwines all the headers and creates serious build problems, so it's a non-starter.
it may be an option for declarations which really depend on other declarations. implying that you need to divide packages, or there is a common, localized interface for subsystems.
FooFwd.h ("forward header"). This is what Effective C++ suggests, based on iosfwd.h. It's very consistent, but the overhead of maintaining twice the number of headers seems annoying at best.
don't worry about the maintenance of this, really. it is a good practice. the compiler uses forward declarations and typedefs with very little effort. it's not annoying because it helps reduce your dependencies, and helps ensure that they are all correct and visible. there really isn't more to maintain since the other files refer to the 'package types' header.
Common.h (put all of them together into one file). This kills reusability by entwining a lot of unrelated types. You now can't just pick up one object and move it to another project. That's a non-starter.
package based dependencies and inclusions are excellent (ideal, really) - do not rule this out. you'll obviously have to create package interfaces (or libraries) which are designed and structured well, and represent related classes of components. you're making an unnecessary issue out of object/component reuse. minimize the static data of a library, and let the link and strip phases do their jobs. again, keep your packages small and reusable and this will not be an issue (assuming your libraries/packages are well designed).
Some kind of fancy #define magic that typedef's if it hasn't already been typedefed. I have an abiding dislike for the preprocessor because I think it makes it hard for new people to grok the code, but maybe....
actually, you may declare a typedef in the same scope multiple times (e.g., in two separate headers) - that is not an error.
declaring a typedef in the same scope with different underlying types is an error. obviously. you must avoid this, and fortunately the compiler enforces that.
to avoid this, create a 'translation build' which includes the world - the compiler will flag declarations of typedeffed types which don't match.
trying to sneak by with minimal typedefs and/or forwards (which are close enough to free at compilation) is not worth the effort. sometimes you'll need a bunch of conditional support for forward declarations - once that is defined, it is easy (stl libraries are a good example of this -- in the event you are also forward declaring template<typename,typename>class vector;).
it's best to just have all these declarations visible to catch any errors immediately, and you can avoid the preprocessor in this case as a bonus.
Use a vector subclass rather than a typedef. This seems dangerous...
a subclass of std::vector is often flagged as a "beginner's mistake". this container was not meant to be subclassed. don't resort to bad practices simply to reduce your compile times/dependencies. if the dependency really is that significant, you should probably be using PIMPL, anyways:
// <package>.types.hpp
namespace MON {
class FooListPtr;
}
// FooListPtr.hpp
namespace MON {
class FooListPtr {
/* ... */
private:
shared_ptr< vector< shared_ptr<const Foo> > > d_data;
};
}
Are there best practices here? How do they turn out in real code, when reusability, readability and consistency are paramount?
ultimately, i've found a small concise package based approach the best for reuse, for reducing compile times, and minimizing dependence.
Unfortunately with typedefs you have to choose between not ideal options for your header files. There are special cases where option one (right in the class header) works well, but it sounds like it won't work for you. There are also cases where the last option works well, but it's usually where you are using the subclass to replace a pattern involving a class with a single member of type std::vector. For your situation, I'd use the forward declaring header solution. There's extra typing and overhead, but it wouldn't be C++ otherwise, right? It keeps things separate, clean and fast.
I want to remove, if possible, the includes of both <vector> and <string> from my class header file. Both string and vector are return types of functions declared in the header file.
I was hoping I could do something like:
namespace std {
template <class T>
class vector;
}
And, declare the vector in the header and include it in the source file.
Is there a reference covering situations where you must include in the header, and situations where you can pull the includes into the source file?
You cannot safely forward declare STL templates, at least if you want to do it portably and safely. The standard is clear about the minimum requirements for each of the STL element, but leaves room for implemtation extensions that might add extra template parameters as long as those have default values. That is: the standard states that std::vector is a template that takes at least 2 parameters (type and allocator) but can have any number of extra arguments in a standard compliant implementation.
What is the point of not including string and vector headers? Surely whoever is going to use your class must have already included it since it is on your interface.
When you ask about a reference to decide when to include and when to forward declare, my advice would be: include everything that is part of your interface, forward declare internal details.
There are more issues here that plain compilation performance. If you push the include of a type that is in your public (or protected) interface outside of the header you will be creating dependencies on the order of includes. Users must know that they must include string before including your header, so you are giving them one more thing to worry about.
What things should be included in the implementation file: implementation details, loggers, elements that don't affect the interface (the database connectors, file headers), internal implementation details (i.e. using STL algorithms for your implementation does not affect your interface, functors that are created for a simple purpose, utilities...)
With a very few exceptions, you are not allowed to add things to the std:; namespace. For classes like vector and string, you therefore have no option but to #include the relevant Standard header files.
Also, notice that string is not a class, but a typedef for basic_string<char>.
This won't help for vector or string, but it might be worth mentioning that there is a forward reference header for iostream, called iosfwd.
Standard containers often have additional default template parameters (allocators, etc.) so this will not work. For example, here's a snippet from GNU implementation:
template<typename _Tp, typename _Alloc = std::allocator<_Tp> >
class vector : protected _Vector_base<_Tp, _Alloc>
{ ... };
This was something I was trying to do earlier too, but this is not possible due to templates.
Please see my question: Forward Declaration of a Base Class
As such headers don't change during your development it's not worth optimizing that anyway...
There is no simple obvious way to do it (as others have explained it very well).
However these headers should be seen as being part of the language (really!), so you can let them in your own headers without any problem, nobody will ever complain.
If your concern is compilation speed, I encourage you to use pre-compiled header instead and put these std headers in it (among other things). It will significantly increase your compilation speed.
Sorry the for the "real winner is the one who avoid the fight" kind of answer.
Just include the header in any file where you reference an STL collection.
As others have mentioned, there's not a way to reliably forward declare the STL classes, and even if you find one for your particular implementation, it will probably break if you use a different STL implementation.
If the compilation units don't instantiate the classes, it won't make your object files any bigger.
If string and vector are used only in signatures of non-public members of you class, you could use the PImpl idiom:
// MyClass.h
class MyClassImpl;
class MyClass{
public:
MyClass();
void MyMethod();
private:
MyClassImpl* m_impl;
};
// MyClassImpl.h
#include <vector>
#include <string>
#include <MyClass.h>
class MyClassImpl{
public:
MyClassImpl();
void MyMethod();
protected:
std::vector<std::string> StdMethod();
};
// MyClass.cpp
#include <MyClass.h>
#include <MyClassImpl.h>
void MyClass::MyMethod(){
m_impl->MyMethod();
}
You are always including vector and string in the header file, but only in the implementation part of your class; files including only MyClass.h will not be pulling in string and vector.
WARNING
Expect that doing this will cause uproar.
The language allows you to derive your own classes:
// MyKludges.h
#include <vector>
#include <string>
class KludgeIntVector : public std::vector<int> {
// ...
};
class KludgeDoubleVector : public std::vector<double> {
// ...
};
class KludgeString : public std::string {
// ...
};
Change your functions to return KludgeString and KludgeIntVector. Since these are no longer templates, you can forward declare them in your header files, and include MyKludges.h in your implementation files.
Strictly speaking, derived classes do not inherit base class constructors, destructors, assignment operators, and friends. You will need to provide (trivial) implementations of any that you're using.
// LotsOfFunctions.h
// Look, no includes! All forward declared!
class KludgeString;
// 10,000 functions that use neither strings nor vectors
// ...
void someFunction(KludgeString &);
// ...
// Another 10,000 functions that use neither strings nor vectors
// someFunction.cpp
// Implement someFunction in its own compilation unit
// <string> and <vector> arrive on the next line
#include "MyKludges.h"
#include "LotsOfFunctions.h"
void someFunction(KludgeString &k) { k.clear(); }
Maybe you would better use the pimpl idiom: it appears to me that you don't want to expose the implementation of your class to client code. If the vector and string objects are aggregated by value, the compiler needs to see their full declarations.
With the exception of adding overloads to std::swap (the only exception I can think of right now), you are generally not allowed to add anything to the std namespace. Even if it were allowed, the actual declaration for std::vector is a lot more complicated than the code in the OP. See Nikolai N Fetissov's answer for an example.
All that aside, you have the additional problem of what your class users are going to do with functions that return a std::vector or std::string. The C++ Standard section 3.10 says that functions returning such objects are returning rvalues, and rvalues must be of a complete type. In English, if your users want to do anything with those functions, they'll have to #include <vector> or <string> anyway. I think it would be easier to #include the headers for them in your .h file and be done with it.
I assume your objective here is to speed up compile times? Otherwise I'm not sure why you would want to remove them.
Another approach (not pretty but practical) is to use macro guards around the include itself.
e.g.
#ifndef STDLIB_STRING
#include <string>
#define STDLIB_STRING
#endif
Although this looks messy, on large codebases it does indeed increase the compile times. What we did is create a Visual Studio macro that will automatically generate the guards. We bind the macro to a key for easy coding. Then it just becomes a company coding standard (or habit).
We also do it for our own includes as well.
e.g.
#ifndef UTILITY_DATE_TIME_H
#include "Utility/DateTime.h"
#endif
Since we have Visual Studio helpers to auto-generate the guards when we create our own header files, we don't need the #define. The macro knows it's a internal include because we always use the
#include ""
format for our own includes and
#include <>
for external includes.
I know it doesn't look pretty but it did speed up our compile times on a largish codebase by over 1/2 hour (from memory).
This question already has answers here:
Is the PIMPL idiom really used in practice?
(12 answers)
Closed 8 years ago.
Backgrounder:
The PIMPL Idiom (Pointer to IMPLementation) is a technique for implementation hiding in which a public class wraps a structure or class that cannot be seen outside the library the public class is part of.
This hides internal implementation details and data from the user of the library.
When implementing this idiom why would you place the public methods on the pimpl class and not the public class since the public classes method implementations would be compiled into the library and the user only has the header file?
To illustrate, this code puts the Purr() implementation on the impl class and wraps it as well.
Why not implement Purr directly on the public class?
// header file:
class Cat {
private:
class CatImpl; // Not defined here
CatImpl *cat_; // Handle
public:
Cat(); // Constructor
~Cat(); // Destructor
// Other operations...
Purr();
};
// CPP file:
#include "cat.h"
class Cat::CatImpl {
Purr();
... // The actual implementation can be anything
};
Cat::Cat() {
cat_ = new CatImpl;
}
Cat::~Cat() {
delete cat_;
}
Cat::Purr(){ cat_->Purr(); }
CatImpl::Purr(){
printf("purrrrrr");
}
I think most people refer to this as the Handle Body idiom. See James Coplien's book Advanced C++ Programming Styles and Idioms. It's also known as the Cheshire Cat because of Lewis Caroll's character that fades away until only the grin remains.
The example code should be distributed across two sets of source files. Then only Cat.h is the file that is shipped with the product.
CatImpl.h is included by Cat.cpp and CatImpl.cpp contains the implementation for CatImpl::Purr(). This won't be visible to the public using your product.
Basically the idea is to hide as much as possible of the implementation from prying eyes.
This is most useful where you have a commercial product that is shipped as a series of libraries that are accessed via an API that the customer's code is compiled against and linked to.
We did this with the rewrite of IONA's Orbix 3.3 product in 2000.
As mentioned by others, using his technique completely decouples the implementation from the interface of the object. Then you won't have to recompile everything that uses Cat if you just want to change the implementation of Purr().
This technique is used in a methodology called design by contract.
Because you want Purr() to be able to use private members of CatImpl. Cat::Purr() would not be allowed such an access without a friend declaration.
Because you then don't mix responsibilities: one class implements, one class forwards.
For what is worth, it separates the implementation from the interface. This is usually not very important in small size projects. But, in large projects and libraries, it can be used to reduce the build times significantly.
Consider that the implementation of Cat may include many headers, may involve template meta-programming which takes time to compile on its own. Why should a user, who just wants to use the Cat have to include all that? Hence, all the necessary files are hidden using the pimpl idiom (hence the forward declaration of CatImpl), and using the interface does not force the user to include them.
I'm developing a library for nonlinear optimization (read "lots of nasty math"), which is implemented in templates, so most of the code is in headers. It takes about five minutes to compile (on a decent multi-core CPU), and just parsing the headers in an otherwise empty .cpp takes about a minute. So anyone using the library has to wait a couple of minutes every time they compile their code, which makes the development quite tedious. However, by hiding the implementation and the headers, one just includes a simple interface file, which compiles instantly.
It does not necessarily have anything to do with protecting the implementation from being copied by other companies - which wouldn't probably happen anyway, unless the inner workings of your algorithm can be guessed from the definitions of the member variables (if so, it is probably not very complicated and not worth protecting in the first place).
If your class uses the PIMPL idiom, you can avoid changing the header file on the public class.
This allows you to add/remove methods to the PIMPL class, without modifying the external class's header file. You can also add/remove #includes to the PIMPL too.
When you change the external class's header file, you have to recompile everything that #includes it (and if any of those are header files, you have to recompile everything that #includes them, and so on).
Typically, the only reference to a PIMPL class in the header for the owner class (Cat in this case) would be a forward declaration, as you have done here, because that can greatly reduce the dependencies.
For example, if your PIMPL class has ComplicatedClass as a member (and not just a pointer or reference to it) then you would need to have ComplicatedClass fully defined before its use. In practice, this means including file "ComplicatedClass.h" (which will also indirectly include anything ComplicatedClass depends on). This can lead to a single header fill pulling in lots and lots of stuff, which is bad for managing your dependencies (and your compile times).
When you use the PIMPL idiom, you only need to #include the stuff used in the public interface of your owner type (which would be Cat here). Which makes things better for people using your library, and means you don't need to worry about people depending on some internal part of your library - either by mistake, or because they want to do something you don't allow, so they #define private public before including your files.
If it's a simple class, there's usually isn't any reason to use a PIMPL, but for times when the types are quite big, it can be a big help (especially in avoiding long build times).
Well, I wouldn't use it. I have a better alternative:
File foo.h
class Foo {
public:
virtual ~Foo() { }
virtual void someMethod() = 0;
// This "replaces" the constructor
static Foo *create();
}
File foo.cpp
namespace {
class FooImpl: virtual public Foo {
public:
void someMethod() {
//....
}
};
}
Foo *Foo::create() {
return new FooImpl;
}
Does this pattern have a name?
As someone who is also a Python and Java programmer, I like this a lot more than the PIMPL idiom.
Placing the call to the impl->Purr inside the .cpp file means that in the future you could do something completely different without having to change the header file.
Maybe next year they discover a helper method they could have called instead and so they can change the code to call that directly and not use impl->Purr at all. (Yes, they could achieve the same thing by updating the actual impl::Purr method as well, but in that case you are stuck with an extra function call that achieves nothing but calling the next function in turn.)
It also means the header only has definitions and does not have any implementation which makes for a cleaner separation, which is the whole point of the idiom.
We use the PIMPL idiom in order to emulate aspect-oriented programming where pre, post and error aspects are called before and after the execution of a member function.
struct Omg{
void purr(){ cout<< "purr\n"; }
};
struct Lol{
Omg* omg;
/*...*/
void purr(){ try{ pre(); omg-> purr(); post(); }catch(...){ error(); } }
};
We also use a pointer-to-base class to share different aspects between many classes.
The drawback of this approach is that the library user has to take into account all the aspects that are going to be executed, but only sees his/her class. It requires browsing the documentation for any side effects.
I just implemented my first PIMPL class over the last couple of days. I used it to eliminate problems I was having, including file *winsock2.*h in Borland Builder. It seemed to be screwing up struct alignment and since I had socket things in the class private data, those problems were spreading to any .cpp file that included the header.
By using PIMPL, winsock2.h was included in only one .cpp file where I could put a lid on the problem and not worry that it would come back to bite me.
To answer the original question, the advantage I found in forwarding the calls to the PIMPL class was that the PIMPL class is the same as what your original class would have been before you pimpl'd it, plus your implementations aren't spread over two classes in some weird fashion. It's much clearer to implement the public members to simply forward to the PIMPL class.
Like Mr Nodet said, one class, one responsibility.
I don't know if this is a difference worth mentioning but...
Would it be possible to have the implementation in its own namespace and have a public wrapper / library namespace for the code the user sees:
catlib::Cat::Purr(){ cat_->Purr(); }
cat::Cat::Purr(){
printf("purrrrrr");
}
This way all library code can make use of the cat namespace and as the need to expose a class to the user arises a wrapper could be created in the catlib namespace.
I find it telling that, in spite of how well-known the PIMPL idiom is, I don't see it crop up very often in real life (e.g., in open source projects).
I often wonder if the "benefits" are overblown; yes, you can make some of your implementation details even more hidden, and yes, you can change your implementation without changing the header, but it's not obvious that these are big advantages in reality.
That is to say, it's not clear that there's any need for your implementation to be that well hidden, and perhaps it's quite rare that people really do change only the implementation; as soon as you need to add new methods, say, you need to change the header anyway.