I've been playing with the Pimpl idiom and reaping all sorts of benefits from it. The only thing I haven't been too keen on is the feeling I get when I define the functions.
Once in the header (P def)
Once at the top of the .cpp (Impl def)
Once in the middle of the .cpp (Impl Impl)
Once at the lower end of the .cpp (P Impl)
I really enjoy cutting down code disparity and redundancy, and I feel like my code is less than well oiled when I have to add or change functions in even relatively complex Impls in my current project.
My question is, what effective ways are there to imply or template my classes in such a way that if I were to define a new function, I'd only have to write one explicit definition and implementation, and have the rest remain spatially close to the explicits in code; and if I were to change a function, the changes necessary would be as few as possible?
You might consider something along these lines:
An Interface class to minimize repeating declarations. The client will use the PublicImplementation class in their code.
Pimpl.h
#ifndef PIMPL_H_
#define PIMPL_H_
#include <memory> // std::unique_ptr
class Interface
{
public:
virtual ~Interface() {}
virtual void func_a() = 0;
virtual void func_b() = 0;
};
class PublicImplementation
{
// smart pointer provides exception safety
std::unique_ptr<Interface> impl;
public:
PublicImplementation();
// pass-through invoker
Interface* operator->() { return impl.get(); }
};
#endif // PIMPL_H_
Pimpl.cpp
#include "Pimpl.h"
#include <iostream>
class PrivateImplementation
: public Interface
{
public:
void func_a() override { std::cout << "a" << '\n'; }
void func_b() override { std::cout << "b" << '\n'; }
};
PublicImplementation::PublicImplementation()
: impl(new PrivateImplementation)
{
}
And finally this is what the client code does:
Main.cpp
#include "Pimpl.h"
int main()
{
PublicImplementation pi; // not a pointer
pi->func_a(); // pointer semantics
pi->func_b();
}
Let's postulate your header starts something like this:
class X
{
public:
...la de dah...
private:
struct Impl;
Impl* p_impl_;
};
Then when you add functions you have a choice to make:
do you have the X member function definition implement the logic, referring to p_impl_-> things all over the place, or
return p_impl->same_fn(all_the_args); and keep the logic inside the Impl class?
If you choose 1. then you end up with a function declaration in the header, and a (slightly messier than usual) definition in the matching implementation file.
If you choose 2. then you end up with a function declaration in the header file, a wrapping/forwarding definition in the matching implementation file, and at a minimum a definition in the Impl structure (I tend not to define the functions outside the Impl class definition - it's an implementation detail and the interface is not public anyway).
There is no generally desirable way to improve on this situation (i.e. macro hackery and extra code-generation scripts in your build process may occasionally be warranted, but very rarely).
It may not matter a whole heap, though it may be of interest that a variation on the second approach is to first implement a class that doesn't use the pimpl idiom (complete with proper header and optionally inline functions), you can then wrap it with a pimpl management object and forward functions to it, and in that way you keep the freedom to have some code somewhere some day decide it wants to use the functionality without using the pimpl wrapper, perhaps for improved performance / reduced memory usage at the cost of the recompilation dependency. You can also do this to make use of a specific instantiation of a template without exposing the template's code.
To illustrate this option (as requested in a comment), let's start with a silly non-pimpl class X in its own files, then create a Pimpl::X wrapper (the use of namespace and the same class name is entirely optional but facilitates flipping client code to use either, and a reminder - this isn't meant to be concise, the point here is to let a non-pImpl version be usable too):
// x.h
class X
{
public:
int get() const { return n_; } // inline
void operator=(int); // out-of-line definition
private:
int n_;
};
// x.c++
#include <x.h>
void X::operator=(int n) { n_ = n * 2; }
// x_pimpl.h
namespace Pimpl
{
class X
{
public:
X();
X(const X&);
~X();
X& operator=(const X&);
int get() const;
void operator=(int);
private:
struct Impl;
Impl* p_impl_;
};
}
x_pimpl.c++
#include <x.h>
namespace Pimpl
{
struct X::Impl
{
::X x_;
};
// the usual handling...
X() : p_impl_(new Impl) { }
X(const X& rhs) : p_impl(new Impl) { p_impl_->x_ = rhs.p_impl_->x_; }
~X() { delete p_impl_; }
X& operator=(const X& rhs) { p_impl_->x_ = rhs.p_impl_->x_; return *this; }
// the wrapping...
int X::get() const { return p_impl_->x_.get(); }
void X::operator=(int n) { p_impl_->x_ = n; }
}
If you opt for the above variation on 2, which makes the "implementation" a usable entity in it's own right, then yes - you may end up with 2 declarations and 2 definitions related to a single function, but then one of the definitions will be a simple wrapper/forwarding function which is only significantly repetitive and tedious if the functions are very short and numerous but have lots of parameters.
There's no requirement to treat the IMPL object to the same rules & standards as an object declaration in the .h file. By allowing member variables to be public (via a struct declaration), you don't need to implement an unnecessary wrapper layer. This is generally safe, since only the .cpp file has access to IMPL anyway.
Consider the following code that achieves the benefits of the PIMPL idiom without unnecessary code duplication:
// x.h
class X {
public:
X();
~X();
X(const X&) = delete;
X& operator =(const X&) = delete;
void set(int val);
int get() const;
private:
struct IMPL;
IMPL* impl;
};
// x.cpp
#include "x.h"
struct X::IMPL {
int val;
};
X::X() : impl(new IMPL) {}
X::~X() { delete impl; }
void X::set(int val)
{
impl->val = val;
}
int X::get() const
{
return impl->val;
}
// main.cpp
#include <iostream>
#include "x.h"
int main (int, char *[])
{
X x;
x.set(10);
std::cout << x.get() << std::endl;
return 0;
}
I'm just going to start by sumarizing to make sure I understand: You like the benefits of using pimpl, but dislike the amount of boilerplate code when adding or modifying functions?
In a nutshell, there is no template magic you can use to eliminate this boilerplate, but there are things to consider here as well:
You write code only once but read it many times, and you have at your disposal a variety of copy-paste capabilities. Initially creating the function isn't the majority of the time you will spend on this class. Compiling and maintaining is where your time will be spent.
Be sure to keep the public class API as simple as possible. The fewer functions you have in the public API the less boilerplate you have to write. You can make as many functions as you like in the impl and y ou only have to modify them there.
If you find yourself changing the public class API many many times, you might wish to slightly adjust your design process. Spend ten more minutes up front looking at/writing down use cases and you may reduce your API changes by 90%.
Related
I have a embedded C++03 codebase that needs to support different vendors of gadgets, but only ever one at a time. Most of the functions overlap between the several gadgets, but there are a few exclusives, and these exclusive functions are creating a problem that I need to solve.
Here is an example of clumsy code that works using pre-processor conditionals:
#define HW_TYPE1 0
#define HW_TYPE2 1
#define HW_TYPE HW_TYPE1
struct GadgetBase {
void FncA();
// Many common methods and functions
void FncZ();
};
#if HW_TYPE==HW_TYPE2
struct Gadget : public GadgetBase {
bool Bar() {return(true);}
};
#else
struct Gadget : public GadgetBase {
bool Foo() {return(false);}
};
#endif
Gadget A;
#if HW_TYPE==HW_TYPE2
bool Test() {return(A.Bar());}
#else
bool Test() {return(A.Foo());}
Here is my attempt at converting the above code to C++ templates without pre-processor directives.
The following code does not compile due to an error in the definition of Test() on my particular platform, because either Foo() or Bar() is undefined depending on the value of Type.
enum TypeE {
eType1,
eType2
};
const TypeE Type= eType1; // Set Global Type
// Common functions for both Gadgets
struct GadgetBase {
void FncA();
// Many common methods and functions
void FncZ();
};
// Unique functions for each gadget
template<TypeE E= eType1>
struct Gadget : public GadgetBase {
bool Foo() {return(false);}
};
template<>
struct Gadget<eType2> : public GadgetBase {
bool Bar() {return(true);}
};
Gadget<Type> A;
template<TypeE E= eType1>
bool Test() {return(A.Foo());}
template<>
bool Test() {return(A.Bar());}
I want to do this with templates to keep the number of code changes down when a new type or additional functions are added. There are currently five types with at least two more expected soon. The pre-processor implementation code reeks, I want to clean this up before it gets unwieldy.
The gadget code is a small amount of the total code base, so breaking up the entire project per gadget may not be ideal either.
Even though only one type will ever be used for each project, the unused types still have to compile, how do I best design this using C++03 (no constexpr, const if, etc)? Am I completely approaching this wrongly? I am willing to do a complete overhaul.
EDIT:
Tomek's solution below makes me wonder if it violates LSP. Effectively, another way to look at this is having Test() be part of an interface that requires implementation. So, the example can be reconsidered like the following:
struct GadgetI {
virtual bool Test()=0;
};
template<TypeE E= eType1>
struct Gadget : public GadgetBase, public GadgetI {
bool Foo() {return(false);}
bool Test() {return Foo();}
};
template<>
struct Gadget<eType2> : public GadgetBase, public GadgetI {
bool Bar() {return(true);}
bool Test() {return Bar();}
};
template<>
struct Gadget<eType3> : public GadgetBase, public GadgetI {
bool Test() {} // Violation of LSP?
};
Or similarly with the edited example:
template<typename T>
bool Test(T& o) {} // Violation?
template<>
bool Test(Gadget<eType1> &o) {return(o.Foo());}
template<>
bool Test(Gadget<eType2> &o) {return(o.Bar());}
Test(A);
I might be over-thinking this, I just don't want a poor design now to bite me later.
I agree the code looks convoluted, I'm with you there. But I believe you are going in the wrong direction. Templates seem cool but they are not the right tool in this case. With templates you WILL always compile all the options every time, even if they are not used.
You want the opposite. You want to ONLY compile one source at a time. The proper way to have the best of both worlds is to separate each implementation in a different file and then pick which file to include/compile using external methods.
Build systems usually have plenty of tools for this respect. For example, for compiling natively, we can rely on CMAKE's own CMAKE_SYSTEM_PROCESSOR to identify which is the current processor.
If you want to cross compile you need to specify which platform you want to compile to.
Case in mind, I have a software that needs to be compiled in many operating systems like Redhat, CentOS, Ubuntu, Suse and Windows/Mingw. I have one bash script file that checks for the environment and loads a toolchain cmake file specific for that operating system.
Your case seems to be even simpler. You could just indicate which platform you'd like to use and instruct the build system to compile just the file specific to that platform.
You are getting there :).
Rewrite your Test function so it doesn't rely on global Gadget object but instead takes one as a templated parameter:
template<class T>
bool Test(T &o, std::integral_constant<bool (T::*)(), &T::Foo> * = 0)
{
return(o.Foo());
}
template<class T>
bool Test(T &o, std::integral_constant<bool (T::*)(), &T::Bar> * = 0)
{
return(o.Bar());
}
And call it as:
Test(A);
This relies on SFINAE (Substitution Failure Is Not An Error) idiom. Based on the definitions compiler will deduce the type of T to be a Gadget. Now, depending on availability of the Foo and Bar function it will pick one of the overloads.
Please note this code WILL BREAK if both Foo and Bar are defined in Gadget as the two overloads will match.
This brings a question if you just can't wrap calls to Foo and Bar inside a Gadget class:
template<TypeE E= eType1>
struct Gadget : public GadgetBase {
bool Foo() {return(false);}
bool Test() {return Foo();}
};
template<>
struct Gadget<eType2> : public GadgetBase {
bool Bar() {return(true);}
bool Test() {return Bar();}
};
and consistently call A.Test() instead?
EDIT:
I might have over-complicated it. The following overload may be an easier approach to this:
bool Test(Gadget<eType1> &o)
{
return(o.Foo());
}
bool Test(Gadget<eType2> &o)
{
return(o.Bar());
}
Test(A);
Currently I am writing a class that supports data proccessing on the cpu or gpu utilizing preprocessor definitions to determine which header file to include.
IE
#ifdef CPU_work
#include "cpu_backend.h"
#endif
#ifdef GPU_work
#include "gpu_backend.h"
#endif
class Work {
//Implementation dependant upon included header
}
However, there maybe instances where I would need both variants. Is there anyway I could do something like....
namespace CPU {
#define CPU_work
//Generate implementation of WorkClass with cpu_backend.h
}
namespace GPU {
#define GPU_work
//Generate implementation of WorkClass with gpu_backend.h
}
and therefor determine which implementation I want via something like...
CPU::Work cpuObject;
GPU::Work gpuObject;
Would be happy with any work-arounds also.
Much thanks JJ.
This might be the place to use a template method design. Your base class implements everything that is common to both CPU and GPU and then you use abstract functions where there are differences.
class Work {
public:
void execute() {
// Do some initializing
foo();
// Do some middle stuff
bar();
// Do some final stuff
}
private:
virtual void foo() = 0;
virtual void bar() = 0;
}
class CpuWork: public Work {
virtual void foo() {
// Do some CPU stuff
}
virtual void bar() {
// Do some more CPU stuff
}
}
class GpuWork: public Work {
virtual void foo() {
// Do some GPU stuff
}
virtual void bar() {
// Do some more GPU stuff
}
}
You now can't use your base class Work by accident since it's abstract and you can't accidentally invoke your derived classes foo or bar since they are private members of the base class.
Interesting question:) If I understood your goals correct, I can suggest a few solutions.
First uses template specialization, template default arguments and (of course) some macros.
Check this out:
// cpu_backend.h
#define CPU_BACKEND
class UseCPU;
#ifndef GPU_BACKEND
template<class Method = UseCPU>
struct Backend;
#endif
template<>
struct Backend<UseCPU>
{
char* Info() { return "CPU"; }
};
// gpu_backend.h
#define GPU_BACKEND
class UseGPU;
#ifndef CPU_BACKEND
template<class Method = UseGPU>
struct Backend;
#endif
template<>
struct Backend<UseGPU>
{
char* Info() { return "GPU"; }
};
// main.cpp
// Try to swap comments on headers
// and see how output changes
#include "cpu_backend.h"
//#include "gpu_backend.h"
#include <iostream>
template<class ... Method>
struct Work
{
Work()
{
std::cout << "I use " << backend.Info() << std::endl;
}
private:
Backend<Method ...> backend;
};
int main()
{
Work<> work;
// Uncomment these two while including both headers
//Work<UseCPU> cpuWork;
//Work<UseGPU> gpuWork;
return 0;
}
If you use MSVC you can simplify example above eliminating #define and #ifndef.
Trick: MSVC (2017 and maybe earlier versions) allow to omit that macros thresh, just ignoring the second declaration if they meet in
the same compilation unit, like this:
template<class Method = UseCPU>
struct Backend;
template<class Method = UseGPU>
struct Backend;
BUT this will be not standard. Standard does not allow specifying default template args twice.
Meanwhile, this solution has few drawback:
When you include both headers, someone still can say Work<> which will
use the backend specified by the first header you included.
However, it would be better if compiler forced a person to specify a
backend type explicitly in this circumstances, because otherwise it
relies on the header inclusion order which is bad (say hello to
macros).
Also, it assumes that both backends have the same API (like Info()
in my case)
Possible Fixes for those:
I am sure it is possible to make compiler give an error when both
headers are included and no explicit backend was specified, but it
probably involves more preprocessor things or some SFINAE...
If your backends do have different APIs, then you can insert a few
#ifdef where needed or (preferably) use C++17
if constexpr(std::is_same<Method, UseCPU>()::value) if you have access
to such cool features:)
There is no feature that control visibility/accessibility of class in C++.
Is there any way to fake it?
Are there any macro/template/magic of C++ that can simulate the closest behavior?
Here is the situation
Util.h (library)
class Util{
//note: by design, this Util is useful only for B and C
//Other classes should not even see "Util"
public: static void calculate(); //implementation in Util.cpp
};
B.h (library)
#include "Util.h"
class B{ /* ... complex thing */ };
C.h (library)
#include "Util.h"
class C{ /* ... complex thing */ };
D.h (user)
#include "B.h" //<--- Purpose of #include is to access "B", but not "Util"
class D{
public: static void a(){
Util::calculate(); //<--- should compile error
//When ctrl+space, I should not see "Util" as a choice.
}
};
My poor solution
Make all member of Util to be private, then declare :-
friend class B;
friend class C;
(Edit: Thank A.S.H for "no forward declaration needed here".)
Disadvantage :-
It is a modifying Util to somehow recognize B and C.
It doesn't make sense in my opinion.
Now B and C can access every member of Util, break any private access guard.
There is a way to enable friend for only some members but it is not so cute, and unusable for this case.
D just can't use Util, but can still see it.
Util is still a choice when use auto-complete (e.g. ctrl+space) in D.h.
(Edit) Note: It is all about convenience for coding; to prevent some bug or bad usage / better auto-completion / better encapsulation. This is not about anti-hacking, or prevent unauthorized access to the function.
(Edit, accepted):
Sadly, I can accept only one solution, so I subjectively picked the one that requires less work and provide much flexibility.
To future readers, Preet Kukreti (& texasbruce in comment) and Shmuel H. (& A.S.H is comment) has also provided good solutions that worth reading.
I think that the best way is not to include Util.h in a public header at all.
To do that, #include "Util.h" only in the implementation cpp file:
Lib.cpp:
#include "Util.h"
void A::publicFunction()
{
Util::calculate();
}
By doing that, you make sure that changing Util.h would make a difference only in your library files and not in the library's users.
The problem with this approach is that would not be able to use Util in your public headers (A.h, B.h). forward-declaration might be a partial solution for this problem:
// Forward declare Util:
class Util;
class A {
private:
// OK;
Util *mUtil;
// ill-formed: Util is an incomplete type
Util mUtil;
}
One possible solution would be to shove Util into a namespace, and typedef it inside the B and C classes:
namespace util_namespace {
class Util{
public:
static void calculate(); //implementation in Util.cpp
};
};
class B {
typedef util_namespace::Util Util;
public:
void foo()
{
Util::calculate(); // Works
}
};
class C {
typedef util_namespace::Util Util;
public:
void foo()
{
Util::calculate(); // Works
}
};
class D {
public:
void foo()
{
Util::calculate(); // This will fail.
}
};
If the Util class is implemented in util.cpp, this would require wrapping it inside a namespace util_namespace { ... }. As far as B and C are concerned, their implementation can refer to a class named Util, and nobody would be the wiser. Without the enabling typedef, D will not find a class by that name.
One way to do this is by friending a single intermediary class whose sole purpose is to provide an access interface to the underlying functionality. This requires a bit of boilerplate. Then A and B are subclasses and hence are able to use the access interface, but not anything directly in Utils:
class Util
{
private:
// private everything.
static int utilFunc1(int arg) { return arg + 1; }
static int utilFunc2(int arg) { return arg + 2; }
friend class UtilAccess;
};
class UtilAccess
{
protected:
int doUtilFunc1(int arg) { return Util::utilFunc1(arg); }
int doUtilFunc2(int arg) { return Util::utilFunc2(arg); }
};
class A : private UtilAccess
{
public:
int doA(int arg) { return doUtilFunc1(arg); }
};
class B : private UtilAccess
{
public:
int doB(int arg) { return doUtilFunc2(arg); }
};
int main()
{
A a;
const int x = a.doA(0); // 1
B b;
const int y = b.doB(0); // 2
return 0;
}
Neither A or B have access to Util directly. Client code cannot call UtilAccess members via A or B instances either. Adding an extra class C that uses the current Util functionality will not require modification to the Util or UtilAccess code.
It means that you have tighter control of Util (especially if it is stateful), keeping the code easier to reason about since all access is via a prescribed interface, instead of giving direct/accidental access to anonymous code (e.g. A and B).
This requires boilerplate and doesn't automatically propagate changes from Util, however it is a safer pattern than direct friendship.
If you do not want to have to subclass, and you are happy to have UtilAccess change for every using class, you could make the following modifications:
class UtilAccess
{
protected:
static int doUtilFunc1(int arg) { return Util::utilFunc1(arg); }
static int doUtilFunc2(int arg) { return Util::utilFunc2(arg); }
friend class A;
friend class B;
};
class A
{
public:
int doA(int arg) { return UtilAccess::doUtilFunc1(arg); }
};
class B
{
public:
int doB(int arg) { return UtilAccess::doUtilFunc2(arg); }
};
There are also some related solutions (for tighter access control to parts of a class), one called Attorney-Client and the other called PassKey, both are discussed in this answer: clean C++ granular friend equivalent? (Answer: Attorney-Client Idiom) . In retrospect, I think the solution I have presented is a variation of the Attorney-Client idiom.
An aspect of C++ that periodically frustrates me is deciding where templates fit between header files (traditionally describing the interface) and implemention (.cpp) files. Templates often need to go in the header, exposing the implementation and sometimes pulling in extra headers which previously only needed to be included in the .cpp file. I encountered this problem yet again recently, and a simplified example of it is shown below.
#include <iostream> // for ~Counter() and countAndPrint()
class Counter
{
unsigned int count_;
public:
Counter() : count_(0) {}
virtual ~Counter();
template<class T>
void
countAndPrint(const T&a);
};
Counter::~Counter() {
std::cout << "total count=" << count_ << "\n";
}
template<class T>
void
Counter::countAndPrint(const T&a) {
++count_;
std::cout << "counted: "<< a << "\n";
}
// Simple example class to use with Counter::countAndPrint
class IntPair {
int a_;
int b_;
public:
IntPair(int a, int b) : a_(a), b_(b) {}
friend std::ostream &
operator<<(std::ostream &o, const IntPair &ip) {
return o << "(" << ip.a_ << "," << ip.b_ << ")";
}
};
int main() {
Counter ex;
int i = 5;
ex.countAndPrint(i);
double d=3.2;
ex.countAndPrint(d);
IntPair ip(2,4);
ex.countAndPrint(ip);
}
Note that I intend to use my actual class as a base class, hence the virtual destructor; I doubt it matters, but I've left it in Counter just in case. The resulting output from the above is
counted: 5
counted: 3.2
counted: (2,4)
total count=3
Now Counter's class declaration could all go in a header file (e.g., counter.h). I can put the implementation of the dtor, which requires iostream, into counter.cpp. But what to do for the member function template countAndPrint(), which also uses iostream? It's no use in counter.cpp since it needs to be instantiated outside of the compiled counter.o. But putting it in counter.h means that anything including counter.h also in turn includes iostream, which just seems wrong (and I accept that I may just have to get over this aversion). I could also put the template code into a separate file (counter.t?), but that would be a bit surprising to other users of the code. Lakos doesn't really go into this as much as I'd like, and the C++ FAQ doesn't go into best practice. So what I'm after is:
are there any alternatives for dividing the code to those I've suggested?
in practice, what works best?
A rule of thumb (the reason of which should be clear).
Private member templates should be defined in the .cpp file (unless they need to be callable by friends of your class template).
Non-private member templates should be defined in headers, unless they are explicitly instantiated.
You can often avoid having to include lots of headers by making names be dependent, thus delaying lookup and/or determination of their meaning. This way, you need the complete set of headers only at the point of instantiation. As an example
#include <iosfwd> // suffices
class Counter
{
unsigned int count_;
public:
Counter() : count_(0) {}
virtual ~Counter();
// in the .cpp file, this returns std::cout
std::ostream &getcout();
// makes a type artificially dependent
template<typename T, typename> struct ignore { typedef T type; };
template<class T>
void countAndPrint(const T&a) {
typename ignore<std::ostream, T>::type &cout = getcout();
cout << count_;
}
};
This is what I used for implementing a visitor pattern that uses CRTP. It looked like this initially
template<typename Derived>
struct Visitor {
Derived *getd() { return static_cast<Derived*>(this); }
void visit(Stmt *s) {
switch(s->getKind()) {
case IfStmtKind: {
getd()->visitStmt(static_cast<IfStmt*>(s));
break;
}
case WhileStmtKind: {
getd()->visitStmt(static_cast<WhileStmt*>(s));
break;
}
// ...
}
}
};
This will need the headers of all statement classes because of those static casts. So I have made the types be dependent, and then I only need forward declarations
template<typename T, typename> struct ignore { typedef T type; };
template<typename Derived>
struct Visitor {
Derived *getd() { return static_cast<Derived*>(this); }
void visit(Stmt *s) {
typename ignore<Stmt, Derived>::type *sd = s;
switch(s->getKind()) {
case IfStmtKind: {
getd()->visitStmt(static_cast<IfStmt*>(sd));
break;
}
case WhileStmtKind: {
getd()->visitStmt(static_cast<WhileStmt*>(sd));
break;
}
// ...
}
}
};
The Google Style Guide suggests putting the template code in a "counter-inl.h" file. If you want to be very careful about your includes, that might be the best way.
However, clients getting an included iostream header by "accident" is probably a small price to pay for having all your class's code in a single logical placeāat least if you only have a single member function template.
Practically your only options are to place all template code in a header, or to place template code in a .tcc file and include that file at the end of your header.
Also, if possible you should try to avoid #includeing <iostream> in headers, because this has a significant toll on compile-time. Headers are often #included by multiple implementation files, after all. The only code you need in your header is template and inline code. The destructor doesn't need to be in the header.
Consider the following:
PImpl.hpp
class Impl;
class PImpl
{
Impl* pimpl;
PImpl() : pimpl(new Impl) { }
~PImpl() { delete pimpl; }
void DoSomething();
};
PImpl.cpp
#include "PImpl.hpp"
#include "Impl.hpp"
void PImpl::DoSomething() { pimpl->DoSomething(); }
Impl.hpp
class Impl
{
int data;
public:
void DoSomething() {}
}
client.cpp
#include "Pimpl.hpp"
int main()
{
PImpl unitUnderTest;
unitUnderTest.DoSomething();
}
The idea behind this pattern is that Impl's interface can change, yet clients do not have to be recompiled. Yet, I fail to see how this can truly be the case. Let's say I wanted to add a method to this class -- clients would still have to recompile.
Basically, the only kinds of changes like this that I can see ever needing to change the header file for a class for are things for which the interface of the class changes. And when that happens, pimpl or no pimpl, clients have to recompile.
What kinds of editing here give us benefits in terms of not recompiling client code?
The main advantage is that the clients of the interface aren't forced to include the headers for all your class's internal dependencies. So any changes to those headers don't cascade into a recompile of most of your project. Plus general idealism about implementation-hiding.
Also, you wouldn't necessarily put your impl class in its own header. Just make it a struct inside the single cpp and make your outer class reference its data members directly.
Edit: Example
SomeClass.h
struct SomeClassImpl;
class SomeClass {
SomeClassImpl * pImpl;
public:
SomeClass();
~SomeClass();
int DoSomething();
};
SomeClass.cpp
#include "SomeClass.h"
#include "OtherClass.h"
#include <vector>
struct SomeClassImpl {
int foo;
std::vector<OtherClass> otherClassVec; //users of SomeClass don't need to know anything about OtherClass, or include its header.
};
SomeClass::SomeClass() { pImpl = new SomeClassImpl; }
SomeClass::~SomeClass() { delete pImpl; }
int SomeClass::DoSomething() {
pImpl->otherClassVec.push_back(0);
return pImpl->otherClassVec.size();
}
There has been a number of answers... but no correct implementation so far. I am somewhat saddened that examples are incorrect since people are likely to use them...
The "Pimpl" idiom is short for "Pointer to Implementation" and is also referred to as "Compilation Firewall". And now, let's dive in.
1. When is an include necessary ?
When you use a class, you need its full definition only if:
you need its size (attribute of your class)
you need to access one of its method
If you only reference it or have a pointer to it, then since the size of a reference or pointer does not depend on the type referenced / pointed to you need only declare the identifier (forward declaration).
Example:
#include "a.h"
#include "b.h"
#include "c.h"
#include "d.h"
#include "e.h"
#include "f.h"
struct Foo
{
Foo();
A a;
B* b;
C& c;
static D d;
friend class E;
void bar(F f);
};
In the above example, which includes are "convenience" includes and could be removed without affecting the correctness ? Most surprisingly: all but "a.h".
2. Implementing Pimpl
Therefore, the idea of Pimpl is to use a pointer to the implementation class, so as not to need to include any header:
thus isolating the client from the dependencies
thus preventing compilation ripple effect
An additional benefit: the ABI of the library is preserved.
For ease of use, the Pimpl idiom can be used with a "smart pointer" management style:
// From Ben Voigt's remark
// information at:
// http://en.wikibooks.org/wiki/More_C%2B%2B_Idioms/Checked_delete
template<class T>
inline void checked_delete(T * x)
{
typedef char type_must_be_complete[ sizeof(T)? 1: -1 ];
(void) sizeof(type_must_be_complete);
delete x;
}
template <typename T>
class pimpl
{
public:
pimpl(): m(new T()) {}
pimpl(T* t): m(t) { assert(t && "Null Pointer Unauthorized"); }
pimpl(pimpl const& rhs): m(new T(*rhs.m)) {}
pimpl& operator=(pimpl const& rhs)
{
std::auto_ptr<T> tmp(new T(*rhs.m)); // copy may throw: Strong Guarantee
checked_delete(m);
m = tmp.release();
return *this;
}
~pimpl() { checked_delete(m); }
void swap(pimpl& rhs) { std::swap(m, rhs.m); }
T* operator->() { return m; }
T const* operator->() const { return m; }
T& operator*() { return *m; }
T const& operator*() const { return *m; }
T* get() { return m; }
T const* get() const { return m; }
private:
T* m;
};
template <typename T> class pimpl<T*> {};
template <typename T> class pimpl<T&> {};
template <typename T>
void swap(pimpl<T>& lhs, pimpl<T>& rhs) { lhs.swap(rhs); }
What does it have that the others didn't ?
It simply obeys the Rule of Three: defining the Copy Constructor, Copy Assignment Operator and Destructor.
It does so implementing the Strong Guarantee: if the copy throws during an assignment, then the object is left unchanged. Note that the destructor of T should not throw... but then, that is a very common requirement ;)
Building on this, we can now define Pimpl'ed classes somewhat easily:
class Foo
{
public:
private:
struct Impl;
pimpl<Impl> mImpl;
}; // class Foo
Note: the compiler cannot generate a correct constructor, copy assignment operator or destructor here, because doing so would require access to Impl definition. Therefore, despite the pimpl helper, you will need to define manually those 4. However, thanks to the pimpl helper the compilation will fail, instead of dragging you into the land of undefined behavior.
3. Going Further
It should be noted that the presence of virtual functions is often seen as an implementation detail, one of the advantages of Pimpl is that we have the correct framework in place to leverage the power of the Strategy Pattern.
Doing so requires that the "copy" of pimpl be changed:
// pimpl.h
template <typename T>
pimpl<T>::pimpl(pimpl<T> const& rhs): m(rhs.m->clone()) {}
template <typename T>
pimpl<T>& pimpl<T>::operator=(pimpl<T> const& rhs)
{
std::auto_ptr<T> tmp(rhs.m->clone()); // copy may throw: Strong Guarantee
checked_delete(m);
m = tmp.release();
return *this;
}
And then we can define our Foo like so
// foo.h
#include "pimpl.h"
namespace detail { class FooBase; }
class Foo
{
public:
enum Mode {
Easy,
Normal,
Hard,
God
};
Foo(Mode mode);
// Others
private:
pimpl<detail::FooBase> mImpl;
};
// Foo.cpp
#include "foo.h"
#include "detail/fooEasy.h"
#include "detail/fooNormal.h"
#include "detail/fooHard.h"
#include "detail/fooGod.h"
Foo::Foo(Mode m): mImpl(FooFactory::Get(m)) {}
Note that the ABI of Foo is completely unconcerned by the various changes that may occur:
there is no virtual method in Foo
the size of mImpl is that of a simple pointer, whatever what it points to
Therefore your client need not worry about a particular patch that would add either a method or an attribute and you need not worry about the memory layout etc... it just naturally works.
With the PIMPL idiom, if the internal implementation details of the IMPL class changes, the clients do not have to be rebuilt. Any change in the interface of the IMPL (and hence header file) class obviously would require the PIMPL class to change.
BTW,
In the code shown, there is a strong coupling between IMPL and PIMPL. So any change in class implementation of IMPL also would cause a need to rebuild.
Consider something more realistic and the benefits become more notable. Most of the time that I have used this for compiler firewalling and implementation hiding, I define the implementation class within the same compilation unit that visible class is in. In your example, I wouldn't have Impl.h or Impl.cpp and Pimpl.cpp would look something like:
#include <iostream>
#include <boost/thread.hpp>
class Impl {
public:
Impl(): data(0) {}
void setData(int d) {
boost::lock_guard l(lock);
data = d;
}
int getData() {
boost::lock_guard l(lock);
return data;
}
void doSomething() {
int d = getData();
std::cout << getData() << std::endl;
}
private:
int data;
boost::mutex lock;
};
Pimpl::Pimpl(): pimpl(new Impl) {
}
void Pimpl::doSomething() {
pimpl->doSomething();
}
Now no one needs to know about our dependency on boost. This gets more powerful when mixed together with policies. Details like threading policies (e.g., single vs multi) can be hidden by using variant implementations of Impl behind the scenes. Also notice that there are a number of additional methods available in Impl that aren't exposed. This also makes this technique good for layering your implementation.
In your example, you can change the implementation of data without having to recompile the clients. This would not be the case without the PImpl intermediary. Likewise, you could change the signature or name of Imlp::DoSomething (to a point), and the clients wouldn't have to know.
In general, anything that can be declared private (the default) or protected in Impl can be changed without recompiling the clients.
In non-Pimpl class headers the .hpp file defines the public and private components of your class all in one big bucket.
Privates are closely coupled to your implementation, so this means your .hpp file really can give away a lot about your internal implementation.
Consider something like the threading library you choose to use privately inside the class. Without using Pimpl, the threading classes and types might be encountered as private members or parameters on private methods. Ok, a thread library might be a bad example but you get the idea: The private parts of your class definition should be hidden away from those who include your header.
That's where Pimpl comes in. Since the public class header no longer defines the "private parts" but instead has a Pointer to Implementation, your private world remains hidden from logic which "#include"s your public class header.
When you change your private methods (the implementation), you are changing the stuff hidden beneath the Pimpl and therefore clients of your class don't need to recompile because from their perspective nothing has changed: They no longer see the private implementation members.
http://www.gotw.ca/gotw/028.htm
Not all classes benefit from p-impl. Your example has only primitive types in its internal state which explains why there's no obvious benefit.
If any of the members had complex types declared in another header, you can see that p-impl moves the inclusion of that header from your class's public header to the implementation file, since you form a raw pointer to an incomplete type (but not an embedded field nor a smart pointer). You could just use raw pointers to all your member variables individually, but using a single pointer to all the state makes memory management easier and improves data locality (well, there's not much locality if all those types use p-impl in turn).