The Pimpl Idiom in practice - c++

There have been a few questions on SO about the pimpl idiom, but I'm more curious about how often it is leveraged in practice.
I understand there are some trade-offs between performance and encapsulation, plus some debugging annoyances due to the extra redirection.
With that, is this something that should be adopted on a per-class, or an all-or-nothing basis? Is this a best-practice or personal preference?
I realize that's somewhat subjective, so let me list my top priorities:
Code clarity
Code maintainability
Performance
I always assume that I will need to expose my code as a library at some point, so that's also a consideration.
EDIT: Any other options to accomplish the same thing would be welcome suggestions.

I'd say that whether you do it per-class or on an all-or-nothing basis depends on why you go for the pimpl idiom in the first place. My reasons, when building a library, have been one of the following:
Wanted to hide implementation in order to avoid disclosing information (yes, it was not a FOSS project :)
Wanted to hide implementation in order to make client code less dependent. If you build a shared library (DLL), you can change your pimpl class without even recompiling the application.
Wanted to reduce the time it takes to compile the classes using the library.
Wanted to fix a namespace clash (or similar).
None of these reasons prompts for the all-or-nothing approach. In the first one, you only pimplize what you want to hide, whereas in the second case it's probably enough to do so for classes which you expect to change. Also for the third and fourth reason there's only benefit from hiding non-trivial members that in turn require extra headers (e.g., of a third-party library, or even STL).
In any case, my point is that I wouldn't typically find something like this too useful:
class Point {
public:
Point(double x, double y);
Point(const Point& src);
~Point();
Point& operator= (const Point& rhs);
void setX(double x);
void setY(double y);
double getX() const;
double getY() const;
private:
class PointImpl;
PointImpl* pimpl;
}
In this kind of a case, the tradeoff starts to hit you because the pointer needs to be dereferenced, and the methods cannot be inlined. However, if you do it only for non-trivial classes then the slight overhead can typically be tolerated without any problems.

One of the biggest uses of pimpl ideom is the creation of stable C++ ABI. Almost every Qt class uses "D" pointer that is kind of pimpl. This allows performing much easier changes withot breaking ABI.

Code Clarity
Code clarity is very subjective, but in my opinion a header that has a single data-member is much more readable than a header with many data-members. The implementation file however is noisier, so clarity is reduced there. That might not be an issue if the class is a base class, mostly used by derived classes rather than maintained.
Maintainability
For maintainability of the pimpl'd class I personally find the extra dereference in each access of a data-member tedious. Accessors can't help if the data is purely private because then you shouldn't expose an accessor or mutator for it anyway, and you're stuck with constantly dereferencing the pimpl.
For maintainability of derived classes I find the idiom is a pure win in all cases, because the header file lists fewer irrelevant details. Compile time is also improved for all client compilation units.
Performance
Performance loss is small in many cases and significant in few. In the long-run it is in the order of magnitude of virtual functions' performance loss. We're talking about an extra dereference per access per data-member, plus dynamic memory allocation for the pimpl, plus release of the memory on destruction. If the pimpl'd class doesn't access its data-members often, the pimpl'd class' objects are created often and are short-lived then dynamic allocation can out-weigh the extra-dereferences.
Decision
I think classes in which performance is crucial, such that one extra dereference or memory allocation makes a significant difference, shouldn't use the pimpl no matter what. Base classe in which this reduction in performance is insignificant and of which the header file is widely #include'd probably should use the pimpl if compilation time is improved significantly. If compilation time isn't reduced it's down to your code-clarity taste.
For all other cases it's purely a matter of taste. Try it and measure runtime performance and compile-time performance before you make a decision.

pImpl is very useful when you come to implement std::swap and operator= with the strong exception guarantee. I'm inclined to say that if your class supports either of those, and has more than one non-trivial field, then it's usually no longer down to preference.
Otherwise, it's about how tightly you want clients to be bound to the implementation via the header file. If binary-incompatible changes aren't a problem, then you might not benefit much in maintainability, although if compile speed becomes an issue there are usually savings there.
The performance costs probably have more to do with loss of inlining than they do with indirection, but that's a wild guess.
You can always add pImpl later, and declare that from this day forth clients will not have to recompile just because you added a private field.
So none of this suggests an all-or-nothing approach. You can selectively do it for the classes where it gives you benefit, not for the ones it doesn't, and change your mind later. Implementing for example iterators as pImpl sounds like Too Much Design...

This idiom helps greatly with compile time on large projects.
External link
This is good too

I generally use it when I want to avoid a header file polluting my codebase. Windows.h is the perfect example. It is so badly behaved, I'd rather kill myself than have it visible everywhere. So assuming you want a class-based API, hiding it behind a pimpl class neatly solves the problem. (If you're content to just expose individual function, those can just be forward declared, of course, without putting them into a pimpl class)
I wouldn't use pimpl everywhere, partly because of the performance hit, and partly just because it's a lot of extra work for a usually small benefit. The main thing it gives you is isolation between implementation and interface. Usually, that's just not a very high priority.

I use the idiom in a couple of places in my own libraries, in both cases to cleanly split the interface from tthe implementation. I have, for example, an XML reader class fully declared in a .h file, which has a PIMPL to a RealXMLReader class which is declared & defined in non-public .h and .cpp files. The RealXMlReader in turn is a convenience wrapper for the XML parser I use (currently Expat).
This arrangement allows me to change from Expat in the future to another XML parser without having to recompile all the client code (I still need to re-link of course).
Note that I don't do this for compile-time performance reasons, only for conveniance. There are a few PIMPL fabnatics who insist that any project containing more than three files will be uncompilable unless you use PIMPLs throughout. It's noticeable that these people never produce any actual evidence, but only make vague references to "Latkos" and "exponential time".

pImpl will work best when we have r-value semantics.
The "alternative" to pImpl, that will also achieve hiding the implementation detail, is to use an abstract base class and put the implementation in a derived class. Users call some kind of "factory" method to create the instance and will generally use a pointer (probably a shared one) to the abstract class.
The rationale behind pImpl instead can be:
Saving on a v-table. Yes, but will your compiler inline all the forwarding and will you really save anything.
If your module contains multiple classes that know about each other in detail although to the outside world you hide that.
Semantics of the container class for the pImpl could be:
- Non-copyable, not assignable... So you "new" your pImpl on construction and "delete" on destruction
- shared. So you have shared_ptr rather than Impl*
With shared_ptr you can use a forward declaration as long as the class is complete at the point of the destructor. Your destructor should be defined even if default (which it probably will be).
swappable. You can implement "may be empty" and implements "swap". Users can create an instance of one and pass a non-const reference to it to get it populated, with a "swap".
2-stage construction. You construct an empty one then call "load()" on it to populate it.
shared is the only one I have even a remote liking for without r-value semantics. With them we can also implement non-copyable non-assignable properly. I like to be able to call a function that gives me one.
I have, however, found I tend more now to use abstract base classes more than pImpl, even when there is only one implementation.

Related

Why aren't all functions virtual in C++? [duplicate]

Is there any real reason not to make a member function virtual in C++? Of course, there's always the performance argument, but that doesn't seem to stick in most situations since the overhead of virtual functions is fairly low.
On the other hand, I've been bitten a couple of times with forgetting to make a function virtual that should be virtual. And that seems to be a bigger argument than the performance one. So is there any reason not to make member functions virtual by default?
One way to read your questions is "Why doesn't C++ make every function virtual by default, unless the programmer overrides that default." Without consulting my copy of "Design and Evolution of C++": this would add extra storage to every class unless every member function is made non-virtual. Seems to me this would have required more effort in the compiler implementation, and slowed down the adoption of C++ by providing fodder to the performance obsessed (I count myself in that group.)
Another way to read your questions is "Why do C++ programmers do not make every function virtual unless they have very good reasons not to?" The performance excuse is probably the reason. Depending on your application and domain, this might be a good reason or not. For example, part of my team works in market data ticker plants. At 100,000+ messages/second on a single stream, the virtual function overhead would be unacceptable. Other parts of my team work in complex trading infrastructure. Making most functions virtual is probably a good idea in that context, as the extra flexibility beats the micro-optimization.
Stroustrup, the designer of the language, says:
Because many classes are not designed to be used as base classes. For example, see class complex.
Also, objects of a class with a virtual function require space needed by the virtual function call mechanism - typically one word per object. This overhead can be significant, and can get in the way of layout compatibility with data from other languages (e.g. C and Fortran).
See The Design and Evolution of C++ for more design rationale.
There are several reasons.
First, performance: Yes, the overhead of a virtual function is relatively low seen in isolation. But it also prevents the compiler from inlining, and that is a huge source of optimization in C++. The C++ standard library performs as well as it does because it can inline the dozens and dozens of small one-liners it consists of. Additionally, a class with virtual methods is not a POD datatype, and so a lot of restrictions apply to it. It can't be copied just by memcpy'ing, it becomes more expensive to construct, and takes up more space. There are a lot of things that suddenly become illegal or less efficient once a non-POD type is involved.
And second, good OOP practice. The point in a class is that it makes some kind of abstraction, hides its internal details, and provides a guarantee that "this class will behave so and so, and will always maintain these invariants. It will never end up in an invalid state".
That is pretty hard to live up to if you allow others to override any member function. The member functions you defined in the class are there to ensure that the invariant is maintained. If we didn't care about that, we could just make the internal data members public, and let people manipulate them at will. But we want our class to be consistent. And that means we have to specify the behavior of its public interface. That may involve specific customizability points, by making individual functions virtual, but it almost always also involves making most methods non-virtual, so that they can do the job of ensuring that the invariant is maintained. The non-virtual interface idiom is a good example of this:
http://www.gotw.ca/publications/mill18.htm
Third, inheritance isn't often needed, especially not in C++. Templates and generic programming (static polymorphism) can in many cases do a better job than inheritance (runtime polymorphism). Yes, you sometimes still need virtual methods and inheritance, but it is certainly not the default. If it is, you're Doing It Wrong. Work with the language, rather than try to pretend it was something else. It's not Java, and unlike Java, in C++ inheritance is the exception, not the rule.
I'll ignore performance and memory cost, because I have no way to measure them for the "in general" case...
Classes with virtual member functions are non-POD. So if you want to use your class in low-level code which relies on it being POD, then (among other restrictions) any member functions must be non-virtual.
Examples of things you can portably do with an instance of a POD class:
copy it with memcpy (provided the target address has sufficient alignment).
access fields with offsetof()
in general, treat it as a sequence of char
... um
that's about it. I'm sure I've forgotten something.
Other things people have mentioned that I agree with:
Many classes are not designed for inheritance. Making their methods virtual would be misleading, since it implies child classes might want to override the method, and there shouldn't be any child classes.
Many methods are not designed to be overridden: same thing.
Also, even when things are intended to be subclassed / overridden, they aren't necessarily intended for run-time polymorphism. Very occasionally, despite what OO best practice says, what you want inheritance for is code reuse. For example if you're using CRTP for simulated dynamic binding. So again you don't want to imply your class will play nicely with runtime polymorphism by making its methods virtual, when they should never be called that way.
In summary, things which are intended to be overridden for runtime polymorphism should be marked virtual, and things which don't, shouldn't. If you find that almost all your member functions are intended to be virtual, then mark them virtual unless there's a reason not to. If you find that most of your member functions are not intended to be virtual, then don't mark them virtual unless there's a reason to do so.
It's a tricky issue when designing a public API, because flipping a method from one to the other is a breaking change, so you have to get it right first time. But you don't necessarily know before you have any users, whether your users are going to want to "polymorph" your classes. Ho hum. The STL container approach, of defining abstract interfaces and banning inheritance entirely, is safe but sometimes requires users to do more typing.
The following post is mostly opinion, but here goes:
Object oriented design is three things, and encapsulation (information hiding) is the first of these things. If a class design is not solid on this, then the rest doesn't really matter very much.
It has been said before that "inheritance breaks encapsulation" (Alan Snyder '86) A good discussion of this is present in the group of four design pattern book. A class should be designed to support inheritance in a very specific manner. Otherwise, you open the possibility of misuse by inheritors.
I would make the analogy that making all of your methods virtual is akin to making all your members public. A bit of a stretch, I know, but that's why I used the word 'analogy'
As you are designing your class hierarchy, it may make sense to write a function that should not be overridden. One example is if you are doing the "template method" pattern, where you have a public method that calls several private virtual methods. You would not want derived classes to override that; everyone should use the base definition.
There is no "final" keyword, so the best way to communicate to other developers that a method should not be overridden is to make it non-virtual. (other than easily ignored comments)
At the class level, making the destructor non-virtual communicates that the class should not be used as a base class, such as the STL containers.
Making a method non-virtual communicates how it should be used.
The Non-Virtual Interface idiom makes use of non-virtual methods. For more information please refer to Herb Sutter "Virtuality" article.
http://www.gotw.ca/publications/mill18.htm
And comments on the NVI idiom:
http://www.parashift.com/c++-faq-lite/strange-inheritance.html#faq-23.3
http://accu.org/index.php/journals/269 [See sub-section]

Is it better to use pointers for member class/object variables?

When I am trying to keep a clean header file with regard to #includes, I find that I can do a better job by making all my members pointers to classes/objects that require inclusion of other header files because this way I can use forward declarations instead of #includes.
But I am starting to wonder if this is a good rule-of-thumb or not.
So for example take the following class - option 1:
class QAudioDeviceInfo;
class QAudioInput;
class QIODevice;
class QAudioRx
{
public:
QAudioRx();
private:
QAudioDeviceInfo *mp_audioDeviceInfo;
QAudioInput *mp_audioInput;
QIODevice *mp_inputDevice;
};
However this can be re-written like this - option 2:
#include "QAudioDeviceInfo.h"
#include "QAudioInput.h"
#include "QIODevice.h"
class QAudioRx
{
public:
QAudioRx();
private:
QAudioDeviceInfo m_audioDeviceInfo;
QAudioInput m_audioInput;
QIODevice m_inputDevice;
};
Option 1, is generally my preference because its a nicer cleaner header to include. However option 2 does not require dynamic allocation / de-allocation of resources.
Maybe this could be seen as a subjective/opinion-based question (and I know the opinion-police will be all over this), so lets avoid "I like" and "I prefer" and stick to facts/key-points.
So my question is, in general, what is the best practice (if there is one) and why? Or if there is not one best practice (normally the case) when is it good to use each one?
edit
Sorry, this was really meant for private variables - not public ones.
Use pointers if you need pointers. Pointers are not for avoiding includes. Includes are not evil. When you need to use them, use them. Simple rules equal happy programming!
Using pointers in your case will bring a lot of unnecessary overhead. You may have to allocate your object dynamically which is not a good choice for your case. You have also to deal with memory management manually. Simply, there is no reason to use pointers here unless your case demand that (for example you want the object to live more than the pointer itself (non-ownership pattern for example).
There is a trade-off between compilation time and compile-time dependencies (which usually correlates with the number of include files needed) and simplicity and efficiency.
Option 1 will compile faster, but almost certainly run slower. Every time you access a member variable there is an indirection through a pointer, and the objects that the pointers refer to must be allocated and managed separately. If the allocation is dynamic then that has an extra cost, and the additional logic to manage the separate objects risks introducing bugs. For a start you have raw pointers, so you must define (or delete) a copy constructor, copy assignment, and probably destructor, and maybe move constructor and move assignment (your example is missing all of those, and so is probably horribly broken).
Option 2 requires some additional includes, and so everything that uses QAudioRx needs to be recompiled when any of those headers change. That tightly couples QAudioRx to those other types. In a large project that can be significant (but in a small one it probably doesn't matter - if the time to build your whole project is measured in seconds then it's certainly not worth it!)
However I would say your option 1 is almost never a good idea. If you care about reducing dependencies and build times then use the pImpl idiom (see also pimpl-idiom for related questions) instead:
class QAudioRx
{
public:
QAudioRx();
private:
class QAudioRxImpl;
std::unique_ptr<QAudioRxImpl> m_impl;
};
(N.B. My use of std::unique_ptr<QAudioRx> assumes the impl object will be dynamically allocated, because that's the most common approach, the solution can be adjusted if the object is managed differently).
Now all the member variables are part of the "impl" object, which is not defined in the header. This means that access to the members from within the impl class is direct, not through pointers, but access from the QAudioRx must perform one indirection to call a function on the impl class.
This still reduces dependencies, but makes it simpler to manage the extra complexity because there are no raw pointers so the special member functions will do something sensible automatically. By default it will be movable, but not copyable, and will clean up in the destructor.
When you are working on small project where number of class are little then you should go with option 2.
But when you are working with very large project in which number of classes are very large and compilation time of project is one of criteria for developing architecture of your project then go with option 1.
In most of cases option 1 is just fine. In real world application you need to combine both of this option choose trade of between compilation time and pretty look of your code(without pointers and nacked new(dynamic allocation) for objects.)

Why would you want to put a class in an implementation file?

While looking over some code, I ran into the following:
.h file
class ExampleClass
{
public:
// methods, etc
private:
class AnotherExampleClass* ptrToClass;
}
.cpp file
class AnotherExampleClass
{
// methods, etc
}
// AnotherExampleClass and ExampleClass implemented
Is this a pattern or something beneficial when working in c++? Since the class is not broken out into another file, does this work flow promote faster compilation times?
or is this just the style this developer?
This is variously known as the pImpl Idiom, Cheshire cat technique, or Compilation firewall.
Benefits:
Changing private member variables of a class does not require recompiling classes that depend on it, thus make times are faster, and
the FragileBinaryInterfaceProblem is reduced.
The header file does not need to #include classes that are used 'by value' in private member variables, thus compile times are faster.
This is sorta like the way SmallTalk automatically handles classes... more pure encapsulation.
Drawbacks:
More work for the implementor.
Doesn't work for 'protected' members where access by subclasses is required.
Somewhat harder to read code, since some information is no longer in the header file.
Run-time performance is slightly compromised due to the pointer indirection, especially if function calls are virtual (branch prediction for indirect branches is generally poor).
Herb Sutter's "Exceptional C++" books also go into useful detail on the appropriate usage of this technique.
The most common example would be when using the PIMPL pattern or similar techniques. Still, there are other uses as well. Typically, the distinction .hpp/.cpp in C++ is rather (or, at least can be) one of public interface versus private implementation. If a type is only used as part of the implementation, then that's a good reason not to export it in the header file.
Apart from possibly being an implementation of the PIMPL idiom, here are two more possible reason to do this:
Objects in C++ cannot modify their this pointer. As a consequence, they cannot change type in mid-usage. However, ptrToClass can change, allowing an implementation to delete itself and to replace itself with another instance of another subclass of AnotherExampleClass.
If the implementation of AnotherExampleClass depends on some template parameters, but the interface of ExampleClass does not, it is possible to use a template derived from AnotherExampleClass to provide the implementation. This hides part of the necessary, yet internal type information from the user of the interface class.

How to deal with the idea of "many small functions" for classes, without passing lots of parameters?

Over time I have come to appreciate the mindset of many small functions ,and I really do like it a lot, but I'm having a hard time losing my shyness to apply it to classes, especially ones with more than a handful of nonpublic member variables.
Every additional helper function clutters up the interface, since often the code is class specific and I can't just use some generic piece of code.
(To my limited knowledge, anyway, still a beginner, don't know every library out there, etc.)
So in extreme cases, I usually create a helper class which becomes the friend of the class that needs to be operated on, so it has access to all the nonpublic guts.
An alternative are free functions that need parameters, but even though premature optimization is evil, and I haven't actually profiled or disassembled it...
I still DREAD the mere thought of passing all the stuff I need sometimes, even just as reference, even though that should be a simple address per argument.
Is all this a matter of preference, or is there a widely used way of dealing with that kind of stuff?
I know that trying to force stuff into patterns is a kind of anti pattern, but I am concerned about code sharing and standards, and I want to get stuff at least fairly non painful for other people to read.
So, how do you guys deal with that?
Edit:
Some examples that motivated me to ask this question:
About the free functions:
DeadMG was confused about making free functions work...without arguments.
My issue with those functions is that unlike member functions, free functions only know about data, if you give it to them, unless global variables and the like are used.
Sometimes, however, I have a huge, complicated procedure I want to break down for readability and understandings sake, but there are so many different variables which get used all over the place that passing all the data to free functions, which are agnostic to every bit of member data, looks simply nightmarish.
Click for an example
That is a snippet of a function that converts data into a format that my mesh class accepts.
It would take all of those parameter to refactor this into a "finalizeMesh" function, for example.
At this point it's a part of a huge computer mesh data function, and bits of dimension info and sizes and scaling info is used all over the place, interwoven.
That's what I mean with "free functions need too many parameters sometimes".
I think it shows bad style, and not necessarily a symptom of being irrational per se, I hope :P.
I'll try to clear things up more along the way, if necessary.
Every additional helper function clutters up the interface
A private helper function doesn't.
I usually create a helper class which becomes the friend of the class that needs to be operated on
Don't do this unless it's absolutely unavoidable. You might want to break up your class's data into smaller nested classes (or plain old structs), then pass those around between methods.
I still DREAD the mere thought of passing all the stuff I need sometimes, even just as reference
That's not premature optimization, that's a perfectly acceptable way of preventing/reducing cognitive load. You don't want functions taking more than three parameters. If there are more then three, consider packaging your data in a struct or class.
I sometimes have the same problems as you have described: increasingly large classes that need too many helper functions to be accessed in a civilized manner.
When this occurs I try to seperate the class in multiple smaller classes if that is possible and convenient.
Scott Meyers states in Effective C++ that friend classes or functions is mostly not the best option, since the client code might do anything with the object.
Maybe you can try nested classes, that deal with the internals of your object. Another option are helper functions that use the public interface of your class and put the into a namespace related to your class.
Another way to keep your classes free of cruft is to use the pimpl idiom. Hide your private implementation behind a pointer to a class that actually implements whatever it is that you're doing, and then expose a limited subset of features to whoever is the consumer of your class.
// Your public API in foo.h (note: only foo.cpp should #include foo_impl.h)
class Foo {
public:
bool func(int i) { return impl_->func(i); }
private:
FooImpl* impl_;
};
There are many ways to implement this. The Boost pimpl template in the Vault is pretty good. Using smart pointers is another useful way of handling this, too.
http://www.boost.org/doc/libs/1_46_1/libs/smart_ptr/sp_techniques.html#pimpl
An alternative are free functions that
need parameters, but even though
premature optimization is evil, and I
haven't actually profiled or
disassembled it... I still DREAD the
mere thought of passing all the stuff
I need sometimes, even just as
reference, even though that should be
a simple address per argument.
So, let me get this entirely straight. You haven't profiled or disassembled. But somehow, you intend on ... making functions work ... without arguments? How, exactly, do you propose to program without using function arguments? Member functions are no more or less efficient than free functions.
More importantly, you come up with lots of logical reasons why you know you're wrong. I think the problem here is in your head, which possibly stems from you being completely irrational, and nothing that any answer from any of us can help you with.
Generic algorithms that take parameters are the basis of modern object orientated programming- that's the entire point of both templates and inheritance.

Pimpl idiom vs Pure virtual class interface

I was wondering what would make a programmer to choose either Pimpl idiom or pure virtual class and inheritance.
I understand that pimpl idiom comes with one explicit extra indirection for each public method and the object creation overhead.
The Pure virtual class in the other hand comes with implicit indirection(vtable) for the inheriting implementation and I understand that no object creation overhead.
EDIT: But you'd need a factory if you create the object from the outside
What makes the pure virtual class less desirable than the pimpl idiom?
When writing a C++ class, it's appropriate to think about whether it's going to be
A Value Type
Copy by value, identity is never important. It's appropriate for it to be a key in a std::map. Example, a "string" class, or a "date" class, or a "complex number" class. To "copy" instances of such a class makes sense.
An Entity type
Identity is important. Always passed by reference, never by "value". Often, doesn't make sense to "copy" instances of the class at all. When it does make sense, a polymorphic "Clone" method is usually more appropriate. Examples: A Socket class, a Database class, a "policy" class, anything that would be a "closure" in a functional language.
Both pImpl and pure abstract base class are techniques to reduce compile time dependencies.
However, I only ever use pImpl to implement Value types (type 1), and only sometimes when I really want to minimize coupling and compile-time dependencies. Often, it's not worth the bother. As you rightly point out, there's more syntactic overhead because you have to write forwarding methods for all of the public methods. For type 2 classes, I always use a pure abstract base class with associated factory method(s).
Pointer to implementation is usually about hiding structural implementation details. Interfaces are about instancing different implementations. They really serve two different purposes.
The pimpl idiom helps you reduce build dependencies and times especially in large applications, and minimizes header exposure of the implementation details of your class to one compilation unit. The users of your class should not even need to be aware of the existence of a pimple (except as a cryptic pointer to which they are not privy!).
Abstract classes (pure virtuals) is something of which your clients must be aware: if you try to use them to reduce coupling and circular references, you need to add some way of allowing them to create your objects (e.g. through factory methods or classes, dependency injection or other mechanisms).
I was searching an answer for the same question.
After reading some articles and some practice I prefer using "Pure virtual class interfaces".
They are more straight forward (this is a subjective opinion). Pimpl idiom makes me feel I'm writing code "for the compiler", not for the "next developer" that will read my code.
Some testing frameworks have direct support for Mocking pure virtual classes
It's true that you need a factory to be accessible from the outside.
But if you want to leverage polymorphism: that's also "pro", not a "con". ...and a simple factory method does not really hurts so much
The only drawback (I'm trying to investigate on this) is that pimpl idiom could be faster
when the proxy-calls are inlined, while inheriting necessarily need an extra access to the object VTABLE at runtime
the memory footprint the pimpl public-proxy-class is smaller (you can do easily optimizations for faster swaps and other similar optimizations)
I hate pimples! They do the class ugly and not readable. All methods are redirected to pimple. You never see in headers, what functionalities has the class, so you can not refactor it (e. g. simply change the visibility of a method). The class feels like "pregnant". I think using iterfaces is better and really enough to hide the implementation from the client. You can event let one class implement several interfaces to hold them thin. One should prefer interfaces!
Note: You do not necessary need the factory class. Relevant is that the class clients communicate with it's instances via the appropriate interface.
The hiding of private methods I find as a strange paranoia and do not see reason for this since we hav interfaces.
There's a very real problem with shared libraries that the pimpl idiom circumvents neatly that pure virtuals can't: you cannot safely modify/remove data members of a class without forcing users of the class to recompile their code. That may be acceptable under some circumstances, but not e.g. for system libraries.
To explain the problem in detail, consider the following code in your shared library/header:
// header
struct A
{
public:
A();
// more public interface, some of which uses the int below
private:
int a;
};
// library
A::A()
: a(0)
{}
The compiler emits code in the shared library that calculates the address of the integer to be initialized to be a certain offset (probably zero in this case, because it's the only member) from the pointer to the A object it knows to be this.
On the user side of the code, a new A will first allocate sizeof(A) bytes of memory, then hand a pointer to that memory to the A::A() constructor as this.
If in a later revision of your library you decide to drop the integer, make it larger, smaller, or add members, there'll be a mismatch between the amount of memory user's code allocates, and the offsets the constructor code expects. The likely result is a crash, if you're lucky - if you're less lucky, your software behaves oddly.
By pimpl'ing, you can safely add and remove data members to the inner class, as the memory allocation and constructor call happen in the shared library:
// header
struct A
{
public:
A();
// more public interface, all of which delegates to the impl
private:
void * impl;
};
// library
A::A()
: impl(new A_impl())
{}
All you need to do now is keep your public interface free of data members other than the pointer to the implementation object, and you're safe from this class of errors.
Edit: I should maybe add that the only reason I'm talking about the constructor here is that I didn't want to provide more code - the same argumentation applies to all functions that access data members.
We must not forget that inheritance is a stronger, closer coupling than delegation. I would also take into account all the issues raised in the answers given when deciding what design idioms to employ in solving a particular problem.
Although broadly covered in the other answers maybe I can be a bit more explicit about one benefit of pimpl over virtual base classes:
A pimpl approach is transparent from the user view point, meaning you can e.g. create objects of the class on the stack and use them directly in containers. If you try to hide the implementation using an abstract virtual base class, you will need to return a shared pointer to the base class from a factory, complicating it's use. Consider the following equivalent client code:
// Pimpl
Object pi_obj(10);
std::cout << pi_obj.SomeFun1();
std::vector<Object> objs;
objs.emplace_back(3);
objs.emplace_back(4);
objs.emplace_back(5);
for (auto& o : objs)
std::cout << o.SomeFun1();
// Abstract Base Class
auto abc_obj = ObjectABC::CreateObject(20);
std::cout << abc_obj->SomeFun1();
std::vector<std::shared_ptr<ObjectABC>> objs2;
objs2.push_back(ObjectABC::CreateObject(13));
objs2.push_back(ObjectABC::CreateObject(14));
objs2.push_back(ObjectABC::CreateObject(15));
for (auto& o : objs2)
std::cout << o->SomeFun1();
In my understanding these two things serve completely different purposes. The purpose of the pimple idiom is basically give you a handle to your implementation so you can do things like fast swaps for a sort.
The purpose of virtual classes is more along the line of allowing polymorphism, i.e. you have a unknown pointer to an object of a derived type and when you call function x you always get the right function for whatever class the base pointer actually points to.
Apples and oranges really.
The most annoying problem about the pimpl idiom is it makes it extremely hard to maintain and analyse existing code. So using pimpl you pay with developer time and frustration only to "reduce build dependencies and times and minimize header exposure of the implementation details". Decide yourself, if it is really worth it.
Especially "build times" is a problem you can solve by better hardware or using tools like Incredibuild ( www.incredibuild.com, also already included in Visual Studio 2017 ), thus not affecting your software design. Software design should be generally independent of the way the software is built.