Best way to use a C++ Interface - c++

I have an interface class similar to:
class IInterface
{
public:
virtual ~IInterface() {}
virtual methodA() = 0;
virtual methodB() = 0;
};
I then implement the interface:
class AImplementation : public IInterface
{
// etc... implementation here
}
When I use the interface in an application is it better to create an instance of the concrete class AImplementation. Eg.
int main()
{
AImplementation* ai = new AIImplementation();
}
Or is it better to put a factory "create" member function in the Interface like the following:
class IInterface
{
public:
virtual ~IInterface() {}
static std::tr1::shared_ptr<IInterface> create(); // implementation in .cpp
virtual methodA() = 0;
virtual methodB() = 0;
};
Then I would be able to use the interface in main like so:
int main()
{
std::tr1::shared_ptr<IInterface> test(IInterface::create());
}
The 1st option seems to be common practice (not to say its right). However, the 2nd option was sourced from "Effective C++".

One of the most common reasons for using an interface is so that you can "program against an abstraction" rather then a concrete implementation.
The biggest benefit of this is that it allows changing of parts of your code while minimising the change on the remaining code.
Therefore although we don't know the full background of what you're building, I would go for the Interface / factory approach.
Having said this, in smaller applications or prototypes I often start with concrete classes until I get a feel for where/if an interface would be desirable. Interfaces can introduce a level of indirection that may just not be necessary for the scale of app you're building.
As a result in smaller apps, I find I don't actually need my own custom interfaces. Like so many things, you need to weigh up the costs and benefits specific to your situation.

There is yet another alternative which you haven't mentioned:
int main(int argc, char* argv[])
{
//...
boost::shared_ptr<IInterface> test(new AImplementation);
//...
return 0;
}
In other words, one can use a smart pointer without using a static "create" function. I prefer this method, because a "create" function adds nothing but code bloat, while the benefits of smart pointers are obvious.

There are two separate issues in your question:
1. How to manage the storage of the created object.
2. How to create the object.
Part 1 is simple - you should use a smart pointer like std::tr1::shared_ptr to prevent memory leaks that otherwise require fancy try/catch logic.
Part 2 is more complicated.
You can't just write create() in main() like you want to - you'd have to write IInterface::create(), because otherwise the compiler will be looking for a global function called create, which isn't what you want. It might seem like having the 'std::tr1::shared_ptr test' initialized with the value returned by create() might seem like it'd do what you want, but that's not how C++ compilers work.
As to whether using a factory method on the interface is a better way to do this than just using new AImplementation(), it's possible it'd be helpful in your situation, but beware of speculative complexity - if you're writing the interface so that it always creates an AImplementation and never a BImplementation or a CImplementation, it's hard to see what the extra complexity buys you.

"Better" in what sense?
The factory method doesn't buy you much if you only plan to have, say, one concrete class. (But then again, if you only plan to have one concrete class, do you really need the interface class at all? Maybe yes, if you're using COM.) In any case, if you can forsee a small, fixed limit on the number of concrete classes, then the simpler implementation may be the "better" one, on the whole.
But if there may be many concrete classes, and if you don't want to have the base class be tightly coupled to them, then the factory pattern may be useful.
And yes, this can help reduce coupling -- if the base class provides some means for the derived classes to register themselves with the base class. This would allow the factory to know which derived classes exist, and how to create them, without needing compile-time information about them.

Use the 1st method. Your factory method in the 2nd option would have to be implemented per-concrete class and this is not possible to do in the interface. I.e., IInterface::create() has no idea exactly which concrete class you actually wish to instantiate.
A static method cannot be virtual, and implementing a non-static create() method in your concrete classes has not really won you anything in this case.
Factory methods are certainly useful, but this is not the correct use.
Which item in Effective C++ recommends the 2nd option? I don't see it in mine (though I don't also have the second book). That may clear up a mis-understanding.

I would go with the first option just because it's more common and more understandable. It's really up to you, but if your working on a commercial app then I would ask what my peers what they use.

I do have a very simple question there:
Are you sure you want to use a pointer ?
This question might seem unlogical but people coming from a Java background use new much often than required. In your example, creating the variable on the stack would be amply sufficient.

Related

Is there any way to avoid declaring virtual methods when storing (children) pointers?

I have run into an annoying problem lately, and I am not satisfied with my own workaround: I have a program that maintains a vector of pointers to a base class, and I am storing there all kind of children object-pointers. Now, each child class has methods of their own, and the main program may or not may call these methods, depending on the type of object (note though that they all heavily use common methods of the base class, so this justify inheritance).
I have found useful to have an "object identifier" to check the class type (and then either call the method or not), which is already not very beautiful, but this is not the main inconvenience. The main inconvenience is that, if I want to actually be able to call a derived class method using the base class pointer (or even just store the pointer in the pointer array), then one need to declare the derived methods as virtual in the base class.
Make sense from the C++ coding point of view.. but this is not practical in my case (from the development point of view), because I am planning to create many different children classes in different files, perhaps made by different people, and I don't want to tweak/maintain the base class each time, to add virtual methods!
How to do this? Essentially, what I am asking (I guess) is how to implement something like Objective-C NSArrays - if you send a message to an object that does not implement the method, well, nothing happens.
regards
Instead of this:
// variant A: declare everything in the base class
void DoStuff_A(Base* b) {
if (b->TypeId() == DERIVED_1)
b->DoDerived1Stuff();
else if if (b->TypeId() == DERIVED_2)
b->DoDerived12Stuff();
}
or this:
// variant B: declare nothing in the base class
void DoStuff_B(Base* b) {
if (b->TypeId() == DERIVED_1)
(dynamic_cast<Derived1*>(b))->DoDerived1Stuff();
else if if (b->TypeId() == DERIVED_2)
(dynamic_cast<Derived2*>(b))->DoDerived12Stuff();
}
do this:
// variant C: declare the right thing in the base class
b->DoStuff();
Note there's a single virtual function in the base per stuff that has to be done.
If you find yourself in a situation where you are more comfortable with variants A or B then with variant C, stop and rethink your design. You are coupling components too tightly and in the end it will backfire.
I am planning to create many different children classes in different
files, perhaps made by different people, and I don't want to
tweak/maintain the base class each time, to add virtual methods!
You are OK with tweaking DoStuff each time a derived class is added, but tweaking Base is a no-no. May I ask why?
If your design does not fit in either A, B or C pattern, show what you have, for clairvoyance is a rare feat these days.
You can do what you describe in C++, but not using functions. It is, by the way, kind of horrible but I suppose there might be cases in which it's a legitimate approach.
First way of doing this:
Define a function with a signature something like boost::variant parseMessage(std::string, std::vector<boost::variant>); and perhaps a string of convenience functions with common signatures on the base class and include a message lookup table on the base class which takes functors. In each class constructor add its messages to the message table and the parseMessage function then parcels off each message to the right function on the class.
It's ugly and slow but it should work.
Second way of doing this:
Define the virtual functions further down the hierarchy so if you want to add int foo(bar*); you first add a class that defines it as virtual and then ensure every class that wants to define int foo(bar*); inherit from it. You can then use dynamic_cast to ensure that the pointer you are looking at inherits from this class before trying to call int foo(bar*);. Possible these interface adding classes could be pure virtual so they can be mixed in to various points using multiple inheritance, but that may have its own problems.
This is less flexible than the first way and requires the classes that implement a function to be linked to each other. Oh, and it's still ugly.
But mostly I suggest you try and write C++ code like C++ code not Objective-C code.
This can be solved by adding some sort of introspection capabilities and meta object system. This talk Metadata and reflection in C++ — Jeff Tucker demonstrates how to do this using c++'s template meta programming.
If you don't want to go to the trouble of implementing one yourself, then it would be easier to use an existing one such as Qt's meta object system. Note that this solution does not work with multiple inheritance due to limitations in the meta object compiler: QObject Multiple Inheritance.
With that installed, you can query for the presence of methods and call them. This is quite tedious to do by hand, so the easiest way to call such a methods is using the signal and slot mechanism.
There is also GObject which is quite simmilar and there are others.
If you are planning to create many different children classes in different files, perhaps made by different people, and also I would guess you don't want to change your main code for every child class. Then I think what you need to do in your base class is to define several (not to many) virtual functions (with empty implementation) BUT those functions should be used to mark a time in the logic where they are called like "AfterInseart" or "BeforeSorting", Etc.
Usually there are not to many places in the logic you wish a derived classes to perform there own logic.

Why bother with virtual functions in c++?

This is not a question about how they work and declared, this I think is pretty much clear to me. The question is about why to implement this?
I suppose the practical reason is to simplify bunch of other code to relate and declare their variables of base type, to handle objects and their specific methods from many other subclasses?
Could this be done by templating and typechecking, like I do it in Objective C? If so, what is more efficient? I find it confusing to declare object as one class and instantiate it as another, even if it is its child.
SOrry for stupid questions, but I havent done any real projects in C++ yet and since I am active Objective C developer (it is much smaller language thus relying heavily on SDK's functionalities, like OSX, iOS) I need to have clear view on any parallel ways of both cousins.
Yes, this can be done with templates, but then the caller must know what the actual type of the object is (the concrete class) and this increases coupling.
With virtual functions the caller doesn't need to know the actual class - it operates through a pointer to a base class, so you can compile the client once and the implementor can change the actual implementation as much as it wants and the client doesn't have to know about that as long as the interface is unchanged.
Virtual functions implement polymorphism. I don't know Obj-C, so I cannot compare both, but the motivating use case is that you can use derived objects in place of base objects and the code will work. If you have a compiled and working function foo that operates on a reference to base you need not modify it to have it work with an instance of derived.
You could do that (assuming that you had runtime type information) by obtaining the real type of the argument and then dispatching directly to the appropriate function with a switch of shorts, but that would require either manually modifying the switch for each new type (high maintenance cost) or having reflection (unavailable in C++) to obtain the method pointer. Even then, after obtaining a method pointer you would have to call it, which is as expensive as the virtual call.
As to the cost associated to a virtual call, basically (in all implementations with a virtual method table) a call to a virtual function foo applied on object o: o.foo() is translated to o.vptr[ 3 ](), where 3 is the position of foo in the virtual table, and that is a compile time constant. This basically is a double indirection:
From the object o obtain the pointer to the vtable, index that table to obtain the pointer to the function and then call. The extra cost compared with a direct non-polymorphic call is just the table lookup. (In fact there can be other hidden costs when using multiple inheritance, as the implicit this pointer might have to be shifted), but the cost of the virtual dispatch is very small.
I don't know the first thing about Objective-C, but here's why you want to "declare an object as one class and instantiate it as another": the Liskov Substitution Principle.
Since a PDF is a document, and an OpenOffice.org document is a document, and a Word Document is a document, it's quite natural to write
Document *d;
if (ends_with(filename, ".pdf"))
d = new PdfDocument(filename);
else if (ends_with(filename, ".doc"))
d = new WordDocument(filename);
else
// you get the point
d->print();
Now, for this to work, print would have to be virtual, or be implemented using virtual functions, or be implemented using a crude hack that reinvents the virtual wheel. The program need to know at runtime which of various print methods to apply.
Templating solves a different problem, where you determine at compile time which of the various containers you're going to use (for example) when you want to store a bunch of elements. If you operate on those containers with template functions, then you don't need to rewrite them when you switch containers, or add another container to your program.
A virtual function is important in inheritance. Think of an example where you have a CMonster class and then a CRaidBoss and CBoss class that inherit from CMonster.
Both need to be drawn. A CMonster has a Draw() function, but the way a CRaidBoss and a CBoss are drawn is different. Thus, the implementation is left to them by utilizing the virtual function Draw.
Well, the idea is simply to allow the compiler to perform checks for you.
It's like a lot of features : ways to hide what you don't want to have to do yourself. That's abstraction.
Inheritance, interfaces, etc. allow you to provide an interface to the compiler for the implementation code to match.
If you didn't have the virtual function mecanism, you would have to write :
class A
{
void do_something();
};
class B : public A
{
void do_something(); // this one "hide" the A::do_something(), it replace it.
};
void DoSomething( A* object )
{
// calling object->do_something will ALWAYS call A::do_something()
// that's not what you want if object is B...
// so we have to check manually:
B* b_object = dynamic_cast<B*>( object );
if( b_object != NULL ) // ok it's a b object, call B::do_something();
{
b_object->do_something()
}
else
{
object->do_something(); // that's a A, call A::do_something();
}
}
Here there are several problems :
you have to write this for each function redefined in a class hierarchy.
you have one additional if for each child class.
you have to touch this function again each time you add a definition to the whole hierarcy.
it's visible code, you can get it wrong easily, each time
So, marking functions virtual does this correctly in an implicit way, rerouting automatically, in a dynamic way, the function call to the correct implementation, depending on the final type of the object.
You dont' have to write any logic so you can't get errors in this code and have an additional thing to worry about.
It's the kind of thing you don't want to bother with as it can be done by the compiler/runtime.
The use of templates is also technically known as polymorphism from theorists. Yep, both are valid approach to the problem. The implementation technics employed will explain better or worse performance for them.
For example, Java implements templates, but through template erasure. This means that it is only apparently using templates, under the surface is plain old polymorphism.
C++ has very powerful templates. The use of templates makes code quicker, though each use of a template instantiates it for the given type. This means that, if you use an std::vector for ints, doubles and strings, you'll have three different vector classes: this means that the size of the executable will suffer.

Factory Pattern in C++ -- doing this correctly?

I am relatively new to "design patterns" as they are referred to in a formal sense. I've not been a professional for very long, so I'm pretty new to this.
We've got a pure virtual interface base class. This interface class is obviously to provide the definition of what functionality its derived children are supposed to do. The current use and situation in the software dictates what type of derived child we want to use, so I recommended creating a wrapper that will communicate which type of derived child we want and return a Base pointer that points to a new derived object. This wrapper, to my understanding, is a factory.
Well, a colleague of mine created a static function in the Base class to act as the factory. This causes me trouble for two reasons. First, it seems to break the interface nature of the Base class. It feels wrong to me that the interface would itself need to have knowledge of the children derived from it.
Secondly, it causes more problems when I try to re-use the Base class across two different Qt projects. One project is where I am implementing the first (and probably only real implementation for this one class... though i want to use the same method for two other features that will have several different derived classes) derived class and the second is the actual application where my code will eventually be used. My colleague has created a derived class to act as a tester for the real application while I code my part. This means that I've got to add his headers and cpp files to my project, and that just seems wrong since I'm not even using his code for the project while I implement my part (but he will use mine when it is finished).
Am I correct in thinking that the factory really needs to be a wrapper around the Base class rather than the Base acting as the factory?
You do NOT want to use your interface class as the factory class. For one, if it is a true interface class, there is no implementation. Second, if the interface class does have some implementation defined (in addition to the pure virtual functions), making a static factory method now forces the base class to be recompiled every time you add a child class implementation.
The best way to implement the factory pattern is to have your interface class separate from your factory.
A very simple (and incomplete) example is below:
class MyInterface
{
public:
virtual void MyFunc() = 0;
};
class MyImplementation : public MyInterface
{
public:
virtual void MyFunc() {}
};
class MyFactory
{
public:
static MyInterface* CreateImplementation(...);
};
I'd have to agree with you. Probably one of the most important principles of object oriented programming is to have a single responsibility for the scope of a piece of code (whether it's a method, class or namespace). In your case, your base class serves the purpose of defining an interface. Adding a factory method to that class, violates that principle, opening the door to a world of shi... trouble.
Yes, a static factory method in the interface (base class) requires it to have knowledge of all possible instantiations. That way, you don't get any of the flexibility the Factory Method pattern is intended to bring.
The Factory should be an independent piece of code, used by client code to create instances. You have to decide somewhere in your program what concrete instance to create. Factory Method allows you to avoid having the same decision spread out through your client code. If later you want to change the implementation (or e.g. for testing), you have just one place to edit: this may be e.g. a simple global change, through conditional compilation (usually for tests), or even via a dependency injection configuration file.
Be careful about how client code communicates what kind of implementation it wants: that's not an uncommon way of reintroducing the dependencies factories are meant to hide.
It's not uncommon to see factory member functions in a class, but it makes my eyes bleed. Often their use have been mixed up with the functionality of the named constructor idiom. Moving the creation function(s) to a separate factory class will buy you more flexibility also to swap factories during testing.
When the interface is just for hiding the implementation details and there will be only one implementation of the Base interface ever, it could be ok to couple them. In that case, the factory function is just a new name for the constructor of the actual implementation.
However, that case is rare. Except when explicit designed having only one implementation ever, you are better off to assume that multiple implementations will exist at some point in time, if only for testing (as you discovered).
So usually it is better to split the Factory part into a separate class.

Template or abstract base class?

If I want to make a class adaptable, and make it possible to select different algorithms from the outside -- what is the best implementation in C++?
I see mainly two possibilities:
Use an abstract base class and pass concrete object in
Use a template
Here is a little example, implemented in the various versions:
Version 1: Abstract base class
class Brake {
public: virtual void stopCar() = 0;
};
class BrakeWithABS : public Brake {
public: void stopCar() { ... }
};
class Car {
Brake* _brake;
public:
Car(Brake* brake) : _brake(brake) { brake->stopCar(); }
};
Version 2a: Template
template<class Brake>
class Car {
Brake brake;
public:
Car(){ brake.stopCar(); }
};
Version 2b: Template and private inheritance
template<class Brake>
class Car : private Brake {
using Brake::stopCar;
public:
Car(){ stopCar(); }
};
Coming from Java, I am naturally inclined to always use version 1, but the templates versions seem to be preferred often, e.g. in STL code? If that's true, is it just because of memory efficiency etc (no inheritance, no virtual function calls)?
I realize there is not a big difference between version 2a and 2b, see C++ FAQ.
Can you comment on these possibilities?
This depends on your goals. You can use version 1 if you
Intend to replace brakes of a car (at runtime)
Intend to pass Car around to non-template functions
I would generally prefer version 1 using the runtime polymorphism, because it is still flexible and allows you to have the Car still have the same type: Car<Opel> is another type than Car<Nissan>. If your goals are great performance while using the brakes frequently, i recommend you to use the templated approach. By the way, this is called policy based design. You provide a brake policy. Example because you said you programmed in Java, possibly you are not yet too experienced with C++. One way of doing it:
template<typename Accelerator, typename Brakes>
class Car {
Accelerator accelerator;
Brakes brakes;
public:
void brake() {
brakes.brake();
}
}
If you have lots of policies you can group them together into their own struct, and pass that one, for example as a SpeedConfiguration collecting Accelerator, Brakes and some more. In my projects i try to keep a good deal of code template-free, allowing them to be compiled once into their own object files, without needing their code in headers, but still allowing polymorphism (via virtual functions). For example, you might want to keep common data and functions that non-template code will probably call on many occasions in a base-class:
class VehicleBase {
protected:
std::string model;
std::string manufacturer;
// ...
public:
~VehicleBase() { }
virtual bool checkHealth() = 0;
};
template<typename Accelerator, typename Breaks>
class Car : public VehicleBase {
Accelerator accelerator;
Breaks breaks;
// ...
virtual bool checkHealth() { ... }
};
Incidentally, that is also the approach that C++ streams use: std::ios_base contains flags and stuff that do not depend on the char type or traits like openmode, format flags and stuff, while std::basic_ios then is a class template that inherits it. This also reduces code bloat by sharing the code that is common to all instantiations of a class template.
Private Inheritance?
Private inheritance should be avoided in general. It is only very rarely useful and containment is a better idea in most cases. Common case where the opposite is true when size is really crucial (policy based string class, for example): Empty Base Class Optimization can apply when deriving from an empty policy class (just containing functions).
Read Uses and abuses of Inheritance by Herb Sutter.
The rule of thumb is:
1) If the choice of the concrete type is made at compile time, prefer a template. It will be safer (compile time errors vs run time errors) and probably better optimized.
2) If the choice is made at run-time (i.e. as a result of a user's action) there is really no choice - use inheritance and virtual functions.
Other options:
Use the Visitor Pattern (let external code work on your class).
Externalize some part of your class, for example via iterators, that generic iterator-based code can work on them. This works best if your object is a container of other objects.
See also the Strategy Pattern (there are c++ examples inside)
Templates are a way to let a class use a variable of which you don't really care about the type. Inheritance is a way to define what a class is based on its attributes. Its the "is-a" versus "has-a" question.
Most of your question has already been answered, but I wanted to elaborate on this bit:
Coming from Java, I am naturally
inclined to always use version 1, but
the templates versions seem to be
preferred often, e.g. in STL code? If
that's true, is it just because of
memory efficiency etc (no inheritance,
no virtual function calls)?
That's part of it. But another factor is the added type safety. When you treat a BrakeWithABS as a Brake, you lose type information. You no longer know that the object is actually a BrakeWithABS. If it is a template parameter, you have the exact type available, which in some cases may enable the compiler to perform better typechecking. Or it may be useful in ensuring that the correct overload of a function gets called. (if stopCar() passes the Brake object to a second function, which may have a separate overload for BrakeWithABS, that won't be called if you'd used inheritance, and your BrakeWithABS had been cast to a Brake.
Another factor is that it allows more flexibility. Why do all Brake implementations have to inherit from the same base class? Does the base class actually have anything to bring to the table? If I write a class which exposes the expected member functions, isn't that good enough to act as a brake? Often, explicitly using interfaces or abstract base classes constrain your code more than necessary.
(Note, I'm not saying templates should always be the preferred solution. There are other concerns that might affect this, ranging from compilation speed to "what programmers on my team are familiar with" or just "what I prefer". And sometimes, you need runtime polymorphism, in which case the template solution simply isn't possible)
this answer is more or less correct. When you want something parametrized at compile time - you should prefer templates. When you want something parametrized at runtime, you should prefer virtual functions being overridden.
However, using templates does not preclude you from doing both (making the template version more flexible):
struct Brake {
virtual void stopCar() = 0;
};
struct BrakeChooser {
BrakeChooser(Brake *brake) : brake(brake) {}
void stopCar() { brake->stopCar(); }
Brake *brake;
};
template<class Brake>
struct Car
{
Car(Brake brake = Brake()) : brake(brake) {}
void slamTheBrakePedal() { brake.stopCar(); }
Brake brake;
};
// instantiation
Car<BrakeChooser> car(BrakeChooser(new AntiLockBrakes()));
That being said, I would probably NOT use templates for this... But its really just personal taste.
Abstract base class has on overhead of virtual calls but it has an advantage that all derived classes are really base classes. Not so when you use templates – Car<Brake> and Car<BrakeWithABS> are unrelated to each other and you'll have to either dynamic_cast and check for null or have templates for all the code that deals with Car.
Use interface if you suppose to support different Break classes and its hierarchy at once.
Car( new Brake() )
Car( new BrakeABC() )
Car( new CoolBrake() )
And you don't know this information at compile time.
If you know which Break you are going to use 2b is right choice for you to specify different Car classes. Brake in this case will be your car "Strategy" and you can set default one.
I wouldn't use 2a. Instead you can add static methods to Break and call them without instance.
Personally I would allways prefer to use Interfaces over templates because of several reasons:
Templates Compiling&linking errors are sometimes cryptic
It is hard to debug a code that based on templates (at least in visual studio IDE)
Templates can make your binaries bigger.
Templates require you to put all its code in the header file , that makes the template class a bit harder to understand.
Templates are hard to maintained by novice programmers.
I Only use templates when the virtual tables create some kind of overhead.
Ofcourse , this is only my self opinion.

Pimpl idiom vs Pure virtual class interface

I was wondering what would make a programmer to choose either Pimpl idiom or pure virtual class and inheritance.
I understand that pimpl idiom comes with one explicit extra indirection for each public method and the object creation overhead.
The Pure virtual class in the other hand comes with implicit indirection(vtable) for the inheriting implementation and I understand that no object creation overhead.
EDIT: But you'd need a factory if you create the object from the outside
What makes the pure virtual class less desirable than the pimpl idiom?
When writing a C++ class, it's appropriate to think about whether it's going to be
A Value Type
Copy by value, identity is never important. It's appropriate for it to be a key in a std::map. Example, a "string" class, or a "date" class, or a "complex number" class. To "copy" instances of such a class makes sense.
An Entity type
Identity is important. Always passed by reference, never by "value". Often, doesn't make sense to "copy" instances of the class at all. When it does make sense, a polymorphic "Clone" method is usually more appropriate. Examples: A Socket class, a Database class, a "policy" class, anything that would be a "closure" in a functional language.
Both pImpl and pure abstract base class are techniques to reduce compile time dependencies.
However, I only ever use pImpl to implement Value types (type 1), and only sometimes when I really want to minimize coupling and compile-time dependencies. Often, it's not worth the bother. As you rightly point out, there's more syntactic overhead because you have to write forwarding methods for all of the public methods. For type 2 classes, I always use a pure abstract base class with associated factory method(s).
Pointer to implementation is usually about hiding structural implementation details. Interfaces are about instancing different implementations. They really serve two different purposes.
The pimpl idiom helps you reduce build dependencies and times especially in large applications, and minimizes header exposure of the implementation details of your class to one compilation unit. The users of your class should not even need to be aware of the existence of a pimple (except as a cryptic pointer to which they are not privy!).
Abstract classes (pure virtuals) is something of which your clients must be aware: if you try to use them to reduce coupling and circular references, you need to add some way of allowing them to create your objects (e.g. through factory methods or classes, dependency injection or other mechanisms).
I was searching an answer for the same question.
After reading some articles and some practice I prefer using "Pure virtual class interfaces".
They are more straight forward (this is a subjective opinion). Pimpl idiom makes me feel I'm writing code "for the compiler", not for the "next developer" that will read my code.
Some testing frameworks have direct support for Mocking pure virtual classes
It's true that you need a factory to be accessible from the outside.
But if you want to leverage polymorphism: that's also "pro", not a "con". ...and a simple factory method does not really hurts so much
The only drawback (I'm trying to investigate on this) is that pimpl idiom could be faster
when the proxy-calls are inlined, while inheriting necessarily need an extra access to the object VTABLE at runtime
the memory footprint the pimpl public-proxy-class is smaller (you can do easily optimizations for faster swaps and other similar optimizations)
I hate pimples! They do the class ugly and not readable. All methods are redirected to pimple. You never see in headers, what functionalities has the class, so you can not refactor it (e. g. simply change the visibility of a method). The class feels like "pregnant". I think using iterfaces is better and really enough to hide the implementation from the client. You can event let one class implement several interfaces to hold them thin. One should prefer interfaces!
Note: You do not necessary need the factory class. Relevant is that the class clients communicate with it's instances via the appropriate interface.
The hiding of private methods I find as a strange paranoia and do not see reason for this since we hav interfaces.
There's a very real problem with shared libraries that the pimpl idiom circumvents neatly that pure virtuals can't: you cannot safely modify/remove data members of a class without forcing users of the class to recompile their code. That may be acceptable under some circumstances, but not e.g. for system libraries.
To explain the problem in detail, consider the following code in your shared library/header:
// header
struct A
{
public:
A();
// more public interface, some of which uses the int below
private:
int a;
};
// library
A::A()
: a(0)
{}
The compiler emits code in the shared library that calculates the address of the integer to be initialized to be a certain offset (probably zero in this case, because it's the only member) from the pointer to the A object it knows to be this.
On the user side of the code, a new A will first allocate sizeof(A) bytes of memory, then hand a pointer to that memory to the A::A() constructor as this.
If in a later revision of your library you decide to drop the integer, make it larger, smaller, or add members, there'll be a mismatch between the amount of memory user's code allocates, and the offsets the constructor code expects. The likely result is a crash, if you're lucky - if you're less lucky, your software behaves oddly.
By pimpl'ing, you can safely add and remove data members to the inner class, as the memory allocation and constructor call happen in the shared library:
// header
struct A
{
public:
A();
// more public interface, all of which delegates to the impl
private:
void * impl;
};
// library
A::A()
: impl(new A_impl())
{}
All you need to do now is keep your public interface free of data members other than the pointer to the implementation object, and you're safe from this class of errors.
Edit: I should maybe add that the only reason I'm talking about the constructor here is that I didn't want to provide more code - the same argumentation applies to all functions that access data members.
We must not forget that inheritance is a stronger, closer coupling than delegation. I would also take into account all the issues raised in the answers given when deciding what design idioms to employ in solving a particular problem.
Although broadly covered in the other answers maybe I can be a bit more explicit about one benefit of pimpl over virtual base classes:
A pimpl approach is transparent from the user view point, meaning you can e.g. create objects of the class on the stack and use them directly in containers. If you try to hide the implementation using an abstract virtual base class, you will need to return a shared pointer to the base class from a factory, complicating it's use. Consider the following equivalent client code:
// Pimpl
Object pi_obj(10);
std::cout << pi_obj.SomeFun1();
std::vector<Object> objs;
objs.emplace_back(3);
objs.emplace_back(4);
objs.emplace_back(5);
for (auto& o : objs)
std::cout << o.SomeFun1();
// Abstract Base Class
auto abc_obj = ObjectABC::CreateObject(20);
std::cout << abc_obj->SomeFun1();
std::vector<std::shared_ptr<ObjectABC>> objs2;
objs2.push_back(ObjectABC::CreateObject(13));
objs2.push_back(ObjectABC::CreateObject(14));
objs2.push_back(ObjectABC::CreateObject(15));
for (auto& o : objs2)
std::cout << o->SomeFun1();
In my understanding these two things serve completely different purposes. The purpose of the pimple idiom is basically give you a handle to your implementation so you can do things like fast swaps for a sort.
The purpose of virtual classes is more along the line of allowing polymorphism, i.e. you have a unknown pointer to an object of a derived type and when you call function x you always get the right function for whatever class the base pointer actually points to.
Apples and oranges really.
The most annoying problem about the pimpl idiom is it makes it extremely hard to maintain and analyse existing code. So using pimpl you pay with developer time and frustration only to "reduce build dependencies and times and minimize header exposure of the implementation details". Decide yourself, if it is really worth it.
Especially "build times" is a problem you can solve by better hardware or using tools like Incredibuild ( www.incredibuild.com, also already included in Visual Studio 2017 ), thus not affecting your software design. Software design should be generally independent of the way the software is built.