Design with (pure)virtual C++ - c++

First of all I have to mention that I have read many C++ virtual questions in on stackoverflow. I have some knowledge how they work, but when I start the project and try to design something I never consider/use virtual or pure virtual implementations. Maybe it is because I am lack of knowledge how do they work or I don't know how to realize some stuff with them. I think it's bad because I don't use fully Object Oriented development.
Maybe someone can advise me how to get used to them?

Check out abstract base classes and interfaces in Java or C# to get ideas on when pure virtuals are useful.
Virtual functions are pretty basic to OO. Theree are plenty of books out there to help you. Myself, I like Larman's Applying UML and Patterns.

but when I start the project and try to design something I never consider/use virtual or pure virtual implementations.
Here's something you can try:
Figure out the set of classes you use
Do you see some class hierarchies? A Circle is-a Shape sort of relationships?
Isolate behavior
Bubble up/down behavior to form interfaces (base classes) (Code to interfaces and not implementations)
Implement these as virtual functions
The responsibility of defining the exact semantics of the operation(s) rests with the sub-classes'
Create your sub-classes
Implement (override) the virtual functions
But don't force a hierarchy just for the sake of using them. An example from real code I have been working on recently:
class Codec {
public:
virtual GUID Guid() { return GUID_NULL; }
};
class JpegEncoder : public Codec {
public:
virtual GUID Guid() { return GUID_JpegEncoder; }
};
class PngDecoder : public Codec {
public:
virtual GUID Guid() { return GUID_PngDecoder; }
};

I don't have a ton of time ATM, but here is a simple example.
In my job I maintain and application which talks to various hardware devices. Of these devices, many motors are used for various purposes. Now, I don't know if you have done any development with motors and drives, but they are all a bit different, even if they claim to follow a standard like CANOpen. Anyway, you need to create some new code when you switch vendors, perhaps you motor or drive was end-of-life'd, etc. On top of that, this code has to maintain compatibility with older devices, and we also have various models of similar devices. So, all in all, you have to deal with many different motors and interfaces.
Now, in the code I use an abstract class, named "iMotor", which contains only pure virtual functions. In the implementation code only the iMotor class is referenced. I create a dll for different types of motors with different implementations, but they all implement the iMotor interface. So, all that I need to do to add/change a motor is create a new implementation and drop that dll in place of the old one. Because the code which uses these motor implementations deals only with the iMotor interface it never needs to change, only the implementation of how each motor does what it does needs to change.

If you google for design patterns like the "strategy pattern" and "command pattern" you will find some good uses of interfaces and polymorphism. Besides that, design patterns are always very useful to know.

You don't HAVE to use them but they have their advantages.
Generally they are used as an "interface" between 2 different types of functionality that, code wise, aren't very related.
An example would be handling file loading. A simple file handling class would seem to be perfect. However at a later stage you are asked to shift all your files into a single packaged file while maintaining support for individual files for debug purposes. How do you handle loading here? Obviously things will be handled rather differently because suddenly you can't just open a file. Instead you need to be able to look up the files location and then seek to that location before loading, pretty much, as normal.
The obvious thing to do is implement an abstract base class. Perhaps call it BaseFile. The OpenFile function handling will differ dependent on whether you are using the PackageFile or the DiskFile class. So make that a pure virtual.
Then when you derive the PackageFile and DiskFile classes you provide the appropriate implementation for Opening a file.
You can then add something such as
#if !defined( DISK_FILE ) && defined ( _DEBUG )
#define DISK_FILE 1
#elif !defined( DISK_FILE )
#define DISK_FILE 0
#endif
#if DISK_FILE
typedef DiskFile File;
#else
typedef PackageFile File;
#endif
Now you would just use the "File" typedef to do all file handling. Equally if you don't pre-define DISK_FILE as 0 or 1 and debug is set it will automatically load from disk otherwise it will load from the Package file.
Of course such a construct still allows you to load from the Package file in debug simply by defining DISK_FILE to be 1 in advance and it also allows you to use disk access in a release build by setting DISK_FILE to 0.

Related

Runtime interfaces and object composition in C++

I am searching for a simple, light-weight solution for interface-based runtime object composition in C++. I want to be able to specify interfaces (methods declarations), and objects (creatable through factory pattern) implementing these. At runtime I want mechanisms to instantiate these objects and interconnect these based on interface-connectors. The method calls at runtime should remain fairly cheap, i.e. only several more instructions per call, comparable to functor patterns.
The whole thing needs to be platform independent (at least MS Windows and Linux). And the solution needs to be licensed liberally, like open source LGPL or (even better) BSD or something, especially allowing use commercial products.
What I do not want are heavy things like networking, inter-process-communication, extra compiler steps (one-time code generation is ok though), or dependencies to some heavy libraries (like Qt).
The concrete scenario is: I have such a mechanism in a larger software, but the mechanism is not very well implemented. Interfaces are realized by base classes exported by Dlls. These Dlls also export factory functions to instantiate the implementing objects, based on hand-written class ids.
Before I now start to redesign and implement something better myself, I want to know if there is something out there which would be even better.
Edit: The solution also needs to support multi-threading environments. Additionally, as everything will happen inside the same process, I do not need data serialization mechanisms of any kind.
Edit: I know how such mechanisms work, and I know that several teaching books contain corresponding examples. I do not want to write it myself. The aim of my question is: Is there some sort of "industry standard" lib for this? It is a small problem (within a single process) and I am really only searching for a small solution.
Edit: I got the suggestion to add a pseudo-code example of what I really want to do. So here it is:
Somewhere I want to define interfaces. I do not care if it's C-Headers or some language and code generation.
class interface1 {
public:
virtual void do_stuff(void) = 0;
};
class interface2 {
public:
virtual void do_more_stuff(void) = 0;
};
Then I want to provide (multiple) implementations. These may even be placed in Dll-based plugins. Especially, these two classes my be implemented in two different Dlls not knowing each other at compile time.
class A : public interface1 {
public:
virtual void do_stuff(void) {
// I even need to call further interfaces here
// This call should, however, not require anything heavy, like data serialization or something.
this->con->do_more_stuff();
}
// Interface connectors of some kind. Here I use something like a template
some_connector<interface2> con;
};
class B : public interface2 {
public:
virtual void do_more_stuff() {
// finally doing some stuff
}
};
Finally, I may application main code I want to be able to compose my application logic at runtime (e.g. based on user input):
void main(void) {
// first I create my objects through a factory
some_object a = some_factory::create(some_guid<A>);
some_object b = some_factory::create(some_guid<B>);
// Then I want to connect the interface-connector 'con' of object 'a' to the instance of object 'b'
some_thing::connect(a, some_guid<A::con>, b);
// finally I want to call an interface-method.
interface1 *ia = a.some_cast<interface1>();
ia->do_stuff();
}
I am perfectly able to write such a solution myself (including all pitfalls). What I am searching for is a solution (e.g. a library) which is used and maintained by a wide user base.
While not widely used, I wrote a library several years ago that does this.
You can see it on GitHub zen-core library, and it's also available on Google Code
The GitHub version only contains the core libraries, which is really all the you need. The Google Code version contains a LOT of extra libraries, primarily for game development, but it does provide a lot of good examples on how to use it.
The implementation was inspired by Eclipse's plugin system, using a plugin.xml file that indicates a list of available plugins, and a config.xml file that indicates which plugins you would like to load. I'd also like to change it so that it doesn't depend on libxml2 and allow you to be able to specify plugins using other methods.
The documentation has been destroyed thanks to some hackers, but if you think this would be useful then I can write enough documentation to get you started.
A co-worker gave me two further tips:
The loki library (originating from the modern c++ book):
http://loki-lib.sourceforge.net/
A boost-like library:
http://kifri.fri.uniza.sk/~chochlik/mirror-lib/html/
I still have not looked at all the ideas I got.

Friendship not inherited - what are the alternatives?

I have written/am writing a piece of physics analysis code, initially for myself, that will now hopefully be used and extended by a small group of physicists. None of us are C++ gurus. I have put together a small framework that abstracts the "physics event" data into objects acted on by a chain of tools that can easily be swapped in and out depending on the analysis requirements.
This has created two halves to the code: the "physics analysis" code that manipulates the event objects and produces our results via derivatives of a base "Tool"; and the "structural" code that attaches input files, splits the job into parallel runs, links tools into a chain according to some script, etc.
The problem is this: for others to make use of the code it is essential that every user should be able to follow every single step that modifies the event data in any way. The (many) extra lines of difficult structural code could therefore be daunting, unless it is obviously and demonstrably peripheral to the physics. Worse, looking at it in too much detail might give people ideas - and I'd rather they didn't edit the structural code without very good reason - and most importantly they must not introduce anything that affects the physics.
I would like to be able to:
A) demonstrate in an obvious way that
the structural code does not edit the
event data in any way
B) enforce this once other users
begin extending the code themselves
(none of us are
expert, and the physics always comes
first - translation: anything not
bolted down is fair game for a nasty
hack)
In my ideal scenario the event data would be private, with the derived physics tools inheriting access from the Tool base class. Of course in reality this is not allowed. I hear there are good reasons for this, but that's not the issue.
Unfortunately, in this case the method of calling getters/setters from the base (which is a friend) would create more problems than it solves - the code should be as clean, as easy to follow, and as connected to the physics as possible in the implementation of the tool itself (a user should not need to be an expert in either C++ or the inner workings of the program to create a tool).
Given that I have a trusted base class and any derivatives will be subject to close scrutiny, is there any other roundabout but well tested way of allowing access to only these derivatives? Or any way of denying access to the derivatives of some other base?
To clarify the situation I have something like
class Event
{
// The event data (particle collections etc)
};
class Tool
{
public:
virtual bool apply(Event* ev) = 0;
};
class ExampleTool : public Tool
{
public:
bool apply(Event* ev)
{
// do something like loop over the electron collection
// and throw away those will low energy
}
};
The ideal would be to limit access to the contents of Event to only these tools for the two reasons (A and B) above.
Thanks everyone for the solutions proposed. I think, as I suspected, the perfect solution I was wishing for is impossible. dribeas' solution would be perfect in any other setting, but its precisely in the apply() function that the code needs to be as clear and succinct as possible as we will basically spend all day writing/editing apply() functions, and will also need to understand every line of these written by each of the others. Its not so much about capability as readability and effort. I do like the preprocessor solution from "Useless". It doesn't really enforce the separation, but someone would need to be genuinely malicious to break it. To those who suggested a library, I think this will definitely be a good first step, but doesn't really address the two main issues (as I'll still need to provide the source anyway).
There are three access qualifiers in C++: public, protected and private. The sentence with the derived physics tools inheriting access from the Tool base class seems to indicate that you want protected access, but it is not clear whether the actual data that is private is in Tool (and thus protected suffices) or is currently private in a class that befriends Tool.
In the first case, just make the data protected:
class Tool {
protected:
type data;
};
In the second case, you can try to play nasty tricks on the language, like for example, providing an accessor at the Tool level:
class Data {
type this_is_private;
friend class Tool;
};
class Tool {
protected:
static type& gain_acces_to_data( Data& d ) {
return d.this_is_private;
}
};
class OneTool : public Tool {
public:
void foo( Data& d ) {
operate_on( gain_access_to_data(d) );
}
};
But I would avoid it altogether. There is a point where access specifiers stop making sense. They are tools to avoid mistakes, not to police your co-workers, and the fact is that as long as you want them to write code that will need access to that data (Tool extensions) you might as well forget about having absolute protection: you cannot.
A user that wants to gain access to the data might as well just use the newly created backdoor to do so:
struct Evil : Tool {
static type& break_rule( Data & d ) {
return gain_access_to_data( d );
}
};
And now everyone can simply use Evil as a door to Data. I recommend that you read the C++FAQ-lite for more insight on C++.
Provide the code as a library with headers to be used by whoever wants to create tools. This nicely encapsulates the stuff you want to keep intact. It's impossible to prevent hacks if everyone has access to the source and are keen to make changes to anything.
There is also the C-style approach, of restricting visibility rather than access rights. It is enforced more by convention and (to some extent) your build system, rather than the language - although you could use a sort of include guard to prevent "accidental" leakage of the Tool implementation details into the structural code.
-- ToolInterface.hpp --
class Event; // just forward declare it
class ToolStructuralInterface
{
// only what the structural code needs to invoke tools
virtual void invoke(std::list<Event*> &) = 0;
};
-- ToolImplementation.hpp --
class Event
{
// only the tool code sees this header
};
// if you really want to prevent accidental inclusion in the structural code
#define TOOL_PRIVATE_VISIBILITY
-- StructuralImplementation.hpp --
...
#ifdef TOOL_PRIVATE_VISIBILITY
#error "someone leaked tool implementation details into the structural code"
#endif
...
Note that this kind of partitioning lends itself to putting the tool and structural code in seperate libraries - you might even be able to restrict access to the structural code seperately to the tool code, and just share headers and the compiled library.

Module and classes handling (dynamic linking)

Run into a bit of an issue, and I'm looking for the best solution concept/theory.
I have a system that needs to use objects. Each object that the system uses has a known interface, likely implemented as an abstract class. The interfaces are known at build time, and will not change. The exact implementation to be used will vary and I have no idea ahead of time what module will be providing it. The only guarantee is that they will provide the interface. The class name and module (DLL) come from a config file or may be changed programmatically.
Now, I have all that set up at the moment using a relatively simple system, set up something like so (rewritten pseudo-code, just to show the basics):
struct ClassID
{
Module * module;
int number;
};
class Module
{
HMODULE module;
function<void * (int)> * createfunc;
static Module * Load(String filename);
IObject * CreateClass(int number)
{
return createfunc(number);
}
};
class ModuleManager
{
bool LoadModule(String filename);
IObject * CreateClass(String classname)
{
ClassID class = AvailableClasses.find(classname);
return class.module->CreateObject(class.number);
}
vector<Module*> LoadedModules;
map<String, ClassID> AvailableClasses;
};
Modules have a few exported functions to give the number of classes they provide and the names/IDs of those, which are then stored. All classes derive from IObject, which has a virtual destructor, stores the source module and has some methods to get the class' ID, what interface it implements and such.
The only issue with this is each module has to be manually loaded somewhere (listed in the config file, at the moment). I would like to avoid doing this explicitly (outside of the ModuleManager, inside that I'm not really concerned as to how it's implemented).
I would like to have a similar system without having to handle loading the modules, just create an object and (once it's all set up) it magically appears.
I believe this is similar to what COM is intended to do, in some ways. I looked into the COM system briefly, but it appears to be overkill beyond belief. I only need the classes known within my system and don't need all the other features it handles, just implementations of interfaces coming from somewhere.
My other idea is to use the registry and keep a key with all the known/registered classes and their source modules and numbers, so I can just look them up and it will appear that Manager::CreateClass finds and makes the object magically. This seems like a viable solution, but I'm not sure if it's optimal or if I'm reinventing something.
So, after all that, my question is: How to handle this? Is there an existing technology, if not, how best to set it up myself? Are there any gotchas that I should be looking out for?
COM very likely is what you want. It is very broad but you don't need to use all the functionality. For example, you don't need to require participants to register GUIDs, you can define your own mechanism for creating instances of interfaces. There are a number of templates and other mechanisms to make it easy to create COM interfaces. What's more, since it is a standard, it is easy to document the requirements.
One very important thing to bear in mind is that importing/exporting C++ objects requires all participants to be using the same compiler. If you think that ever could be a problem to you then you should use COM. If you are happy to accept that restriction then you can carry on as you are.
I don't know if any technology exists to do this.
I do know that I worked with a system very similar to this. We used XML files to describe the various classes that different modules made available. Our equivalent of ModuleManager would parse the xml files to determine what to create for the user at run time based on the class name they provided and the configuration of the system. (Requesting an object that implemented interface 'I' could give back any of objects 'A', 'B' or 'C' depending on how the system was configured.)
The big gotcha we found was that the system was very brittle and at times hard to debug/understand. Just reading through the code, it was often near impossible to see what concrete class was being instantiated. We also found that maintaining the XML created more bugs and overhead than expected.
If I was to do this again, I would keep the design pattern of exposing classes from DLL's through interfaces, but I would not try to build a central registry of classes, nor would I derive everything from a base class such as IObject.
I would instead make each module responsible for exposing its own factory functions(s) to instantiate objects.

OOP vs macro problem

I came across this problem via a colleague today. He had a design for a front end system which goes like this:
class LWindow
{
//Interface for common methods to Windows
};
class LListBox : public LWindow
{
//Do not override methods in LWindow.
//Interface for List specific stuff
}
class LComboBox : public LWindow{} //So on
The Window system should work on multiple platforms. Suppose for the moment we target Windows and Linux. For Windows we have an implementation for the interface in LWindow. And we have multiple implementations for all the LListBoxes, LComboBoxes, etc. My reaction was to pass an LWindow*(Implementation object) to the base LWindow class so it can do this:
void LWindow::Move(int x, int y)
{
p_Impl->Move(x, y); //Impl is an LWindow*
}
And, do the same thing for implementation of LListBox and so on
The solution originally given was much different. It boiled down to this:
#define WindowsCommonImpl {//Set of overrides for LWindow methods}
class WinListBox : public LListBox
{
WindowsCommonImpl //The overrides for methods in LWindow will get pasted here.
//LListBox overrides
}
//So on
Now, having read all about macros being evil and good design practices, I immediately was against this scheme. After all, it is code duplication in disguise. But I couldn't convince my colleague of that. And I was surprised that that was the case. So, I pose this question to you. What are the possible problems of the latter method? I'd like practical answers please. I need to convince someone who is very practical (and used to doing this sort of stuff. He mentioned that there's lots of macros in MFC!) that this is bad (and myself). Not teach him aesthetics. Further, is there anything wrong with what I proposed? If so, how do I improve it? Thanks.
EDIT: Please give me some reasons so I can feel good about myself supporting oop :(
Going for bounty. Please ask if you need any clarifications. I want to know arguments for and vs OOP against the macro :)
Your colleague is probably thinking of the MFC message map macros; these are used in important-looking places in every MFC derived class, so I can see where your colleague is coming from. However these are not for implementing interfaces, but rather for details with interacting with the rest of the Windows OS.
Specifically, these macros implement part of Windows' message pump system, where "messages" representing requests for MFC classes to do stuff gets directed to the correct handler functions (e.g. mapping the messages to the handlers). If you have access to visual studio, you'll see that these macros wrap the message map entries in a somewhat-complicated array of structs (that the calling OS code knows how to read), and provide functions to access this map.
As MFC users, the macro system makes this look clean to us. But this works mostly because underlying Windows API is well-specified and won't change much, and most of the macro code is generated by the IDE to avoid typos. If you need to implement something that involves messy declarations then macros might make sense, but so far this doesn't seem to be the case.
Practical concerns that your colleague may be interested in:
duplicated macro calls. Looks like you're going to need to copy the line "WindowsCommonImpl" into each class declaration - assuming the macro expands to some inline functions. If they're only declarations and the implementations go in a separate macro, you'll need to do this in every .cpp file too - and change the class name passed into the macro every time.
longer recompile time. For your solution, if you change something in the LWindow implementation, you probably only need to recompile LWindow.cpp. If you change something in the macro, everything that includes the macro header file needs to be recompiled, which is probably your whole project.
harder to debug. If the error has to do with the logic within the macro, the debugger will probably break to the caller, where you don't see the error right away. You may not even think to check the macro definition because you thought you knew exactly what it did.
So basically your LWindow solution is a better solution, to minimize headaches down the road.
Does'nt answer your question directly may be, but can't help from telling you to Read up on the Bridge Design pattern in GOF. It's meant exactly for that.
Decouple an abstraction from its
implementation so that the two can
vary independently.
From what I can understand, you are already on the right path, other than the MACRO stuff.
My reaction was to pass an
LWindow*(Implementation object) to the
base LWindow class so it can do this:
LListBox and LComboBox should receive an instance of WindowsCommonImpl.
In the first solution, inheritance is used so that LListBox and LComboBox can use some common methods. However, inheritance is not meant for this.
I would agree with you. Solution with WindowsCommonImpl macro is really bad. It is error-prone, hard to extend and very hard to debug. MFC is a good example of how you should not design your windows library. If it looks like MFC, you are really on a wrong way.
So, your solution obviously better than macro-based one. Anyway, I wouldn't agree it is good enough. The most significant drawback to me is that you mix interface and implementation. Most practical value of separating interface and implementation is ability to easily write mock objects for testing purposes.
Anyway, it seems the problem you are trying to solve is how to combine interface inheritance with implementation inheritance in C++. I would suggest using template class for window implementation.
// Window interface
class LWindow
{
};
// ListBox interface (inherits Window interface)
class LListBox : public LWindow
{
};
// Window implementation template
template<class Interface>
class WindowImpl : public Interface
{
};
// Window implementation
typedef WindowImpl<LWindow> Window;
// ListBox implementation
// (inherits both Window implementation and Window interface)
class ListBox : public WindowImpl<LListBox>
{
};
As I remember WTL windows library is based on the similar pattern of combining interfaces and implementations. I hope it helps.
Oh man this is confusing.
OK, so L*** is a hierarchy of interfaces, that's fine. Now what are you using the p_Impl for, if you have an interface, why would you include implementation in it?
The macro stuff is of course ugly, plus it's usually impossible to do. The whole point is that you will have different implementations, if you don't, then why create several classes in the first place?
OP seems confused. Here' what to do, it is very complex but it works.
Rule 1: Design the abstractions. If you have an "is-A" relation you must use public virtual inheritance.
struct Window { .. };
struct ListBox : virtual Window { .. };
Rule 2: Make implementations, if you're implementing an abstraction you must use virtual inheritance. You are free to use inheritance to save on duplication.
class WindowImpl : virtual Window { .. };
class BasicListBoxImpl : virtual ListBox, public WindowImpl { .. };
class FancyListBoxImpl : public BasicListBoxImpl { };
Therefore you should read "virtual" to mean "isa" and other inheritance is just saving on rewriting methods.
Rule3: Try to make sure there is only one useful function in a concrete type: the constructor. This is sometimes hard, you may need some default and some set methods to fiddle things. Once the object is set up cast away the implementation. Ideally you'd do this on construction:
ListBox *p = new FancyListBoxImpl (.....);
Notes: you are not going to call any abstract methods directly on or in an implementation so private inheritance of abstract base is just fine. Your task is exclusively to define these methods, not to use them: that's for the clients of the abstractions only. Implementations of virtual methods from the bases also might just as well be private for the same reason. Inheritance for reuse will probably be public since you might want to use these methods in the derived class or from outside of it after construction to configure your object before casting away the implementation details.
Rule 4: There is a standard implementation for many abstractions, known as delegation which is one you were talking about:
struct Abstract { virtual void method()=0; };
struct AbstractImpl_Delegate: virtual Abstract {
Abstract *p;
AbstractImpl_Delegate (Abstract *q) : p(q) {}
void method () { p->method(); }
};
This is a cute implementation since it doesn't require you to know anything about the abstraction or how to implement it... :)
I found that
Using
the preprocessor #define directive to
define constants is not as precise.
[src]
Macros are apparently not as precise, I did not even know that...
The classic hidden dangers of the preprocessor like:
#define PI_PLUS_ONE (3.14 + 1)`
By doing so, you avoid the possibility
that an order of operations issue will
destroy the meaning of your constant:
x = PI_PLUS_ONE * 5;`
Without
parentheses, the above would be
converted to
x = 3.14 + 1 * 5;
[src]

How to design a C++ API for binary compatible extensibility

I am designing an API for a C++ library which will be distributed in a dll / shared object. The library contains polymorhic classes with virtual functions. I am concerned that if I expose these virtual functions on the DLL API, I cut myself from the possibility of extending the same classes with more virtual functions without breaking binary compatibility with applications built for the previous version of the library.
One option would be to use the PImpl idiom to hide all the classes having virtual functions, but that also seem to have it's limitations: this way applications lose the possibility of subclassing the classes of the library and overriding the virtual methods.
How would you design a API class which can be subclassed in an application, without losing the possibility to extend the API with (not abstract) virtual methods in a new version of the dll while staying backward binary compatible?
Update: the target platforms for the library are windows/msvc and linux/gcc.
Several months ago I wrote an article called "Binary Compatibility of Shared Libraries Implemented in C++ on GNU/Linux Systems" [pdf]. While concepts are similar on Windows system, I'm sure they're not exactly the same. But having read the article you can get a notion on what's going on at C++ binary level that has anything to do with compatibility.
By the way, GCC application binary interface is summarized in a standard document draft "Itanium ABI", so you'll have a formal ground for a coding standard you choose.
Just for a quick example: in GCC you can extend a class with more virtual functions, if no other class inherits it. Read the article for better set of rules.
But anyway, rules are sometimes way too complex to understand. So you might be interested in a tool that verifies compatibility of two given versions: abi-compliance-checker for Linux.
There is an interesting article on the KDE knowledge base that describes the do's and don'ts when aiming at binary compatibility when writing a library: Policies/Binary Compatibility Issues With C++
C++ binary compat is generally difficult, even without inheritance. Look at GCC for example. In the last 10 years, I'm not sure how many breaking ABI changes they've had. Then MSVC has a different set of conventions, so linking to that with GCC and vice versa can't be done... If you compare this to the C world, compiler inter-op seems a bit better there.
If you're on Windows you should look at COM. As you introduce new functionality you can add interfaces. Then callers can QueryInterface() for the new one to expose that new functionality, and even if you end up changing things a lot, you can either leave the old implementation there or you can write shims for the old interfaces.
I think you misunderstand the problem of subclassing.
Here is your Pimpl:
// .h
class Derived
{
public:
virtual void test1();
virtual void test2();
private;
Impl* m_impl;
};
// .cpp
struct Impl: public Base
{
virtual void test1(); // override Base::test1()
virtual void test2(); // override Base::test2()
// data members
};
void Derived::test1() { m_impl->test1(); }
void Derived::test2() { m_impl->test2(); }
See ? No problem with overriding the virtual methods of Base, you just need to make sure to redeclare them virtual in Derived so that those deriving from Derived know they may rewrite them too (only if you wish so, which by the way is a great way of providing a final for those who lack it), and you may still redefine it for yourself in Impl which may even call the Base version.
There is no problem with Pimpl there.
On the other hand, you lose polymorphism, which may be troublesome. It's up to you to decide whether you want polymorphism or just composition.
If you expose the PImpl class in a header file, then you can inherit from it. You can still maintain backward portability since the external classes contains a pointer to the PImpl object. Of course if the client code of the library isn't very wise, it could misuse this exposed PImpl object, and ruin the binary backward compatibility. You may add some notes to warn the user in the PImpl's header file.