Dependency inversion (from S.O.L.I.D principles) in C++ - c++

After reading and watching much about SOLID principles I was very keen to use these principles in my work (mostly C++ development) since I do think they are good principles and that they indeed will bring much benefit to the quality of my code, readability, testability, reuse and maintainability.
But I have real hard time with the 'D' (Dependency inversion).
This principal states that:
A. High-level modules should not depend on low-level modules. Both should depend on abstractions.
B. Abstractions should not depend on details. Details should depend on abstractions.
Let me explain by example:
Lets say I am writing the following interface:
class SOLIDInterface {
//usual stuff with constructor, destructor, don't copy etc
public:
virtual void setSomeString(const std::string &someString) = 0;
};
(for the sake of simplicity please ignore the other things needed for a "correct interface" such as non virutal publics, private virtuals etc, its not part of the problem.)
notice, that setSomeString() is taking an std::string.
But that breaks the above principal since std::string is an implementation.
Java and C# don't have that problem since the language offers interfaces to all the complex common types such as string and containers.
C++ does not offer that.
Now, C++ does offer the possibility to write this interface in such a way that I could write an 'IString' interface that would take any implementation that will support an std::string interface using type erasure
(Very good article: http://www.artima.com/cppsource/type_erasure.html)
So the implementation could use STL (std::string) or Qt (QString), or my own string implementation or something else.
Like it should be.
But this means, that if I (and not only I but all C++ developers) want to write C++ API which obeys SOLID design principles ('D' included), I will have to implement a LOT of code to accommodate all the common non natural types.
Beyond being not realistic in terms of effort, this solution has other problems such as - what if STL changes?(for this example)
And its not really a solution since STL is not implementing IString, rather IString is abstracting STL, so even if I were to create such an interface the principal problem remains.
(I am not even getting into issues such as this adds polymorphic overhead, which for some systems, depending on size and HW requirements may not be acceptable)
So may question is:
Am I missing something here (which I guess the true answer, but what?), is there a way to use Dependency inversion in C++ without writing a whole new interface layer for the common types in a realistic way - or are we doomed to write API which is always dependent on some implementation?
Thanks for your time!
From the first few comments I received so far I think a clarification is needed:
The selection of std::string was just an example.
It could be QString for that matter - I just took STL since it is the standard.
Its not even important that its a string type, it could be any common type.
I have selected the answer by Corristo not because he explicitly answered my question but because the extensive post (coupled with the other answers) allowed me to extract my answer from it implicitly, realizing that the discussion tends to drift from the actual question which is:
Can you implement Dependency inversion in C++ when you use basic complex types like strings and containers and basically any of the STL with an effort that makes sense. (and the last part is a very important element of the question).
Maybe I should have explicitly noted that I am after run-time polymorphism not compile time.
The clear answer is NO, its not possible.
It might have been possible if STL would have exposed abstract interfaces to their implementations (if there are indeed reasons that prevent the STL implementations to derive from these interfaces (say, performance)) then it still could have simply maintained these abstract interfaces to match the implementations).
For types that I have full control over, yes, there is no technical problem implementing the DIP.
But most likely any such interface (of my own) will still use a string or a container, forcing it to use either the STL implementation or another.
All the suggested solutions below are either not polymorphic in runtime, or/and are forcing quiet a some coding around the interface - when you think you have to do this for all these common types the practicality is simply not there.
If you think you know better, and you say it is possible to have what I described above then simply post the code proving it.
I dare you! :-)

Note that C++ is not an object-oriented programming language, but rather lets the programmer choose between many different paradigms. One of the key principles of C++ is that of zero-cost abstractions, which in particular entails to build abstractions in such a way that users don't pay for what they don't use.
The C#/Java style of defining interfaces with virtual methods that are then implemented by derived classes don't fall into that category though, because even if you don't need the polymorphic behavior, were std::string implementing a virtual interface, every call of one of its methods would incur a vtable lookup. This is unacceptable for classes in the C++ standard library supposed to be used in all kinds of settings.
Defining interfaces without inheriting from an abstract interface class
Another problem with the C#/Java approach is that in most cases you don't actually care that something inherits from a particular abstract interface class and only need that the type you pass to a function supports the operations you use. Restricting accepted parameters to those inheriting from a particular interface class thus actually hinders reuse of existing components, and you often end up writing wrappers to make classes of one library conform to the interfaces of another - even when they already have the exact same member functions.
Together with the fact that inheritance-based polymorphism typically also entails heap allocations and reference semantics with all its problems regarding lifetime management, it is best to avoid inheriting from an abstract interface class in C++.
Generic templates for implicit interfaces
In C++ you can get compile-time polymorphism through templates.
In its simplest form, the interface that an object used in a templated function or class need to conform to is not actually specified in C++ code, but implied by what functions are called on them.
This is the approach used in the STL, and it is really flexible. Take std::vector for example. There the requirements on the value type T of objects you store in it are dependent on what operations you perform on the vector. This allows e.g. to store move-only types as long as you don't use any of the operations that need to make a copy. In such a case, defining an interface that the value types needs to conform to would greatly reduce the usefulness of std::vector, because you'd either need to remove methods that require copies or you'd need to exclude move-only types from being stored in it.
That doesn't mean you can't use dependency inversion, though: The common Button-Lamp example for dependency inversion implemented with templates would look like this:
class Lamp {
public:
void activate();
void deactivate();
};
template <typename T>
class Button {
Button(T& switchable)
: _switchable(&switchable) {
}
void toggle() {
if (_buttonIsInOnPosition) {
_switchable->deactivate();
_buttonIsInOnPosition = false;
} else {
_switchable->activate();
_buttonIsInOnPosition = true;
}
}
private:
bool _buttonIsInOnPosition{false};
T* _switchable;
}
int main() {
Lamp l;
Button<Lamp> b(l)
b.toggle();
}
Here Button<T>::toggle implicitly relies on a Switchable interface, requiring T to have member functions T::activate and T::deactivate. Since Lamp happens to implement that interface it can be used with the Button class. Of course, in real code you would also state these requirements on T in the documentation of the Button class so that users don't need to look up the implementation.
Similarly, you could also declare your setSomeString method as
template <typename String>
void setSomeString(String const& string);
and then this will work with all types that implement all the methods you used in the implementation of setSomeString, hence only relying on an abstract - although implicit - interface.
As always, there are some downsides to consider:
In the string example, assuming you only make use of .begin() and .end() member functions returning iterators that return a char when dereferenced (e.g. to copy it into the classes' local, concrete string data member), you can also accidentally pass a std::vector<char> to it, even though it isn't technically a string. If you consider this a problem is arguable, in a way this can also be seen as the epitome of relying only on abstractions.
If you pass an object of a type that doesn't have the required (member) functions, then you can end up with horrible compiler error messages that make it very hard to find the source of the error.
Only in very limited cases it is possible to separate the interface of a templated class or function from its implementation, as is typically done with separate .h and .cpp files. This can thus lead to longer compile times.
Defining interfaces with the Concepts TS
if you really care about types used in templated functions and classes to conform to a fixed interface, regardless of what you actually use, there are ways to restrict the template parameters only to types conforming to a certain interface with std::enable_if, but these are very verbose and unreadable. In order to make this kind of generic programming easier, the Concepts TS allows to actually define interfaces that are checked by the compiler and thus greatly improves diagnostics. With the Concepts TS, the Button-Lamp example from above translates to
template <typename T>
concept bool Switchable = requires(T t) {
t.activate();
t.deactivate();
};
// Lamp as before
template <Switchable T>
class Button {
public:
Button(T&); // implementation as before
void toggle(); // implementation as before
private:
T* _switchable;
bool _buttonIsInOnPosition{false};
};
If you can't use the Concepts TS (it is only implemented in GCC right now), the closest you can get is the Boost.ConceptCheck library.
Type erasure for runtime polymorphism
There is one case where compile-time polymorphism doesn't suffice, and that is when the types you pass to or get from a particular function aren't fully determined at compile-time but depend on runtime parameters (e.g. from a config file, command-line arguments passed to the executable or even the value of a parameter passed to the function itself).
If you need to store objects (even in a variable) of a type dependent on runtime parameters, the traditional approach is to store pointers to a common base class instead and to use dynamic dispatch via virtual member functions to get the behavior you need. But this still suffers from the problem described before: You can't use types that effectively do what you need but were defined in an external library, and thus don't inherit from the base class you defined. So you have to write a wrapper class.
Or you do what you described in your question and create a type-erasure class.
An example from the standard library is std::function. You declare only the interface of the function and it can store arbitrary function pointers and callables that have that interface. In general, writing a type erasure class can be quite tedious, so I refrain from giving an example of a type-erasing Switchable here, but I can highly recommend Sean Parent's talk Inheritance is the base class of evil, where he demonstrates the technique for "Drawable" objects and explores what you can build on top of it in just over 20 minutes.
There are libraries that help writing type-erasure classes though, e.g. Louis Dionne's experimental dyno, where you define the interface via what he calls "concept maps" directly in C++ code, or Zach Laine's emtypen which uses a python tool to create the type erasure classes from a C++ header file you provide. The latter also comes with a CppCon talk describing the features as well as the general idea and how to use it.
Conclusion
Inheriting from a common base class just to define interfaces, while easy, leads to many problems that can be avoided using different approaches:
(Constrained) templates allow for compile-time polymorphism, which is sufficient for the majority of cases, but can lead to hard-to-understand compiler errors when used with types that don't conform to the interface.
If you need runtime polymorphism (which actually is rather rare in my experience), you can use type-erasure classes.
So even though the classes in the STL and other C++ libraries rarely derive from an abstract interface, you can still apply dependency inversion with one of the two methods described above if you really want to.
But as always, use good judgment on a case-by-case basis whether you really need the abstraction or if it is better to simply use a concrete type. The string example you brought up is one where I'd go with concrete types, simply because the different string classes don't share a common interface (e.g. std::string has .find(), but QStrings version of the same function is called .contains()). It might be just as much effort to write wrapper classes for both as it is to write a conversion function and to use that at well-defined boundaries within the project.

Ahh, but C++ lets you write code that is independent of a particular implementation without actually using inheritance.
std::string itself is a good example... it's actually a typedef for std::basic_string<char, std::char_traits<char>, std::allocator<char>>. Which allows you to create strings using other allocators if you choose (or mock the allocator object in order to measure number of calls, if you like). There just isn't any explicit interface like an IAllocator, because C++ templates use duck-typing.
A future version of C++ will support explicit description of the interface a template parameter must adhere to -- this feature is called concepts -- but just using duck-typing enables decoupling without requiring redundant interface definitions.
And because C++ performs optimization after instantiation of templates, there's no polymorphic overhead.
Now, when you do have virtual functions, you'll need to commit to a particular type, because the virtual-table layout doesn't accommodate use of templates each of which generates an arbitrary number of instances each of which require separate dispatch. But when using templates, you'll won't need virtual functions nearly as much as e.g. Java does, so in practice this isn't a big problem.

Related

C++ stdlib container class hierarchy

I've been wondering, is there any reason for the design decision in C++ to not have a pure abstract class for any of the std library containers?
I appreciate that hash_map came later from the stdext namespace but shares a very similar interface. In the event that I later decide I would like to implement my own map for a particular piece of software, I would have preferred to have some kind of interface to work with.
Example
std::base_map *foo = new std::map<std::string, std::string>;
delete foo;
foo = new stdext::hash_map<std::string, std::string>;
Obviously the above example is not possible, as far as I am aware, however this is similar for list and other std lib containers.
I appreciate that this is not C# or Java, but there is obviously no constraints in C++ to stop this design, so why was it designed like this so that there is no coupling between similar containers.
Because virtual functions add overhead.
Because the containers don't all have the same interface, there are common functions but also important differences regarding iterator invalidation and memory allocation (and so exception behaviour) which you need to understand, if you were using an abstract base you wouldn't know the specifics of how the concrete container would behave.
If you want to write code that is agnostic about the type of container it is passed then in C++ you write a template instead of relying on abstract interfaces, i.e. use static polymorphism not dynamic polymorphism. That avoids the overhead of dynamic dispatch and also allows specialization based on concrete type, because the concrete type is known at compile time.
Finally, it wouldn't have any advantage IMHO. It's better the way it is. As you say, this isn't C# or Java, thankfully.
(P.S. the stdext namespace is not part of C++, it appears to be Microsoft's namespace for non-standard types, a better example would use std::tr1::unordered_map or std::unordered_map instead of stdext::hash_map)

Inheritance & virtual functions Vs Generic Programming

I need to Understand that whether really Inheritance & virtual functions not necessary in C++ and one can achieve everything using Generic programming. This came from Alexander Stepanov and Lecture I was watching is Alexander Stepanov: STL and Its Design Principles
I always like to think of templates and inheritance as two orthogonal concepts, in the very literal sense: To me, inheritance goes "vertically", starting with a base class at the top and going "down" to more and more derived classes. Every (publically) derived class is a base class in terms of its interface: A poodle is a dog is an animal.
On the other hand, templates go "horizontal": Each instance of a template has the same formal code content, but two distinct instances are entirely separate, unrelated pieces that run in "parallel" and don't see each other. Sorting an array of integers is formally the same as sorting an array of floats, but an array of integers is not at all related to an array of floats.
Since these two concepts are entirely orthogonal, their application is, too. Sure, you can contrive situations in which you could replace one by another, but when done idiomatically, both template (generic) programming and inheritance (polymorphic) programming are independent techniques that both have their place.
Inheritance is about making an abstract concept more and more concrete by adding details. Generic programming is essentially code generation.
As my favourite example, let me mention how the two technologies come together beautifully in a popular implementation of type erasure: A single handler class holds a private polymorphic pointer-to-base of an abstract container class, and the concrete, derived container class is determined a templated type-deducing constructor. We use template code generation to create an arbitrary family of derived classes:
// internal helper base
class TEBase { /* ... */ };
// internal helper derived TEMPLATE class (unbounded family!)
template <typename T> class TEImpl : public TEBase { /* ... */ }
// single public interface class
class TE
{
TEBase * impl;
public:
// "infinitely many" constructors:
template <typename T> TE(const T & x) : impl(new TEImpl<T>(x)) { }
// ...
};
They serve different purpose. Generic programming (at least in C++) is about compile time polymorphisim, and virtual functions about run-time polymorphisim.
If the choice of the concrete type depends on user's input, you really need runtime polymorphisim - templates won't help you.
Polymorphism (i.e. dynamic binding) is crucial for decisions that are based on runtime data. Generic data structures are great but they are limited.
Example: Consider an event handler for a discrete event simulator: It is very cheap (in terms of programming effort) to implement this with a pure virtual function, but is verbose and quite inflexible if done purely with templated classes.
As rule of thumb: If you find yourself switching (or if-else-ing) on the value of some input object, and performing different actions depending on its value, there might exist a better (in the sense of maintainability) solution with dynamic binding.
Some time ago I thought about a similar question and I can only dream about giving you such a great answer I received. Perhaps this is helpful: interface paradigm performance (dynamic binding vs. generic programming)
It seems like a very academic question, like with most things in life there are lots of ways to do things and in the case of C++ you have a number of ways to solve things. There is no need to have an XOR attitude to things.
In the ideal world, you would use templates for static polymorphism to give you the best possible performance in instances where the type is not determined by user input.
The reality is that templates force most of your code into headers and this has the consequence of exploding your compile times.
I have done some heavy generic programming leveraging static polymorphism to implement a generic RPC library (https://github.com/bytemaster/mace (rpc_static_poly branch) ). In this instance the protocol (JSON-RPC, the transport (TCP/UDP/Stream/etc), and the types) are all known at compile time so there is no reason to do a vtable dispatch... or is there?
When I run the code through the pre-processor for a single.cpp it results in 250,000 lines and takes 30+ seconds to compile a single object file. I implemented 'identical' functionality in Java and C# and it compiles in about a second.
Almost every stl or boost header you include adds thousands or 10's of thousands of lines of code that must be processed per-object-file, most of it redundant.
Do compile times matter? In most cases they have a more significant impact on the final product than 'maximally optimized vtable elimination'. The reason being that every 'bug' requires a 'try fix, compile, test' cycle and if each cycle takes 30+ seconds development slows to a crawl (note motivation for Google's go language).
After spending a few days with java and C# I decided that I needed to 're-think' my approach to C++. There is no reason a C++ program should compile much slower than the underlying C that would implement the same function.
I now opt for runtime polymorphism unless profiling shows that the bottleneck is in vtable dispatches. I now use templates to provide 'just-in-time' polymorphism and type-safe interface on top of the underlying object which deals with 'void*' or an abstract base class. In this way users need not derive from my 'interfaces' and still have the 'feel' of generic programming, but they get the benefit of fast compile times. If performance becomes an issue then the generic code can be replaced with static polymorphism.
The results are dramatic, compile times have fallen from 30+ seconds to about a second. The post-preprocessor source code is now a couple thousand lines instead of 250,000 lines.
On the other side of the discussion, I was developing a library of 'drivers' for a set of similar but slightly different embedded devices. In this instance the embedded device had little room for 'extra code' and no use for 'vtable' dispatch. With C our only option was 'separate object files' or runtime 'polymorphism' via function pointers. Using generic programming and static polymorphism we were able to create maintainable software that ran faster than anything we could produce in C.

How do Concepts differ from Interfaces?

How do Concepts (ie those recently dropped from the C++0x standard) differ from Interfaces in languages such as Java?
Concepts are for compile-time polymorphism, That means parametric generic code. Interfaces are for run-time polymorphism.
You have to implement an interface as you implement a Concept. The difference is that you don't have to explicitly say that you are implementing a Concept. If the required interface is matched then no problems. In the case of interfaces, even if you implemented all the required functions, you have to excitability say that you are implementing it!
I will try to clarify my answer :)
Imagine that you are designing a container that accepts any type that has the size member function. We formalize the Concept and call it HasSize, of course we should define it elsewhere but this is an example no more.
template <class HasSize>
class Container
{
HasSize[10]; // just an example don't take it seriously :)
// elements MUST have size member function!
};
Then, Imagine we are creating an instance of our Container and we call it myShapes, Shape is a base class and it defines the size member function. Square and Circle are just children of it. If Shape didn't define size then an error should be produced.
Container<Shape> myShapes;
if(/* some condition*/)
myShapes.add(Square());
else
myShapes.add(Circle());
I hope you see that Shape can be checked against HasSize at compile time, there is no reason to do the checking at run-time. Unlike the elements of myShapes, we could define a function that manipulates them :
void doSomething(Shape* shape)
{
if(/* shape is a Circle*/)
// cast then do something with the circle.
else if( /* shape is a Square */)
// cast then do something with the square.
}
In this function, you can't know what will be passed till run-time a Circle or a Square!
They are two tools for a similar job, though Interface-or whatever you call them- can do almost the same job of Concepts at run-time but you lose all benefits of compile-time checking and optimization!
Concepts are likes types (classes) for templates: it's for the generic programming side of the language only.
In that way, it's not meant to replace the interface classes (assuming you mean abstract classes or other C++ equivalent implementation of C# or Java Interfaces) as it's only meant to check types used in template parameters to match specific requirements.
The type check is only done at compile time like all the template code generation and whereas interface classes have an impact on runtime execution.
Concepts are implicit interfaces. In C# or Java a class must explicitly implement an interface, whereas in C++ a class is part of a concept merely as long as it meets the concept's constraints.
The reason you will see concepts in C++ and not in Java or C# is because C++ doesn't really have "interfaces". Instead, you can simulate an interface by using multiple inheritance and abstract, memberless base classes. These are somewhat of a hack and can be a headache to work with (e.g. virtual inheritance and The Diamond Problem). Interfaces play a critical role in OOP and polymorphism, and that role has not been adequately fulfilled in C++ so far. Concepts are the answer to this problem.
It's more or less a difference in the point of view. While an interface (as in C#) is specified similar to a base class, a concept can also be matched automatically (similar to duck-typing in Python). It is still unclear to which level C++ is going to support automatic concept matching, which is one of the reasons why they dropped it.
To keep it simple, as per my understanding.
Concept is a constraint on the template parameter of a Type (i.e., class or struct) or a Method
Interfaces is a contract that Type (i.e., class Or struct) has to implement.

Achieving Interface functionality in C++

A big reason why I use OOP is to create code that is easily reusable. For that purpose Java style interfaces are perfect. However, when dealing with C++ I really can't achieve any sort of functionality like interfaces... at least not with ease.
I know about pure virtual base classes, but what really ticks me off is that they force me into really awkward code with pointers. E.g. map<int, Node*> nodes; (where Node is the virtual base class).
This is sometimes ok, but sometimes pointers to base classes are just not a possible solution. E.g. if you want to return an object packaged as an interface you would have to return a base-class-casted pointer to the object.. but that object is on the stack and won't be there after the pointer is returned. Of course you could start using the heap extensively to avoid this but that's adding so much more work than there should be (avoiding memory leaks).
Is there any way to achieve interface-like functionality in C++ without have to awkwardly deal with pointers and the heap?? (Honestly for all that trouble and awkardness id rather just stick with C.)
You can use boost::shared_ptr<T> to avoid the raw pointers. As a side note, the reason why you don't see a pointer in the Java syntax has nothing to do with how C++ implements interfaces vs. how Java implements interfaces, but rather it is the result of the fact that all objects in Java are implicit pointers (the * is hidden).
Template MetaProgramming is a pretty cool thing. The basic idea? "Compile time polymorphism and implicit interfaces", Effective C++. Basically you can get the interfaces you want via templated classes. A VERY simple example:
template <class T>
bool foo( const T& _object )
{
if ( _object != _someStupidObject && _object > 0 )
return true;
return false;
}
So in the above code what can we say about the object T? Well it must be compatible with '_someStupidObject' OR it must be convertible to a type which is compatible. It must be comparable with an integral value, or again convertible to a type which is. So we have now defined an interface for the class T. The book "Effective C++" offers a much better and more detailed explanation. Hopefully the above code gives you some idea of the "interface" capability of templates. Also have a look at pretty much any of the boost libraries they are almost all chalk full of templatization.
Considering C++ doesn't require generic parameter constraints like C#, then if you can get away with it you can use boost::concept_check. Of course, this only works in limited situations, but if you can use it as your solution then you'll certainly have faster code with smaller objects (less vtable overhead).
Dynamic dispatch that uses vtables (for example, pure virtual bases) will make your objects grow in size as they implement more interfaces. Managed languages do not suffer from this problem (this is a .NET link, but Java is similar).
I think the answer to your question is no - there is no easier way. If you want pure interfaces (well, as pure as you can get in C++), you're going to have to put up with all the heap management (or try using a garbage collector. There are other questions on that topic, but my opinion on the subject is that if you want a garbage collector, use a language designed with one. Like Java).
One big way to ease your heap management pain somewhat is auto pointers. Boost has a nice automatic pointer that does a lot of heap management work for you. The std::auto_ptr works, but it's quite quirky in my opinion.
You might also evaluate whether you really need those pure interfaces or not. Sometimes you do, but sometimes (like some of the code I work with), the pure interfaces are only ever instantiated by one class, and thus just become extra work, with no benefit to the end product.
While auto_ptr has some weird rules of use that you must know*, it exists to make this kind of thing work easily.
auto_ptr<Base> getMeAThing() {
return new Derived();
}
void something() {
auto_ptr<Base> myThing = getMeAThing();
myThing->foo(); // Calls Derived::foo, if virtual
// The Derived object will be deleted on exit to this function.
}
*Never put auto_ptrs in containers, for one. Understand what they do on assignment is another.
This is actually one of the cases in which C++ shines. The fact that C++ provides templates and functions that are not bound to a class makes reuse much easier than in pure object oriented languages. The reality though is that you will have to adjust they manner in which you write your code in order to make use of these benefits. People that come from pure OO languages often have difficulty with this, but in C++ an objects interface includes not member functions. In fact it is considered to be good practice in C++ to use non-member functions to implement an objects interface whenever possible. Once you get the hang of using template nonmember functions to implement interfaces, well it is a somewhat life changing experience. \

Could C++ have not obviated the pimpl idiom?

As I understand, the pimpl idiom is exists only because C++ forces you to place all the private class members in the header. If the header were to contain only the public interface, theoretically, any change in class implementation would not have necessitated a recompile for the rest of the program.
What I want to know is why C++ is not designed to allow such a convenience. Why does it demand at all for the private parts of a class to be openly displayed in the header (no pun intended)?
This has to do with the size of the object. The h file is used, among other things, to determine the size of the object. If the private members are not given in it, then you would not know how large an object to new.
You can simulate, however, your desired behavior by the following:
class MyClass
{
public:
// public stuff
private:
#include "MyClassPrivate.h"
};
This does not enforce the behavior, but it gets the private stuff out of the .h file.
On the down side, this adds another file to maintain.
Also, in visual studio, the intellisense does not work for the private members - this could be a plus or a minus.
I think there is a confusion here. The problem is not about headers. Headers don't do anything (they are just ways to include common bits of source text among several source-code files).
The problem, as much as there is one, is that class declarations in C++ have to define everything, public and private, that an instance needs to have in order to work. (The same is true of Java, but the way reference to externally-compiled classes works makes the use of anything like shared headers unnecessary.)
It is in the nature of common Object-Oriented Technologies (not just the C++ one) that someone needs to know the concrete class that is used and how to use its constructor to deliver an implementation, even if you are using only the public parts. The device in (3, below) hides it. The practice in (1, below) separates the concerns, whether you do (3) or not.
Use abstract classes that define only the public parts, mainly methods, and let the implementation class inherit from that abstract class. So, using the usual convention for headers, there is an abstract.hpp that is shared around. There is also an implementation.hpp that declares the inherited class and that is only passed around to the modules that implement methods of the implementation. The implementation.hpp file will #include "abstract.hpp" for use in the class declaration it makes, so that there is a single maintenance point for the declaration of the abstracted interface.
Now, if you want to enforce hiding of the implementation class declaration, you need to have some way of requesting construction of a concrete instance without possessing the specific, complete class declaration: you can't use new and you can't use local instances. (You can delete though.) Introduction of helper functions (including methods on other classes that deliver references to class instances) is the substitute.
Along with or as part of the header file that is used as the shared definition for the abstract class/interface, include function signatures for external helper functions. These function should be implemented in modules that are part of the specific class implementations (so they see the full class declaration and can exercise the constructor). The signature of the helper function is probably much like that of the constructor, but it returns an instance reference as a result (This constructor proxy can return a NULL pointer and it can even throw exceptions if you like that sort of thing). The helper function constructs a particular implementation instance and returns it cast as a reference to an instance of the abstract class.
Mission accomplished.
Oh, and recompilation and relinking should work the way you want, avoiding recompilation of calling modules when only the implementation changes (since the calling module no longer does any storage allocations for the implementations).
You're all ignoring the point of the question -
Why must the developer type out the PIMPL code?
For me, the best answer I can come up with is that we don't have a good way to express C++ code that allows you to operate on it. For instance, compile-time (or pre-processor, or whatever) reflection or a code DOM.
C++ badly needs one or both of these to be available to a developer to do meta-programming.
Then you could write something like this in your public MyClass.h:
#pragma pimpl(MyClass_private.hpp)
And then write your own, really quite trivial wrapper generator.
Someone will have a much more verbose answer than I, but the quick response is two-fold: the compiler needs to know all the members of a struct to determine the storage space requirements, and the compiler needs to know the ordering of those members to generate offsets in a deterministic way.
The language is already fairly complicated; I think a mechanism to split the definitions of structured data across the code would be a bit of a calamity.
Typically, I've always seen policy classes used to define implementation behavior in a Pimpl-manner. I think there are some added benefits of using a policy pattern -- easier to interchange implementations, can easily combine multiple partial implementations into a single unit which allow you to break up the implementation code into functional, reusable units, etc.
May be because the size of the class is required when passing its instance by values, aggregating it in other classes, etc ?
If C++ did not support value semantics, it would have been fine, but it does.
Yes, but...
You need to read Stroustrup's "Design and Evolution of C++" book. It would have inhibited the uptake of C++.