I'm currently refactoring a Tcl plugin library written in C++. Originally the code was hand-written. A second library exists that does the same thing for Java.
The refactored library will be a single C++ library that can be used to create bindings to different languages.
My first tests with SWIG are promising. However, a lot of junk is generated as well. Various base classes and utilities are all exported. These don't make sense from the scripting point of view and only increase clutter.
Possible solutions that I can think of are:
Use #ifndef SWIG in the original codebase to filter out unneeded code
Create a SWIG-compatible wrapper around the API classes.
Differentiate between public and private headers. Public headers are pure abstract base classes that contain no implementation. The private headers inherit and implement them. Only SWIG the public headers.
The opposite of the above solution: inherit a SWIG-compatible class for each API class.
I'm currently leaning towards solution 3 at the moment. However, I'm not really certain so I'd like to know the SO community's opinion on this. Feel free to share your thoughts.
Update
I forgot to list one solution:
Code that should not exported by SWIG should probably not be in the public section of your class.
Perhaps this is the answer. I'll have another look on Monday.
Update
I settled with a solution. See my answer.
Any approach that means that the C++ library becomes less useful to the C++ user is not the ideal solution.
#ifdef SWIG in the middle of .hpp files: Muddies up your C++ with unnecessary cruft, so it's not ideal
SWIG Specific Interface: This is a viable option, but only makes sense if the code you want to expose to SWIG is significantly higher level then the base C++ API.
Public vs Private interface: Might make sense, but again you have to ask at what cost to the C++ user of the API? Are you limiting the public interface too much? Who has access to the private interface? Should the pImpl idiom be considered instead?
SWIG Compatible Class for each interface: Probably more work than necessary.
First and foremost, to keep your SWIG related code separate from the rest of the API.
You probably don't want to import the .hpp files directly into SWIG (if SWIG wasn't considered during the initial design of the library), but if you do, you want to use a SWIG .i file to help you clean up the mess. There are three basic approaches we use, each with different use cases.
First, direct inclusion. This is useful if you know your API is nice and clean and well suited for parsing by SWIG:
// MyClass.i
%{
#include "MyClass.hpp" // included for the generated .cxx file
%}
%include "MyClass.hpp" // included and parsed directly by SWIG
The second case is for code that is most of the way there. This is code that had SWIG taken into consideration, but really needed some stuff for the C++ user that we didn't want to expose to SWIG:
// MyClass.i
%{
#include "MyClass.hpp" // included for the generated .cxx file
%}
%ignore MyClass::someFunction(); // This function really just causes us problems
%include "MyClass.hpp" // included and parsed directly by SWIG
The third case, and probably the one you want to use, is to directly choose which functions you want to expose to SWIG.
// MyClass.i
%{
#include "MyClass.hpp" // included for the generated .cxx file
%}
// With this example we provide exactly as much information to SWIG as we want
// it to have. Want it to know the base class? Add it. Don't want it to know about
// a function? Leave it out. want to add a new function? %extend it.
class MyClass
{
void importantFunction();
void importantFunction2();
}
I'd use apprach #3 too. I'm using a similar approach in my projects, and it is used by COM too (interfaces inherithed by private implementation class).
It is really easy to detect errors and maintain code in that way! Unfortunately you will end implementing all functions as virtual, but it should not be a big issue...
Separating the interface will keep it really clean and understandable!
My final solution: simply SWIG the original code base. In order to avoid generation of non-relevant code I use the following techniques. In order of preference:
Make non-swig code private or protected. If it doesn't need to be swigged, then it probably doesn't need to be public.
If possible, change the original code to make it more compatible with SWIG. I replaced a curiously recurring template pattern with abstract base classes. I was willing to make that sacrifice for SWIG :)
Add %ignore statements to the interface file.
Use #ifndef SWIG to filter it out. I don't like to pollute my original code so I only use this as a last resort.
Concerning my previous ideas:
Create a SWIG-compatible wrapper around the API classes.
Differentiate between public and private headers. Public headers are
pure abstract base classes that
contain no implementation. The private
headers inherit and implement them.
Only SWIG the public headers.
The opposite of the above solution: inherit a SWIG-compatible class for
each API class.
All these solutions require writing SWIG-compatible wrapper code. This is a bit silly because you are ditching SWIG's the strongest selling point: automatic generation of wrapper code. If I write my own wrapper code for SWIG then I might just as well write regular JNI code.
That said, I realize that for some projects writing wrapper code may the most cost-efficient solution. However, I this was not the case in my situation.
Related
This question already has answers here:
Is the PIMPL idiom really used in practice?
(12 answers)
Closed 8 years ago.
Backgrounder:
The PIMPL Idiom (Pointer to IMPLementation) is a technique for implementation hiding in which a public class wraps a structure or class that cannot be seen outside the library the public class is part of.
This hides internal implementation details and data from the user of the library.
When implementing this idiom why would you place the public methods on the pimpl class and not the public class since the public classes method implementations would be compiled into the library and the user only has the header file?
To illustrate, this code puts the Purr() implementation on the impl class and wraps it as well.
Why not implement Purr directly on the public class?
// header file:
class Cat {
private:
class CatImpl; // Not defined here
CatImpl *cat_; // Handle
public:
Cat(); // Constructor
~Cat(); // Destructor
// Other operations...
Purr();
};
// CPP file:
#include "cat.h"
class Cat::CatImpl {
Purr();
... // The actual implementation can be anything
};
Cat::Cat() {
cat_ = new CatImpl;
}
Cat::~Cat() {
delete cat_;
}
Cat::Purr(){ cat_->Purr(); }
CatImpl::Purr(){
printf("purrrrrr");
}
I think most people refer to this as the Handle Body idiom. See James Coplien's book Advanced C++ Programming Styles and Idioms. It's also known as the Cheshire Cat because of Lewis Caroll's character that fades away until only the grin remains.
The example code should be distributed across two sets of source files. Then only Cat.h is the file that is shipped with the product.
CatImpl.h is included by Cat.cpp and CatImpl.cpp contains the implementation for CatImpl::Purr(). This won't be visible to the public using your product.
Basically the idea is to hide as much as possible of the implementation from prying eyes.
This is most useful where you have a commercial product that is shipped as a series of libraries that are accessed via an API that the customer's code is compiled against and linked to.
We did this with the rewrite of IONA's Orbix 3.3 product in 2000.
As mentioned by others, using his technique completely decouples the implementation from the interface of the object. Then you won't have to recompile everything that uses Cat if you just want to change the implementation of Purr().
This technique is used in a methodology called design by contract.
Because you want Purr() to be able to use private members of CatImpl. Cat::Purr() would not be allowed such an access without a friend declaration.
Because you then don't mix responsibilities: one class implements, one class forwards.
For what is worth, it separates the implementation from the interface. This is usually not very important in small size projects. But, in large projects and libraries, it can be used to reduce the build times significantly.
Consider that the implementation of Cat may include many headers, may involve template meta-programming which takes time to compile on its own. Why should a user, who just wants to use the Cat have to include all that? Hence, all the necessary files are hidden using the pimpl idiom (hence the forward declaration of CatImpl), and using the interface does not force the user to include them.
I'm developing a library for nonlinear optimization (read "lots of nasty math"), which is implemented in templates, so most of the code is in headers. It takes about five minutes to compile (on a decent multi-core CPU), and just parsing the headers in an otherwise empty .cpp takes about a minute. So anyone using the library has to wait a couple of minutes every time they compile their code, which makes the development quite tedious. However, by hiding the implementation and the headers, one just includes a simple interface file, which compiles instantly.
It does not necessarily have anything to do with protecting the implementation from being copied by other companies - which wouldn't probably happen anyway, unless the inner workings of your algorithm can be guessed from the definitions of the member variables (if so, it is probably not very complicated and not worth protecting in the first place).
If your class uses the PIMPL idiom, you can avoid changing the header file on the public class.
This allows you to add/remove methods to the PIMPL class, without modifying the external class's header file. You can also add/remove #includes to the PIMPL too.
When you change the external class's header file, you have to recompile everything that #includes it (and if any of those are header files, you have to recompile everything that #includes them, and so on).
Typically, the only reference to a PIMPL class in the header for the owner class (Cat in this case) would be a forward declaration, as you have done here, because that can greatly reduce the dependencies.
For example, if your PIMPL class has ComplicatedClass as a member (and not just a pointer or reference to it) then you would need to have ComplicatedClass fully defined before its use. In practice, this means including file "ComplicatedClass.h" (which will also indirectly include anything ComplicatedClass depends on). This can lead to a single header fill pulling in lots and lots of stuff, which is bad for managing your dependencies (and your compile times).
When you use the PIMPL idiom, you only need to #include the stuff used in the public interface of your owner type (which would be Cat here). Which makes things better for people using your library, and means you don't need to worry about people depending on some internal part of your library - either by mistake, or because they want to do something you don't allow, so they #define private public before including your files.
If it's a simple class, there's usually isn't any reason to use a PIMPL, but for times when the types are quite big, it can be a big help (especially in avoiding long build times).
Well, I wouldn't use it. I have a better alternative:
File foo.h
class Foo {
public:
virtual ~Foo() { }
virtual void someMethod() = 0;
// This "replaces" the constructor
static Foo *create();
}
File foo.cpp
namespace {
class FooImpl: virtual public Foo {
public:
void someMethod() {
//....
}
};
}
Foo *Foo::create() {
return new FooImpl;
}
Does this pattern have a name?
As someone who is also a Python and Java programmer, I like this a lot more than the PIMPL idiom.
Placing the call to the impl->Purr inside the .cpp file means that in the future you could do something completely different without having to change the header file.
Maybe next year they discover a helper method they could have called instead and so they can change the code to call that directly and not use impl->Purr at all. (Yes, they could achieve the same thing by updating the actual impl::Purr method as well, but in that case you are stuck with an extra function call that achieves nothing but calling the next function in turn.)
It also means the header only has definitions and does not have any implementation which makes for a cleaner separation, which is the whole point of the idiom.
We use the PIMPL idiom in order to emulate aspect-oriented programming where pre, post and error aspects are called before and after the execution of a member function.
struct Omg{
void purr(){ cout<< "purr\n"; }
};
struct Lol{
Omg* omg;
/*...*/
void purr(){ try{ pre(); omg-> purr(); post(); }catch(...){ error(); } }
};
We also use a pointer-to-base class to share different aspects between many classes.
The drawback of this approach is that the library user has to take into account all the aspects that are going to be executed, but only sees his/her class. It requires browsing the documentation for any side effects.
I just implemented my first PIMPL class over the last couple of days. I used it to eliminate problems I was having, including file *winsock2.*h in Borland Builder. It seemed to be screwing up struct alignment and since I had socket things in the class private data, those problems were spreading to any .cpp file that included the header.
By using PIMPL, winsock2.h was included in only one .cpp file where I could put a lid on the problem and not worry that it would come back to bite me.
To answer the original question, the advantage I found in forwarding the calls to the PIMPL class was that the PIMPL class is the same as what your original class would have been before you pimpl'd it, plus your implementations aren't spread over two classes in some weird fashion. It's much clearer to implement the public members to simply forward to the PIMPL class.
Like Mr Nodet said, one class, one responsibility.
I don't know if this is a difference worth mentioning but...
Would it be possible to have the implementation in its own namespace and have a public wrapper / library namespace for the code the user sees:
catlib::Cat::Purr(){ cat_->Purr(); }
cat::Cat::Purr(){
printf("purrrrrr");
}
This way all library code can make use of the cat namespace and as the need to expose a class to the user arises a wrapper could be created in the catlib namespace.
I find it telling that, in spite of how well-known the PIMPL idiom is, I don't see it crop up very often in real life (e.g., in open source projects).
I often wonder if the "benefits" are overblown; yes, you can make some of your implementation details even more hidden, and yes, you can change your implementation without changing the header, but it's not obvious that these are big advantages in reality.
That is to say, it's not clear that there's any need for your implementation to be that well hidden, and perhaps it's quite rare that people really do change only the implementation; as soon as you need to add new methods, say, you need to change the header anyway.
I'm currently in the process of trying to organize my code in better way.
To do that I used namespaces, grouping classes by components, each having a defined role and a few interfaces (actually Abstract classes).
I found it to be pretty good, especially when I had to rewrite an entire component and I did with almost no impact on the others. (I believe it would have been a lot more difficult with a bunch of mixed-up classes and methods)
Yet I'm not 100% happy with it. Especially I'd like to do a better separation between interfaces, the public face of the components, and their implementations in behind.
I think the 'interface' of the component itself should be clearer, I mean a new comer should understand easily what interfaces he must implement, what interfaces he can use and what's part of the implementation.
Soon I'll start a bigger project involving up to 5 devs, and I'd like to be clear in my mind on that point.
So what about you? how do you do it? how do you organize your code?
Especially I'd like to do a better
separation between interfaces, the
public face of the components, and
their implementations in behind.
I think what you're looking for is the Facade pattern:
A facade is an object that provides a simplified interface to a larger body of code, such as a class library. -- Wikipedia
You may also want to look at the Mediator and Builder patterns if you have complex interactions in your classes.
The Pimpl idiom (aka compiler firewall) is also useful for hiding implementation details and reducing build times. I prefer to use Pimpl over interface classes + factories when I don't need polymorphism. Be careful not to over-use it though. Don't use Pimpl for lightweight types that are normally allocated on the stack (like a 3D point or complex number). Use it for the bigger, longer-lived classes that have dependencies on other classes/libraries that you'd wish to hide from the user.
In large-scale projects, it's important to not use an #include directives in a header file when a simple forward declaration will do. Only put an #include directives in a header file if absolutely necessary (prefer to put #includes in the implementation files). If done right, proper #include discipline will reduce your compile times significantly. The Pimpl idiom can help to move #includes from header files to their corresponding implementation files.
A coherent collection of classes / functions can be grouped together in it's own namespace and put in a subdirectory of your source tree (the subdirectory should have the same name as the library namespace). You can then create a static library subproject/makefile for that package and link it with your main application. This is what I'd consider a "package" in UML jargon. In an ideal package, classes are closely related to each other, but loosely related with classes outside the package. It is helpful to draw dependency diagrams of your packages to make sure there are no cyclical dependencies.
There are two common approaches:
If, apart from the public interface, you only need some helper functions, just put them in an unnamed namespace in the implementation file:
// header:
class MyClass {
// interface etc.
};
// source file:
namespace {
void someHelper() {}
}
// ... MyClass implementation
If you find yourself wanting to hide member functions, consider using the PIMPL idiom:
class MyClassImpl; // forward declaration
class MyClass {
public:
// public interface
private:
MyClassImpl* pimpl;
};
MyClassImpl implements the functionality, while MyClass forwards calls to the public interface to the private implementation.
You might find some of the suggestions in Large Scale C++ Software Design useful. It's a bit dated (published in 1996) but still valuable, with pointers on structuring code to minimize the "recompiling the world when a single header file changes" problem.
Herb Sutter's article on "What's In a Class? - The Interface Principle" presents some ideas that many don't seem to think of when designing interfaces. It's a bit dated (1998) but there's some useful stuff in there, nonetheless.
First declare you variables you may use them in one string declaration also like so.
char Name[100],Name2[100],Name3[100];
using namespace std;
int main(){
}
and if you have a long peice of code that could be used out of the program make it a new function.
likeso.
void Return(char Word[100]){
cout<<Word;
cin.ignore();
}
int main(){
Return("Hello");
}
And I suggest any outside functions and declarations you put into a header file and link it
likeso
Dev-C++ #include "Resource.h"
What is best practice for C++ Public API?
I am working on a C++ project that has multiple namespaces, each with multiple objects. Some objects have the same names, but are in different namespaces. Currently, each object has its own .cpp file and .h file. I am not sure how to word this... Would it be appropriate to create a second .h file to expose only the public API? Should their be a .h file per namespace or per object or some other scope? What might be a best practice for creating Public APIs for C++ libraries?
Thanks For Any Help,
Chenz
It is sometimes convenient to have a single class in every .cpp and .h pair of files and to have the namespace hierarchy as the directory hierarchy.
For instance if you have this class:
namespace stuff {
namespace important {
class SecretPassword
{
...
};
}
}
then it will be in two files:
/stuff/important/SecretPassword.cpp
/stuff/important/SecretPassword.h
another possible layout might be:
/src/stuff/important/SecretPassword.cpp
/include/stuff/important/SecretPassword.h
G'day,
One suggestion is to take a look at the C++ idiom of Handle-Body, sometimes known as Cheshire Cat. Here's James Coplien's original paper containing the idiom.
This is a well known method for decoupling public API's from implementations.
HTH
I'd say it's best decided by you, and the type of 'library' this is.
Is your API provides one "Action"? or handles only one abstract "Data type"? examples for this would be zlib and libpng. Both have only one header that gives everything that is needed to perform what the libraries are for.
If your library is a collection of unrelated (or even related) classes that do, or not, the same goal, then provide each subset with it's own header. Major example for this will be boost.
Here is what I'm used to do:
"Some objects have the same names, but are in different namespaces"
That's why namespaces exist.
"Would it be appropriate to create a second .h file to expose only the public API? "
You always should expose only the public API. But what means to expose public API? If it would be only to headers then, since public API relies on private API, the private API would be included by public API anyway. To expose a public API mark public functions/classes with a macro (which in case of Windows exports public functions to the symbol table; and probably it will be soon adopted by Unix systems). So you should define a macro like MYLIB_API or MYLIB_DECLSPEC, just check some existing libraries and MS declspec documentation. It is sufficient, usually non-public API will be kept in subdirectories so it doesn't attend library's user.
"Should their be a .h file per namespace or per object or some other scope?"
I prefer Java-style, one public class per header. I found that libs written in this way are far more clean and readable than those which are mixing file and structure names. But there are cases when I brake this rule, especially when it comes to templates. In such cases I give #warning message to not include header directly and carefully explain in comments what is going on.
Great response LiraNuna.
Are you providing an API to an application or a library?
An application API will generally only provide methods that either query the state of an application or attempt to alter that state. In this case you'd generally have separate interface declarations in a separate header file. Your objects would then implement this interface to handle API requests.
A library will generally expose objects that can be re-used. In this case, generally speaking, your API is simply the public methods in the header file.
I agree with what doc said - one class per file. 99.9% of the time!
Also, consider what filenames to use. It's generally a bad idea to have multiple headers of the same name in different directories, even though the classes they contain may well be in different namespaces.
Especially if this is a public API, you probably cannot control what include paths are defined by the user of your library, so a build may find the wrong one. Tweaking the order of include paths would definitely not be a solution I'd recommend!
We use a naming convention of Namespace-Class.h to explicitly identify classes in files.
I have a code base where many of the classes I implement derive from classes that are provided by other divisions of my company. Working with these other devisions often have the working relationship as though they are third party middle ware vendors.
I'm trying to write test code without modifying these base classes. However, there are issues with creating meaningful test
objects due to the lack of interfaces:
//ACommonClass.h
#include "globalthermonuclearwar.h" //which contains deep #include dependencies...
#include "tictactoe.h" //...and need to exist at compile time to get into test...
class Something //which may or may not inherit from another class similar to this...
{
public:
virtual void fxn1(void); //which often calls into many other classes, similar to this
//...
int data1; //will be the only thing I can test against, but is often meaningless without fxn1 implemented
//...
};
I'd normally extract an interface and work from there, but as these are "Third Party", I can't commit these changes.
Currently, I've created a separate file that holds fake implementations for functions that are defined in the third-party supplied base class headers on a need to know basis, as has been described in the book "Working with Legacy Code".
My plan was to continue to use these definitions and provide alternative test implementations for each third party class that I needed:
//SomethingRequiredImplementations.cpp
#include "ACommonClass.h"
void CGlobalThermoNuclearWar::Simulate(void) {}; // fake this and all other required functions...
// fake implementations for otherwise undefined functions in globalthermonuclearwar.h's #include files...
void Something::fxn1(void) { data1 = blah(); } //test specific functionality.
But before I start doing that I was wondering if any one has tried providing actual objects on a code base similar to mine, which would allow creating new test specific classes to use in place of actual third-party classes.
Note all code bases in question are written in C++.
Mock objects are suitable for this kind of task. They allow you to simulate the existence of other components without needing them to be present. You simply define the expected input and output in your tests.
Google have a good mocking framework for C++.
I'm running into a very similar problem at the moment. I don't want to add a bunch of interfaces that are only there for the purpose of testing, so I can't use any of the existing mock object libraries. To get around this I do the same thing, creating a different file with fake implementations, and having my tests link the fake behaviour, and production code links the real behaviour.
What I wish I could do at this point, is take the internals of another mock framework, and use it inside my fake objects. It would look a little something like this:
Production.h
class ConcreteProductionClass { // regular everyday class
protected:
ConcreteProductionClass(); // I've found the 0 arg constructor useful
public:
void regularFunction(); // regular function that I want to mock
}
Mock.h
class MockProductionClass
: public ConcreteProductionClass
, public ClassThatLetsMeSetExpectations
{
friend class ConcreteProductionClass;
MockTypes membersNeededToSetExpectations;
public:
MockClass() : ConcreteProductionClass() {}
}
ConcreteProductionClass::regularFunction() {
membersNeededToSetExpectations.PassOrFailTheTest();
}
ProductionCode.cpp
void doSomething(ConcreteProductionClass c) {
c.regularFunction();
}
Test.cpp
TEST(myTest) {
MockProductionClass m;
m.SetExpectationsAndReturnValues();
doSomething(m);
ASSERT(m.verify());
}
The most painful part of all this is that the other mock frameworks are so close to this, but don't do it exactly, and the macros are so convoluted that it's not trivial to adapt them. I've begun looking into this on my spare time, but it's not moving along very quickly. Even if I got my method working the way I want, and had the expectation setting code in place, this method still has a couple drawbacks, one of them being that your build commands can get to be kind of long if you have to link against a lot of .o files rather than one .a, but that's manageable. It's also impossible to fall through to the default implementation, since we're not linking it. Anyway, I know this doesn't answer the question, or really even tell you anything you don't already know, but it shows how close the C++ community is to being able to mock classes that don't have a pure virtual interface.
You might want to consider mocking instead of faking as a potential solution. In some cases you may need to write wrapper classes that are mockable if the original classes aren't. I've done this with framework classes in C#/.Net, but not C++ so YMMV.
If I have a class that I need under test that derives from something I can't (or don't want to) run under test I'll:
Make a new logic-only class.
Move the code-i-wanna-test to the logic class.
Use an interface to talk back to the real class to interact with the base class and/or things I can't or won't put in the logic.
Define a test class using that same interface. This test class could have nothing but noops or fancy code that simulates the real classes.
If I have a class that I just need to use in testing, but using the real class is a problem (dependencies or unwanted behaviors):
I'll define a new interface that looks like all of the public methods I need to call.
I'll create a mock version of the object that supports that interface for testing.
I'll create another class that is constructed with a "real" version of that class. It also supports that interface. All interface calls a forwarded to the real object methods.
I'll only do this for methods I actually call - not ALL the public methods. I'll add to these classes as I write more tests.
For example, I wrap MFC's GDI classes like this to test Windows GDI drawing code. Templates can make some of this easier - but we often end up not doing that for various technical reasons (stuff with Windows DLL class exporting...).
I'm sure all this is in Feather's Working with Legacy Code book - and what I'm describing has actual terms. Just don't make me pull the book off the shelf...
One thing you did not indicate in your question is the reason why your classes derive from base classes from the other division. Is the relationship really a IS-A relationshiop ?
Unless your classes needs to be used by a framework, you could consider favoring delegation over inheritance. Then you can use dependency injection to provide your class with a mock of their class in the unit tests.
Otherwise, an idea would be to write a script to extract and create the interface your need from the header they provide, and integrate this to the compilation process so your unit test can ve checked in.
This question already has answers here:
Is the PIMPL idiom really used in practice?
(12 answers)
Closed 8 years ago.
Backgrounder:
The PIMPL Idiom (Pointer to IMPLementation) is a technique for implementation hiding in which a public class wraps a structure or class that cannot be seen outside the library the public class is part of.
This hides internal implementation details and data from the user of the library.
When implementing this idiom why would you place the public methods on the pimpl class and not the public class since the public classes method implementations would be compiled into the library and the user only has the header file?
To illustrate, this code puts the Purr() implementation on the impl class and wraps it as well.
Why not implement Purr directly on the public class?
// header file:
class Cat {
private:
class CatImpl; // Not defined here
CatImpl *cat_; // Handle
public:
Cat(); // Constructor
~Cat(); // Destructor
// Other operations...
Purr();
};
// CPP file:
#include "cat.h"
class Cat::CatImpl {
Purr();
... // The actual implementation can be anything
};
Cat::Cat() {
cat_ = new CatImpl;
}
Cat::~Cat() {
delete cat_;
}
Cat::Purr(){ cat_->Purr(); }
CatImpl::Purr(){
printf("purrrrrr");
}
I think most people refer to this as the Handle Body idiom. See James Coplien's book Advanced C++ Programming Styles and Idioms. It's also known as the Cheshire Cat because of Lewis Caroll's character that fades away until only the grin remains.
The example code should be distributed across two sets of source files. Then only Cat.h is the file that is shipped with the product.
CatImpl.h is included by Cat.cpp and CatImpl.cpp contains the implementation for CatImpl::Purr(). This won't be visible to the public using your product.
Basically the idea is to hide as much as possible of the implementation from prying eyes.
This is most useful where you have a commercial product that is shipped as a series of libraries that are accessed via an API that the customer's code is compiled against and linked to.
We did this with the rewrite of IONA's Orbix 3.3 product in 2000.
As mentioned by others, using his technique completely decouples the implementation from the interface of the object. Then you won't have to recompile everything that uses Cat if you just want to change the implementation of Purr().
This technique is used in a methodology called design by contract.
Because you want Purr() to be able to use private members of CatImpl. Cat::Purr() would not be allowed such an access without a friend declaration.
Because you then don't mix responsibilities: one class implements, one class forwards.
For what is worth, it separates the implementation from the interface. This is usually not very important in small size projects. But, in large projects and libraries, it can be used to reduce the build times significantly.
Consider that the implementation of Cat may include many headers, may involve template meta-programming which takes time to compile on its own. Why should a user, who just wants to use the Cat have to include all that? Hence, all the necessary files are hidden using the pimpl idiom (hence the forward declaration of CatImpl), and using the interface does not force the user to include them.
I'm developing a library for nonlinear optimization (read "lots of nasty math"), which is implemented in templates, so most of the code is in headers. It takes about five minutes to compile (on a decent multi-core CPU), and just parsing the headers in an otherwise empty .cpp takes about a minute. So anyone using the library has to wait a couple of minutes every time they compile their code, which makes the development quite tedious. However, by hiding the implementation and the headers, one just includes a simple interface file, which compiles instantly.
It does not necessarily have anything to do with protecting the implementation from being copied by other companies - which wouldn't probably happen anyway, unless the inner workings of your algorithm can be guessed from the definitions of the member variables (if so, it is probably not very complicated and not worth protecting in the first place).
If your class uses the PIMPL idiom, you can avoid changing the header file on the public class.
This allows you to add/remove methods to the PIMPL class, without modifying the external class's header file. You can also add/remove #includes to the PIMPL too.
When you change the external class's header file, you have to recompile everything that #includes it (and if any of those are header files, you have to recompile everything that #includes them, and so on).
Typically, the only reference to a PIMPL class in the header for the owner class (Cat in this case) would be a forward declaration, as you have done here, because that can greatly reduce the dependencies.
For example, if your PIMPL class has ComplicatedClass as a member (and not just a pointer or reference to it) then you would need to have ComplicatedClass fully defined before its use. In practice, this means including file "ComplicatedClass.h" (which will also indirectly include anything ComplicatedClass depends on). This can lead to a single header fill pulling in lots and lots of stuff, which is bad for managing your dependencies (and your compile times).
When you use the PIMPL idiom, you only need to #include the stuff used in the public interface of your owner type (which would be Cat here). Which makes things better for people using your library, and means you don't need to worry about people depending on some internal part of your library - either by mistake, or because they want to do something you don't allow, so they #define private public before including your files.
If it's a simple class, there's usually isn't any reason to use a PIMPL, but for times when the types are quite big, it can be a big help (especially in avoiding long build times).
Well, I wouldn't use it. I have a better alternative:
File foo.h
class Foo {
public:
virtual ~Foo() { }
virtual void someMethod() = 0;
// This "replaces" the constructor
static Foo *create();
}
File foo.cpp
namespace {
class FooImpl: virtual public Foo {
public:
void someMethod() {
//....
}
};
}
Foo *Foo::create() {
return new FooImpl;
}
Does this pattern have a name?
As someone who is also a Python and Java programmer, I like this a lot more than the PIMPL idiom.
Placing the call to the impl->Purr inside the .cpp file means that in the future you could do something completely different without having to change the header file.
Maybe next year they discover a helper method they could have called instead and so they can change the code to call that directly and not use impl->Purr at all. (Yes, they could achieve the same thing by updating the actual impl::Purr method as well, but in that case you are stuck with an extra function call that achieves nothing but calling the next function in turn.)
It also means the header only has definitions and does not have any implementation which makes for a cleaner separation, which is the whole point of the idiom.
We use the PIMPL idiom in order to emulate aspect-oriented programming where pre, post and error aspects are called before and after the execution of a member function.
struct Omg{
void purr(){ cout<< "purr\n"; }
};
struct Lol{
Omg* omg;
/*...*/
void purr(){ try{ pre(); omg-> purr(); post(); }catch(...){ error(); } }
};
We also use a pointer-to-base class to share different aspects between many classes.
The drawback of this approach is that the library user has to take into account all the aspects that are going to be executed, but only sees his/her class. It requires browsing the documentation for any side effects.
I just implemented my first PIMPL class over the last couple of days. I used it to eliminate problems I was having, including file *winsock2.*h in Borland Builder. It seemed to be screwing up struct alignment and since I had socket things in the class private data, those problems were spreading to any .cpp file that included the header.
By using PIMPL, winsock2.h was included in only one .cpp file where I could put a lid on the problem and not worry that it would come back to bite me.
To answer the original question, the advantage I found in forwarding the calls to the PIMPL class was that the PIMPL class is the same as what your original class would have been before you pimpl'd it, plus your implementations aren't spread over two classes in some weird fashion. It's much clearer to implement the public members to simply forward to the PIMPL class.
Like Mr Nodet said, one class, one responsibility.
I don't know if this is a difference worth mentioning but...
Would it be possible to have the implementation in its own namespace and have a public wrapper / library namespace for the code the user sees:
catlib::Cat::Purr(){ cat_->Purr(); }
cat::Cat::Purr(){
printf("purrrrrr");
}
This way all library code can make use of the cat namespace and as the need to expose a class to the user arises a wrapper could be created in the catlib namespace.
I find it telling that, in spite of how well-known the PIMPL idiom is, I don't see it crop up very often in real life (e.g., in open source projects).
I often wonder if the "benefits" are overblown; yes, you can make some of your implementation details even more hidden, and yes, you can change your implementation without changing the header, but it's not obvious that these are big advantages in reality.
That is to say, it's not clear that there's any need for your implementation to be that well hidden, and perhaps it's quite rare that people really do change only the implementation; as soon as you need to add new methods, say, you need to change the header anyway.