Preventing library internal stuff from beeing visible in .h - c++

First things first: I know thats it's quite probable that my question is based on a general missunderstanding, but I hope I can make my point clear to you.
I know that when putting code into diffrent files in C++ we have do create a .h (for Definitions) and a .cpp (for implementations).
When writing a class-library for example I have to put a definitioj of both, private and public members in the header. But what to do if - for some reason - I want to hide the private things from the user?
Thanks!

You use the Pimpl idiom. Pimpl stands for "pointer to implementation". Basically, you don't store the implementation details in the class. Instead you store an opaque pointer to implementation as such:
class MyClass
{
public:
//public functions here
private:
class Impl; //no definition, opaque
Impl* pimpl;
};
And then you go on to define your MyClass::Impl class in the .cpp file only, because only the function definitions need it.
The PIMPL idiom has the additional benefit that as long as the interface of the class does not change, any changes made to the implementation do not require the recompilation of class clients.

You can't. This is one of the drawbacks of C++: things marked private are still part of the class declaration and so will be present in the header.
If you really want to hide the functionality, then using the pImpl Idiom can help. Personally though I prefer to live with the drawback as using the idiom can cause other problems such as your class no longer being trivially copyable.
See Why should the "PIMPL" idiom be used?

Related

Is forward declaring a class a correct way to hide the implementation? [duplicate]

This question already has answers here:
Is the PIMPL idiom really used in practice?
(12 answers)
Closed 8 years ago.
Backgrounder:
The PIMPL Idiom (Pointer to IMPLementation) is a technique for implementation hiding in which a public class wraps a structure or class that cannot be seen outside the library the public class is part of.
This hides internal implementation details and data from the user of the library.
When implementing this idiom why would you place the public methods on the pimpl class and not the public class since the public classes method implementations would be compiled into the library and the user only has the header file?
To illustrate, this code puts the Purr() implementation on the impl class and wraps it as well.
Why not implement Purr directly on the public class?
// header file:
class Cat {
private:
class CatImpl; // Not defined here
CatImpl *cat_; // Handle
public:
Cat(); // Constructor
~Cat(); // Destructor
// Other operations...
Purr();
};
// CPP file:
#include "cat.h"
class Cat::CatImpl {
Purr();
... // The actual implementation can be anything
};
Cat::Cat() {
cat_ = new CatImpl;
}
Cat::~Cat() {
delete cat_;
}
Cat::Purr(){ cat_->Purr(); }
CatImpl::Purr(){
printf("purrrrrr");
}
I think most people refer to this as the Handle Body idiom. See James Coplien's book Advanced C++ Programming Styles and Idioms. It's also known as the Cheshire Cat because of Lewis Caroll's character that fades away until only the grin remains.
The example code should be distributed across two sets of source files. Then only Cat.h is the file that is shipped with the product.
CatImpl.h is included by Cat.cpp and CatImpl.cpp contains the implementation for CatImpl::Purr(). This won't be visible to the public using your product.
Basically the idea is to hide as much as possible of the implementation from prying eyes.
This is most useful where you have a commercial product that is shipped as a series of libraries that are accessed via an API that the customer's code is compiled against and linked to.
We did this with the rewrite of IONA's Orbix 3.3 product in 2000.
As mentioned by others, using his technique completely decouples the implementation from the interface of the object. Then you won't have to recompile everything that uses Cat if you just want to change the implementation of Purr().
This technique is used in a methodology called design by contract.
Because you want Purr() to be able to use private members of CatImpl. Cat::Purr() would not be allowed such an access without a friend declaration.
Because you then don't mix responsibilities: one class implements, one class forwards.
For what is worth, it separates the implementation from the interface. This is usually not very important in small size projects. But, in large projects and libraries, it can be used to reduce the build times significantly.
Consider that the implementation of Cat may include many headers, may involve template meta-programming which takes time to compile on its own. Why should a user, who just wants to use the Cat have to include all that? Hence, all the necessary files are hidden using the pimpl idiom (hence the forward declaration of CatImpl), and using the interface does not force the user to include them.
I'm developing a library for nonlinear optimization (read "lots of nasty math"), which is implemented in templates, so most of the code is in headers. It takes about five minutes to compile (on a decent multi-core CPU), and just parsing the headers in an otherwise empty .cpp takes about a minute. So anyone using the library has to wait a couple of minutes every time they compile their code, which makes the development quite tedious. However, by hiding the implementation and the headers, one just includes a simple interface file, which compiles instantly.
It does not necessarily have anything to do with protecting the implementation from being copied by other companies - which wouldn't probably happen anyway, unless the inner workings of your algorithm can be guessed from the definitions of the member variables (if so, it is probably not very complicated and not worth protecting in the first place).
If your class uses the PIMPL idiom, you can avoid changing the header file on the public class.
This allows you to add/remove methods to the PIMPL class, without modifying the external class's header file. You can also add/remove #includes to the PIMPL too.
When you change the external class's header file, you have to recompile everything that #includes it (and if any of those are header files, you have to recompile everything that #includes them, and so on).
Typically, the only reference to a PIMPL class in the header for the owner class (Cat in this case) would be a forward declaration, as you have done here, because that can greatly reduce the dependencies.
For example, if your PIMPL class has ComplicatedClass as a member (and not just a pointer or reference to it) then you would need to have ComplicatedClass fully defined before its use. In practice, this means including file "ComplicatedClass.h" (which will also indirectly include anything ComplicatedClass depends on). This can lead to a single header fill pulling in lots and lots of stuff, which is bad for managing your dependencies (and your compile times).
When you use the PIMPL idiom, you only need to #include the stuff used in the public interface of your owner type (which would be Cat here). Which makes things better for people using your library, and means you don't need to worry about people depending on some internal part of your library - either by mistake, or because they want to do something you don't allow, so they #define private public before including your files.
If it's a simple class, there's usually isn't any reason to use a PIMPL, but for times when the types are quite big, it can be a big help (especially in avoiding long build times).
Well, I wouldn't use it. I have a better alternative:
File foo.h
class Foo {
public:
virtual ~Foo() { }
virtual void someMethod() = 0;
// This "replaces" the constructor
static Foo *create();
}
File foo.cpp
namespace {
class FooImpl: virtual public Foo {
public:
void someMethod() {
//....
}
};
}
Foo *Foo::create() {
return new FooImpl;
}
Does this pattern have a name?
As someone who is also a Python and Java programmer, I like this a lot more than the PIMPL idiom.
Placing the call to the impl->Purr inside the .cpp file means that in the future you could do something completely different without having to change the header file.
Maybe next year they discover a helper method they could have called instead and so they can change the code to call that directly and not use impl->Purr at all. (Yes, they could achieve the same thing by updating the actual impl::Purr method as well, but in that case you are stuck with an extra function call that achieves nothing but calling the next function in turn.)
It also means the header only has definitions and does not have any implementation which makes for a cleaner separation, which is the whole point of the idiom.
We use the PIMPL idiom in order to emulate aspect-oriented programming where pre, post and error aspects are called before and after the execution of a member function.
struct Omg{
void purr(){ cout<< "purr\n"; }
};
struct Lol{
Omg* omg;
/*...*/
void purr(){ try{ pre(); omg-> purr(); post(); }catch(...){ error(); } }
};
We also use a pointer-to-base class to share different aspects between many classes.
The drawback of this approach is that the library user has to take into account all the aspects that are going to be executed, but only sees his/her class. It requires browsing the documentation for any side effects.
I just implemented my first PIMPL class over the last couple of days. I used it to eliminate problems I was having, including file *winsock2.*h in Borland Builder. It seemed to be screwing up struct alignment and since I had socket things in the class private data, those problems were spreading to any .cpp file that included the header.
By using PIMPL, winsock2.h was included in only one .cpp file where I could put a lid on the problem and not worry that it would come back to bite me.
To answer the original question, the advantage I found in forwarding the calls to the PIMPL class was that the PIMPL class is the same as what your original class would have been before you pimpl'd it, plus your implementations aren't spread over two classes in some weird fashion. It's much clearer to implement the public members to simply forward to the PIMPL class.
Like Mr Nodet said, one class, one responsibility.
I don't know if this is a difference worth mentioning but...
Would it be possible to have the implementation in its own namespace and have a public wrapper / library namespace for the code the user sees:
catlib::Cat::Purr(){ cat_->Purr(); }
cat::Cat::Purr(){
printf("purrrrrr");
}
This way all library code can make use of the cat namespace and as the need to expose a class to the user arises a wrapper could be created in the catlib namespace.
I find it telling that, in spite of how well-known the PIMPL idiom is, I don't see it crop up very often in real life (e.g., in open source projects).
I often wonder if the "benefits" are overblown; yes, you can make some of your implementation details even more hidden, and yes, you can change your implementation without changing the header, but it's not obvious that these are big advantages in reality.
That is to say, it's not clear that there's any need for your implementation to be that well hidden, and perhaps it's quite rare that people really do change only the implementation; as soon as you need to add new methods, say, you need to change the header anyway.

Why would you want to put a class in an implementation file?

While looking over some code, I ran into the following:
.h file
class ExampleClass
{
public:
// methods, etc
private:
class AnotherExampleClass* ptrToClass;
}
.cpp file
class AnotherExampleClass
{
// methods, etc
}
// AnotherExampleClass and ExampleClass implemented
Is this a pattern or something beneficial when working in c++? Since the class is not broken out into another file, does this work flow promote faster compilation times?
or is this just the style this developer?
This is variously known as the pImpl Idiom, Cheshire cat technique, or Compilation firewall.
Benefits:
Changing private member variables of a class does not require recompiling classes that depend on it, thus make times are faster, and
the FragileBinaryInterfaceProblem is reduced.
The header file does not need to #include classes that are used 'by value' in private member variables, thus compile times are faster.
This is sorta like the way SmallTalk automatically handles classes... more pure encapsulation.
Drawbacks:
More work for the implementor.
Doesn't work for 'protected' members where access by subclasses is required.
Somewhat harder to read code, since some information is no longer in the header file.
Run-time performance is slightly compromised due to the pointer indirection, especially if function calls are virtual (branch prediction for indirect branches is generally poor).
Herb Sutter's "Exceptional C++" books also go into useful detail on the appropriate usage of this technique.
The most common example would be when using the PIMPL pattern or similar techniques. Still, there are other uses as well. Typically, the distinction .hpp/.cpp in C++ is rather (or, at least can be) one of public interface versus private implementation. If a type is only used as part of the implementation, then that's a good reason not to export it in the header file.
Apart from possibly being an implementation of the PIMPL idiom, here are two more possible reason to do this:
Objects in C++ cannot modify their this pointer. As a consequence, they cannot change type in mid-usage. However, ptrToClass can change, allowing an implementation to delete itself and to replace itself with another instance of another subclass of AnotherExampleClass.
If the implementation of AnotherExampleClass depends on some template parameters, but the interface of ExampleClass does not, it is possible to use a template derived from AnotherExampleClass to provide the implementation. This hides part of the necessary, yet internal type information from the user of the interface class.

What is this design pattern? How to use it?

Suppose I have class like this (simplified):
class Foo_p;
class Foo
{
private:
Foo_p *p;
public:
Foo();
/* methods, etc... */
};
This class is a part of an API.
The Foo_p is all the private parts of the class, which are not
declared in the class Foo itself as usual, but rather in a separate forward-declared class that is only used by the underlying implementation not visible on the outside.
I've seen this pattern used in a couple of projects, is there a name for it?
Also, how do I use it properly (e.g. exception safety, etc.)? Where should the actual implementation go? In class Foo, as usual, only using Foo_p for storage of data, or in the Foo_p class with Foo being just a wrapper?
That is the pimpl idiom
See
Handle/Body Idiom - close cousin and the seminal idea for pimpl EDIT Found an online copy of this original James Coplien talk
GotW #100: Compilation Firewalls (Difficulty: 6/10)
GotW #101: Compilation Firewalls, Part 2 (Difficulty: 8/10) for the most recent on this
This is known is PIMPL. private/pointer-to-private implementation. The class, Foo_p, your class would have been is implemented privately and accessed through a pointer to it so that rather than displaying the true class to clients, they only get to see the public interface you chose to expose. It essentially abstracts away from the header the vestiges of implementation detail present in the protected and private members.
I've found it unwieldy in VC++- it breaks code completion. It is useful if you are very sure of your implementation and don't want the private and protected members on display in the header.
I put the actual implementation of class Foo_p in the cpp file for class Foo, although this may have been the cause of the code-completion breaking, at least I don't have to run the risk of the class being reused by inclusion of its header.
It's a d-pointer which is a type of opaque-pointer. Similar to the PIMPL idiom.
One type of opaque pointer commonly used in C++ class declarations is
the d-pointer. The d-pointer is the only private data member of the
class and points to an instance of a struct. Named by Arnt Gulbrandsen
of Trolltech, this method allows class declarations to omit private
data members, except for the d-pointer itself.[6] The result is that
more of the class' implementation is hidden from view, that adding new
data members to the private struct does not affect binary
compatibility, and that the header file containing the class
declaration only has to #include those other files that are needed for
the class interface, rather than for its implementation. As a side
benefit, compiles are faster because the header file changes less
often. The d-pointer is heavily used in the Qt and KDE libraries.
https://en.wikipedia.org/wiki/Opaque_pointer#C.2B.2B

Can I create functions without defining them in the header file?

Can I create a function inside a class without defining it in the header file of that class?
Why don't you try and see?
[˙ʇ,uɐɔ noʎ 'oᴎ]
Update: Just to reflect on the comments below, with the emphasis of the C++ language on smart compiling, the compiler needs to know the size of the class (thus requiring declaration of all member data) and the class interface (thus requiring all functions and types declaration).
If you want the flexibility of adding functions to the class without the need to change the class header then consider using the pimpl idiom. This will, however, cost you the extra dereference for each call or use of the function or data you added. There are various common reasons for implementing the pimpl:
to reduce compilation time, as this allows you to change the class without changing all the compilation units that depend on it (#include it)
to reduce coupling between a class' dependents and some often-changing implementation details of the class.
as Noah Roberts mentioned below the pimpl can also solve exception safety issues.
No. However, you can mimic such:
struct X
{
void f();
X();
~X();
private:
struct impl;
impl * pimpl;
};
// X.cpp
struct X::impl
{
void f()
{
private_function();
...
}
void private_function() { ...access private variables... }
};
//todo: constructor/destructor...
void X::f() { pimpl->f(); }
Short answer: No, you can't.
However, if you're trying to inject a private function into the class that will only be used in that class's implementation, you can create a function in an anonymous namespace within that class's .cpp file that takes an object of that type by reference or pointer.
Note that you won't be able to muck with the passed objects internal state directly (since there's no way for the class to declare friendship with that anonymous function), but if the function just aggregates operations from the public interface of the class it should work just fine.
No, you can't. It wouldn't make much sense anyway. How should users of your class that include the header file know about those functions and use them?
I have no idea what You are trying to do, but I have a strange gut feeling, that Pointer To Implementation (pImpl) idiom might be helpful. If you want to add a public method in a cpp file and that method is not declared in the class definition in a header, You can't do that and it doesn't make sense.

Why should the "PIMPL" idiom be used? [duplicate]

This question already has answers here:
Is the PIMPL idiom really used in practice?
(12 answers)
Closed 8 years ago.
Backgrounder:
The PIMPL Idiom (Pointer to IMPLementation) is a technique for implementation hiding in which a public class wraps a structure or class that cannot be seen outside the library the public class is part of.
This hides internal implementation details and data from the user of the library.
When implementing this idiom why would you place the public methods on the pimpl class and not the public class since the public classes method implementations would be compiled into the library and the user only has the header file?
To illustrate, this code puts the Purr() implementation on the impl class and wraps it as well.
Why not implement Purr directly on the public class?
// header file:
class Cat {
private:
class CatImpl; // Not defined here
CatImpl *cat_; // Handle
public:
Cat(); // Constructor
~Cat(); // Destructor
// Other operations...
Purr();
};
// CPP file:
#include "cat.h"
class Cat::CatImpl {
Purr();
... // The actual implementation can be anything
};
Cat::Cat() {
cat_ = new CatImpl;
}
Cat::~Cat() {
delete cat_;
}
Cat::Purr(){ cat_->Purr(); }
CatImpl::Purr(){
printf("purrrrrr");
}
I think most people refer to this as the Handle Body idiom. See James Coplien's book Advanced C++ Programming Styles and Idioms. It's also known as the Cheshire Cat because of Lewis Caroll's character that fades away until only the grin remains.
The example code should be distributed across two sets of source files. Then only Cat.h is the file that is shipped with the product.
CatImpl.h is included by Cat.cpp and CatImpl.cpp contains the implementation for CatImpl::Purr(). This won't be visible to the public using your product.
Basically the idea is to hide as much as possible of the implementation from prying eyes.
This is most useful where you have a commercial product that is shipped as a series of libraries that are accessed via an API that the customer's code is compiled against and linked to.
We did this with the rewrite of IONA's Orbix 3.3 product in 2000.
As mentioned by others, using his technique completely decouples the implementation from the interface of the object. Then you won't have to recompile everything that uses Cat if you just want to change the implementation of Purr().
This technique is used in a methodology called design by contract.
Because you want Purr() to be able to use private members of CatImpl. Cat::Purr() would not be allowed such an access without a friend declaration.
Because you then don't mix responsibilities: one class implements, one class forwards.
For what is worth, it separates the implementation from the interface. This is usually not very important in small size projects. But, in large projects and libraries, it can be used to reduce the build times significantly.
Consider that the implementation of Cat may include many headers, may involve template meta-programming which takes time to compile on its own. Why should a user, who just wants to use the Cat have to include all that? Hence, all the necessary files are hidden using the pimpl idiom (hence the forward declaration of CatImpl), and using the interface does not force the user to include them.
I'm developing a library for nonlinear optimization (read "lots of nasty math"), which is implemented in templates, so most of the code is in headers. It takes about five minutes to compile (on a decent multi-core CPU), and just parsing the headers in an otherwise empty .cpp takes about a minute. So anyone using the library has to wait a couple of minutes every time they compile their code, which makes the development quite tedious. However, by hiding the implementation and the headers, one just includes a simple interface file, which compiles instantly.
It does not necessarily have anything to do with protecting the implementation from being copied by other companies - which wouldn't probably happen anyway, unless the inner workings of your algorithm can be guessed from the definitions of the member variables (if so, it is probably not very complicated and not worth protecting in the first place).
If your class uses the PIMPL idiom, you can avoid changing the header file on the public class.
This allows you to add/remove methods to the PIMPL class, without modifying the external class's header file. You can also add/remove #includes to the PIMPL too.
When you change the external class's header file, you have to recompile everything that #includes it (and if any of those are header files, you have to recompile everything that #includes them, and so on).
Typically, the only reference to a PIMPL class in the header for the owner class (Cat in this case) would be a forward declaration, as you have done here, because that can greatly reduce the dependencies.
For example, if your PIMPL class has ComplicatedClass as a member (and not just a pointer or reference to it) then you would need to have ComplicatedClass fully defined before its use. In practice, this means including file "ComplicatedClass.h" (which will also indirectly include anything ComplicatedClass depends on). This can lead to a single header fill pulling in lots and lots of stuff, which is bad for managing your dependencies (and your compile times).
When you use the PIMPL idiom, you only need to #include the stuff used in the public interface of your owner type (which would be Cat here). Which makes things better for people using your library, and means you don't need to worry about people depending on some internal part of your library - either by mistake, or because they want to do something you don't allow, so they #define private public before including your files.
If it's a simple class, there's usually isn't any reason to use a PIMPL, but for times when the types are quite big, it can be a big help (especially in avoiding long build times).
Well, I wouldn't use it. I have a better alternative:
File foo.h
class Foo {
public:
virtual ~Foo() { }
virtual void someMethod() = 0;
// This "replaces" the constructor
static Foo *create();
}
File foo.cpp
namespace {
class FooImpl: virtual public Foo {
public:
void someMethod() {
//....
}
};
}
Foo *Foo::create() {
return new FooImpl;
}
Does this pattern have a name?
As someone who is also a Python and Java programmer, I like this a lot more than the PIMPL idiom.
Placing the call to the impl->Purr inside the .cpp file means that in the future you could do something completely different without having to change the header file.
Maybe next year they discover a helper method they could have called instead and so they can change the code to call that directly and not use impl->Purr at all. (Yes, they could achieve the same thing by updating the actual impl::Purr method as well, but in that case you are stuck with an extra function call that achieves nothing but calling the next function in turn.)
It also means the header only has definitions and does not have any implementation which makes for a cleaner separation, which is the whole point of the idiom.
We use the PIMPL idiom in order to emulate aspect-oriented programming where pre, post and error aspects are called before and after the execution of a member function.
struct Omg{
void purr(){ cout<< "purr\n"; }
};
struct Lol{
Omg* omg;
/*...*/
void purr(){ try{ pre(); omg-> purr(); post(); }catch(...){ error(); } }
};
We also use a pointer-to-base class to share different aspects between many classes.
The drawback of this approach is that the library user has to take into account all the aspects that are going to be executed, but only sees his/her class. It requires browsing the documentation for any side effects.
I just implemented my first PIMPL class over the last couple of days. I used it to eliminate problems I was having, including file *winsock2.*h in Borland Builder. It seemed to be screwing up struct alignment and since I had socket things in the class private data, those problems were spreading to any .cpp file that included the header.
By using PIMPL, winsock2.h was included in only one .cpp file where I could put a lid on the problem and not worry that it would come back to bite me.
To answer the original question, the advantage I found in forwarding the calls to the PIMPL class was that the PIMPL class is the same as what your original class would have been before you pimpl'd it, plus your implementations aren't spread over two classes in some weird fashion. It's much clearer to implement the public members to simply forward to the PIMPL class.
Like Mr Nodet said, one class, one responsibility.
I don't know if this is a difference worth mentioning but...
Would it be possible to have the implementation in its own namespace and have a public wrapper / library namespace for the code the user sees:
catlib::Cat::Purr(){ cat_->Purr(); }
cat::Cat::Purr(){
printf("purrrrrr");
}
This way all library code can make use of the cat namespace and as the need to expose a class to the user arises a wrapper could be created in the catlib namespace.
I find it telling that, in spite of how well-known the PIMPL idiom is, I don't see it crop up very often in real life (e.g., in open source projects).
I often wonder if the "benefits" are overblown; yes, you can make some of your implementation details even more hidden, and yes, you can change your implementation without changing the header, but it's not obvious that these are big advantages in reality.
That is to say, it's not clear that there's any need for your implementation to be that well hidden, and perhaps it's quite rare that people really do change only the implementation; as soon as you need to add new methods, say, you need to change the header anyway.