Library design: Hiding dependencies - c++

I'm attempting to build a library that uses a third-party library internally, but I do not want to expose and of this third-party library to the user of my library. This way, when the static library is built, the user will only need my header and the compiled library.
How do I deal with private members in my class definitions that are defined in the 3rd party library?
For example . .
header:
#include "ThirdPartyLib.h"
class DummyClass
{
TypeFromThirdParty tftp;
public:
bool checkStuff(const float) const;
};
implementation:
#include "ThirdPartyLib.h"
#include "dummy.h"
bool DummyClass::checkStuff(const float t)
{
return tftp.isOk(t);
}
The offending portion is the #include "ThirdPartyLib.h" in the header, as then the user of my library will need more than my library.
One way of getting around this might be to forward declare all third party types used in the header and then replace the value types with references, but I'm wondering if there is another method or design that I am completely overlooking?

The "private implementation class" or "pimpl" idiom is one approach. This keeps all mention of the third-party library (and other implementation details) out of the header, at the cost of an extra level of indirection:
// header
#include <memory>
class DummyClass {
public:
DummyClass();
~DummyClass();
bool checkStuff(float t);
private:
struct Impl;
std::unique_ptr<Impl> impl;
};
// source
#include "DummyClass.h"
#include "ThirdPartyLib.h"
struct DummyClass::Impl {
TypeFromThirdParty tftp;
};
DummyClass::DummyClass() : impl(new Impl) {}
// This must be defined here, since ~unique_ptr requires Impl to be complete
DummyClass::~DummyClass() {}
bool DummyClass::checkStuff(float t) {return impl->tftp.isOk(t);}
Another approach is to define an abstract interface, and a factory to create the concrete implementation class. Again, this removes all implementation details from the header, at the cost of an extra indirection:
// header
#include <memory>
struct DummyInterface {
virtual ~DummyInterface() {}
virtual bool checkStuff(float t) = 0;
static std::unique_ptr<DummyInterface> create();
};
// source
#include "DummyClass.h"
#include "ThirdPartyLib.h"
struct DummyClass : DummyInterface {
TypeFromThirdParty tftp;
bool checkStuff(float t) {return tftp.isOk(t);}
};
std::unique_ptr<DummyInterface> DummyInterface::create() {
return std::unique_ptr<DummyInterface>(new DummyClass);
}

Related

Private variable declaration in implementation file

Consider this header file:
#ifndef __FOLDER_H__
#define __FOLDER_H__
#include <boost/filesystem.hpp>
class Folder
{
public:
Folder(char* arg);
private:
std::vector<boost::filesystem::path> files;
};
#endif
Everybody including Folder.h will also include boost/filesystem.hpp. However, there are no boost/filesystem types in the public interface of Folder. boost/filesystem.hpp kind of leaks out of Folder.h for the technical reason of declaring a private variable.
I would like to avoid this. Would it be best to declare private variables in the implementation file Folder.cc? Is there some syntax to declare a block of private variables in the implementation file?
There are quite a few idioms to hide the implementation details of a given class. Two of the ones I tend to use are PIMPL and interfaces.
PIMPL
PIMPL is a paradigm where you define a private structure with no definition in the header file, and all of your private implementation details are stored in this private structure. You then reference that structure with a pointer to the implementation, traditionally called pImpl (hence the name).
With the PIMPL idiom, Folder.h becomes this:
//this replaces the include guards and is available in almost all modern compilers.
#pragma once
class Folder
{
public:
Folder(char* arg);
private:
struct FolderImpl* pImpl;
};
And in Folder.cc, you can define FolderImpl as follows:
#include <vector>
#include <boost/filesystem.hpp>
struct FolderImpl
{
std::vector<boost::filesystem::path> files;
}
From there, any operations that work with the files member reference it by pImpl->files.
Interfaces
Interfaces are actually something I "stole" from Microsoft COM. The basic idea is you declare an abstract class, one without any member variables, and inherit from this class in a private header file compiled into your library.
In the Interface idiom, Folder.h becomes this:
class Folder
{
public:
virtual bool DoesFileExist(char* file) = 0;
virtual File* OpenFile(char* file) = 0;
...
static Folder* Create(char* arg);
};
Folder.cc looks like this:
#include "Folder.h"
#include "FolderImpl.h"
Folder* Folder::Create(char* arg)
{
return new FolderImpl(arg);
}
And FolderImpl.h is:
#include "Folder.h"
#include <vector>
#include <boost/filesystem.hpp>
class FolderImpl : public Folder
{
public:
FolderImpl(char* arg);
bool DoesFileExist(char* file) override;
File* OpenFile(char* file) override;
...
private:
std::vector<boost::filesystem::path> files;
};
At the cost of one level of indirection, you could consider doing something like this:
#ifndef FOLDER_H
#define FOLDER_H
#include <memory>
struct FolderPrivateVars;
class Folder
{
public:
Folder(char* arg);
private:
std::unique_ptr <FolderPrivateVars> private_vars;
};
#endif
And then in folder.cc
#include <boost/filesystem.hpp>
struct FolderPrivateVars
{
std::vector<boost::filesystem::path> files;
};
Folder::Folder(char* arg) : private_vars (std::make_unique <FolderPrivateVars> ())
{
...
}
Note that this approach hides all of Folders private variables from prying eyes, which would (for example) mean that modules using it would not need to be recompiled if these change. It might, however, have implications if you want to inherit from Folder.

Better way of using an opaque pointer for Pimpl

I'm writing a C++ wrapper library around a number of different hardware libraries for embedded systems (firmware level), using various libraries from different vendors (C or C++). The API exposed by the header files should be vendor agnostic... all Vendor header libraries are not included in any of my header files.
A common pattern I have is making the vendor member data opaque, by only using a pointer to some "unknown" vendor struct/class/typedef/pod type.
// myclass.h
class MyClass
{
...
private:
VendorThing* vendorData;
};
and implementation (note: each implementation is vendor specific; all have the same *.h file)
// myclass_for_vendor_X.cpp
#include "vendor.h"
... {
vendorData->doSomething();
or
VendorAPICall(vendorData,...);
or whatever
The problem I have is that VendorThing can be lots of different things. It could be a class, struct, type or pod. I don't know, and I don't want to care in the header file. But if you pick the wrong one, then it doesn't compile if the vendor header file is included as well as my header file. For example, if this the actual declaration of VendorThing in "vendor.h":
typedef struct { int a; int b; } VendorThing;
Then you can't just forward-declare VendorThing as class VendorThing;. I don't care about what the type of VendorThing is at all, all I want is the public interface to think of it as void * (i.e allocate space for a pointer and that is it), and the implementation think of it using the correct pointer type.
Two solutions I have come across are the "d-pointer" method found in Qt, where you add a level of indirection by replacing VendorThing a new struct VendorThingWrapper
// myclass.h
struct VendorThingWrapper;
class MyClass
{
...
private:
VendorThingWrapper* vendorDataWrapper;
};
and in your cpp file
// myclass.cpp
#include "vendor.h"
struct VendorThingWrapper {
VendorThing* vendorData;
};
... {
vendorDataWrapper->vendorData->doSomething();
}
but this adds a second pointer dereference, which is not a huge deal, but as this is targeting embedded systems, I don't want to add that overhead just because the language can't do what I want.
The other thing is just declare it void
// myclass.h
class MyClass
{
...
private:
void* vendorDataUntyped;
};
and in the implememtation
//myclass.cpp
#include "vendor.h"
#define vendorData ((VendorThing*)vendorDataUntyped)
... {
vendorData->doSomething();
}
but #define's always leave a bad taste in my mouth. There must be something better.
You can avoid the additional pointer dereference by using:
#include "vendor.h"
struct VendorThingWrapper : public VendorThing {};
Of course, at that point, it makes more sense to use the name MyClassData instead of VendorThingWrapper.
MyClass.h:
struct MyClassData;
class MyClass
{
public:
MyClass();
~MyClass();
private:
MyClassData* myClassData;
};
MyClass.cpp:
struct MyClassData : public VendorThing {};
MyClass::MyClass() : myClassData(new MyClassData())
{
}
MyClass::~MyClass()
{
delete myClassData;
}
Update
I was able to compile and build the following program. The unnamed struct is not a problem.
struct MyClassData;
class MyClass
{
public:
MyClass();
~MyClass();
private:
MyClassData* myClassData;
};
typedef struct { int a; int b; } VendorThing;
struct MyClassData : public VendorThing
{
};
MyClass::MyClass() : myClassData(new MyClassData())
{
myClassData->a = 10;
myClassData->b = 20;
}
MyClass::~MyClass()
{
delete myClassData;
}
int main() {}
If you are willing to go the route of the VendorThingWrapper, then you simply allow the wrapper to contain the data itself, rather than a pointer to it. This gives you the abstraction layer and avoids the extra dereference.
// myclass.cpp
#include "vendor.h"
struct VendorThingWrapper {
VendorThing vendorData;
};
... {
vendorDataWrapper->vendorData.doSomething();
}

C++ Method declaration using another class

I'm starting to learn C++ (coming from Java), so bear with me.
I can't seem to get my method declaration to accept a class I've made.
'Context' has not been declared
I think I'm not understanding a fundamental concept, but I don't know what.
Expression.h
#include "Context.h"
class Expression {
public:
void interpret(Context *); // This line has the error
Expression();
virtual ~Expression();
};
Context.h
#include <stack>
#include <vector>
#include "Expression.h"
class Context {
private:
std::stack<Expression*,std::vector<Expression*> > theStack;
public:
Context();
virtual ~Context();
};
You have to forward declare Expression in Context or vice versa (or both), otherwise you have a cyclic dependency. For example,
Expression.h:
class Context; // no include, we only have Context*.
class Expression {
public:
void interpret(Context *); // This line has the error
Expression();
virtual ~Expression();
};
Context.h:
#include <stack>
#include <vector>
class Expression; // No include, we only have Expression*
class Context {
private:
std::stack<Expression*,std::vector<Expression*> > theStack;
public:
Context();
virtual ~Context();
};
You can perform the forward declarations because the full definition of the classes isn't needed, since you are only referring to pointers to the other class in each case. It is likely that you will need the includes in the implementation files (that is, #include "Context.h" in Expression.cpp and #include Expression.h in Context.cpp).
Finally, remember to put include guards in your header files.
In C++, class definitions always have to end with a semi-colon ;
so example:
class foo {};
Java and C# doesn't require that, so I can see your confusion.
Also it looks like both your header files include each other. Thus it's kind of like a snake eating it's tail: Where does it start? Thus in your Expression.h you can replace the 'include' with a forward declaration instead:
class Context;
class Expression {
public:
void interpret(Context *); // This line has the error
Expression();
virtual ~Expression();
}
And last but not least, you should put a compiler guard to prevent the header from getting included more than once into a .cpp file. You can put a #pragma once in the top of the header file. That is useful if you are using visual studio and the microsoft compiler. I don't know if GCC supports it or not. Or you can wrap your header file like this:
#ifndef EXPRESSION_H_
#define EXPRESSION_H_
class Context;
class Expression {
public:
void interpret(Context *); // This line has the error
Expression();
virtual ~Expression();
}
#endif
you might need to forward declare the classes Context and Expression in the header files before the #include
e.g.
#include <stack>
#include <vector>
// forward declaration
class Context;
class Expression;
#include "Expression.h"
class Context {
private:
std::stack<Expression*,std::vector<Expression*> > theStack;
public:
Context();
virtual ~Context();
}

How to avoid #include dependency to external library

If I'm creating a static library with a header file such as this:
// Myfile.h
#include "SomeHeaderFile.h" // External library
Class MyClass
{
// My code
};
Within my own project I can tell the compiler (in my case, Visual Studio) where to look for SomeHeaderFile.h. However, I don't want my users to be concerned with this - they should be able to include my header without having to inform their compiler about the location of SomeHeaderFile.h.
How is this type of situation normally handled?
This is a classic "compilation firewall" scenario. There are two simple solutions to do:
Forward-declare any classes or functions that you need from the external library. And then include the external library's header file only within your cpp file (when you actually need to use the classes or functions that you forward-declared in your header).
Use the PImpl idiom (or Cheshire Cat) where you forward-declare an "implementation" class that you declare and define only privately (in the cpp file). You use that private class to put all the external-library-dependent code to avoid having any traces of it in your public class (the one declared in your header file).
Here is an example using the first option:
#ifndef MY_LIB_MY_HEADER_H
#define MY_LIB_MY_HEADER_H
class some_external_class; // forward-declare external dependency.
class my_class {
public:
// ...
void someFunction(some_external_class& aRef); // declare members using the forward-declared incomplete type.
};
#endif
// in the cpp file:
#include "my_header.h"
#include "some_external_header.h"
void my_class::someFunction(some_external_class& aRef) {
// here, you can use all that you want from some_external_class.
};
Here is an example of option 2:
#ifndef MY_LIB_MY_HEADER_H
#define MY_LIB_MY_HEADER_H
class my_class_impl; // forward-declare private "implementation" class.
class my_class {
private:
std::unique_ptr<my_class_impl> pimpl; // a vanishing facade...
public:
// ...
};
#endif
// in the cpp file:
#include "my_header.h"
#include "some_external_header.h"
class my_class_impl {
private:
some_external_class obj;
// ...
public:
// some functions ...
};
my_class::my_class() : pimpl(new my_class_impl()) { };
Say the external header file contains the following:
external.h
class foo
{
public:
foo();
};
And in your library you use foo:
myheader.h:
#include "external.h"
class bar
{
...
private:
foo* _x;
};
To get your code to compile, all you have to do is to forward declare the foo class (after that you can remove the include):
class foo;
class bar
{
...
private:
foo* _x;
};
You would then have to include external.h in your source file.

How does the pimpl idiom reduce dependencies?

Consider the following:
PImpl.hpp
class Impl;
class PImpl
{
Impl* pimpl;
PImpl() : pimpl(new Impl) { }
~PImpl() { delete pimpl; }
void DoSomething();
};
PImpl.cpp
#include "PImpl.hpp"
#include "Impl.hpp"
void PImpl::DoSomething() { pimpl->DoSomething(); }
Impl.hpp
class Impl
{
int data;
public:
void DoSomething() {}
}
client.cpp
#include "Pimpl.hpp"
int main()
{
PImpl unitUnderTest;
unitUnderTest.DoSomething();
}
The idea behind this pattern is that Impl's interface can change, yet clients do not have to be recompiled. Yet, I fail to see how this can truly be the case. Let's say I wanted to add a method to this class -- clients would still have to recompile.
Basically, the only kinds of changes like this that I can see ever needing to change the header file for a class for are things for which the interface of the class changes. And when that happens, pimpl or no pimpl, clients have to recompile.
What kinds of editing here give us benefits in terms of not recompiling client code?
The main advantage is that the clients of the interface aren't forced to include the headers for all your class's internal dependencies. So any changes to those headers don't cascade into a recompile of most of your project. Plus general idealism about implementation-hiding.
Also, you wouldn't necessarily put your impl class in its own header. Just make it a struct inside the single cpp and make your outer class reference its data members directly.
Edit: Example
SomeClass.h
struct SomeClassImpl;
class SomeClass {
SomeClassImpl * pImpl;
public:
SomeClass();
~SomeClass();
int DoSomething();
};
SomeClass.cpp
#include "SomeClass.h"
#include "OtherClass.h"
#include <vector>
struct SomeClassImpl {
int foo;
std::vector<OtherClass> otherClassVec; //users of SomeClass don't need to know anything about OtherClass, or include its header.
};
SomeClass::SomeClass() { pImpl = new SomeClassImpl; }
SomeClass::~SomeClass() { delete pImpl; }
int SomeClass::DoSomething() {
pImpl->otherClassVec.push_back(0);
return pImpl->otherClassVec.size();
}
There has been a number of answers... but no correct implementation so far. I am somewhat saddened that examples are incorrect since people are likely to use them...
The "Pimpl" idiom is short for "Pointer to Implementation" and is also referred to as "Compilation Firewall". And now, let's dive in.
1. When is an include necessary ?
When you use a class, you need its full definition only if:
you need its size (attribute of your class)
you need to access one of its method
If you only reference it or have a pointer to it, then since the size of a reference or pointer does not depend on the type referenced / pointed to you need only declare the identifier (forward declaration).
Example:
#include "a.h"
#include "b.h"
#include "c.h"
#include "d.h"
#include "e.h"
#include "f.h"
struct Foo
{
Foo();
A a;
B* b;
C& c;
static D d;
friend class E;
void bar(F f);
};
In the above example, which includes are "convenience" includes and could be removed without affecting the correctness ? Most surprisingly: all but "a.h".
2. Implementing Pimpl
Therefore, the idea of Pimpl is to use a pointer to the implementation class, so as not to need to include any header:
thus isolating the client from the dependencies
thus preventing compilation ripple effect
An additional benefit: the ABI of the library is preserved.
For ease of use, the Pimpl idiom can be used with a "smart pointer" management style:
// From Ben Voigt's remark
// information at:
// http://en.wikibooks.org/wiki/More_C%2B%2B_Idioms/Checked_delete
template<class T>
inline void checked_delete(T * x)
{
typedef char type_must_be_complete[ sizeof(T)? 1: -1 ];
(void) sizeof(type_must_be_complete);
delete x;
}
template <typename T>
class pimpl
{
public:
pimpl(): m(new T()) {}
pimpl(T* t): m(t) { assert(t && "Null Pointer Unauthorized"); }
pimpl(pimpl const& rhs): m(new T(*rhs.m)) {}
pimpl& operator=(pimpl const& rhs)
{
std::auto_ptr<T> tmp(new T(*rhs.m)); // copy may throw: Strong Guarantee
checked_delete(m);
m = tmp.release();
return *this;
}
~pimpl() { checked_delete(m); }
void swap(pimpl& rhs) { std::swap(m, rhs.m); }
T* operator->() { return m; }
T const* operator->() const { return m; }
T& operator*() { return *m; }
T const& operator*() const { return *m; }
T* get() { return m; }
T const* get() const { return m; }
private:
T* m;
};
template <typename T> class pimpl<T*> {};
template <typename T> class pimpl<T&> {};
template <typename T>
void swap(pimpl<T>& lhs, pimpl<T>& rhs) { lhs.swap(rhs); }
What does it have that the others didn't ?
It simply obeys the Rule of Three: defining the Copy Constructor, Copy Assignment Operator and Destructor.
It does so implementing the Strong Guarantee: if the copy throws during an assignment, then the object is left unchanged. Note that the destructor of T should not throw... but then, that is a very common requirement ;)
Building on this, we can now define Pimpl'ed classes somewhat easily:
class Foo
{
public:
private:
struct Impl;
pimpl<Impl> mImpl;
}; // class Foo
Note: the compiler cannot generate a correct constructor, copy assignment operator or destructor here, because doing so would require access to Impl definition. Therefore, despite the pimpl helper, you will need to define manually those 4. However, thanks to the pimpl helper the compilation will fail, instead of dragging you into the land of undefined behavior.
3. Going Further
It should be noted that the presence of virtual functions is often seen as an implementation detail, one of the advantages of Pimpl is that we have the correct framework in place to leverage the power of the Strategy Pattern.
Doing so requires that the "copy" of pimpl be changed:
// pimpl.h
template <typename T>
pimpl<T>::pimpl(pimpl<T> const& rhs): m(rhs.m->clone()) {}
template <typename T>
pimpl<T>& pimpl<T>::operator=(pimpl<T> const& rhs)
{
std::auto_ptr<T> tmp(rhs.m->clone()); // copy may throw: Strong Guarantee
checked_delete(m);
m = tmp.release();
return *this;
}
And then we can define our Foo like so
// foo.h
#include "pimpl.h"
namespace detail { class FooBase; }
class Foo
{
public:
enum Mode {
Easy,
Normal,
Hard,
God
};
Foo(Mode mode);
// Others
private:
pimpl<detail::FooBase> mImpl;
};
// Foo.cpp
#include "foo.h"
#include "detail/fooEasy.h"
#include "detail/fooNormal.h"
#include "detail/fooHard.h"
#include "detail/fooGod.h"
Foo::Foo(Mode m): mImpl(FooFactory::Get(m)) {}
Note that the ABI of Foo is completely unconcerned by the various changes that may occur:
there is no virtual method in Foo
the size of mImpl is that of a simple pointer, whatever what it points to
Therefore your client need not worry about a particular patch that would add either a method or an attribute and you need not worry about the memory layout etc... it just naturally works.
With the PIMPL idiom, if the internal implementation details of the IMPL class changes, the clients do not have to be rebuilt. Any change in the interface of the IMPL (and hence header file) class obviously would require the PIMPL class to change.
BTW,
In the code shown, there is a strong coupling between IMPL and PIMPL. So any change in class implementation of IMPL also would cause a need to rebuild.
Consider something more realistic and the benefits become more notable. Most of the time that I have used this for compiler firewalling and implementation hiding, I define the implementation class within the same compilation unit that visible class is in. In your example, I wouldn't have Impl.h or Impl.cpp and Pimpl.cpp would look something like:
#include <iostream>
#include <boost/thread.hpp>
class Impl {
public:
Impl(): data(0) {}
void setData(int d) {
boost::lock_guard l(lock);
data = d;
}
int getData() {
boost::lock_guard l(lock);
return data;
}
void doSomething() {
int d = getData();
std::cout << getData() << std::endl;
}
private:
int data;
boost::mutex lock;
};
Pimpl::Pimpl(): pimpl(new Impl) {
}
void Pimpl::doSomething() {
pimpl->doSomething();
}
Now no one needs to know about our dependency on boost. This gets more powerful when mixed together with policies. Details like threading policies (e.g., single vs multi) can be hidden by using variant implementations of Impl behind the scenes. Also notice that there are a number of additional methods available in Impl that aren't exposed. This also makes this technique good for layering your implementation.
In your example, you can change the implementation of data without having to recompile the clients. This would not be the case without the PImpl intermediary. Likewise, you could change the signature or name of Imlp::DoSomething (to a point), and the clients wouldn't have to know.
In general, anything that can be declared private (the default) or protected in Impl can be changed without recompiling the clients.
In non-Pimpl class headers the .hpp file defines the public and private components of your class all in one big bucket.
Privates are closely coupled to your implementation, so this means your .hpp file really can give away a lot about your internal implementation.
Consider something like the threading library you choose to use privately inside the class. Without using Pimpl, the threading classes and types might be encountered as private members or parameters on private methods. Ok, a thread library might be a bad example but you get the idea: The private parts of your class definition should be hidden away from those who include your header.
That's where Pimpl comes in. Since the public class header no longer defines the "private parts" but instead has a Pointer to Implementation, your private world remains hidden from logic which "#include"s your public class header.
When you change your private methods (the implementation), you are changing the stuff hidden beneath the Pimpl and therefore clients of your class don't need to recompile because from their perspective nothing has changed: They no longer see the private implementation members.
http://www.gotw.ca/gotw/028.htm
Not all classes benefit from p-impl. Your example has only primitive types in its internal state which explains why there's no obvious benefit.
If any of the members had complex types declared in another header, you can see that p-impl moves the inclusion of that header from your class's public header to the implementation file, since you form a raw pointer to an incomplete type (but not an embedded field nor a smart pointer). You could just use raw pointers to all your member variables individually, but using a single pointer to all the state makes memory management easier and improves data locality (well, there's not much locality if all those types use p-impl in turn).