How to share classes between DLLs - c++

I have an unmanaged Win32 C++ application that uses multiple C++ DLLs. The DLLs each need to use class Foo - definition and implementation.
Where do Foo.h and Foo.cpp live so that the DLLs link and don't end up duplicating code in memory?
Is this a reasonable thing to do?
[Edit]
There is a lot of good info in all the answers and comments below - not just the one I've marked as the answer. Thanks for everyone's input.

Providing functionality in the form of classes via a DLL is itself fine. You need to be careful that you seperate the interrface from the implementation, however. How careful depends on how your DLL will be used. For toy projects or utilities that remain internal, you may not need to even think about it. For DLLs that will be used by multiple clients under who-knows-which compiler, you need to be very careful.
Consider:
class MyGizmo
{
public:
std::string get_name() const;
private:
std::string name_;
};
If MyGizmo is going to be used by 3rd parties, this class will cause you no end of headaches. Obviously, the private member variables are a problem, but the return type for get_name() is just as much of a problem. The reason is because std::string's implementation details are part of it's definition. The Standard dictates a minimum functionality set for std::string, but compiler writers are free to implement that however they choose. One might have a function named realloc() to handle the internal reallocation, while another may have a function named buy_node() or something. Same is true with data members. One implementation may use 3 size_t's and a char*, while another might use std::vector. The point is your compiler might think std::string is n bytes and has such-and-so members, while another compiler (or even another patch level of the same compiler) might think it looks totally different.
One solution to this is to use interfaces. In your DLL's public header, you declare an abstract class representing the useful facilities your DLL provides, and a means to create the class, such as:
DLL.H :
class MyGizmo
{
public:
static MyGizmo* Create();
virtual void get_name(char* buffer_alloc_by_caller, size_t size_of_buffer) const = 0;
virtual ~MyGizmo();
private:
MyGizmo(); // nobody can create this except myself
};
...and then in your DLL's internals, you define a class that actually implements MyGizmo:
mygizmo.cpp :
class MyConcreteGizmo : public MyGizmo
{
public:
void get_name(char* buf, size_t sz) const { /*...*/ }
~MyGizmo() { /*...*/ }
private:
std::string name_;
};
MyGizmo* MyGizmo::Create()
{
return new MyConcreteGizmo;
}
This might seem like a pain and, well, it is. If your DLL is going to be only used internally by only one compiler, there may be no reason to go to the trouble. But if your DLL is going to be used my multiple compilers internally, or by external clients, doing this saves major headaches down the road.

Use __declspec dllexport to export the class to the DLL's export table, then include the header file in your other projects and link against the main DLL's export library file. That way the implementation is common.

Where does Foo live? in another dll.
Is it reasonable? not really.
If you declare a class like this:
class __declspec(dllexport) Foo { ...
then msvc will export every member function of the class. However the resulting dll is very fragile as any small change to the class definition without a corresponding rebuild of every consuming dll means that the consuming code will allocate the incorrect number of bytes for any stack and heap allocations not performed by factory functions. Likewise, inline methods will compile into consuming dlls and reference the old layout of the class.
If all the dlls are always rebuilt together, then go ahead. If not - don't :P

Related

Correctly defining DLL-interfaces with C++11/14

I've read several times that passing STL objects like vector and string outside of a DLL boundary is bad practice because different compiler versions can generate different code for STL objects. Therefore, you should design a C-style interface and not pass STL objects at all. However, there are still some things unclear to me:
1. What is the 'boundary' of a DLL?
Is it right to say, that the boundary is where code is beeing compiled on DLL side? What if I define a .h file inside a DLL (f.e. to write a factory class) and use that header file in a different project? Is that .h file inside or outside the boundary of the DLL and why?
2. What is contained in a DLL?
Let' say I have a class Foo:
class Foo
{
public:
__declspec(dllexport) void f1(); //instantiates v1 inside function
private:
unique_ptr<vector<int>> v1 = nullptr;
}
If I only mark the function f1() with __declspec(dllexport), only this function should be contained in the DLL. How does the code inside f1() know what v1 is if v1 isn't contained in the DLL?
3. Passing objects out of a DLL-boundary using unique_ptr
I'm using unique_ptr almost everytime in my project. From what I understand, returning a unique_ptr from a DLL would be bad practice because unique_ptr is an STL object. How can I instantiate an object inside the DLL and return a unique_ptr to it?
4. Why does defining interfaces or using PIMPL help to define an DLL interface?
I still have to convert my STL classes to C-style objects. And in the project using the DLL, I would have to somehow wrap the C-style objects inside STL classes again. I don't see any advantage of using interfaces or PIMPL in this case.
Also, if I define an interface (class with pure virtual functions), wouldn't this have the same effect as just declaring the functions in my class with __declspec(dllexport)?
class IFoo
{
public:
virtual ~IFoo() = 0 {};
virtual void f1() = 0;
}
class Foo : public IFoo
{
public:
void f1();
//__declspec(dllexport) void f1(); //why use an interface if I can just declare the functions like this?
}
How is the DLL-STL problematic solved in modern C++ 11/14 libraries? Are there any modern open-source libraries that I can have a look at?
Unfortunately STL types aren't consistent across compilers. Even different versions of Visual Studio have differences.
The boundary is where the code is compiled. If you have an implementation in a header file in your library, then the compiler used to compile the EXE will compile the code. This is potentially very bad because what the code in the EXE thinks is the data is different to what the code in the DLL thinks is the data. (You need to look out for differences like this especially if you have #ifs in a struct definition and you need to be explicit about packing).
The only way to be sure is to define all your own types (being careful of packing) and not use STL. This is what DLL libraries usually do.
Interfaces can enable the user to dynamically link to the library. Using __declspec(dllexport) requires a static linking; that is the EXE has to link to the .lib generated when you compiled the DLL to be able to access all the functions. This means amongst other things you can't update the DLL without the EXE having to be recompiled (probably - you can get away with this in some circumstances, but it's not a good idea).
By dynamically linking you can update the DLL or add functionality to the DLL without relinking the EXE as long as you don't change your interfaces. The EXE might call LoadLibrary() on the DLL and GetProcAddress() to access one function that returns an interface. Everything else including data types passed as parameters are interfaces (i.e. contain only pure virtual functions) or simple structs. This is how the basic level of COM works.
To answer question 2, when you declare something as __declspec(dllexport) you are stating that this is part of the interface to the DLL - something that is accessible to the component that loads the DLL. Anything declared without __declspec(dllexport) should be present within the DLL but will not be available to be called/used by an external component.

Is it safe to use strings as private data members in a class used across a DLL boundry?

My understanding is that exposing functions that take or return stl containers (such as std::string) across DLL boundaries can cause problems due to differences in STL implementations of those containers in the 2 binaries. But is it safe to export a class like:
class Customer
{
public:
wchar_t * getName() const;
private:
wstring mName;
};
Without some sort of hack, mName is not going to be usable by the executable, so it won't be able to execute methods on mName, nor construct/destruct this object.
My gut feeling is "don't do this, it's unsafe", but I can't figure out a good reason.
It is not a problem. Because it is trumped by the bigger problem, you cannot create an object of that class in code that lives in a module other than the one that contains the code for the class. Code in another module cannot accurately know the required object size, their implementation of the std::string class may well be different. Which, as declared, also affects the size of the Customer object. Even the same compiler cannot guarantee this, mixing optimized and debugging builds of these modules for example. Albeit that this is usually pretty easy to avoid.
So you must create a class factory for Customer objects, a factory that lives in that same module. Which then automatically implies that any code that touches the "mName" member also lives in the same module. And is therefore safe.
Next step then is to not expose Customer at all but expose an pure abstract base class (aka interface). Now you can prevent the client code from creating an instance of Customer and shoot their leg off. And you'll trivially hide the std::string as well. Interface-based programming techniques are common in module interop scenarios. Also the approach taken by COM.
As long as the allocator of instances of the class and deallocator are of the same settings, you should be ok, but you are right to avoid this.
Differences between the .exe and .dll as far as debug/release, code generation (Multi-threaded DLL vs. Single threaded) could cause problems in some scenarios.
I would recommend using abstract classes in the DLL interface with creation and deletion done solely inside the DLL.
Interfaces like:
class A {
protected:
virtual ~A() {}
public:
virtual void func() = 0;
};
//exported create/delete functions
A* create_A();
void destroy_A(A*);
DLL Implementation like:
class A_Impl : public A{
public:
~A_Impl() {}
void func() { do_something(); }
}
A* create_A() { return new A_Impl; }
void destroy_A(A* a) {
A_Impl* ai=static_cast<A_Impl*>(a);
delete ai;
}
Should be ok.
Even if your class has no data members, you cannot expect it to be usable from code compiled with a different compiler. There is no common ABI for C++ classes. You can expect differences in name mangling just for starters.
If you are prepared to constrain clients to use the same compiler as you, or provide source to allow clients to compile your code with their compiler, then you can do pretty much anything across your interface. Otherwise you should stick to C style interfaces.
If you want to provide an object oriented interface in a DLL that is truly safe, I would suggest building it on top of the COM object model. That's what it was designed for.
Any other attempt to share classes between code that is compiled by different compilers has the potential to fail. You may be able to get something that seems to work most of the time, but it can't be guaraneteed to work.
The chances are that at some point you're going to be relying on undefined behaviour in terms of calling conventions or class structure or memory allocation.
The C++ standard does not say anything about the ABI provided by implementations. Even on a single platform changing the compiler options may change binary layout or function interfaces.
Thus to ensure that standard types can be used across DLL boundaries it is your responsibility to ensure that either:
Resource Acquisition/Release for standard types is done by the same DLL. (Note: you can have multiple crt's in a process but a resource acquired by crt1.DLL must be released by crt1.DLL.)
This is not specific to C++. In C for example malloc/free, fopen/fclose call pairs must each go to a single C runtime.
This can be done by either of the below:
By explicitly exporting acquisition/release functions ( Photon's answer ). In this case you are forced to use a factory pattern and abstract types.Basically COM or a COM-clone
Forcing a group of DLL's to link against the same dynamic CRT. In this case you can safely export any kind of functions/classes.
There are also two "potential bug" (among others) you must take care, since they are related to what is "under" the language.
The first is that std::strng is a template, and hence it is instantiated in every translation unit. If they are all linked to a same module (exe or dll) the linker will resolve same functions as same code, and eventually inconsistent code (same function with different body) is treated as error.
But if they are linked to different module (and exe and a dll) there is nothing (compiler and linker) in common. So -depending on how the module where compiled- you may have different implementation of a same class with different member and memory layout (for example one may have some debugging or profiling added features the other has not). Accessing an object created on one side with methods compiled on the other side, if you have no other way to grant implementation consistency, may end in tears.
The second problem (more subtle) relates to allocation/deallocaion of memory: because of the way windows works, every module can have a distinct heap. But the standard C++ does not specify how new and delete take care about which heap an object comes from. And if the string buffer is allocated on one module, than moved to a string instance on another module, you risk (upon destruction) to give the memory back to the wrong heap (it depends on how new/delete and malloc/free are implemented respect to HeapAlloc/HeapFree: this merely relates to the level of "awarness" the STL implementation have respect to the underlying OS. The operation is not itself destructive -the operation just fails- but it leaks the origin's heap).
All that said, it is not impossible to pass a container. It is just up to you to grant a consistent implementation between the sides, since the compiler and linker have no way to cross check.

Strategy for wrapping multiple libraries in C++

I have a class Foo, which I do not implement directly, but wrap external libraries (e.g FooXternal1 or FooXternal2 )
One way that I have seen to do this, is using preprocessor directives as
#include "config.h"
#include "foo.h"
#ifdef _FOOXTERNAL1_WRAPPER_
//implementation of class Foo using FooXternal1
#endif
#ifdef _FOOXTERNAL2_WRAPPER_
//implementation of class Foo using FooXternal2
#endif
and a config.h is used to define these preprocessor flags (_FOOXTERNAL1_WRAPPER_ and _FOOEXTERNAL2_WRAPPER_).
I have the impression this is frowned upon by the C++ programmer community because it uses preprocessor directives, is hard to debug, etc. Further, it does not allow for the parallel existence of both implementations.
I thought about making Foo a base class and inheriting from it to allow for both implementations to exist in parallel with each other. But I ran into two problems:
Pure virtual functions: cannot instatiate an object of type 'Foo', which I need during use.
Virtual functions run the risk of running an object with no (proper) implementation.
Am I missing something? Is there a cleaner way to do this?
EDIT : To summarize, there are 3(.5?!) ways to doing the wrapping- 2(.5) are given by icepack, and the last by Sergey
1- Use factory methods
2- Use preprocessor directives
2.5- Use makefile or IDE to effectively do the work of the preprocessor directives
3.5- Use templates suggested by Sergay
I am working on an embedded system where resources are limited, I decided to use template<enum = default_library>, with template specialization. It is easy to understand for later users; at least thats what I think
If all method names of external implementations are similar, you can use templates. Let external implementations look like:
class FooX1
{
public:
void Met1()
{
std::cout << "X1\n";
}
};
class FooX2
{
public:
void Met1()
{
std::cout << "X2\n";
}
};
Then you can use several variants.
Variant 1. You can declare member of a template type and wrap all calls to external implementation, even with some preparations before the call. Don't forget to delete impl in ~Foo destructor.
template<typename FooX>
class FooVariant1
{
public:
FooVariant1()
{
impl=new FooX();
}
void Met1Wrapper()
{
impl->Met1();
}
private:
FooX *impl;
};
Usage:
FooVariant1<FooX1> bar;
bar.Met1Wrapper();
Variant 2. You can inherit from a template parameter. In this case you don't declare any members, but just call implementation's methods by their names.
template<typename FooX>
class FooVariant2 : public FooX
{
};
Usage:
FooVariant2<FooX1> bar;
bar.Met1();
A disadvantage of using templates is that there is no easy way to change implementations in runtime. But in return you get much more optimal code, because types are generated in compile-time and there is no table of virtual functions, which can make the program slower.
If you want the 2 implementations to coexist at runtime, interface is the way to go (for example, you can use a factory method design pattern to instantiate the concrete object, like #n.m. has suggested).
If you can decide at compilation time what is the implementation that you need, you have several options:
Still use interface. This will allow an easy transition if in the future you'll need both implementations at runtime.
Use preprocessor directives. There is nothing wrong here as far as C++ is considered. It's a pure design issue.
Put the implementations in different files and configure your compiler to compile either one of them according to settings - this is actually similar to using preprocessor directives but it's cleaner and doesn't add garbage to your code (since the flags are in the solution/makefile/whatever your compiler uses).
The only thing I'd frown upon is including both implementations in the same source file. That might get confusing. Otherwise, this is one of the things preprocessor flags are good at, especially if you're not linking both libraries at the same time. It's just like supporting multiple operating systems. Provide a consistent interface in all cases and hide the implementation details somewhere else.
Does type Foo need to hold any information specific to each library? If not, you might be able to get away with this:
#include "Foo.h"
#if defined _FOOXTERNAL1_WRAPPER_
#include "Foo_impl1.cpp"
#elif defined _FOOXTERNAL2_WRAPPER_
#include "Foo_impl2.cpp"
#else
#error "Warn about a missing define here"
#endif
This way you don't have to bother with virtual functions or inheritance and you still prevent any member functions from going unimplemented.
Keep Foo abstract. Provide a factory method
Foo* MakeFoo();
that allocates a new object of either type FooImpl1 or FooImpl2, and returns its address.
Wikipedia on Factory Method pattern.

Splitting long method maintaining class interface

In my library there's a class like this:
class Foo {
public:
void doSomething();
};
Now, implementation of doSomething() has been grow a lot and I want to split it in two methods:
class Foo {
public:
void doSomething();
private:
void doSomething1();
void doSomething2();
};
Where doSomething() implementation is this:
void Foo::doSomething() {
this->doSomething1();
this->doSomething2();
}
But now class interface has changed. If I compile this library, all existent applications using this library wont work, external linkage is changed.
How can I avoid breaking of binary compatibility?
I guess inlining solves this problem. Is it right? And is it portable? What happen if compiler optimization uninlines these methods?
class Foo {
public:
void doSomething();
private:
inline void doSomething1();
inline void doSomething2();
};
void Foo::doSomething1() {
/* some code here */
}
void Foo::doSomething2() {
/* some code here */
}
void Foo::doSomething() {
this->doSomething1();
this->doSomething2();
}
EDIT:
I tested this code before and after method splitting and it seems to maintain binary compatibility. But I'm not sure this would work in every OS and every compiler and with more complex classes (with virtual methods, inheritance...). Sometimes I had binary compatibility breaking after adding private methods like these, but now I don't remember in which particular situation. Maybe it was due to symbol tabled looked by index (like Steve Jessop notes in his answer).
Strictly speaking, changing the class definition at all (in either of the ways you show) is a violation of the One Definition Rule and leads to undefined behavior.
In practice, adding non-virtual member functions to a class maintains binary compatibility in every implementation out there, because if it didn't then you'd lose most of the benefits of dynamic libraries. But the C++ standard doesn't say much (anything?) about dynamic libraries or binary compatibility, so it doesn't guarantee what changes you can make.
So in practice, changing the symbol table doesn't matter provided that the dynamic linker looks up entries in the symbol table by name. There are more entries in the symbol table than before, but that's OK because all the old ones still have the same mangled names. It may be that with your implementation, private and/or inline functions (or any functions you specify) aren't dll-exported, but you don't need to rely on that.
I have used one system (Symbian) where entries in the symbol table were not looked up by name, they were looked up by index. On that system, when you added anything to a dynamic library you had to ensure that any new functions were added to the end of the symbol table, which you did by listing the required order in a special config file. You could ensure that binary compatibility wasn't broken, but it was fairly tedious.
So, you could check your C++ ABI or compiler/linker documentation to be absolutely sure, or just take my word for it and go ahead.
There is no problem here. The name mangling of Foo::doSomething() is always the same regardless of it's implementation.
I think the ABI of the class won't change if you add non-virtual methods because non-virtual methods are not stored in the class object, but rather as functions with mangled names. You can add as many functions as you like as long as you don't add class members.

Hide class type in header

I'm not sure if this is even possible, but here goes:
I have a library whose interface is, at best, complex. Unfortunately, not only is it a 3rd-party library (and far too big to rewrite), I'm using a few other libraries that are dependent on it. So that interface has to stay how it is.
To solve that, I'm trying to essentially wrap the interface and bundle all the dependencies' interfaces into fewer, more logical classes. That part is going fine and works great. Most of the wrapper classes hold a pointer to an object of one of the original classes. Like so:
class Node
{
public:
String GetName()
{
return this->llNode->getNodeName();
}
private:
OverlyComplicatedNodeClass * llNode; // low-level node
};
My only problem is the secondary point of this. Beside simplifying the interface, I'd like to remove the requirement for linking against the original headers/libraries.
That's the first difficulty. How can I wrap the classes in such a way that there's no need to include the original headers? The wrapper will be built as a shared-library (dll/so), if that makes it simpler.
The original classes are pointers and not used in any exported functions (although they are used in a few constructors).
I've toyed with a few ideas, including preprocessor stuff like:
#ifdef ACCESSLOWLEVEL
# define LLPtr(n) n *
#else
# define LLPtr(n) void *
#endif
Which is ugly, at best. It does what I need basically, but I'd rather a real solution that that kind of mess.
Some kind of pointer-type magic works, until I ran into a few functions that use shared pointers (some kind of custom SharedPtr<> class providing reference count) and worse yet, a few class-specific shared pointers derived from the basic SharedPtr class (NodePtr, for example).
Is it at all possible to wrap the original library in such a way as to require only my headers to be included in order to link to my dynamic library? No need to link to the original library or call functions from it, just mine. Only problem I'm running into are the types/classes that are used.
The question might not be terribly clear. I can try to clean it up and add more code samples if it helps. I'm not really worried about any performance overhead or anything of this method, just trying to make it work first (premature optimization and all that).
Use the Pimpl (pointer to implementation) idiom. As described, OverlyComplicatedNodeClass is an implementation detail as far as the users of your library are concerned. They should not have to know the structure of this class, or even it's name.
When you use the Pimpl idiom, you replace the OverlyComplicatedNodeClass pointer in your class with a pointer to void. Only you the library writer needs to know that the void* is actually a OverlyComplicatedNodeClass*. So your class declaration becomes:
class Node
{
public:
String GetName();
private:
void * impl;
};
In your library's implementation, initialize impl with a pointer to the class that does the real work:
my_lib.cpp
Node::Node()
: impl(new OverlyComplicatedNodeClass)
{
// ...
};
...and users of your library need never know that OverlyComplicatedNodeClass exists.
There's one potential drawback to this approach. All the code which uses the impl class must be implemented in your library. None if it can be inline. Whether this is a drawback depends very much on your application, so judge for yourself.
In the case of your class, you did have GetName()'s implementation in the header. That must be moved to the library, as with all other code that uses the impl pointer.
Essentially, you need a separate set of headers for each use. One that you use to build your DLL and one with only the exported interfaces, and no mention at all of the encapsulated objects. Your example would look like:
class Node
{
public:
String GetName();
};
You can use preprocessor statements to get both versions in the same physical file if you don't mind the mess.