Dynamic Libraries, plugin frameworks, and function pointer casting in c++ - c++

I am trying to create a very open plugin framework in c++, and it seems to me that I have come up with a way to do so, but a nagging thought keeps telling me that there is something very, very wrong with what I am doing, and it either won't work or it will cause problems.
The design I have for my framework consists of a Kernel that calls each plugin's init function. The init function then turns around and uses the Kernel's registerPlugin and registerFunction to get a unique id and then register each function the plugin wants to be accessible using that id, respectively.
The function registerPlugin returns the unique id. The function registerFunction takes that id, the function name, and a generic function pointer, like so:
bool registerFunction(int plugin_id, string function_name, plugin_function func){}
where plugin_function is
typedef void (*plugin_function)();
The kernel then takes the function pointer and puts it in a map with the function_name and plugin_id. All plugins registering their function must caste the function to type plugin_function.
In order to retrieve the function, a different plugin calls the Kernel's
plugin_function getFunction(string plugin_name, string function_name);
Then that plugin must cast the plugin_function to its original type so it can be used. It knows (in theory) what the correct type is by having access to a .h file outlining all the functions the plugin makes available. Plugins, by the by, are implemented as dynamic libraries.
Is this a smart way to accomplish the task of allowing different plugins to connect with each other? Or is this a crazy and really terrible programming technique? If it s, please point me in the direction of the correct way to accomplish this.
EDIT: If any clarification is needed, ask and it will be provided.

Function pointers are strange creatures. They're not necessarily the same size as data pointers, and hence cannot be safely cast to void* and back. But, the C++ (and C) specifications allow any function pointer to be safely cast to another function pointer type (though you have to later cast it back to the earlier type before calling it if you want defined behaviour). This is akin to the ability to safely cast any data pointer to void* and back.
Pointers to methods are where it gets really hairy: a method pointer might be larger than a normal function pointer, depending on the compiler, whether the application is 32- or 64-bit, etc. But even more interesting is that, even on the same compiler/platform, not all method pointers are the same size: Method pointers to virtual functions may be bigger than normal method pointers; if multiple inheritance (with e.g. virtual inheritance in the diamond pattern) is involved, the method pointers can be even bigger. This varies with compiler and platform too. This is also the reason that it's difficult to create function objects (that wrap arbitrary methods as well as free functions) especially without allocating memory on the heap (it's just possible using template sorcery).
So, by using function pointers in your interface, it becomes unpractical for the plugin authors to pass back method pointers to your framework, even if they're using the same compiler. This might be an acceptable constraint; more on this later.
Since there's no guarantee that function pointers will be the same size from one compiler to the next, by registering function pointers you're limiting the plugin authors to compilers that implement function pointers having the same size as your compiler does. This wouldn't necessarily be so bad in practice, since function pointer sizes tend to be stable across compiler versions (and may even be the same for multiple compilers).
The real problems start to arise when you want to call the functions pointed to by the function pointers; you can't safely call the function at all if you don't know its true signature (you will get poor results ranging from "not working" to segmentation faults). So, the plugin authors would be further limited to registering only void functions that take no parameters.
It gets worse: the way a function call actually works at the assembler level depends on more than just the signature and function pointer size. There's also the calling convention, the way exceptions are handled (the stack needs to be properly unwound when an exception is thrown), and the actual interpretation of the bytes of function pointer (if it's larger than a data pointer, what do the extra bytes signify? In what order?). At this point, the plugin author is pretty much limited to using the same compiler (and version!) that you are, and needs to be careful to match the calling convention and exception handling options (with the MSVC++ compiler, for example, exception handling is only explicitly enabled with the /EHsc option), as well as use only normal function pointers with the exact signature you define.
All the restrictions so far can be considered reasonable, if a bit limiting. But we're not done yet.
If you throw in std::string (or almost any part of the STL), things get even worse though, because even with the same compiler (and version), there are several different flags/macros that control the STL; these flags can affect the size and meaning of the bytes representing string objects. It is, in effect, like having two different struct declarations in separate files, each with the same name, and hoping they'll be interchangeable; obviously, this doesn't work. An example flag is _HAS_ITERATOR_DEBUGGING. Note that these options can even change between debug and release mode! These types of errors don't always manifest themselves immediately/consistently and can be very difficult to track down.
You also have to be very careful with dynamic memory management across modules, since new in one project may be defined differently from new in another project (e.g. it may be overloaded). When deleting, you might have a pointer to an interface with a virtual destructor, meaning the vtable is needed to properly delete the object, and different compilers all implement the vtable stuff differently. In general, you want the module that allocates an object to be the one to deallocate it; more specifically, you want the code that deallocates an object to have been compiled under the exact same conditions as the code that allocated it. This is one reason std::shared_ptr can take a "deleter" argument when it is constructed -- because even with the same compiler and flags (the only guaranteed safe way to share shared_ptrs between modules), new and delete may not be the same everywhere the shared_ptr can get destroyed. With the deleter, the code that creates the shared pointer controls how it is eventually destroyed too. (I just threw this paragraph in for good measure; you don't seem to be sharing objects across module boundaries.)
All of this is a consequence of C++ having no standard binary interface (ABI); it's a free-for-all, where it is very easy to shoot yourself in the foot (sometimes without realising it).
So, is there any hope? You betcha! You can expose a C API to your plugins instead, and have your plugins also expose a C API. This is quite nice because a C API can be interoperated with from virtually any language. You don't have to worry about exceptions, apart from making sure they can't bubble up above the plugin functions (that's the authors' concern), and it's stable no matter the compiler/options (assuming you don't pass STL containers and the like). There's only one standard calling convention (cdecl), which is the default for functions declared extern "C". void*, in practice, will be the same across all compilers on the same platform (e.g. 8 bytes on x64).
You (and the plugin authors) can still write your code in C++, as long as all the external communication between the two uses a C API (i.e. pretends to be a C module for the purposes of interop).
C function pointers are also likely compatible between compilers in practice, though if you'd rather not depend on this you could have the plugin register a function name (const char*) instead of address, and then you could extract the address yourself using, e.g., LoadLibrary with GetProcAddress for Windows (similarly, Linux and Mac OS X have dlopen and dlsym). This works because name-mangling is disabled for functions declared with extern "C".
Note that there's no direct way around restricting the registered functions to be of a single prototype type (otherwise, as I've said, you can't call them properly). If you need to give a particular parameter to a plugin function (or get a value back), you'll need to register and call the different functions with different prototypes separately (though you could collapse all the function pointers down to a common function pointer type internally, and only cast back at the last minute).
Finally, while you cannot directly support method pointers (which don't even exist in a C API, but are of variable size even with a C++ API and thus cannot be easily stored), you can allow the plugins to supply a "user-data" opaque pointer when registering their function, which is passed to the function whenever it's called; this gives the plugin authors an easy way to write function wrappers around methods and store the object to apply the method to in the user-data parameter. The user-data parameter can also be used for anything else the plugin author wants, which makes your plugin system much easier to interface with and extend. Another example use is to adapt between different function prototypes using a wrapper and extra arguments stored in the user-data.
These suggestions lead to code something like this (for Windows -- the code is very similar for other platforms):
// Shared header
extern "C" {
typedef void (*plugin_function)(void*);
bool registerFunction(int plugin_id, const char* function_name, void* user_data);
}
// Your plugin registration code
hModule = LoadLibrary(pluginDLLPath);
// Your plugin function registration code
auto pluginFunc = (plugin_function)GetProcAddress(hModule, function_name);
// Store pluginFunc and user_data in a map keyed to function_name
// Calling a plugin function
pluginFunc(user_data);
// Declaring a plugin function
extern "C" void aPluginFunction(void*);
class Foo { void doSomething() { } };
// Defining a plugin function
void aPluginFunction(void* user_data)
{
static_cast<Foo*>(user_data)->doSomething();
}
Sorry for the length of this reply; most of it can be summed up with "the C++ standard doesn't extend to interoperation; use C instead since it at least has de facto standards."
Note: Sometimes it's simplest just to design a normal C++ API (with function pointers or interfaces or whatever you like best) under the assumption that the plugins will be compiled under exactly the same circumstances; this is reasonable if you expect all the plugins to be developed by yourself (i.e. the DLLs are part of the project core). This could also work if your project is open-source, in which case everybody can independently choose a cohesive environment under which the project and the plugins are compiled -- but then this makes it hard to distribute plugins except as source code.
Update: As pointed out by ern0 in the comments, it's possible to abstract the details of the module interoperation (via a C API) so that both the main project and the plugins deal with a simpler C++ API. What follows is an outline of such an implementation:
// iplugin.h -- shared between the project and all the plugins
class IPlugin {
public:
virtual void register() { }
virtual void initialize() = 0;
// Your application-specific functionality here:
virtual void onCheeseburgerEatenEvent() { }
};
// C API:
extern "C" {
// Returns the number of plugins in this module
int getPluginCount();
// Called to register the nth plugin of this module.
// A user-data pointer is expected in return (may be null).
void* registerPlugin(int pluginIndex);
// Called to initialize the nth plugin of this module
void initializePlugin(int pluginIndex, void* userData);
void onCheeseBurgerEatenEvent(int pluginIndex, void* userData);
}
// pluginimplementation.h -- plugin authors inherit from this abstract base class
#include "iplugin.h"
class PluginImplementation {
public:
PluginImplementation();
};
// pluginimplementation.cpp -- implements C API of plugin too
#include <vector>
struct LocalPluginRegistry {
static std::vector<PluginImplementation*> plugins;
};
PluginImplementation::PluginImplementation() {
LocalPluginRegistry::plugins.push_back(this);
}
extern "C" {
int getPluginCount() {
return static_cast<int>(LocalPluginRegistry::plugins.size());
}
void* registerPlugin(int pluginIndex) {
auto plugin = LocalPluginRegistry::plugins[pluginIndex];
plugin->register();
return (void*)plugin;
}
void initializePlugin(int pluginIndex, void* userData) {
auto plugin = static_cast<PluginImplementation*>(userData);
plugin->initialize();
}
void onCheeseBurgerEatenEvent(int pluginIndex, void* userData) {
auto plugin = static_cast<PluginImplementation*>(userData);
plugin->onCheeseBurgerEatenEvent();
}
}
// To declare a plugin in the DLL, just make a static instance:
class SomePlugin : public PluginImplementation {
virtual void initialize() { }
};
SomePlugin plugin; // Will be created when the DLL is first loaded by a process
// plugin.h -- part of the main project source only
#include "iplugin.h"
#include <string>
#include <vector>
#include <windows.h>
class PluginRegistry;
class Plugin : public IPlugin {
public:
Plugin(PluginRegistry* registry, int index, int moduleIndex)
: registry(registry), index(index), moduleIndex(moduleIndex)
{
}
virtual void register();
virtual void initialize();
virtual void onCheeseBurgerEatenEvent();
private:
PluginRegistry* registry;
int index;
int moduleIndex;
void* userData;
};
class PluginRegistry {
public:
registerPluginsInModule(std::string const& modulePath);
~PluginRegistry();
public:
std::vector<Plugin*> plugins;
private:
extern "C" {
typedef int (*getPluginCountFunc)();
typedef void* (*registerPluginFunc)(int);
typedef void (*initializePluginFunc)(int, void*);
typedef void (*onCheeseBurgerEatenEventFunc)(int, void*);
}
struct Module {
getPluginCountFunc getPluginCount;
registerPluginFunc registerPlugin;
initializePluginFunc initializePlugin;
onCheeseBurgerEatenEventFunc onCheeseBurgerEatenEvent;
HMODULE handle;
};
friend class Plugin;
std::vector<Module> registeredModules;
}
// plugin.cpp
void Plugin::register() {
auto func = registry->registeredModules[moduleIndex].registerPlugin;
userData = func(index);
}
void Plugin::initialize() {
auto func = registry->registeredModules[moduleIndex].initializePlugin;
func(index, userData);
}
void Plugin::onCheeseBurgerEatenEvent() {
auto func = registry->registeredModules[moduleIndex].onCheeseBurgerEatenEvent;
func(index, userData);
}
PluginRegistry::registerPluginsInModule(std::string const& modulePath) {
// For Windows:
HMODULE handle = LoadLibrary(modulePath.c_str());
Module module;
module.handle = handle;
module.getPluginCount = (getPluginCountFunc)GetProcAddr(handle, "getPluginCount");
module.registerPlugin = (registerPluginFunc)GetProcAddr(handle, "registerPlugin");
module.initializePlugin = (initializePluginFunc)GetProcAddr(handle, "initializePlugin");
module.onCheeseBurgerEatenEvent = (onCheeseBurgerEatenEventFunc)GetProcAddr(handle, "onCheeseBurgerEatenEvent");
int moduleIndex = registeredModules.size();
registeredModules.push_back(module);
int pluginCount = module.getPluginCount();
for (int i = 0; i < pluginCount; ++i) {
auto plugin = new Plugin(this, i, moduleIndex);
plugins.push_back(plugin);
}
}
PluginRegistry::~PluginRegistry() {
for (auto it = plugins.begin(); it != plugins.end(); ++it) {
delete *it;
}
for (auto it = registeredModules.begin(); it != registeredModules.end(); ++it) {
FreeLibrary(it->handle);
}
}
// When discovering plugins (e.g. by loading all DLLs in a "plugins" folder):
PluginRegistry registry;
registry.registerPluginsInModule("plugins/cheeseburgerwatcher.dll");
for (auto it = registry.plugins.begin(); it != registry.plugins.end(); ++it) {
(*it)->register();
}
for (auto it = registry.plugins.begin(); it != registry.plugins.end(); ++it) {
(*it)->initialize();
}
// And then, when a cheeseburger is actually eaten:
for (auto it = registry.plugins.begin(); it != registry.plugins.end(); ++it) {
auto plugin = *it;
plugin->onCheeseBurgerEatenEvent();
}
This has the benefit of using a C API for compatibility, but also offering a higher level of abstraction for plugins written in C++ (and for the main project code, which is C++). Note that it lets multiple plugins be defined in a single DLL. You could also eliminate some of the duplication of function names by using macros, but I chose not to for this simple example.
All of this, by the way, assumes plugins that have no interdependencies -- if plugin A affects (or is required by) plugin B, you need to devise a safe method for injecting/constructing dependencies as needed, since there's no way of guaranteeing what order the plugins will be loaded in (or initialized). A two-step process would work well in that case: Load and register all plugins; during registration of each plugin, let them register any services they provide. During initialization, construct requested services as needed by looking at the registered service table. This ensures that all services offered by all plugins are registered before any of them are attempted to be used, no matter what order plugins get registered or initialized in.

The approach you took is sane in general, but I see a few possible improvements.
Your kernel should export C functions with a conventional calling convention (cdecl, or maybe stdcall if you are on Windows) for the registration of plugins and functions. If you use a C++ function then you are forcing all plugin authors to use the same compiler and compiler version that you use, since many things like C++ function name mangling, STL implementation and calling conventions are compiler specific.
Plugins should only export C functions like the kernel.
From the definition of getFunction it seems each plugin has a name, which other plugins can use to obtain its functions. This is not a safe practice, two developers can create two different plugins with the same name, so when a plugin asks for some other plugin by name it may get a different plugin than the expected one. A better solution would be for plugins to have a public GUID. This GUID can appear in each plugin's header file, so that other plugins can refer to it.
You have not implemented versioning. Ideally you want your kernel to be versioned because invariably you will change it in the future. When a plugin registers with the kernel it passes the version of the kernel API it was compiled against. The kernel then can decide if the plugin can be loaded. For example, if kernel version 1 receives a registration request for a plugin that requires kernel version 2 you have a problem, the best way to address that is to not allow the plugin to load since it may need kernel features that are not present in the older version. The reverse case is also possible, kernel v2 may or may not want to load plugins that were created for kernel v1, and if it does allow it it may need to adapt itself to the older API.
I'm not sure I like the idea of a plugin being able to locate another plugin and call its functions directly, as this breaks encapsulation. It seems better to me if plugins advertise their capabilities to the kernel, so that other plugins can find services they need by capability instead of by addressing other plugins by name or GUID.
Be aware that any plugin that allocates memory needs to provide a deallocation function for that memory. Each plugin could be using a different run-time library, so memory allocated by a plugin may be unknown to other plugins or the kernel. Having allocation and deallocation in the same module avoids problems.

C++ has no ABI. So what you want doing has a restriction: the plugins and your framework must compile & link by same compiler & linker with same parameter in same os. That is meaningless if the achievement is inter-operation in form of binary distribution because each plugin developed for framework has to prepare many version which target at different compiler on different os. So distrbute source code will be more practical than this and that's the way of GNU(download a src, configure and make)
COM is a chose, but it is too complex and out-of-date. Or managed C++ on .Net runtime. But they are only on ms os. If you want a universal solution, I suggest you change to another language.

As jean mentions, since there is no standard C++ ABI and standard name mangling conventions you are stuck to compile things with same compiler and linker. If you want a shared library/dll kind of plugins you have to use something C-ish.
If all will be compiled with same compiler and linker, you may want to also consider std::function.
typedef std::function<void ()> plugin_function;
std::map<std::string, plugin_function> fncMap;
void register_func(std::string name, plugin_function fnc)
{
fncMap[name] = fnc;
}
void call(std::string name)
{
auto it = fncMap.find(name);
if (it != fncMap.end())
(it->second)(); // it->second is a function object
}
///////////////
void func()
{
std::cout << "plain" << std::endl;
}
class T
{
public:
void method()
{
std::cout << "method" << std::endl;
}
void method2(int i)
{
std::cout << "method2 : " << i << std::endl;
}
};
T t; // of course "t" needs to outlive the map, you could just as well use shared_ptr
register_func("plain", func);
register_func("method", std::bind(&T::method, &t));
register_func("method2_5", std::bind(&T::method2, &t, 5));
register_func("method2_15", std::bind(&T::method2, &t, 15));
call("plain");
call("method");
call("method2_5");
call("method2_15");
You can also have plugin functions that take argumens. This will use the placeholders for std::bind, but soon you can find that it is somewhat lacking behind boost::bind. Boost bind has nice documentation and examples.

There is no reason why you should not do this. In C++ using this style of pointer is the best since it's just a plain pointer. I know of no popular compiler that would do anything as brain-dead as not making a function pointer like a normal pointer. It is beyond the bounds of reason that someone would do something so horrible.
The Vst plugin standard operates in a similar way. It just uses function pointers in the .dll and does not have ways of calling directly to classes. Vst is a very popular standard and on windows people use just about any compiler to do Vst plugins, including Delphi which is pascal based and has nothing to do with C++.
So I would do exactly what you suggest personally. For the common well-known plugins I would not use a string name but an integer index which can be looked up much faster.
The alternative is to use interfaces but I see no reason to if your thinking is already based around function pointers.
If you use interfaces then it is not so easy to call the functions from other languages. You can do it from Delphi but what about .NET.
With your function pointer style suggestion you can use .NET to make one of the plugins for example. Obviously you would need to host Mono in your program to load it but just for hypothetical purposes it illustrates the simplicity of it.
Besides, when you use interfaces you have to get into reference counting which is nasty. Stick your logic in function pointers like you suggest and then wrap the control in some C++ classes to do the calling and stuff for you. Then other people can make the plugins with other languages such as Delphi Pascal, Free Pascal, C, Other C++ compilers etc...
But as always, regardless of what you do, exception handling between compilers will remain an issue so you have to think about the error handling. Best way is that the plugins own method catches own plugin exceptions and returns an error code to the kernel etc...

With all the excellent answers above, I'll just add that this practice is actually pretty wide distributed. In my practice, I've seen it both in commercial projects and in freeware/opensource ones.
So - yes, it's good and proven architecture.

You don't need to register functions manually. Really? Really.
What you could use is a proxy implementation for your plugin interface, where each function loads its original from the shared library on demand, transparently, and calls it. Whoever reaches a proxy object of that interface definition just can call the functions. They will be loaded on demand.
If plugins are singletons, then there is no need for manual binding at all (otherwise the correct instance has to be chosen first).
The idea for the developer of a new plugin would be to describe the interface first, then have a generator which generates a stub for the implementation for the shared library, and additionally a plugin proxy class with the same signature but with the autoloading on demand which then is used in the client software. Both should fulfill the same interface (in C++ a pure abstract class).

Related

c++ plugin : Is it ok to pass polymorphic objects?

When using dynamic libraries, I understand that we should only pass Plain Old Data-structures across boundaries. So can we pass a pointer to base ?
My idea is that the application and the library could both be aware of a common Interface (pure virtual method, = 0).
The library could instantiate a subtype of that Interface,
And the application could use it.
For instance, is the following snippet safe ?
// file interface.h
class IPrinter{
virtual void print(std::string str) = 0;
};
-
// file main.cpp
int main(){
//load plugin...
IPrinter* printer = plugin_get_printer();
printer->print( std::string{"hello"} );
}
-
// file plugin.cpp (compiled by another compiler)
IPrinter* plugin_get_printer(){
return new PrinterImpl{};
}
This snippet is not safe:
the two sides of your DLL boundaries do not use the same compiler. This means that the name mangling (for function names) and the vtable layout (for virtual functions) might not be the same (implementation specific.
the heap on both sides may also be managed differently, thus you have risks related to the deleting of your object if it's not done in the DLL.
This article presents very well the main challenges with binary compatible interfaces.
You may however pass to the other side of the mirror a pointer, as part of a POD as long as the other part doesn't us it by iself (f.ex: your app passes a pointer to a configuration object to the DLL. Later another DLL funct returns that pointer to your app. Your app can then use it as expected (at least if it wasn't a pointer to a local object that no longer exists) .
The presence of virtual functions in your class means that your class is going to have a vtable, and different compilers implement vtables differently.
So, if you use classes with virtual methods across DLL calls where the compiler used on the other side is different from the compiler that you are using, the result is likely to be spectacular crashes.
In your case, the PrinterImpl created by the DLL will have a vtable constructed in a certain way, but the printer->print() call in your main() will attempt to interpret the vtable of IPrinter in a different way in order to resolve the print() method call.

Is it safe to use strings as private data members in a class used across a DLL boundry?

My understanding is that exposing functions that take or return stl containers (such as std::string) across DLL boundaries can cause problems due to differences in STL implementations of those containers in the 2 binaries. But is it safe to export a class like:
class Customer
{
public:
wchar_t * getName() const;
private:
wstring mName;
};
Without some sort of hack, mName is not going to be usable by the executable, so it won't be able to execute methods on mName, nor construct/destruct this object.
My gut feeling is "don't do this, it's unsafe", but I can't figure out a good reason.
It is not a problem. Because it is trumped by the bigger problem, you cannot create an object of that class in code that lives in a module other than the one that contains the code for the class. Code in another module cannot accurately know the required object size, their implementation of the std::string class may well be different. Which, as declared, also affects the size of the Customer object. Even the same compiler cannot guarantee this, mixing optimized and debugging builds of these modules for example. Albeit that this is usually pretty easy to avoid.
So you must create a class factory for Customer objects, a factory that lives in that same module. Which then automatically implies that any code that touches the "mName" member also lives in the same module. And is therefore safe.
Next step then is to not expose Customer at all but expose an pure abstract base class (aka interface). Now you can prevent the client code from creating an instance of Customer and shoot their leg off. And you'll trivially hide the std::string as well. Interface-based programming techniques are common in module interop scenarios. Also the approach taken by COM.
As long as the allocator of instances of the class and deallocator are of the same settings, you should be ok, but you are right to avoid this.
Differences between the .exe and .dll as far as debug/release, code generation (Multi-threaded DLL vs. Single threaded) could cause problems in some scenarios.
I would recommend using abstract classes in the DLL interface with creation and deletion done solely inside the DLL.
Interfaces like:
class A {
protected:
virtual ~A() {}
public:
virtual void func() = 0;
};
//exported create/delete functions
A* create_A();
void destroy_A(A*);
DLL Implementation like:
class A_Impl : public A{
public:
~A_Impl() {}
void func() { do_something(); }
}
A* create_A() { return new A_Impl; }
void destroy_A(A* a) {
A_Impl* ai=static_cast<A_Impl*>(a);
delete ai;
}
Should be ok.
Even if your class has no data members, you cannot expect it to be usable from code compiled with a different compiler. There is no common ABI for C++ classes. You can expect differences in name mangling just for starters.
If you are prepared to constrain clients to use the same compiler as you, or provide source to allow clients to compile your code with their compiler, then you can do pretty much anything across your interface. Otherwise you should stick to C style interfaces.
If you want to provide an object oriented interface in a DLL that is truly safe, I would suggest building it on top of the COM object model. That's what it was designed for.
Any other attempt to share classes between code that is compiled by different compilers has the potential to fail. You may be able to get something that seems to work most of the time, but it can't be guaraneteed to work.
The chances are that at some point you're going to be relying on undefined behaviour in terms of calling conventions or class structure or memory allocation.
The C++ standard does not say anything about the ABI provided by implementations. Even on a single platform changing the compiler options may change binary layout or function interfaces.
Thus to ensure that standard types can be used across DLL boundaries it is your responsibility to ensure that either:
Resource Acquisition/Release for standard types is done by the same DLL. (Note: you can have multiple crt's in a process but a resource acquired by crt1.DLL must be released by crt1.DLL.)
This is not specific to C++. In C for example malloc/free, fopen/fclose call pairs must each go to a single C runtime.
This can be done by either of the below:
By explicitly exporting acquisition/release functions ( Photon's answer ). In this case you are forced to use a factory pattern and abstract types.Basically COM or a COM-clone
Forcing a group of DLL's to link against the same dynamic CRT. In this case you can safely export any kind of functions/classes.
There are also two "potential bug" (among others) you must take care, since they are related to what is "under" the language.
The first is that std::strng is a template, and hence it is instantiated in every translation unit. If they are all linked to a same module (exe or dll) the linker will resolve same functions as same code, and eventually inconsistent code (same function with different body) is treated as error.
But if they are linked to different module (and exe and a dll) there is nothing (compiler and linker) in common. So -depending on how the module where compiled- you may have different implementation of a same class with different member and memory layout (for example one may have some debugging or profiling added features the other has not). Accessing an object created on one side with methods compiled on the other side, if you have no other way to grant implementation consistency, may end in tears.
The second problem (more subtle) relates to allocation/deallocaion of memory: because of the way windows works, every module can have a distinct heap. But the standard C++ does not specify how new and delete take care about which heap an object comes from. And if the string buffer is allocated on one module, than moved to a string instance on another module, you risk (upon destruction) to give the memory back to the wrong heap (it depends on how new/delete and malloc/free are implemented respect to HeapAlloc/HeapFree: this merely relates to the level of "awarness" the STL implementation have respect to the underlying OS. The operation is not itself destructive -the operation just fails- but it leaks the origin's heap).
All that said, it is not impossible to pass a container. It is just up to you to grant a consistent implementation between the sides, since the compiler and linker have no way to cross check.

What vb6 type is ABI-compatible with std::vector?

I've been writing a DLL in C++, now I must call this DLL from a VB6 application.
Here's a code sample from this DLL :
#include <vector>
#include <string>
using namespace std;
void __stdcall DLLFunction (vector<Object>*)
{
// performs a few operations on the Objects contained in the vector.
}
struct Object
{
long CoordX;
long CoordY;
long Width;
long Height;
LPSTR Id;
};
I also defined the "Object struct" in VB6
Private Type Object
CoordX As Integer
CoordY As Integer
Width As Integer
Height As Integer
Id As String
End Type
The issue is I don't know what vb6 type could stand for std::vector in order to call the DLL's function.
Notes :
- I use a vector for the DLL to be able to add objects.
- I use a pointer in order to use as less memory as possible.
- Sorry for my english, it ain't my home language at all.
- Thank you for reading and trying to help me.
Edit :
- I fixed the typing issues (Ids are definitely ended by NullChar, so LPSTR should do the trick).
- I read your answers, and I'd like to thank both of you, your answers are close to one another and a major issue remains. My DLL definitely needs to add elements to the container. Thus, I'm wondering how I could do the trick. Maybe I could add a return type to my function and then make that the function is able to return the items it created (instead of putting it directly into the container) so that the vb6 application gets these items and is able to process them, but I can't figure out how to do this
Edit bis :
#Rook : I feel like I could achieve this by using a new struct.
struct ObjectArrayPointer
{
Object* Pointer;
size_t Counter;
}
And then call my function this way :
void __stdcall DLLFunction (ObjectArrayPointer*);
I would then be able to add objects and edit the size parameter for my VB6 application to find these new objects. Was that what you meant?
You should not be trying to export template containers from a DLL anyway. They're likely to break when faced with newer compilers and libraries (eg. a library built under C++03 will not play well with code built using C++11).
The least painful thing to do is to accept a pointer to a buffer and a length parameter,
void __stdcall DLLFunction (Object* buffer, size_t nObjects);
if the size of the container will not change during execution. This interface is about as simple as it gets, and is easily accessible by any language that understand C calling conventions (eg. almost every single one.)
You've already thrown away most of the use of a std::vector because you've already specialised it to Object; you could consider going all the way and creating your own ObjectCollection class which uses a std::vector internally but presents a non-templated interface. Here's a simple example :
// In your public API header file:
typedef struct object_collection_t *object_collection;
object_collection CreateObjectCollection();
void DestroyObjectCollect(object_collection collection);
void AddObjectToCollection(object_collection collection, Object* object);
// etc
No template types are exposed in any form in the header. This is good.
// And the corresponding code file:
struct object_collection_t
{
std::vector<Object*> objects;
};
object_collection CreateObjectCollection() { return new object_collection_t; }
void DestroyObjectCollect(object_collection collection) { delete collection; }
void AddObjectToCollection(object_collection collection, Object* object)
{
collection->objects.push_back(object);
}
// etc
All of templating code is hidden away, leaving you with a fairly clean and simple interface which present an opaque pointer type that can be passed around by external code but only queried and modified by your own, etc.
EDIT: Incidentally, I've used Object* throughout the above code. It may well be safer and impler to use just plain old Object and avoid all of the issues associated with memory management and pointer manipulation by client code. If Object is sufficiently small and simple, passing by value may be a better approach.
(NB: not checked for compilability or functionality. E&OE. Caveat Implementor!)
You can't do that as it's a C++ class/template. Internally, it's an array but not in a way that can be created from VB6.
Your best bet is to change the function to accept a pointer to an array with a count parameter.
You'll also need to be very careful as to how the type is structured.
C++ ints are Longs in VB6.
Also, the Id string won't be compatible. VB6 will have a pointer to a unicode BString (unless you make it fixed length) where as a C++ will have std::string which is an array of ANSI chars. VB6 MAY marshal this if you pass an array of the objects (rather than a pointer)
The VB6 ABI is the COM Automation ABI.
Therefore, if you need an arry which is VB6 ABI compatible, you should probably use SAFEARRAY. I suggest you should also be using the Compiler COM Support classes:
http://msdn.microsoft.com/en-US/library/5yb2sfxk(v=vs.80).aspx
This question appears to do exactly what you want, using ATL's CComSafeArray class:
conversion between std::vector and _variant_t
You may also want to look at these:
https://stackoverflow.com/search?q=safearray+_variant_t
Alternatives to SAFEARRAY
The alternative to SAFEARRAY is to supply a COM Collection object. This is simply a COM object with a Dispinterface or Dual interface with the methods Count and Item. Item should have dispid=0 to be the default method. You may also want to supply _NewEnum with DISPID_NEWENUM to support the For Each syntax.

Instantiate class from name?

imagine I have a bunch of C++ related classes (all extending the same base class and providing the same constructor) that I declared in a common header file (which I include), and their implementations in some other files (which I compile and link statically as part of the build of my program).
I would like to be able to instantiate one of them passing the name, which is a parameter that has to be passed to my program (either as command line or as a compilation macro).
The only possible solution I see is to use a macro:
#ifndef CLASS_NAME
#define CLASS_NAME MyDefaultClassToUse
#endif
BaseClass* o = new CLASS_NAME(param1, param2, ..);
Is it the only valuable approach?
This is a problem which is commonly solved using the Registry Pattern:
This is the situation that the
Registry Pattern describes:
Objects need to contact another
object, knowing only the object’s name
or the name of the service it
provides, but not how to contact it.
Provide a service that takes the name
of an object, service or role and
returns a remote proxy that
encapsulates the knowledge of how to
contact the named object.
It’s the same basic publish/find model
that forms the basis of a Service
Oriented Architecture (SOA) and for
the services layer in OSGi.
You implement a registry normally using a singleton object, the singleton object is informed at compile time or at startup time the names of the objects, and the way to construct them. Then you can use it to create the object on demand.
For example:
template<class T>
class Registry
{
typedef boost::function0<T *> Creator;
typedef std::map<std::string, Creator> Creators;
Creators _creators;
public:
void register(const std::string &className, const Creator &creator);
T *create(const std::string &className);
}
You register the names of the objects and the creation functions like so:
Registry<I> registry;
registry.register("MyClass", &MyClass::Creator);
std::auto_ptr<T> myT(registry.create("MyClass"));
We might then simplify this with clever macros to enable it to be done at compile time. ATL uses the Registry Pattern for CoClasses which can be created at runtime by name - the registration is as simple as using something like the following code:
OBJECT_ENTRY_AUTO(someClassID, SomeClassName);
This macro is placed in your header file somewhere, magic causes it to be registered with the singleton at the time the COM server is started.
A way to implement this is hard-coding a mapping from class 'names' to a factory function. Templates may make the code shorter. The STL may make the coding easier.
#include "BaseObject.h"
#include "CommonClasses.h"
template< typename T > BaseObject* fCreate( int param1, bool param2 ) {
return new T( param1, param2 );
}
typedef BaseObject* (*tConstructor)( int param1, bool param2 );
struct Mapping { string classname; tConstructor constructor;
pair<string,tConstructor> makepair()const {
return make_pair( classname, constructor );
}
} mapping[] =
{ { "class1", &fCreate<Class1> }
, { "class2", &fCreate<Class2> }
// , ...
};
map< string, constructor > constructors;
transform( mapping, mapping+_countof(mapping),
inserter( constructors, constructors.begin() ),
mem_fun_ref( &Mapping::makepair ) );
EDIT -- upon general request :) a little rework to make things look smoother (credits to Stone Free who didn't probably want to add an answer himself)
typedef BaseObject* (*tConstructor)( int param1, bool param2 );
struct Mapping {
string classname;
tConstructor constructor;
operator pair<string,tConstructor> () const {
return make_pair( classname, constructor );
}
} mapping[] =
{ { "class1", &fCreate<Class1> }
, { "class2", &fCreate<Class2> }
// , ...
};
static const map< string, constructor > constructors(
begin(mapping), end(mapping) ); // added a flavor of C++0x, too.
Why not use an object factory?
In its simplest form:
BaseClass* myFactory(std::string const& classname, params...)
{
if(classname == "Class1"){
return new Class1(params...);
}else if(...){
return new ...;
}else{
//Throw or return null
}
return NULL;
}
In C++, this decision must be made at compile time.
During compile time you could use a typedef rather than a macor:
typedef DefaultClass MyDefaultClassToUse;
this is equivalent and avoids a macro (macros bad ;-)).
If the decision is to be made during run time, you need to write your own code to support it. The simples solution is a function that tests the string and instantiates the respective class.
An extended version of that (allowing independent code sections to register their classes) would be a map<name, factory function pointer>.
You mention two possibilities - Command line and compilation macro but the solution for the each one is vastly different.
If the choice is made by a compilation macro than it's a simple problem which can be solved with #defines and #ifdefs and the like. The solution you propose is as good as any.
But if the choice is made in run-time using a command line argument then you need to have some Factory framework that is able to receive a string and create the appropriate object. This can be done using a simple, static if().. else if()... else if()... chain that has all the possibilities or can be a fully dynamic framework where objects register and are being cloned to provide new instances of themselves.
Though the question exist now for more than four years it is still useful. Because calling for new code unknown at the moment of compiling and linking the main code files is in these days a very common scenario. One solution to this question isn't mentioned at all. Thus, I like to point the audience to a different kind of solution not built in C++. C++ itself has no capability to behave like Class.forName() known from Java or like Activator.CreateInstance(type) known from .NET. Due to mentioned reasons, that there is no supervision by a VM to JIT code on the fly. But anyhow, LLVM, the low level virtual machine, gives you the needed tools and libs to read-in a compiled lib. Basically, you need to execute two steps:
compile your C/C++ source code, which you like to instantiate dynamically. You need to compile it to bitcode, so you end up in a, let say, foo.bc. You can do it with clang and provide a compiler switch: clang -emit-llvm -o foo.bc -c foo.c
You need then to use the ParseIRFile() method from llvm/IRReader/IRReader.h to parse the foo.bc file to get the relevant functions (LLVM itself only knows functions as the bitcode is a direct abstraction of CPU opcodes and quite unsimiliar to more high-level intermediate representations like the Java bytecode). Refer for instance to this article for a more complete code description.
After setting up these steps sketched above you can call dynamically also from C++ other prior unknown functions and methods.
In the past, I've implemented the Factory pattern in such a way that classes can self-register at runtime without the factory itself having to know specifically about them. The key is to use a non-standard compiler feature called (IIRC) "attachment by initialisation", wherein you declare a dummy static variable in the implementation file for each class (e.g. a bool), and initialise it with a call to the registration routine.
In this scheme, each class has to #include the header containing its factory, but the factory knows about nothing except the interface class. You can literally add or remove implementation classes from your build, and recompile with no code changes.
The catch is that only some compilers support attachment by initialisation - IIRC others initialise file-scope variables on first use (the same way function-local statics work), which is no help here since the dummy variable is never accessed and the factory map will always be found empty.
The compilers I'm interested in (MSVC and GCC) do support this, though, so it's not a problem for me. You'll have to decide for yourself whether this solution suits you.

Using C++ DLLs with different compiler versions

This question is related to "How to make consistent dll binaries across VS versions ?"
We have applications and DLLs built
with VC6 and a new application built
with VC9. The VC9-app has to use
DLLs compiled with VC6, most of
which are written in C and one in
C++.
The C++ lib is problematic due to
name decoration/mangling issues.
Compiling everything with VC9 is
currently not an option as there
appear to be some side effects.
Resolving these would be quite time
consuming.
I can modify the C++ library, however it must be compiled with VC6.
The C++ lib is essentially an OO-wrapper for another C library. The VC9-app uses some static functions as well as some non-static.
While the static functions can be handled with something like
// Header file
class DLL_API Foo
{
int init();
}
extern "C"
{
int DLL_API Foo_init();
}
// Implementation file
int Foo_init()
{
return Foo::init();
}
it's not that easy with the non-static methods.
As I understand it, Chris Becke's suggestion of using a COM-like interface won't help me because the interface member names will still be decorated and thus inaccessible from a binary created with a different compiler. Am I right there?
Would the only solution be to write a C-style DLL interface using handlers to the objects or am I missing something?
In that case, I guess, I would probably have less effort with directly using the wrapped C-library.
The biggest problem to consider when using a DLL compiled with a different C++ compiler than the calling EXE is memory allocation and object lifetime.
I'm assuming that you can get past the name mangling (and calling convention), which isn't difficult if you use a compiler with compatible mangling (I think VC6 is broadly compatible with VS2008), or if you use extern "C".
Where you'll run into problems is when you allocate something using new (or malloc) from the DLL, and then you return this to the caller. The caller's delete (or free) will attempt to free the object from a different heap. This will go horribly wrong.
You can either do a COM-style IFoo::Release thing, or a MyDllFree() thing. Both of these, because they call back into the DLL, will use the correct implementation of delete (or free()), so they'll delete the correct object.
Or, you can make sure that you use LocalAlloc (for example), so that the EXE and the DLL are using the same heap.
Interface member names will not be decorated -- they're just offsets in a vtable. You can define an interface (using a C struct, rather than a COM "interface") in a header file, thusly:
struct IFoo {
int Init() = 0;
};
Then, you can export a function from the DLL, with no mangling:
class CFoo : public IFoo { /* ... */ };
extern "C" IFoo * __stdcall GetFoo() { return new CFoo(); }
This will work fine, provided that you're using a compiler that generates compatible vtables. Microsoft C++ has generated the same format vtable since (at least, I think) MSVC6.1 for DOS, where the vtable is a simple list of pointers to functions (with thunking in the multiple-inheritance case). GNU C++ (if I recall correctly) generates vtables with function pointers and relative offsets. These are not compatible with each other.
Well, I think Chris Becke's suggestion is just fine. I would not use Roger's first solution, which uses an interface in name only and, as he mentions, can run into problems of incompatible compiler-handling of abstract classes and virtual methods. Roger points to the attractive COM-consistent case in his follow-on.
The pain point: You need to learn to make COM interface requests and deal properly with IUnknown, relying on at least IUnknown:AddRef and IUnknown:Release. If the implementations of interfaces can support more than one interface or if methods can also return interfaces, you may also need to become comfortable with IUnknown:QueryInterface.
Here's the key idea. All of the programs that use the implementation of the interface (but don't implement it) use a common #include "*.h" file that defines the interface as a struct (C) or a C/C++ class (VC++) or struct (non VC++ but C++). The *.h file automatically adapts appropriately depending on whether you are compiling a C Language program or a C++ language program. You don't have to know about that part simply to use the *.h file. What the *.h file does is define the Interface struct or type, lets say, IFoo, with its virtual member functions (and only functions, no direct visibility to data members in this approach).
The header file is constructed to honor the COM binary standard in a way that works for C and that works for C++ regardless of the C++ compiler that is used. (The Java JNI folk figured this one out.) This means that it works between separately-compiled modules of any origin so long as a struct consisting entirely of function-entry pointers (a vtable) is mapped to memory the same by all of them (so they have to be all x86 32-bit, or all x64, for example).
In the DLL that implements the the COM interface via a wrapper class of some sort, you only need a factory entry point. Something like an
extern "C" HRESULT MkIFooImplementation(void **ppv);
which returns an HRESULT (you'll need to learn about those too) and will also return a *pv in a location you provide for receiving the IFoo interface pointer. (I am skimming and there are more careful details that you'll need here. Don't trust my syntax) The actual function stereotype that you use for this is also declared in the *.h file.
The point is that the factory entry, which is always an undecorated extern "C" does all of the necessary wrapper class creation and then delivers an Ifoo interface pointer to the location that you specify. This means that all memory management for creation of the class, and all memory management for finalizing it, etc., will happen in the DLL where you build the wrapper. This is the only place where you have to deal with those details.
When you get an OK result from the factory function, you have been issued an interface pointer and it has already been reserved for you (there is an implicit IFoo:Addref operation already performed on behalf of the interface pointer you were delivered).
When you are done with the interface, you release it with a call on the IFoo:Release method of the interface. It is the final release implementation (in case you made more AddRef'd copies) that will tear down the class and its interface support in the factory DLL. This is what gets you correct reliance on a consistent dynamic stoorage allocation and release behind the interface, whether or not the DLL containing the factory function uses the same libraries as the calling code.
You should probably implement IUnknown:QueryInterface (as method IFoo:QueryInterface) too, even if it always fails. If you want to be more sophisticated with using the COM binary interface model as you have more experience, you can learn to provide full QueryInterface implementations.
This is probably too much information, but I wanted to point out that a lot of the problems you are facing about heterogeneous implementations of DLLs are resolved in the definition of the COM binary interface and even if you don't need all of it, the fact that it provides worked solutions is valuable. In my experience, once you get the hang of this, you will never forget how powerful this can be in C++ and C++ interop situations.
I haven't sketched the resources you might need to consult for examples and what you have to learn in order to make *.h files and to actually implement factory-function wrappers of the libraries you want to share. If you want to dig deeper, holler.
There are other things you need to consider too, such as which run-times are being used by the various libraries. If no objects are being shared that's fine, but that seems quite unlikely at first glance.
Chris Becker's suggestions are pretty accurate - using an actual COM interface may help you get the binary compatibility you need. Your mileage may vary :)
not fun, man. you are in for a lot of frustration, you should probably give this:
Would the only solution be to write a
C-style DLL interface using handlers
to the objects or am I missing
something? In that case, I guess, I
would probably have less effort with
directly using the wrapped C-library.
a really close look. good luck.