Instantiate class from name? - c++

imagine I have a bunch of C++ related classes (all extending the same base class and providing the same constructor) that I declared in a common header file (which I include), and their implementations in some other files (which I compile and link statically as part of the build of my program).
I would like to be able to instantiate one of them passing the name, which is a parameter that has to be passed to my program (either as command line or as a compilation macro).
The only possible solution I see is to use a macro:
#ifndef CLASS_NAME
#define CLASS_NAME MyDefaultClassToUse
#endif
BaseClass* o = new CLASS_NAME(param1, param2, ..);
Is it the only valuable approach?

This is a problem which is commonly solved using the Registry Pattern:
This is the situation that the
Registry Pattern describes:
Objects need to contact another
object, knowing only the object’s name
or the name of the service it
provides, but not how to contact it.
Provide a service that takes the name
of an object, service or role and
returns a remote proxy that
encapsulates the knowledge of how to
contact the named object.
It’s the same basic publish/find model
that forms the basis of a Service
Oriented Architecture (SOA) and for
the services layer in OSGi.
You implement a registry normally using a singleton object, the singleton object is informed at compile time or at startup time the names of the objects, and the way to construct them. Then you can use it to create the object on demand.
For example:
template<class T>
class Registry
{
typedef boost::function0<T *> Creator;
typedef std::map<std::string, Creator> Creators;
Creators _creators;
public:
void register(const std::string &className, const Creator &creator);
T *create(const std::string &className);
}
You register the names of the objects and the creation functions like so:
Registry<I> registry;
registry.register("MyClass", &MyClass::Creator);
std::auto_ptr<T> myT(registry.create("MyClass"));
We might then simplify this with clever macros to enable it to be done at compile time. ATL uses the Registry Pattern for CoClasses which can be created at runtime by name - the registration is as simple as using something like the following code:
OBJECT_ENTRY_AUTO(someClassID, SomeClassName);
This macro is placed in your header file somewhere, magic causes it to be registered with the singleton at the time the COM server is started.

A way to implement this is hard-coding a mapping from class 'names' to a factory function. Templates may make the code shorter. The STL may make the coding easier.
#include "BaseObject.h"
#include "CommonClasses.h"
template< typename T > BaseObject* fCreate( int param1, bool param2 ) {
return new T( param1, param2 );
}
typedef BaseObject* (*tConstructor)( int param1, bool param2 );
struct Mapping { string classname; tConstructor constructor;
pair<string,tConstructor> makepair()const {
return make_pair( classname, constructor );
}
} mapping[] =
{ { "class1", &fCreate<Class1> }
, { "class2", &fCreate<Class2> }
// , ...
};
map< string, constructor > constructors;
transform( mapping, mapping+_countof(mapping),
inserter( constructors, constructors.begin() ),
mem_fun_ref( &Mapping::makepair ) );
EDIT -- upon general request :) a little rework to make things look smoother (credits to Stone Free who didn't probably want to add an answer himself)
typedef BaseObject* (*tConstructor)( int param1, bool param2 );
struct Mapping {
string classname;
tConstructor constructor;
operator pair<string,tConstructor> () const {
return make_pair( classname, constructor );
}
} mapping[] =
{ { "class1", &fCreate<Class1> }
, { "class2", &fCreate<Class2> }
// , ...
};
static const map< string, constructor > constructors(
begin(mapping), end(mapping) ); // added a flavor of C++0x, too.

Why not use an object factory?
In its simplest form:
BaseClass* myFactory(std::string const& classname, params...)
{
if(classname == "Class1"){
return new Class1(params...);
}else if(...){
return new ...;
}else{
//Throw or return null
}
return NULL;
}

In C++, this decision must be made at compile time.
During compile time you could use a typedef rather than a macor:
typedef DefaultClass MyDefaultClassToUse;
this is equivalent and avoids a macro (macros bad ;-)).
If the decision is to be made during run time, you need to write your own code to support it. The simples solution is a function that tests the string and instantiates the respective class.
An extended version of that (allowing independent code sections to register their classes) would be a map<name, factory function pointer>.

You mention two possibilities - Command line and compilation macro but the solution for the each one is vastly different.
If the choice is made by a compilation macro than it's a simple problem which can be solved with #defines and #ifdefs and the like. The solution you propose is as good as any.
But if the choice is made in run-time using a command line argument then you need to have some Factory framework that is able to receive a string and create the appropriate object. This can be done using a simple, static if().. else if()... else if()... chain that has all the possibilities or can be a fully dynamic framework where objects register and are being cloned to provide new instances of themselves.

Though the question exist now for more than four years it is still useful. Because calling for new code unknown at the moment of compiling and linking the main code files is in these days a very common scenario. One solution to this question isn't mentioned at all. Thus, I like to point the audience to a different kind of solution not built in C++. C++ itself has no capability to behave like Class.forName() known from Java or like Activator.CreateInstance(type) known from .NET. Due to mentioned reasons, that there is no supervision by a VM to JIT code on the fly. But anyhow, LLVM, the low level virtual machine, gives you the needed tools and libs to read-in a compiled lib. Basically, you need to execute two steps:
compile your C/C++ source code, which you like to instantiate dynamically. You need to compile it to bitcode, so you end up in a, let say, foo.bc. You can do it with clang and provide a compiler switch: clang -emit-llvm -o foo.bc -c foo.c
You need then to use the ParseIRFile() method from llvm/IRReader/IRReader.h to parse the foo.bc file to get the relevant functions (LLVM itself only knows functions as the bitcode is a direct abstraction of CPU opcodes and quite unsimiliar to more high-level intermediate representations like the Java bytecode). Refer for instance to this article for a more complete code description.
After setting up these steps sketched above you can call dynamically also from C++ other prior unknown functions and methods.

In the past, I've implemented the Factory pattern in such a way that classes can self-register at runtime without the factory itself having to know specifically about them. The key is to use a non-standard compiler feature called (IIRC) "attachment by initialisation", wherein you declare a dummy static variable in the implementation file for each class (e.g. a bool), and initialise it with a call to the registration routine.
In this scheme, each class has to #include the header containing its factory, but the factory knows about nothing except the interface class. You can literally add or remove implementation classes from your build, and recompile with no code changes.
The catch is that only some compilers support attachment by initialisation - IIRC others initialise file-scope variables on first use (the same way function-local statics work), which is no help here since the dummy variable is never accessed and the factory map will always be found empty.
The compilers I'm interested in (MSVC and GCC) do support this, though, so it's not a problem for me. You'll have to decide for yourself whether this solution suits you.

Related

Dynamic Libraries, plugin frameworks, and function pointer casting in c++

I am trying to create a very open plugin framework in c++, and it seems to me that I have come up with a way to do so, but a nagging thought keeps telling me that there is something very, very wrong with what I am doing, and it either won't work or it will cause problems.
The design I have for my framework consists of a Kernel that calls each plugin's init function. The init function then turns around and uses the Kernel's registerPlugin and registerFunction to get a unique id and then register each function the plugin wants to be accessible using that id, respectively.
The function registerPlugin returns the unique id. The function registerFunction takes that id, the function name, and a generic function pointer, like so:
bool registerFunction(int plugin_id, string function_name, plugin_function func){}
where plugin_function is
typedef void (*plugin_function)();
The kernel then takes the function pointer and puts it in a map with the function_name and plugin_id. All plugins registering their function must caste the function to type plugin_function.
In order to retrieve the function, a different plugin calls the Kernel's
plugin_function getFunction(string plugin_name, string function_name);
Then that plugin must cast the plugin_function to its original type so it can be used. It knows (in theory) what the correct type is by having access to a .h file outlining all the functions the plugin makes available. Plugins, by the by, are implemented as dynamic libraries.
Is this a smart way to accomplish the task of allowing different plugins to connect with each other? Or is this a crazy and really terrible programming technique? If it s, please point me in the direction of the correct way to accomplish this.
EDIT: If any clarification is needed, ask and it will be provided.
Function pointers are strange creatures. They're not necessarily the same size as data pointers, and hence cannot be safely cast to void* and back. But, the C++ (and C) specifications allow any function pointer to be safely cast to another function pointer type (though you have to later cast it back to the earlier type before calling it if you want defined behaviour). This is akin to the ability to safely cast any data pointer to void* and back.
Pointers to methods are where it gets really hairy: a method pointer might be larger than a normal function pointer, depending on the compiler, whether the application is 32- or 64-bit, etc. But even more interesting is that, even on the same compiler/platform, not all method pointers are the same size: Method pointers to virtual functions may be bigger than normal method pointers; if multiple inheritance (with e.g. virtual inheritance in the diamond pattern) is involved, the method pointers can be even bigger. This varies with compiler and platform too. This is also the reason that it's difficult to create function objects (that wrap arbitrary methods as well as free functions) especially without allocating memory on the heap (it's just possible using template sorcery).
So, by using function pointers in your interface, it becomes unpractical for the plugin authors to pass back method pointers to your framework, even if they're using the same compiler. This might be an acceptable constraint; more on this later.
Since there's no guarantee that function pointers will be the same size from one compiler to the next, by registering function pointers you're limiting the plugin authors to compilers that implement function pointers having the same size as your compiler does. This wouldn't necessarily be so bad in practice, since function pointer sizes tend to be stable across compiler versions (and may even be the same for multiple compilers).
The real problems start to arise when you want to call the functions pointed to by the function pointers; you can't safely call the function at all if you don't know its true signature (you will get poor results ranging from "not working" to segmentation faults). So, the plugin authors would be further limited to registering only void functions that take no parameters.
It gets worse: the way a function call actually works at the assembler level depends on more than just the signature and function pointer size. There's also the calling convention, the way exceptions are handled (the stack needs to be properly unwound when an exception is thrown), and the actual interpretation of the bytes of function pointer (if it's larger than a data pointer, what do the extra bytes signify? In what order?). At this point, the plugin author is pretty much limited to using the same compiler (and version!) that you are, and needs to be careful to match the calling convention and exception handling options (with the MSVC++ compiler, for example, exception handling is only explicitly enabled with the /EHsc option), as well as use only normal function pointers with the exact signature you define.
All the restrictions so far can be considered reasonable, if a bit limiting. But we're not done yet.
If you throw in std::string (or almost any part of the STL), things get even worse though, because even with the same compiler (and version), there are several different flags/macros that control the STL; these flags can affect the size and meaning of the bytes representing string objects. It is, in effect, like having two different struct declarations in separate files, each with the same name, and hoping they'll be interchangeable; obviously, this doesn't work. An example flag is _HAS_ITERATOR_DEBUGGING. Note that these options can even change between debug and release mode! These types of errors don't always manifest themselves immediately/consistently and can be very difficult to track down.
You also have to be very careful with dynamic memory management across modules, since new in one project may be defined differently from new in another project (e.g. it may be overloaded). When deleting, you might have a pointer to an interface with a virtual destructor, meaning the vtable is needed to properly delete the object, and different compilers all implement the vtable stuff differently. In general, you want the module that allocates an object to be the one to deallocate it; more specifically, you want the code that deallocates an object to have been compiled under the exact same conditions as the code that allocated it. This is one reason std::shared_ptr can take a "deleter" argument when it is constructed -- because even with the same compiler and flags (the only guaranteed safe way to share shared_ptrs between modules), new and delete may not be the same everywhere the shared_ptr can get destroyed. With the deleter, the code that creates the shared pointer controls how it is eventually destroyed too. (I just threw this paragraph in for good measure; you don't seem to be sharing objects across module boundaries.)
All of this is a consequence of C++ having no standard binary interface (ABI); it's a free-for-all, where it is very easy to shoot yourself in the foot (sometimes without realising it).
So, is there any hope? You betcha! You can expose a C API to your plugins instead, and have your plugins also expose a C API. This is quite nice because a C API can be interoperated with from virtually any language. You don't have to worry about exceptions, apart from making sure they can't bubble up above the plugin functions (that's the authors' concern), and it's stable no matter the compiler/options (assuming you don't pass STL containers and the like). There's only one standard calling convention (cdecl), which is the default for functions declared extern "C". void*, in practice, will be the same across all compilers on the same platform (e.g. 8 bytes on x64).
You (and the plugin authors) can still write your code in C++, as long as all the external communication between the two uses a C API (i.e. pretends to be a C module for the purposes of interop).
C function pointers are also likely compatible between compilers in practice, though if you'd rather not depend on this you could have the plugin register a function name (const char*) instead of address, and then you could extract the address yourself using, e.g., LoadLibrary with GetProcAddress for Windows (similarly, Linux and Mac OS X have dlopen and dlsym). This works because name-mangling is disabled for functions declared with extern "C".
Note that there's no direct way around restricting the registered functions to be of a single prototype type (otherwise, as I've said, you can't call them properly). If you need to give a particular parameter to a plugin function (or get a value back), you'll need to register and call the different functions with different prototypes separately (though you could collapse all the function pointers down to a common function pointer type internally, and only cast back at the last minute).
Finally, while you cannot directly support method pointers (which don't even exist in a C API, but are of variable size even with a C++ API and thus cannot be easily stored), you can allow the plugins to supply a "user-data" opaque pointer when registering their function, which is passed to the function whenever it's called; this gives the plugin authors an easy way to write function wrappers around methods and store the object to apply the method to in the user-data parameter. The user-data parameter can also be used for anything else the plugin author wants, which makes your plugin system much easier to interface with and extend. Another example use is to adapt between different function prototypes using a wrapper and extra arguments stored in the user-data.
These suggestions lead to code something like this (for Windows -- the code is very similar for other platforms):
// Shared header
extern "C" {
typedef void (*plugin_function)(void*);
bool registerFunction(int plugin_id, const char* function_name, void* user_data);
}
// Your plugin registration code
hModule = LoadLibrary(pluginDLLPath);
// Your plugin function registration code
auto pluginFunc = (plugin_function)GetProcAddress(hModule, function_name);
// Store pluginFunc and user_data in a map keyed to function_name
// Calling a plugin function
pluginFunc(user_data);
// Declaring a plugin function
extern "C" void aPluginFunction(void*);
class Foo { void doSomething() { } };
// Defining a plugin function
void aPluginFunction(void* user_data)
{
static_cast<Foo*>(user_data)->doSomething();
}
Sorry for the length of this reply; most of it can be summed up with "the C++ standard doesn't extend to interoperation; use C instead since it at least has de facto standards."
Note: Sometimes it's simplest just to design a normal C++ API (with function pointers or interfaces or whatever you like best) under the assumption that the plugins will be compiled under exactly the same circumstances; this is reasonable if you expect all the plugins to be developed by yourself (i.e. the DLLs are part of the project core). This could also work if your project is open-source, in which case everybody can independently choose a cohesive environment under which the project and the plugins are compiled -- but then this makes it hard to distribute plugins except as source code.
Update: As pointed out by ern0 in the comments, it's possible to abstract the details of the module interoperation (via a C API) so that both the main project and the plugins deal with a simpler C++ API. What follows is an outline of such an implementation:
// iplugin.h -- shared between the project and all the plugins
class IPlugin {
public:
virtual void register() { }
virtual void initialize() = 0;
// Your application-specific functionality here:
virtual void onCheeseburgerEatenEvent() { }
};
// C API:
extern "C" {
// Returns the number of plugins in this module
int getPluginCount();
// Called to register the nth plugin of this module.
// A user-data pointer is expected in return (may be null).
void* registerPlugin(int pluginIndex);
// Called to initialize the nth plugin of this module
void initializePlugin(int pluginIndex, void* userData);
void onCheeseBurgerEatenEvent(int pluginIndex, void* userData);
}
// pluginimplementation.h -- plugin authors inherit from this abstract base class
#include "iplugin.h"
class PluginImplementation {
public:
PluginImplementation();
};
// pluginimplementation.cpp -- implements C API of plugin too
#include <vector>
struct LocalPluginRegistry {
static std::vector<PluginImplementation*> plugins;
};
PluginImplementation::PluginImplementation() {
LocalPluginRegistry::plugins.push_back(this);
}
extern "C" {
int getPluginCount() {
return static_cast<int>(LocalPluginRegistry::plugins.size());
}
void* registerPlugin(int pluginIndex) {
auto plugin = LocalPluginRegistry::plugins[pluginIndex];
plugin->register();
return (void*)plugin;
}
void initializePlugin(int pluginIndex, void* userData) {
auto plugin = static_cast<PluginImplementation*>(userData);
plugin->initialize();
}
void onCheeseBurgerEatenEvent(int pluginIndex, void* userData) {
auto plugin = static_cast<PluginImplementation*>(userData);
plugin->onCheeseBurgerEatenEvent();
}
}
// To declare a plugin in the DLL, just make a static instance:
class SomePlugin : public PluginImplementation {
virtual void initialize() { }
};
SomePlugin plugin; // Will be created when the DLL is first loaded by a process
// plugin.h -- part of the main project source only
#include "iplugin.h"
#include <string>
#include <vector>
#include <windows.h>
class PluginRegistry;
class Plugin : public IPlugin {
public:
Plugin(PluginRegistry* registry, int index, int moduleIndex)
: registry(registry), index(index), moduleIndex(moduleIndex)
{
}
virtual void register();
virtual void initialize();
virtual void onCheeseBurgerEatenEvent();
private:
PluginRegistry* registry;
int index;
int moduleIndex;
void* userData;
};
class PluginRegistry {
public:
registerPluginsInModule(std::string const& modulePath);
~PluginRegistry();
public:
std::vector<Plugin*> plugins;
private:
extern "C" {
typedef int (*getPluginCountFunc)();
typedef void* (*registerPluginFunc)(int);
typedef void (*initializePluginFunc)(int, void*);
typedef void (*onCheeseBurgerEatenEventFunc)(int, void*);
}
struct Module {
getPluginCountFunc getPluginCount;
registerPluginFunc registerPlugin;
initializePluginFunc initializePlugin;
onCheeseBurgerEatenEventFunc onCheeseBurgerEatenEvent;
HMODULE handle;
};
friend class Plugin;
std::vector<Module> registeredModules;
}
// plugin.cpp
void Plugin::register() {
auto func = registry->registeredModules[moduleIndex].registerPlugin;
userData = func(index);
}
void Plugin::initialize() {
auto func = registry->registeredModules[moduleIndex].initializePlugin;
func(index, userData);
}
void Plugin::onCheeseBurgerEatenEvent() {
auto func = registry->registeredModules[moduleIndex].onCheeseBurgerEatenEvent;
func(index, userData);
}
PluginRegistry::registerPluginsInModule(std::string const& modulePath) {
// For Windows:
HMODULE handle = LoadLibrary(modulePath.c_str());
Module module;
module.handle = handle;
module.getPluginCount = (getPluginCountFunc)GetProcAddr(handle, "getPluginCount");
module.registerPlugin = (registerPluginFunc)GetProcAddr(handle, "registerPlugin");
module.initializePlugin = (initializePluginFunc)GetProcAddr(handle, "initializePlugin");
module.onCheeseBurgerEatenEvent = (onCheeseBurgerEatenEventFunc)GetProcAddr(handle, "onCheeseBurgerEatenEvent");
int moduleIndex = registeredModules.size();
registeredModules.push_back(module);
int pluginCount = module.getPluginCount();
for (int i = 0; i < pluginCount; ++i) {
auto plugin = new Plugin(this, i, moduleIndex);
plugins.push_back(plugin);
}
}
PluginRegistry::~PluginRegistry() {
for (auto it = plugins.begin(); it != plugins.end(); ++it) {
delete *it;
}
for (auto it = registeredModules.begin(); it != registeredModules.end(); ++it) {
FreeLibrary(it->handle);
}
}
// When discovering plugins (e.g. by loading all DLLs in a "plugins" folder):
PluginRegistry registry;
registry.registerPluginsInModule("plugins/cheeseburgerwatcher.dll");
for (auto it = registry.plugins.begin(); it != registry.plugins.end(); ++it) {
(*it)->register();
}
for (auto it = registry.plugins.begin(); it != registry.plugins.end(); ++it) {
(*it)->initialize();
}
// And then, when a cheeseburger is actually eaten:
for (auto it = registry.plugins.begin(); it != registry.plugins.end(); ++it) {
auto plugin = *it;
plugin->onCheeseBurgerEatenEvent();
}
This has the benefit of using a C API for compatibility, but also offering a higher level of abstraction for plugins written in C++ (and for the main project code, which is C++). Note that it lets multiple plugins be defined in a single DLL. You could also eliminate some of the duplication of function names by using macros, but I chose not to for this simple example.
All of this, by the way, assumes plugins that have no interdependencies -- if plugin A affects (or is required by) plugin B, you need to devise a safe method for injecting/constructing dependencies as needed, since there's no way of guaranteeing what order the plugins will be loaded in (or initialized). A two-step process would work well in that case: Load and register all plugins; during registration of each plugin, let them register any services they provide. During initialization, construct requested services as needed by looking at the registered service table. This ensures that all services offered by all plugins are registered before any of them are attempted to be used, no matter what order plugins get registered or initialized in.
The approach you took is sane in general, but I see a few possible improvements.
Your kernel should export C functions with a conventional calling convention (cdecl, or maybe stdcall if you are on Windows) for the registration of plugins and functions. If you use a C++ function then you are forcing all plugin authors to use the same compiler and compiler version that you use, since many things like C++ function name mangling, STL implementation and calling conventions are compiler specific.
Plugins should only export C functions like the kernel.
From the definition of getFunction it seems each plugin has a name, which other plugins can use to obtain its functions. This is not a safe practice, two developers can create two different plugins with the same name, so when a plugin asks for some other plugin by name it may get a different plugin than the expected one. A better solution would be for plugins to have a public GUID. This GUID can appear in each plugin's header file, so that other plugins can refer to it.
You have not implemented versioning. Ideally you want your kernel to be versioned because invariably you will change it in the future. When a plugin registers with the kernel it passes the version of the kernel API it was compiled against. The kernel then can decide if the plugin can be loaded. For example, if kernel version 1 receives a registration request for a plugin that requires kernel version 2 you have a problem, the best way to address that is to not allow the plugin to load since it may need kernel features that are not present in the older version. The reverse case is also possible, kernel v2 may or may not want to load plugins that were created for kernel v1, and if it does allow it it may need to adapt itself to the older API.
I'm not sure I like the idea of a plugin being able to locate another plugin and call its functions directly, as this breaks encapsulation. It seems better to me if plugins advertise their capabilities to the kernel, so that other plugins can find services they need by capability instead of by addressing other plugins by name or GUID.
Be aware that any plugin that allocates memory needs to provide a deallocation function for that memory. Each plugin could be using a different run-time library, so memory allocated by a plugin may be unknown to other plugins or the kernel. Having allocation and deallocation in the same module avoids problems.
C++ has no ABI. So what you want doing has a restriction: the plugins and your framework must compile & link by same compiler & linker with same parameter in same os. That is meaningless if the achievement is inter-operation in form of binary distribution because each plugin developed for framework has to prepare many version which target at different compiler on different os. So distrbute source code will be more practical than this and that's the way of GNU(download a src, configure and make)
COM is a chose, but it is too complex and out-of-date. Or managed C++ on .Net runtime. But they are only on ms os. If you want a universal solution, I suggest you change to another language.
As jean mentions, since there is no standard C++ ABI and standard name mangling conventions you are stuck to compile things with same compiler and linker. If you want a shared library/dll kind of plugins you have to use something C-ish.
If all will be compiled with same compiler and linker, you may want to also consider std::function.
typedef std::function<void ()> plugin_function;
std::map<std::string, plugin_function> fncMap;
void register_func(std::string name, plugin_function fnc)
{
fncMap[name] = fnc;
}
void call(std::string name)
{
auto it = fncMap.find(name);
if (it != fncMap.end())
(it->second)(); // it->second is a function object
}
///////////////
void func()
{
std::cout << "plain" << std::endl;
}
class T
{
public:
void method()
{
std::cout << "method" << std::endl;
}
void method2(int i)
{
std::cout << "method2 : " << i << std::endl;
}
};
T t; // of course "t" needs to outlive the map, you could just as well use shared_ptr
register_func("plain", func);
register_func("method", std::bind(&T::method, &t));
register_func("method2_5", std::bind(&T::method2, &t, 5));
register_func("method2_15", std::bind(&T::method2, &t, 15));
call("plain");
call("method");
call("method2_5");
call("method2_15");
You can also have plugin functions that take argumens. This will use the placeholders for std::bind, but soon you can find that it is somewhat lacking behind boost::bind. Boost bind has nice documentation and examples.
There is no reason why you should not do this. In C++ using this style of pointer is the best since it's just a plain pointer. I know of no popular compiler that would do anything as brain-dead as not making a function pointer like a normal pointer. It is beyond the bounds of reason that someone would do something so horrible.
The Vst plugin standard operates in a similar way. It just uses function pointers in the .dll and does not have ways of calling directly to classes. Vst is a very popular standard and on windows people use just about any compiler to do Vst plugins, including Delphi which is pascal based and has nothing to do with C++.
So I would do exactly what you suggest personally. For the common well-known plugins I would not use a string name but an integer index which can be looked up much faster.
The alternative is to use interfaces but I see no reason to if your thinking is already based around function pointers.
If you use interfaces then it is not so easy to call the functions from other languages. You can do it from Delphi but what about .NET.
With your function pointer style suggestion you can use .NET to make one of the plugins for example. Obviously you would need to host Mono in your program to load it but just for hypothetical purposes it illustrates the simplicity of it.
Besides, when you use interfaces you have to get into reference counting which is nasty. Stick your logic in function pointers like you suggest and then wrap the control in some C++ classes to do the calling and stuff for you. Then other people can make the plugins with other languages such as Delphi Pascal, Free Pascal, C, Other C++ compilers etc...
But as always, regardless of what you do, exception handling between compilers will remain an issue so you have to think about the error handling. Best way is that the plugins own method catches own plugin exceptions and returns an error code to the kernel etc...
With all the excellent answers above, I'll just add that this practice is actually pretty wide distributed. In my practice, I've seen it both in commercial projects and in freeware/opensource ones.
So - yes, it's good and proven architecture.
You don't need to register functions manually. Really? Really.
What you could use is a proxy implementation for your plugin interface, where each function loads its original from the shared library on demand, transparently, and calls it. Whoever reaches a proxy object of that interface definition just can call the functions. They will be loaded on demand.
If plugins are singletons, then there is no need for manual binding at all (otherwise the correct instance has to be chosen first).
The idea for the developer of a new plugin would be to describe the interface first, then have a generator which generates a stub for the implementation for the shared library, and additionally a plugin proxy class with the same signature but with the autoloading on demand which then is used in the client software. Both should fulfill the same interface (in C++ a pure abstract class).

Serializing function objects

Is it possible to serialize and deserialize a std::function, a function object, or a closure in general in C++? How? Does C++11 facilitate this? Is there any library support available for such a task (e.g., in Boost)?
For example, suppose a C++ program has a std::function which is needed to be communicated (say via a TCP/IP socket) to another C++ program residing on another machine. What do you suggest in such a scenario?
Edit:
To clarify, the functions which are to be moved are supposed to be pure and side-effect-free. So I do not have security or state-mismatch problems.
A solution to the problem is to build a small embedded domain specific language and serialize its abstract syntax tree.
I was hoping that I could find some language/library support for moving a machine-independent representation of functions instead.
Yes for function pointers and closures. Not for std::function.
A function pointer is the simplest — it is just a pointer like any other so you can just read it as bytes:
template <typename _Res, typename... _Args>
std::string serialize(_Res (*fn_ptr)(_Args...)) {
return std::string(reinterpret_cast<const char*>(&fn_ptr), sizeof(fn_ptr));
}
template <typename _Res, typename... _Args>
_Res (*deserialize(std::string str))(_Args...) {
return *reinterpret_cast<_Res (**)(_Args...)>(const_cast<char*>(str.c_str()));
}
But I was surprised to find that even without recompilation the address of a function will change on every invocation of the program. Not very useful if you want to transmit the address. This is due to ASLR, which you can turn off on Linux by starting your_program with setarch $(uname -m) -LR your_program.
Now you can send the function pointer to a different machine running the same program, and call it! (This does not involve transmitting executable code. But unless you are generating executable code at run-time, I don't think you are looking for that.)
A lambda function is quite different.
std::function<int(int)> addN(int N) {
auto f = [=](int x){ return x + N; };
return f;
}
The value of f will be the captured int N. Its representation in memory is the same as an int! The compiler generates an unnamed class for the lambda, of which f is an instance. This class has operator() overloaded with our code.
The class being unnamed presents a problem for serialization. It also presents a problem for returning lambda functions from functions. The latter problem is solved by std::function.
std::function as far as I understand is implemented by creating a templated wrapper class which effectively holds a reference to the unnamed class behind the lambda function through the template type parameter. (This is _Function_handler in functional.) std::function takes a function pointer to a static method (_M_invoke) of this wrapper class and stores that plus the closure value.
Unfortunately, everything is buried in private members and the size of the closure value is not stored. (It does not need to, because the lambda function knows its size.)
So std::function does not lend itself to serialization, but works well as a blueprint. I followed what it does, simplified it a lot (I only wanted to serialize lambdas, not the myriad other callable things), saved the size of the closure value in a size_t, and added methods for (de)serialization. It works!
No.
C++ has no built-in support for serialization and was never conceived with the idea of transmitting code from one process to another, lest one machine to another. Languages that may do so generally feature both an IR (intermediate representation of the code that is machine independent) and reflection.
So you are left with writing yourself a protocol for transmitting the actions you want, and the DSL approach is certainly workable... depending on the variety of tasks you wish to perform and the need for performance.
Another solution would be to go with an existing language. For example the Redis NoSQL database embeds a LUA engine and may execute LUA scripts, you could do the same and transmit LUA scripts on the network.
No, but there are some restricted solutions.
The most you can hope for is to register functions in some sort of global map (e.g. with key strings) that is common to the sending code and the receiving code (either in different computers or before and after serialization).
You can then serialize the string associated with the function and get it on the other side.
As a concrete example the library HPX implements something like this, in something called HPX_ACTION.
This requires a lot of protocol and it is fragile with respect to changes in code.
But after all this is no different from something that tries to serialize a class with private data. In some sense the code of the function is its private part (the arguments and return interface is the public part).
What leaves you a slip of hope is that depending on how you organize the code these "objects" can be global or common and if all goes right they are available during serialization and deserialization through some kind predefined runtime indirection.
This is a crude example:
serializer code:
// common:
class C{
double d;
public:
C(double d) : d(d){}
operator(double x) const{return d*x;}
};
C c1{1.};
C c2{2.};
std::map<std::string, C*> const m{{"c1", &c1}, {"c2", &c2}};
// :common
main(int argc, char** argv){
C* f = (argc == 2)?&c1:&c2;
(*f)(5.); // print 5 or 10 depending on the runtime args
serialize(f); // somehow write "c1" or "c2" to a file
}
deserializer code:
// common:
class C{
double d;
public:
operator(double x){return d*x;}
};
C c1;
C c2;
std::map<std::string, C*> const m{{"c1", &c1}, {"c2", &c2}};
// :common
main(){
C* f;
deserialize(f); // somehow read "c1" or "c2" and assign the pointer from the translation "map"
(*f)(3.); // print 3 or 6 depending on the code of the **other** run
}
(code not tested).
Note that this forces a lot of common and consistent code, but depending on the environment you might be able to guarantee this.
The slightest change in the code can produce a hard to detect logical bug.
Also, I played here with global objects (which can be used on free functions) but the same can be done with scoped objects, what becomes trickier is how to establish the map locally (#include common code inside a local scope?)

How does this code create an instance of a class which has only a private constructor?

I'm working on a sound library (with OpenAL), and taking inspiration from the interface provided by FMOD, you can see the interface at this link.
I've provided some concepts like: Sound, Channel and ChannelGroup, as you can see through FMOD interface, all of those classes have a private constructor and, for example, if you would create a Sound you mast use the function createSound() provided by the System class (the same if you would create a Channel or a ChannelGroup).
I'd like to provide a similar mechanism, but I don't understand how it work behind. For example, how can the function createSound() create a new istance of a Sound? The constructor is private and from the Sound interface there aren't any static methods or friendship. Are used some patterns?
EDIT: Just to make OP's question clear, s/he is not asking how to create a instance of class with private constructor, The question is in the link posted, how is instance of classes created which have private constructor and NO static methods or friend functions.
Thanks.
Hard to say without seeing the source code. Seems however that FMOD is 100% C with global variables and with a bad "OOP" C++ wrapper around it.
Given the absence of source code and a few of the bad tricks that are played in the .h files may be the code is compiled using a different header file and then just happens to work (even if it's clearly non-standard) with the compilers they are using.
My guess is that the real (unpublished) source code for the C++ wrapper is defining a static method or alternatively if everything is indeed just global then the object is not really even created and tricks are being played to fool C++ object system to think there is indeed an object. Apparently all dispatching is static so this (while not formally legal) can happen to work anyway with C++ implementations I know.
Whatever they did it's quite ugly and non-conforming from a C++ point of view.
They never create any instances! The factory function is right there in the header
/*
FMOD System factory functions.
*/
inline FMOD_RESULT System_Create(System **system)
{ return FMOD_System_Create((FMOD_SYSTEM **)system); }
The pointer you pass in to get a System object is immediately cast to a pointer to a C struct declared in the fmod.h header.
As it is a class without any data members who can tell the difference?
struct Foo {
enum Type {
ALPHA,
BETA_X,
BETA_Y
};
Type type () const;
static Foo alpha (int i) {return Foo (ALPHA, i);}
static Foo beta (int i) {return Foo (i<0 ? BETA_X : BETA_Y, i);}
private:
Foo (Type, int);
};
create_alpha could have been a free function declared friend but that's just polluting the namespace.
I'm afraid I can't access that link but another way could be a factory pattern. I'm guessing a bit, now.
It is the factory pattern - as their comment says.
/*
FMOD System factory functions.
*/
inline FMOD_RESULT System_Create(System **system) { return FMOD_System_Create((FMOD_SYSTEM **)system); }
It's difficult to say exactly what is happening as they don't publish the source for the FMOD_System_Create method.
The factory pattern is a mechanism for creating an object but the (sub)class produced depends on the parameters of the factory call. http://en.wikipedia.org/wiki/Factory_method_pattern

How to perform type scanning in C++?

I have an ESB. Any serialized message transports its own fully qualified name (that is, namespace + class name). I have a concrete type for each message that encapsulates a specific logic to be executed.
Every time I receive a message, I need to deserialize it at first, so I can perform its operations --once more, depending on its concrete type--.
I need a way to register every single class at compile time or during my application initialization.
With .net I would use reflection to scan assemblies and discover the message types during initialization, but how would you do it in C++?
C++ has no reflection capability. I suppose you could try to scan object files, etc., but there's no reliable way to do this (AFAIK); the compiler may entirely eliminate or mangle certain things.
Essentially, for serialization, you will have to do the registration (semi-)manually. But you may be interested in a serialization library that will help out with the chores, such as Boost Serialization.
Since there is no reflection in C++, I would suggest using an external script to scan your source code for all relevant classes (which is easy if you use empty dummy #defines to annotate them in the source code) and have it generate the registration code.
I personally use the manual registration road. If you forget to register... then the test don't work anyway.
You just have to use a factory, and implement some tag dispatching. For example:
typedef void (*ActOnMessageType)(Message const&);
typedef std::map<std::string, ActOnMessageType> MessageDispatcherType;
static MessageDispatcherType& GetDispatcher() {
static MessageDispatcherType D; return D;
}
static bool RegisterMessageHandler(std::string name, ActOnMessageType func) {
return GetDispatcher().insert(std::make_pair(name, func)).second;
}
Then you just prepare your functions:
void ActOnFoo(Message const& m);
void ActOnBar(Message const& m);
And register them:
bool const gRegisteredFoo = RegisterMessageHandler("Foo", ActOnFoo);
bool const gRegisteredBar = RegsiterMessageHandler("Bar", ActOnBar);
Note: I effectively use a lazily initialized Singleton, in order to allow decoupling. That is the registration is done during the library load and thus each Register... call is placed in the file where the function is defined. The one difference with a global variable is that here the dispatching map is actually constant once the initialization ends.

C++ Command line action abstraction using interface

I'm building an application whose usage is going to look something like this:
application --command --option1=? --option2=2?
Basically, there can be any number of options, but only one command per instance of the application. Similar to the way git works.
Now, I thought I'd write it in C++ to get some boost and stl experience and have a go with a few of those design patterns I keep reading about. So, I implemented this:
class Action
{
public:
void AddParameter(std::string key, boost::any p);
virtual unsigned int ExecuteAction();
protected:
std::map<std::string, boost::any> parameters;
};
I'll explain my logic anyway, just to check it - this is an abstract-ish action. All actions need option adding, hence the parameters map, so that we can implement at this level, but we expect ExecuteAction to be implemented by derived classes, such as my simple example DisplayHelpAction, which does pretty much what it says on the tin.
So now I've written a factory, like so:
class DetermineAction
{
public:
DetermineAction();
vx::modero::Action getAction(std::string ActionString);
private:
std::map<std::string, vx::modero::Action> cmdmap;
};
The logic being that the constructor will create a map of possible strings you can ask for and getAction will do what it says - give it a command string and it'll give you a class that is derived from Action which implements the desired functionality.
I'm having trouble with that constructor. I am trying this:
this->cmdmap = std::map<std::string, Action>();
this->cmdmap.insert(pair<string, Action>("help", DisplayHelpAction()));
this->cmdmap.insert(pair<string, Action>("license", DisplayLicenseAction()));
Which is causing a lot of errors. Now, I'm used to the Java Way of interfaces, so you use:
Interface I = new ConcreteClass();
and Java likes it. So that's the sort of idea I'm trying to achieve here, because what I want do have for the implementation of getAction is this:
return this->cmdmap[ActionString];
Which should return a class derived from Action, on which I can then start adding parameters and call execute.
So, to summarise, I have two questions which are closely related:
Soundboard. I'm deliberately practising abstracting things, so there's some additional complexity there, but in principle, is my approach sound? Is there an insanely obvious shortcut I've missed? Is there a better method I should be using?
How can I set up my class mapping solution so that I can return the correct class? The specific complaint is link-time and is:
Linking CXX executable myapp
CMakeFiles/myapp.dir/abstractcmd.cpp.o: In function `nf::Action::Action()':
abstractcmd.cpp:(.text._ZN2vx6modero6ActionC2Ev[_ZN2vx6modero6ActionC5Ev]+0x13): undefined reference to `vtable for nf::Action'
Just because it might be relevant, I'm using boost::program_options for command line parsing.
Edit 1: Ok, I have now replaced Action with Action* as per Eugen's answer and am trying to add new SomethingThatSubclassesAction to the map. I'm still getting the vtable error.
One thing than needs to be said right off the bat is that runtime polymorphism works in C++ via pointers to the base class not by value. So your std::map<std::string, Action> needs to be std::map<std::string, Action*> or your derived Actions (i.e. DisplayHelpAction) will be sliced when copied into the map. Storing Action* also mean that you'll need to explicitly take care of freeing the map values when you're done. Note: you can use a boost::ptr_map (boost::ptr_map<std::string,Action>) (as #Fred Nurk pointed out) or a boost::shared_ptr (std::map<std::string,boost::shared_ptr<Action> >) to not worry about explicitly freeing the Action* allocated.
The same thing about 'Action getAction(std::string ActionString);' it needs to become Action* getAction(std::string ActionString);.
The linker error is (most likely) caused by not providing an implementation for virtual unsigned int ExecuteAction();. Also I'd say it makes sense to make it pure virtual (virtual unsigned int ExecuteAction() = 0;) - in which case you don't need to provide an implementation for it. It will also provide the closes semantics to a Java interface for the Action class.
Unless you have a very good reason for the Action derived objects to not know the entire boost:program_options I'd pass it down and let each of them access it directly instead of constructing std::map<std::string, boost::any>.
I'd rename DetermineAction to something like ActionManager or ActionHandler.