And now for a complete change of direction from C++ function pointers - c++

I am building a part of a simulator. We are building off of a legacy simulator, but going in different direction, incorporating live bits along side of the simulated bits. The piece I am working on has to, effectively route commands from the central controller to the various bits.
In the legacy code, there is a const array populated with an enumerated type. A command comes in, it is looked up in the table, then shipped off to a switch statement keyed by the enumerated type.
The type enumeration has a choice VALID_BUT_NOT_SIMULATED, which is effectively a no-op from the point of the sim. I need to turn those no-ops into commands to actual other things [new simulated bits| live bits]. The new stuff and the live stuff have different interfaces than the old stuff [which makes me laugh about the shill job that it took to make it all happen, but that is a topic for a different discussion].
I like the array because it is a very apt description of the live thing this chunk is simulating [latching circuits by row and column]. I thought that I would try to replace the enumerated types in the array with pointers to functions and call them directly. This would be in lieu of the lookup+switch.

Can't be done. However, you could do something sort of like it with a functor. I'd put example code but as I was writing it I realized such a construct would necessarily be quite complicated. You might look at boost::bind for some ideas.

One way to do it, though ugly, is to use a generic table of pointers and cast your function pointers to such generic type (losing information about the arguments' types):
void (*myFunctions[]) () = {
(void (*)())myFirstFunction,
(void (*)())mySecondFunction
};
But then you'll have to know, for each of the pointers, what arguments to pass to the corresponding functions. You can extend your table of pointers and make a table of more sophisticated objects which hold some enumeration variable informing about the arguments of a function to which a particular pointer points.
Unfortunately, each time you'll want to use a function from the array, you will need to cast the pointer back to the given type (and you'll have to care not to cast it incorrectly), like so:
((void (*)(int))tab[0])(1);
In order to call myFirstFunction with x = 1.
As I think about it now (after you changed the question), I come to the conclusion that if you have to call the functions differently, there really is no point complicating the whole thing (lookup table), unless there are just a few signatures and many functions available. You need a very consistent calling policy and really few possible signatures to achieve a good-looking solution with a lookup table. Needless to mention what will happen if you need to store pointers to member functions or even worse - virtuals.

Based on your updated question, I'm still not sure how you're going to invoke the functions via the pointer if the functions need different parameter lists.
However, if one parameter list is a subset of the other, could you write thunks to adapt one interface to look like the other? (i.e. discarding irrelevant parameters or synthesising fake parameters).

In your original question, you were describing a scenario that is very common when working with Javascript libraries. In Javascript, libraries often provide a way for "interested parties" to be notified of events that are published by the library, and all that the interested parties need to do is register their own Function object callbacks. In one version of the library, the documentation might say that the callbacks will be passed n arguments (a, b, c, ... in that order), but a future version might want to provide n + m arguments. This change does not have to break existing code because the library can just append the extra m arguments to the argument list, and this works because Javascript uses a caller-cleans-up calling convention (essentially).
In C++, you could do something similar (provide additional arguments to callbacks) as long as you can guarantee that the calling convention that is used by the callbacks and to call the callbacks is a caller-cleans-up calling convention such as the C calling convention for the x86 architecture:
#include <cstdlib>
#include <iostream>
#include <vector>
extern "C" void old_api_callback_in_old_archive(int x) {
std::cout << "`old_api_callback_in_old_archive` was called with x = " << x << std::endl;
}
extern "C" void new_api_callback(int x, int otherInfo) {
std::cout << "`new_api_callback` was called with x = " << x
<< ", otherInfo = " << otherInfo << std::endl;
}
extern "C" {
typedef void (*callback_type)(int, int);
}
int main()
{
std::vector<callback_type> callbacks;
callbacks.push_back(&new_api_callback);
callbacks.push_back(reinterpret_cast<callback_type>(&old_api_callback_in_old_archive));
std::vector<callback_type>::iterator it;
for (it = callbacks.begin(); it != callbacks.end(); ++it) {
(*it)(7, -8);
}
return EXIT_SUCCESS;
}

Related

Good practice in C++ function/method design

I have a confusion about C++ function/method design as below:
1.
class ArithmeticCalculation
{
private:
float num1_;
float num2_;
float sum_;
void addTwoNumbers();
};
2.
class ArithmeticCalculation
{
private:
float addTwoNumbers(float num1, float num2);
};
In 1., one can basically declare a class variable and the void addTwoNumbers() will just implement it and assign to the class variable (sum_). I found using 1. is cleaner but using 2. looks like it more intuitive for function use.
Which one is actually best option considering the function/method is not restricted to only this basic addition functionality -- I mean in general how to decide to use with return or simply void?
The major difference between the two functions is that the second one is stateless*, while the first one has a state. Other things being equal, stateless approach is preferred, because it gives the users of your class more flexibility at utilizing your class in their systems. For example, stateless functions are re-entrant, while functions that rely on state may require the code that uses them to take additional measures that prevent incorrect use.
Re-entrancy alone is a big reason to prefer stateless functions whenever possible. However, there are situations when keeping state becomes more economical - for example, when you are using Builder Design Pattern.
Another important advantage of keeping your functions stateless whenever it is possible is that the call sequence becomes more readable. A call of a method that relies on the state consists of these parts:
Set up the object before the call
Make the call
Harvest the result of the call (optional)
Human readers of your code will have much easier time reading the call that uses a function invocation with parameter passing than the three-part setup-call-get result sequence.
There are situations when you have to have state, for example, when you want to defer the action. In this case the parameters are supplied by one part of the code, while the computation is initiated by some other part of the code. In terms of your example, one function would call set_num1 and set_num2, while another function would call addTwoNumbers at some later time. In situations like this you could save the parameters on the object itself, or create a separate object with deferred parameters.
* This is only an assumption based on the signature of your member function. Your second function gets all the data that it needs as parameters, and returns the value to the caller; Obviously, implementations may choose to add some state, e.g. by saving the last result, but that is uncommon for addTwoNumbers functions, so I assume that your code does not do it.
The first function doesn't really make a lot of sense. What numbers? Where does the result go? The name doesn't describe the expected side-effects, nor the origin of the numbers in question.
The second function makes it abundantly clear what's going on, where the result is, and how that function might be used.
Your functions should strive to communicate their intent based on the function signature. If that's not sufficient you'll need to add comments or documentation, but no amount of commenting or documentation can pave over a misleading or confusing signature.
Think about what your function's responsibility is as well as whatever expectations it has when naming things. For example:
void whatever(const int);
What does that function do? Could you even guess without looking at code or documentation?
Compare with the same function given a much more meaningful name:
void detonateReactor(const int countdownTimeInSeconds);
It seems pretty clear what that does now, as well as what side-effects it will have.
You probably had in mind something like this for the first option:
struct Adder {
float sum;
float a;
float b;
void addNumbers(){ sum = a+b; }
};
that would be used like this:
Adder adder;
adder.a = 1.0;
adder.b = 2.0;
adder.addNumbers();
std::cout << adder.sum << "\n";
There is no single good argument to do this when you actually wanted this:
float addTwoNumbers(float a,float b) { return a+b; }
std::cout << addTwoNumbers(1.0,2.0) << "\n";
Not everything has to be inside a class. Actually not everything should be inside a class (C++ isnt Java). If you need a function that adds two numbers then write a function that adds two numbers and dont overthink it.

Is it idiomatically ok to put algorithm into class?

I have a complex algorithm. This uses many variables, calculates helper arrays at initialization and also calculates arrays along the way. Since the algorithm is complex, I break it down into several functions.
Now, I actually do not see how this might be a class from an idiomatic way; I mean, I am just used to have algorithms as functions. The usage would simply be:
Calculation calc(/* several parameters */);
calc.calculate();
// get the heterogenous results via getters
On the other hand, putting this into a class has the following advantages:
I do not have to pass all the variables to the other functions/methods
arrays initialized at the beginning of the algorithm are accessible throughout the class in each function
my code is shorter and (imo) clearer
A hybrid way would be to put the algorithm class into a source file and access it via a function that uses it. The user of the algorithm would not see the class.
Does anyone have valuable thoughts that might help me out?
Thank you very much in advance!
I have a complex algorithm. This uses many variables, calculates helper arrays at initialization and also calculates arrays along the way.[...]
Now, I actually do not see how this might be a class from an idiomatic way
It is not, but many people do the same thing you do (so did I a few times).
Instead of creating a class for your algorithm, consider transforming your inputs and outputs into classes/structures.
That is, instead of:
Calculation calc(a, b, c, d, e, f, g);
calc.calculate();
// use getters on calc from here on
you could write:
CalcInputs inputs(a, b, c, d, e, f, g);
CalcResult output = calculate(inputs); // calculate is now free function
// use getters on output from here on
This doesn't create any problems and performs the same (actually better) grouping of data.
I'd say it is very idiomatic to represent an algorithm (or perhaps better, a computation) as a class. One of the definitions of object class from OOP is "data and functions to operate on that data." A compex algorithm with its inputs, outputs and intermediary data matches this definition perfectly.
I've done this myself several times, and it simplifies (human) code flow analysis significantly, making the whole thing easier to reason about, to debug and to test.
If the abstraction for the client code is an algorithm, you
probably want to keep a pure functional interface, and not
introduce additional types there. It's quite common, on the
other hand, for such a function to be implemented in a source
file which defines a common data structure or class for its
internal use, so you might have:
double calculation( /* input parameters */ )
{
SupportClass calc( /* input parameters */ );
calc.part1();
calc.part2();
// etc...
return calc.results();
}
Depending on how your code is organized, SupportClass will be
in an unnamed namespace in the source file (probably the most
common case), or in a "private" header, included only by the
sources involved in the algorith.
It really depends of what kind of algorithm you want to encapsulate. Generally I agree with John Carmack : "Sometimes, the elegant implementation is just a function. Not a method. Not a class. Not a framework. Just a function."
It really boils down to: do the algorithm need access to the private area of the class that is not supposed to be public? If the answer is yes (unless you are willing to refactor your class interface, depending on the specific cases) you should go with a member function, if not, then a free function is good enough.
Take for example the standard library. Most of the algorithms are provided as free functions because they only access the public interface of the class (with iterators for standard containers, for example).
Do you need to call the exact same functions in the exact same order each time? Then you shouldn't be requiring calling code to do this. Splitting your algorithm into multiple functions is fine, but I'd still have one call the next and then the next and so on, with a struct of results/parameters being passed along the way. A class doesn't feel right for a one-off invocation of some procedure.
The only way I'd do this with a class is if the class encapsulates all the input data itself, and you then call myClass.nameOfMyAlgorithm() on it, among other potential operations. Then you have data+manipulators. But just manipulators? Yeah, I'm not so sure.
In modern C++ the distinction has been eroded quite a bit. Even from the operator overloading of the pre-ANSI language, you could create a class whose instances are syntactically like functions:
struct Multiplier
{
int factor_;
Multiplier(int f) : factor_(f) { }
int operator()(int v) const
{
return v * _factor;
}
};
Multipler doubler(2);
std::cout << doubler(3) << std::endl; // prints 6
Such a class/struct is called a functor, and can capture "contextual" values in its constructor. This allows you to effectively pass the parameters to a function in two stages: some in the constructor call, some later each time you call it for real. This is called partial function application.
To relate this to your example, your calculate member function could be turned into operator(), and then the Calculation instance would be a function! (or near enough.)
To unify these ideas, you can try thinking of a plain function as a functor of which there is only one instance (and hence no need for a constructor - although this is no guarantee that the function only depends on its formal parameters: it might depend on global variables...)
Rather than asking "Should I put this algorithm in a function or a class?" instead ask yourself "Would it be useful to be able to pass the parameters to this algorithm in two or more stages?" In your example, all the parameters go into the constructor, and none in the later call to calculate, so it makes little sense to ask users of your class make two calls.
In C++11 the distinction breaks down further (and things get a lot more convenient), in recognition of the fluidity of these ideas:
auto doubler = [] (int val) { return val * 2; };
std::cout << doubler(3) << std::endl; // prints 6
Here, doubler is a lambda, which is essentially a nifty way to declare an instance of a compiler-generated class that implements the () operator.
Reproducing the original example more exactly, we would want a function-like thing called multiplier that accepts a factor, and returns another function-like thing that accepts a value v and returns v * factor.
auto multiplier = [] (int factor)
{
return [=] (int v) { return v * factor; };
};
auto doubler = multiplier(2);
std::cout << doubler(3) << std::endl; // prints 6
Note the pattern: ultimately we're multiplying two numbers, but we specify the numbers in two steps. The functor we get back from calling multiplier acts like a "package" containing the first number.
Although lambdas are relatively new, they are likely to become a very common part of C++ style (as they have in every other language they've been added to).
But sadly at this point we've reached the "cutting edge" as the above example works in GCC but not in MSVC 12 (I haven't tried it in MSVC 13). It does pass the intellisense checking of MSVC 12 though (they use two completely different compilers)! And you can fix it by wrapping the inner lambda with std::function<int(int)>( ... ).
Even so, you can use these ideas in old-school C++ when writing functors by hand.
Looking further ahead, resumable functions may make it into some future version of the language (Microsoft is pushing hard for them as they are practically identical to async/await in C#) and that is yet another blurring of the distinction between functions and classes (a resumable function acts like a constructor for a state machine class).

Dynamic Libraries, plugin frameworks, and function pointer casting in c++

I am trying to create a very open plugin framework in c++, and it seems to me that I have come up with a way to do so, but a nagging thought keeps telling me that there is something very, very wrong with what I am doing, and it either won't work or it will cause problems.
The design I have for my framework consists of a Kernel that calls each plugin's init function. The init function then turns around and uses the Kernel's registerPlugin and registerFunction to get a unique id and then register each function the plugin wants to be accessible using that id, respectively.
The function registerPlugin returns the unique id. The function registerFunction takes that id, the function name, and a generic function pointer, like so:
bool registerFunction(int plugin_id, string function_name, plugin_function func){}
where plugin_function is
typedef void (*plugin_function)();
The kernel then takes the function pointer and puts it in a map with the function_name and plugin_id. All plugins registering their function must caste the function to type plugin_function.
In order to retrieve the function, a different plugin calls the Kernel's
plugin_function getFunction(string plugin_name, string function_name);
Then that plugin must cast the plugin_function to its original type so it can be used. It knows (in theory) what the correct type is by having access to a .h file outlining all the functions the plugin makes available. Plugins, by the by, are implemented as dynamic libraries.
Is this a smart way to accomplish the task of allowing different plugins to connect with each other? Or is this a crazy and really terrible programming technique? If it s, please point me in the direction of the correct way to accomplish this.
EDIT: If any clarification is needed, ask and it will be provided.
Function pointers are strange creatures. They're not necessarily the same size as data pointers, and hence cannot be safely cast to void* and back. But, the C++ (and C) specifications allow any function pointer to be safely cast to another function pointer type (though you have to later cast it back to the earlier type before calling it if you want defined behaviour). This is akin to the ability to safely cast any data pointer to void* and back.
Pointers to methods are where it gets really hairy: a method pointer might be larger than a normal function pointer, depending on the compiler, whether the application is 32- or 64-bit, etc. But even more interesting is that, even on the same compiler/platform, not all method pointers are the same size: Method pointers to virtual functions may be bigger than normal method pointers; if multiple inheritance (with e.g. virtual inheritance in the diamond pattern) is involved, the method pointers can be even bigger. This varies with compiler and platform too. This is also the reason that it's difficult to create function objects (that wrap arbitrary methods as well as free functions) especially without allocating memory on the heap (it's just possible using template sorcery).
So, by using function pointers in your interface, it becomes unpractical for the plugin authors to pass back method pointers to your framework, even if they're using the same compiler. This might be an acceptable constraint; more on this later.
Since there's no guarantee that function pointers will be the same size from one compiler to the next, by registering function pointers you're limiting the plugin authors to compilers that implement function pointers having the same size as your compiler does. This wouldn't necessarily be so bad in practice, since function pointer sizes tend to be stable across compiler versions (and may even be the same for multiple compilers).
The real problems start to arise when you want to call the functions pointed to by the function pointers; you can't safely call the function at all if you don't know its true signature (you will get poor results ranging from "not working" to segmentation faults). So, the plugin authors would be further limited to registering only void functions that take no parameters.
It gets worse: the way a function call actually works at the assembler level depends on more than just the signature and function pointer size. There's also the calling convention, the way exceptions are handled (the stack needs to be properly unwound when an exception is thrown), and the actual interpretation of the bytes of function pointer (if it's larger than a data pointer, what do the extra bytes signify? In what order?). At this point, the plugin author is pretty much limited to using the same compiler (and version!) that you are, and needs to be careful to match the calling convention and exception handling options (with the MSVC++ compiler, for example, exception handling is only explicitly enabled with the /EHsc option), as well as use only normal function pointers with the exact signature you define.
All the restrictions so far can be considered reasonable, if a bit limiting. But we're not done yet.
If you throw in std::string (or almost any part of the STL), things get even worse though, because even with the same compiler (and version), there are several different flags/macros that control the STL; these flags can affect the size and meaning of the bytes representing string objects. It is, in effect, like having two different struct declarations in separate files, each with the same name, and hoping they'll be interchangeable; obviously, this doesn't work. An example flag is _HAS_ITERATOR_DEBUGGING. Note that these options can even change between debug and release mode! These types of errors don't always manifest themselves immediately/consistently and can be very difficult to track down.
You also have to be very careful with dynamic memory management across modules, since new in one project may be defined differently from new in another project (e.g. it may be overloaded). When deleting, you might have a pointer to an interface with a virtual destructor, meaning the vtable is needed to properly delete the object, and different compilers all implement the vtable stuff differently. In general, you want the module that allocates an object to be the one to deallocate it; more specifically, you want the code that deallocates an object to have been compiled under the exact same conditions as the code that allocated it. This is one reason std::shared_ptr can take a "deleter" argument when it is constructed -- because even with the same compiler and flags (the only guaranteed safe way to share shared_ptrs between modules), new and delete may not be the same everywhere the shared_ptr can get destroyed. With the deleter, the code that creates the shared pointer controls how it is eventually destroyed too. (I just threw this paragraph in for good measure; you don't seem to be sharing objects across module boundaries.)
All of this is a consequence of C++ having no standard binary interface (ABI); it's a free-for-all, where it is very easy to shoot yourself in the foot (sometimes without realising it).
So, is there any hope? You betcha! You can expose a C API to your plugins instead, and have your plugins also expose a C API. This is quite nice because a C API can be interoperated with from virtually any language. You don't have to worry about exceptions, apart from making sure they can't bubble up above the plugin functions (that's the authors' concern), and it's stable no matter the compiler/options (assuming you don't pass STL containers and the like). There's only one standard calling convention (cdecl), which is the default for functions declared extern "C". void*, in practice, will be the same across all compilers on the same platform (e.g. 8 bytes on x64).
You (and the plugin authors) can still write your code in C++, as long as all the external communication between the two uses a C API (i.e. pretends to be a C module for the purposes of interop).
C function pointers are also likely compatible between compilers in practice, though if you'd rather not depend on this you could have the plugin register a function name (const char*) instead of address, and then you could extract the address yourself using, e.g., LoadLibrary with GetProcAddress for Windows (similarly, Linux and Mac OS X have dlopen and dlsym). This works because name-mangling is disabled for functions declared with extern "C".
Note that there's no direct way around restricting the registered functions to be of a single prototype type (otherwise, as I've said, you can't call them properly). If you need to give a particular parameter to a plugin function (or get a value back), you'll need to register and call the different functions with different prototypes separately (though you could collapse all the function pointers down to a common function pointer type internally, and only cast back at the last minute).
Finally, while you cannot directly support method pointers (which don't even exist in a C API, but are of variable size even with a C++ API and thus cannot be easily stored), you can allow the plugins to supply a "user-data" opaque pointer when registering their function, which is passed to the function whenever it's called; this gives the plugin authors an easy way to write function wrappers around methods and store the object to apply the method to in the user-data parameter. The user-data parameter can also be used for anything else the plugin author wants, which makes your plugin system much easier to interface with and extend. Another example use is to adapt between different function prototypes using a wrapper and extra arguments stored in the user-data.
These suggestions lead to code something like this (for Windows -- the code is very similar for other platforms):
// Shared header
extern "C" {
typedef void (*plugin_function)(void*);
bool registerFunction(int plugin_id, const char* function_name, void* user_data);
}
// Your plugin registration code
hModule = LoadLibrary(pluginDLLPath);
// Your plugin function registration code
auto pluginFunc = (plugin_function)GetProcAddress(hModule, function_name);
// Store pluginFunc and user_data in a map keyed to function_name
// Calling a plugin function
pluginFunc(user_data);
// Declaring a plugin function
extern "C" void aPluginFunction(void*);
class Foo { void doSomething() { } };
// Defining a plugin function
void aPluginFunction(void* user_data)
{
static_cast<Foo*>(user_data)->doSomething();
}
Sorry for the length of this reply; most of it can be summed up with "the C++ standard doesn't extend to interoperation; use C instead since it at least has de facto standards."
Note: Sometimes it's simplest just to design a normal C++ API (with function pointers or interfaces or whatever you like best) under the assumption that the plugins will be compiled under exactly the same circumstances; this is reasonable if you expect all the plugins to be developed by yourself (i.e. the DLLs are part of the project core). This could also work if your project is open-source, in which case everybody can independently choose a cohesive environment under which the project and the plugins are compiled -- but then this makes it hard to distribute plugins except as source code.
Update: As pointed out by ern0 in the comments, it's possible to abstract the details of the module interoperation (via a C API) so that both the main project and the plugins deal with a simpler C++ API. What follows is an outline of such an implementation:
// iplugin.h -- shared between the project and all the plugins
class IPlugin {
public:
virtual void register() { }
virtual void initialize() = 0;
// Your application-specific functionality here:
virtual void onCheeseburgerEatenEvent() { }
};
// C API:
extern "C" {
// Returns the number of plugins in this module
int getPluginCount();
// Called to register the nth plugin of this module.
// A user-data pointer is expected in return (may be null).
void* registerPlugin(int pluginIndex);
// Called to initialize the nth plugin of this module
void initializePlugin(int pluginIndex, void* userData);
void onCheeseBurgerEatenEvent(int pluginIndex, void* userData);
}
// pluginimplementation.h -- plugin authors inherit from this abstract base class
#include "iplugin.h"
class PluginImplementation {
public:
PluginImplementation();
};
// pluginimplementation.cpp -- implements C API of plugin too
#include <vector>
struct LocalPluginRegistry {
static std::vector<PluginImplementation*> plugins;
};
PluginImplementation::PluginImplementation() {
LocalPluginRegistry::plugins.push_back(this);
}
extern "C" {
int getPluginCount() {
return static_cast<int>(LocalPluginRegistry::plugins.size());
}
void* registerPlugin(int pluginIndex) {
auto plugin = LocalPluginRegistry::plugins[pluginIndex];
plugin->register();
return (void*)plugin;
}
void initializePlugin(int pluginIndex, void* userData) {
auto plugin = static_cast<PluginImplementation*>(userData);
plugin->initialize();
}
void onCheeseBurgerEatenEvent(int pluginIndex, void* userData) {
auto plugin = static_cast<PluginImplementation*>(userData);
plugin->onCheeseBurgerEatenEvent();
}
}
// To declare a plugin in the DLL, just make a static instance:
class SomePlugin : public PluginImplementation {
virtual void initialize() { }
};
SomePlugin plugin; // Will be created when the DLL is first loaded by a process
// plugin.h -- part of the main project source only
#include "iplugin.h"
#include <string>
#include <vector>
#include <windows.h>
class PluginRegistry;
class Plugin : public IPlugin {
public:
Plugin(PluginRegistry* registry, int index, int moduleIndex)
: registry(registry), index(index), moduleIndex(moduleIndex)
{
}
virtual void register();
virtual void initialize();
virtual void onCheeseBurgerEatenEvent();
private:
PluginRegistry* registry;
int index;
int moduleIndex;
void* userData;
};
class PluginRegistry {
public:
registerPluginsInModule(std::string const& modulePath);
~PluginRegistry();
public:
std::vector<Plugin*> plugins;
private:
extern "C" {
typedef int (*getPluginCountFunc)();
typedef void* (*registerPluginFunc)(int);
typedef void (*initializePluginFunc)(int, void*);
typedef void (*onCheeseBurgerEatenEventFunc)(int, void*);
}
struct Module {
getPluginCountFunc getPluginCount;
registerPluginFunc registerPlugin;
initializePluginFunc initializePlugin;
onCheeseBurgerEatenEventFunc onCheeseBurgerEatenEvent;
HMODULE handle;
};
friend class Plugin;
std::vector<Module> registeredModules;
}
// plugin.cpp
void Plugin::register() {
auto func = registry->registeredModules[moduleIndex].registerPlugin;
userData = func(index);
}
void Plugin::initialize() {
auto func = registry->registeredModules[moduleIndex].initializePlugin;
func(index, userData);
}
void Plugin::onCheeseBurgerEatenEvent() {
auto func = registry->registeredModules[moduleIndex].onCheeseBurgerEatenEvent;
func(index, userData);
}
PluginRegistry::registerPluginsInModule(std::string const& modulePath) {
// For Windows:
HMODULE handle = LoadLibrary(modulePath.c_str());
Module module;
module.handle = handle;
module.getPluginCount = (getPluginCountFunc)GetProcAddr(handle, "getPluginCount");
module.registerPlugin = (registerPluginFunc)GetProcAddr(handle, "registerPlugin");
module.initializePlugin = (initializePluginFunc)GetProcAddr(handle, "initializePlugin");
module.onCheeseBurgerEatenEvent = (onCheeseBurgerEatenEventFunc)GetProcAddr(handle, "onCheeseBurgerEatenEvent");
int moduleIndex = registeredModules.size();
registeredModules.push_back(module);
int pluginCount = module.getPluginCount();
for (int i = 0; i < pluginCount; ++i) {
auto plugin = new Plugin(this, i, moduleIndex);
plugins.push_back(plugin);
}
}
PluginRegistry::~PluginRegistry() {
for (auto it = plugins.begin(); it != plugins.end(); ++it) {
delete *it;
}
for (auto it = registeredModules.begin(); it != registeredModules.end(); ++it) {
FreeLibrary(it->handle);
}
}
// When discovering plugins (e.g. by loading all DLLs in a "plugins" folder):
PluginRegistry registry;
registry.registerPluginsInModule("plugins/cheeseburgerwatcher.dll");
for (auto it = registry.plugins.begin(); it != registry.plugins.end(); ++it) {
(*it)->register();
}
for (auto it = registry.plugins.begin(); it != registry.plugins.end(); ++it) {
(*it)->initialize();
}
// And then, when a cheeseburger is actually eaten:
for (auto it = registry.plugins.begin(); it != registry.plugins.end(); ++it) {
auto plugin = *it;
plugin->onCheeseBurgerEatenEvent();
}
This has the benefit of using a C API for compatibility, but also offering a higher level of abstraction for plugins written in C++ (and for the main project code, which is C++). Note that it lets multiple plugins be defined in a single DLL. You could also eliminate some of the duplication of function names by using macros, but I chose not to for this simple example.
All of this, by the way, assumes plugins that have no interdependencies -- if plugin A affects (or is required by) plugin B, you need to devise a safe method for injecting/constructing dependencies as needed, since there's no way of guaranteeing what order the plugins will be loaded in (or initialized). A two-step process would work well in that case: Load and register all plugins; during registration of each plugin, let them register any services they provide. During initialization, construct requested services as needed by looking at the registered service table. This ensures that all services offered by all plugins are registered before any of them are attempted to be used, no matter what order plugins get registered or initialized in.
The approach you took is sane in general, but I see a few possible improvements.
Your kernel should export C functions with a conventional calling convention (cdecl, or maybe stdcall if you are on Windows) for the registration of plugins and functions. If you use a C++ function then you are forcing all plugin authors to use the same compiler and compiler version that you use, since many things like C++ function name mangling, STL implementation and calling conventions are compiler specific.
Plugins should only export C functions like the kernel.
From the definition of getFunction it seems each plugin has a name, which other plugins can use to obtain its functions. This is not a safe practice, two developers can create two different plugins with the same name, so when a plugin asks for some other plugin by name it may get a different plugin than the expected one. A better solution would be for plugins to have a public GUID. This GUID can appear in each plugin's header file, so that other plugins can refer to it.
You have not implemented versioning. Ideally you want your kernel to be versioned because invariably you will change it in the future. When a plugin registers with the kernel it passes the version of the kernel API it was compiled against. The kernel then can decide if the plugin can be loaded. For example, if kernel version 1 receives a registration request for a plugin that requires kernel version 2 you have a problem, the best way to address that is to not allow the plugin to load since it may need kernel features that are not present in the older version. The reverse case is also possible, kernel v2 may or may not want to load plugins that were created for kernel v1, and if it does allow it it may need to adapt itself to the older API.
I'm not sure I like the idea of a plugin being able to locate another plugin and call its functions directly, as this breaks encapsulation. It seems better to me if plugins advertise their capabilities to the kernel, so that other plugins can find services they need by capability instead of by addressing other plugins by name or GUID.
Be aware that any plugin that allocates memory needs to provide a deallocation function for that memory. Each plugin could be using a different run-time library, so memory allocated by a plugin may be unknown to other plugins or the kernel. Having allocation and deallocation in the same module avoids problems.
C++ has no ABI. So what you want doing has a restriction: the plugins and your framework must compile & link by same compiler & linker with same parameter in same os. That is meaningless if the achievement is inter-operation in form of binary distribution because each plugin developed for framework has to prepare many version which target at different compiler on different os. So distrbute source code will be more practical than this and that's the way of GNU(download a src, configure and make)
COM is a chose, but it is too complex and out-of-date. Or managed C++ on .Net runtime. But they are only on ms os. If you want a universal solution, I suggest you change to another language.
As jean mentions, since there is no standard C++ ABI and standard name mangling conventions you are stuck to compile things with same compiler and linker. If you want a shared library/dll kind of plugins you have to use something C-ish.
If all will be compiled with same compiler and linker, you may want to also consider std::function.
typedef std::function<void ()> plugin_function;
std::map<std::string, plugin_function> fncMap;
void register_func(std::string name, plugin_function fnc)
{
fncMap[name] = fnc;
}
void call(std::string name)
{
auto it = fncMap.find(name);
if (it != fncMap.end())
(it->second)(); // it->second is a function object
}
///////////////
void func()
{
std::cout << "plain" << std::endl;
}
class T
{
public:
void method()
{
std::cout << "method" << std::endl;
}
void method2(int i)
{
std::cout << "method2 : " << i << std::endl;
}
};
T t; // of course "t" needs to outlive the map, you could just as well use shared_ptr
register_func("plain", func);
register_func("method", std::bind(&T::method, &t));
register_func("method2_5", std::bind(&T::method2, &t, 5));
register_func("method2_15", std::bind(&T::method2, &t, 15));
call("plain");
call("method");
call("method2_5");
call("method2_15");
You can also have plugin functions that take argumens. This will use the placeholders for std::bind, but soon you can find that it is somewhat lacking behind boost::bind. Boost bind has nice documentation and examples.
There is no reason why you should not do this. In C++ using this style of pointer is the best since it's just a plain pointer. I know of no popular compiler that would do anything as brain-dead as not making a function pointer like a normal pointer. It is beyond the bounds of reason that someone would do something so horrible.
The Vst plugin standard operates in a similar way. It just uses function pointers in the .dll and does not have ways of calling directly to classes. Vst is a very popular standard and on windows people use just about any compiler to do Vst plugins, including Delphi which is pascal based and has nothing to do with C++.
So I would do exactly what you suggest personally. For the common well-known plugins I would not use a string name but an integer index which can be looked up much faster.
The alternative is to use interfaces but I see no reason to if your thinking is already based around function pointers.
If you use interfaces then it is not so easy to call the functions from other languages. You can do it from Delphi but what about .NET.
With your function pointer style suggestion you can use .NET to make one of the plugins for example. Obviously you would need to host Mono in your program to load it but just for hypothetical purposes it illustrates the simplicity of it.
Besides, when you use interfaces you have to get into reference counting which is nasty. Stick your logic in function pointers like you suggest and then wrap the control in some C++ classes to do the calling and stuff for you. Then other people can make the plugins with other languages such as Delphi Pascal, Free Pascal, C, Other C++ compilers etc...
But as always, regardless of what you do, exception handling between compilers will remain an issue so you have to think about the error handling. Best way is that the plugins own method catches own plugin exceptions and returns an error code to the kernel etc...
With all the excellent answers above, I'll just add that this practice is actually pretty wide distributed. In my practice, I've seen it both in commercial projects and in freeware/opensource ones.
So - yes, it's good and proven architecture.
You don't need to register functions manually. Really? Really.
What you could use is a proxy implementation for your plugin interface, where each function loads its original from the shared library on demand, transparently, and calls it. Whoever reaches a proxy object of that interface definition just can call the functions. They will be loaded on demand.
If plugins are singletons, then there is no need for manual binding at all (otherwise the correct instance has to be chosen first).
The idea for the developer of a new plugin would be to describe the interface first, then have a generator which generates a stub for the implementation for the shared library, and additionally a plugin proxy class with the same signature but with the autoloading on demand which then is used in the client software. Both should fulfill the same interface (in C++ a pure abstract class).

Pattern to share data between objects in C++

I have started a migration of a high energy physics algorithm written in FORTRAN to an object oriented approach in C++. The FORTRAN code uses a lot of global variables all across a lot of functions.
I have simplified the global variables into a set of input variables, and a set of invariants (variables calculated once at the beginning of the algorithm and then used by all the functions).
Also, I have divided the full algorithm into three logical steps, represented by three different classes. So, in a very simple way, I have something like this:
double calculateFactor(double x, double y, double z)
{
InvariantsTypeA invA();
InvariantsTypeB invB();
// they need x, y and z
invA.CalculateValues();
invB.CalculateValues();
Step1 s1();
Step2 s2();
Step3 s3();
// they need x, y, z, invA and invB
return s1.Eval() + s2.Eval() + s3.Eval();
}
My problem is:
for doing the calculations all the InvariantsTypeX and StepX objects need the input parameters (and these are not just three).
the three objects s1, s2 and s3 need the data of the invA and invB objects.
all the classes use several other classes through composition to do their job, and all those classes also need the input and the invariants (by example, s1 has a member object theta of class ThetaMatrix that needs x, z and invB to get constructed).
I cannot rewrite the algorithm to reduce the global values, because it follows several high energy physics formulas, and those formulas are just like that.
Is there a good pattern to share the input parameters and the invariants to all the objects used to calculate the result?
Should I use singletons? (but the calculateFactor function is evaluated around a million of times)
Or should I pass all the required data as arguments to the objects when they are created?(but if I do that then the data will be passed everywhere in every member object of every class, creating a mess)
Thanks.
Well, in C++ the most suitable solution, given your constraints and conditions, is represented by pointers. Many developers told you to use boost::shared_ptr. Well it is not necessary, although it provides a better performance especially when considering portability and robustness to system faults.
It is not necessary for you to bind to boost. It is true that they are not compiled and that now standardization processes will lead to c++ with boost directly integrated as a standard library, but if you do not want to use an external library you obviously can.
So let's go and try to solve your problem using just C++ and what it provides actually.
You'll probably have a main method and there, you told before, initialize all invariants elements... so you basically have constants and they can be every possible type. no need to make them constant if you want, however, in main you instantiate your invariant elements and point them for all those components requiring their usage. First in a separate file called "common_components.hpp" consider the following (I assume that you need some types for your invariant variables):
typedef struct {
Type1 invariant_var1;
Type2 invariant_var2;
...
TypeN invariant_varN;
} InvariantType; // Contains the variables I need, it is a type, instantiating it will generate a set of global variables.
typedef InvariantType* InvariantPtr; // Will point to a set of invariants
In your "main.cpp" file you'll have:
#include "common_components.hpp"
// Functions declaration
int main(int, char**);
MyType1 CalculateValues1(InvariantPtr); /* Your functions have as imput param the pointer to globals */
MyType2 CalculateValues2(InvariantPtr); /* Your functions have as imput param the pointer to globals */
...
MyType3 CalculateValuesN(InvariantPtr); /* Your functions have as imput param the pointer to globals */
// Main implementation
int main(int argc, char** argv) {
InvariantType invariants = {
value1,
value2,
...
valueN
}; // Instantiating all invariants I need.
InvariantPtr global = &invariants;
// Now I have my variable global being a pointer to global.
// Here I have to call the functions
CalculateValue1(global);
CalculateValue2(global);
...
CalculateValueN(global);
}
If you have functions returning or using the global variable use the pointer to the struct modifying you methods' interface. By doing so all changes will be flooded to all using thoss variables.
Why not passing the invariants as a function parameter or to the constructor of the class having the calculateFactor method ?
Also try to gather parameters together if you have too many params for a single function (for instance, instead of (x, y, z) pass a 3D point, you have then only 1 parameter instead of 3).
three logical steps, represented by three different classes
This may not have been the best approach.
A single class can have a large number of "global" variables, shared by all methods of the class.
What I've done when converting old codes (C or Fortran) to new OO structures is to try to create a single class which represents a more complete "thing".
In some case, well-structured FORTRAN would use "Named COMMON Blocks" to cluster things into meaningful groups. This is a hint as to what the "thing" really was.
Also, FORTRAN will have lots of parallel arrays which aren't really separate things, they're separate attributes of a common thing.
DOUBLE X(200)
DOUBLE Y(200)
Is really a small class with two attributes that you would put into a collection.
Finally, you can easily create large classes with nothing but data, separate from the the class that contains the functions that do the work. This is kind of creepy, but it allows you to finesse the common issue by translating a COMMON block into a class and simply passing an instance of that class to every function that uses the COMMON.
There is a very simple template class to share data between objects in C++ and it is called shared_ptr. It is in the new STL and in boost.
If two objects both have a shared_ptr to the same object they get shared access to whatever data it holds.
In your particular case you probably don't want this but want a simple class that holds the data.
class FactorCalculator
{
InvariantsType invA;
InvariantsType invB;
public:
FactorCalculator() // calculate the invariants once per calculator
{
invA.CalculateValues();
invB.CalculateValues();
}
// call multiple times with different values of x, y, z
double calculateFactor( double x, double y, double z ) /*const*/
{
// calculate using pre-calculated values in invA and invB
}
};
Instead of passing each parameter individually, create another class to store them all and pass an instance of that class:
// Before
void f1(int a, int b, int c) {
cout << a << b << c << endl;
}
// After
void f2(const HighEnergyParams& x) {
cout << x.a << x.b << x.c << endl;
}
First point: globals aren't nearly as bad (in themselves) as many (most?) programmers claim. In fact, in themselves, they aren't really bad at all. They're primarily a symptom of other problems, primarily 1) logically separate pieces of code that have been unnecessarily intermixed, and 2) code that has unnecessary data dependencies.
In your case, it sounds like already eliminated (or at least minimized) the real problems (being invariants, not really variables eliminates one major source of problems all by itself). You've already stated that you can't eliminate the data dependencies, and you've apparently un-mingled the code to the point that you have at least two distinct sets of invariants. Without seeing the code, that may be coarser granularity than really needed, and maybe upon closer inspection, some of those dependencies can be eliminated completely.
If you can reduce or eliminate the dependencies, that's a worthwhile pursuit -- but eliminating the globals, in itself, is rarely worthwhile or useful. In fact, I'd say within the last decade or so, I've seen fewer problems caused by globals, than by people who didn't really understand their problems attempting to eliminate what were (or should have been) perfectly fine as globals.
Given that they are intended to be invariant, what you probably should do is enforce that explicitly. For example, have a factory class (or function) that creates an invariant class. The invariant class makes the factory its friend, but that's the only way members of the invariant class can change. The factory class, in turn, has (for example) a static bool, and executes an assert if you attempt to run it more than once. This gives (a reasonable level of) assurance that the invariants really are invariant (yes, a reinterpret_cast will let you modify the data anyway, but not by accident).
The one real question I'd have is whether there's a real point in separating your invariants into two "chunks" if all the calculations really depend on both. If there's a clear, logical separation between the two, that's great (even if they do get used together). If you have what's logically a single block of data, however, trying to break it into pieces may be counterproductive.
Bottom line: globals are (at worst) a symptom, not a disease. Insisting that you're going to get the patient's temperature down to 98.6 degrees may be counterproductive -- especially if the patient is an animal whose normal body temperature is actually 102 degrees.
uhm. Cpp is not necessarily object oriented. It is the GTA of programming! You are free to be a Object obscessed freak, a relax C programmer, a functional programmer, what ever; a mix martial artist.
My point, if Global variables worked in your fortran compile, just copy and paste to Cpp. No need to avoid global variables. It follows the principle of, dont touch legacy code.
Lets understand why global variables may cause problem. As you know, variables is the programs`s state and state is the soul of the program. Bad or invalid state causes runtime and logic errors. The problem with global variables/ global state, is that any part of our code has access to it; thus in case of invalid state, their are many bad guys or culprits to consider, meaning functions and operators. However this is only applicable if you really used so many functions on your global variable. I mean you are the only one working on your lonely program. Global variables are only a real problem if you are doing a team project. In that case many people have access to it, writing different functions that may or may not be accessing that variable.

Closures in C++

I've found myself in a strange place, mentally. In a C++ project, I long for closures.
Background. There's a Document-type class with a public Render method which spawns a deep call tree. There's some transient state that only makes sense during rendering. Right now it resides in the class like regular member variables. However, this is not satisfactory on some levels - this data only makes sense during a Render call, why store it all the time? Passing it around in arguments would be ugly - there are around 15 variables there. Passing around a structure would add a lot of "RenderState->..." in the lower-level methods.
So what do I want? I want the world, like we all do. Specifically, a set of variables that are:
available to some methods in a class (not all of them)
accessible by name alone (no pState->... stuff - so that refactoring is easy)
not copied around on every method call
only live during a method call and up its call tree (assuming trees grow up)
live on a stack
I know I can have some of those properties with C++ - but not all of them. Tell me I'm not turning weird.
Heck, in Pascal, of all places, nested functions give you all that...
So what is a good workaround to emulate closures in C++, getting as many of the above benefits as possible?
Standard C++ since C++11 provides native lambda expressions and several compilers (VC10+ GCC and clang at least) implements it.
With GCC and Clang you can activate it with "--std=c++11" (or use a higher version of C++ if available). VC10 and later versions have it activated without need for flags.
By the way, you can also use boost::lambda (that is not perfect but works with C++03) also provide lambda in C++.
You don't have nested functions, but you have local classes:
void Document::Render(Param)
{
class RenderState
{
public:
RenderState(Document&)
{
//...
}
void Go(Param);
private:
// "Nested" functions
// ....
// Data that nested functions operate on
// ...
};
RenderState s(*this);
s.Go(Param);
}
See this GotW article for more information
Personally, I'd go with the RenderState approach.
Alternatively, if there's a well-defined set of Render-only functions that all require access to the same data, I'd seriously investigate pulling those into their own DocumentRenderer class that contains both the appropriate methods and the appropriate member variables. (This is similar to Fowler's "method object" refactoring.)
C++ doesn't have nested functions, but local classes can serve as an imperfect solution. (Imperfect because local classes' methods cannot access variables of the enclosing class and because they can't be used to instantiate templates.) A local class is simply a class that's declared, along with its methods, within the body of a function. Herb Sutter discusses local classes in more detail here.
Local classes are used to implement Boost's ScopeExit library. ScopeExit's reviewers noted that ScopeExit "suggests a method for creating a general closure mechanism as a library," so if you aren't happy with a RenderState or DocumentRenderer approach, ScopeExit's implementation may give you some ideas for closures in C++.
Currently there are no closures in C++ that would generate "orinary" first-class functions (whether member or non-member). Moreover, there's no standard way to implement such closures.
Closure semantics is available for functors in template metaprogramming at compile time, but that's a completely different kind of beast. In order to obtain a true run-time closure functionality for first-class functions you haver no other choice but to use a non-standard low-level implementation like this one, for example.
A functor is basically a closure.
Why the downvotes? Take Érics comment, change void Go(Param); to void operator () (Param); and there you have it.
There is no way to keep the stack in a native application after the function has exited. But this would be neccesary to make closures like the ones in Javascript. And there is no way to reference a function's stack without doing anything evil. A class that acts like a function (=a functor) would have to get all the relevant information passed somehow, but this is as close as you get in C++. It has state, it has code, and you can pass it around.
Please explain, where am I wrong?
As long as the local variables you want to bind are in scope, you can try something like the following to bind them to your inner class. Though, if you have read the above posted GotW article, it is a fragile solution.
#include <iostream>
using namespace std;
int main() {
int x = 1;
cout << x << endl; // 1
class Inner {
public:
Inner(int& x) : bound_x(x) {}
void do_sth() { ++bound_x; }
private:
int& bound_x;
};
Inner i(x);
i.do_sth();
cout << x << endl; // 2
x = 5;
i.do_sth();
cout << x << endl; // 6
return 0;
}