C/C++ Dynamic loading of functions with unknown prototype - c++

I'm in the process of writing a kind of runtime system/interpreter, and one of things that I need to be able to do is call c/c++ functions located in external libraries.
On linux I'm using the dlfcn.h functions to open a library, and call a function located within. The problem is that, when using dlsysm() the function pointer returned need to be cast to an appropriate type before being called so that the function arguments and return type are know, however if I’m calling some arbitrary function in a library then obviously I will not know this prototype at compile time.
So what I’m asking is, is there a way to call a dynamically loaded function and pass it arguments, and retrieve it’s return value without knowing it’s prototype?
So far I’ve come to the conclusion there is not easy way to do this, but some workarounds that I’ve found are:
Ensure all the functions I want to load have the same prototype, and provide some sort mechanism for these functions to retrieve parameters and return values. This is what I am doing currently.
Use inline asm to push the parameters onto the stack, and to read the return value. I really want to steer clear of doing this if possible!
If anyone has any ideas then it would be much appreciated.
Edit:
I have now found exactly what I was looking for:
http://sourceware.org/libffi/
"A Portable Foreign Function Interface Library"
(Although I’ll admit I could have been clearer in the original question!)

What you are asking for is if C/C++ supports reflection for functions (i.e. getting information about their type at runtime). Sadly the answer is no.
You will have to make the functions conform to a standard contract (as you said you were doing), or start implementing mechanics for trying to call functions at runtime without knowing their arguments.
Since having no knowledge of a function makes it impossible to call it, I assume your interpreter/"runtime system" at least has some user input or similar it can use to deduce that it's trying to call a function that will look like something taking those arguments and returning something not entirely unexpected. That lookup is hard to implement in itself, even with reflection and a decent runtime type system to work with. Mix in calling conventions, linkage styles, and platforms, and things get nasty real soon.
Stick to your plan, enforce a well-defined contract for the functions you load dynamically, and hopefully make due with that.

Can you add a dispatch function to the external libraries, e.g. one that takes a function name and N (optional) parameters of some sort of variant type and returns a variant? That way the dispatch function prototype is known. The dispatch function then does a lookup (or a switch) on the function name and calls the corresponding function.
Obviously it becomes a maintenance problem if there are a lot of functions.

I believe the ruby FFI library achieves what you are asking. It can call functions
in external dynamically linked libraries without specifically linking them in.
http://wiki.github.com/ffi/ffi/
You probably can't use it directly in your scripting language but perhapps the ideas are portable.
--
Brad Phelan
http://xtargets.heroku.com

I'm in the process of writing a kind of runtime system/interpreter, and one of things that I need to be able to do is call c/c++ functions located in external libraries.
You can probably check for examples how Tcl and Python do that. If you are familiar with Perl, you can also check the Perl XS.
General approach is to require extra gateway library sitting between your interpreter and the target C library. From my experience with Perl XS main reasons are the memory management/garbage collection and the C data types which are hard/impossible to map directly on to the interpreter's language.
So what I’m asking is, is there a way to call a dynamically loaded function and pass it arguments, and retrieve it’s return value without knowing it’s prototype?
No known to me.
Ensure all the functions I want to load have the same prototype, and provide some sort mechanism for these functions to retrieve parameters and return values. This is what I am doing currently.
This is what in my project other team is doing too. They have standardized API for external plug-ins on something like that:
typedef std::list< std::string > string_list_t;
string_list_t func1(string_list_t stdin, string_list_t &stderr);
Common tasks for the plug-ins is to perform transformation or mapping or expansion of the input, often using RDBMS.
Previous versions of the interface grew over time unmaintainable causing problems to both customers, products developers and 3rd party plug-in developers. Frivolous use of the std::string is allowed by the fact that the plug-ins are called relatively seldom (and still the overhead is peanuts compared to the SQL used all over the place). The argument stdin is populated with input depending on the plug-in type. Plug-in call considered failed if inside output parameter stderr any string starts with 'E:' ('W:' is for warnings, rest is silently ignored thus can be used for plug-in development/debugging).
The dlsym is used only once on function with predefined name to fetch from the shared library array with the function table (function public name, type, pointer, etc).

My solution is that you can define a generic proxy function which will convert the dynamic function to a uniform prototype, something like this:
#include <string>
#include <functional>
using result = std::function<std::string(std::string)>;
template <class F>
result proxy(F func) {
// some type-traits technologies based on func type
}
In user-defined file, you must add define to do the convert:
double foo(double a) { /*...*/ }
auto local_foo = proxy(foo);
In your runtime system/interpreter, you can use dlsym to define a foo-function. It is the user-defined function foo's responsibility to do calculation.

Related

Strategy for Wrapping external library return types

I have been asked to unit test some legacy code.
Currently, the code is tightly coupled with a 3rd party library both in terms of method calls and types used.
I am planning on writing a wrapper around the library in the form of a Façade design pattern which will aid in testability, create a cleaner interface for the rest of the code and allow me to swap out the library in the future if required.
This works fine where the method calls are void return type because the library functions are self contained. But what if the existing code uses library specific types? An example is here:
LibrarySpecificType[] myVar = wrappedLibrary.DoX();
Although I have wrapped my library call in the above example, it still returns a library specific type, so it is still somewhat coupled.
Does anybody know a way around this?
you can just create wrapper classes around the types that are returned and have the wrappedLibrary return those wrapped types instead. This might be quite a lot of work if each of those types also exposes methods which accept and return other types. Something like this:
WrappedLibrarySpecificType[] myVar = wrappedLibrary.DoX();
Then in the library wrapper will have to call the actual library and wrap the type the library returns and return the wrapped type.
This ends up being a rabbit hole though and you will probably need to wrap every type.
If this is a large library you might find some benefit in writing (or using) a tool which will be able to generate the wrappers for you by reflecting over the types in the third-party library
you might have some assistance in generating the delegating members, depending on your ide

Where should the user-defined parameters of a framework be ?

I am kind of a newbie and I am creating a framework to evolve objects in C++ with an evolutionary algorithm.
An evolutionary algorithm evolves objects and tests them to get the best solution (for example, evolve the weights neural network and test it on sample data, so that in the end you get a network which has a good accuracy, without having trained it).
My problem is that there are lots of parameters for the algorithm (type of selection/crossover/mutation, probabilities for each of them...) and since it is a framework, the user should be able to easily access and modify them.
CURRENT SOLUTION
For now, I created a header file parameters.h of this form:
// DON'T CHANGE THESE PARAMETERS
//mutation type
#define FLIP 1
#define ADD_CONNECTION 2
#define RM_CONNECTION 3
// USER DEFINED
static const int TYPE_OF_MUTATION = FLIP;
The user modifies the static variables TYPE_OF_MUTATION and then my mutation function tests what the value of TYPE_OF_MUTATION is and calls the right mutation function.
This works well, but it has a few drawbacks:
when I change a parameter in this header and then call "make", no change is taken into account, I have to call "make clean" then "make". From what I saw, it is not a problem in the makefile but it is how building works. Even if it did re-build when I change a parameter, it would mean re-compile the whole project as these parameters are used everywhere; it is definitely not efficient.
if you want to run the genetic algorithm several times with different parameters, you have to run it a first time then save the results, change the parameters then run it a second time etc.
OTHER POSSIBILITIES
I thought about taking these parameters as arguments of the top-level function. The problem is that the function would then take 20 arguments or so, it doesn't seem really readable...
What I mean about the top-level function is that for now, the evolutionary algorithm is run simply by doing this:
PopulationManager myPop;
myPop.evolveIt();
If I defined the parameters as arguments, we would have something like:
PopulationManager myPop;
myPop.evolveIt(20,10,5,FLIP,9,8,2,3,TOURNAMENT,0,23,4);
You can see how hellish it may be to always define parameters in the right order !
CONCLUSION
The frameworks I know make you build your algorithm yourself from pre-defined functions, but the user shouldn't have to go through all the code to change parameters one by one.
It may be useful to indicate that this framework will be used internally, for a definite set of projects.
Any input about the best way to define these parameters is welcome !
If the options do not change I usually use a struct for this:
enum class MutationType {
Flip,
AddConnection,
RemoveConnection
};
struct Options {
// Documentation for mutation_type.
MutationType mutation_type = MutationType::Flip;
// Documentation for integer option.
int integer_option = 10;
};
And then provide a constructor that takes these options.
Options options;
options.mutation_type = MutationType::AddConnection;
PopulationManager population(options);
C++11 makes this really easy, because it allows specifying defaults for the options, so a user only needs to set the options that need to be different from the default.
Also note that I used an enum for the options, this ensures that the user can only use correct values.
This is a classic example of polymorphism. In your proposed implementation you're doing a switch on constant to decide which polymorphic mutation algorithm you will choose to decide how to mutate the parameter. In C++, the corresponding mechanisms are templates (static polymorphism) or virtual functions (dynamic polymorphism) to select the appropriate mutating algorithm to apply to the parameter.
The templates way has the advantage that everything is resolvable at compile time and the resulting mutating algorithm could be inlined entirely, depending on the implementation. What you give up is the ability to dynamically select parameter mutation algorithms at runtime.
The virtual function way has the advantage that you can defer the choice of mutation algorithm until runtime, allowing this to vary based on input from the user or whatnot. The disadvantage is that the mutation algorithm can no longer be inlined and you pay the cost of a virtual function call (an extra level of indirection) when you mutate the parameter.
If you want to see a real example of how "algorithmic mutation" can work, look at evolve.cpp in my Iterated Dynamics repository on github. This is C code converted to C++ so it is neither using templates nor using virtual functions. Instead it uses function pointers and a switch-on-constant to select the appropriate code. However, the idea is the same.
My recommendation would be to see if you can use static polymorphism (templates) first. From your initial description you were fixing the mutation at compile-time anyway, so you're not giving anything up.
If that was just a prototyping phase and you intended to support switching of mutation algorithms at runtime, then look at virtual functions. As the other answer recommended, please shun C-style coding like #define constants and instead use proper enums.
To solve the "long parameter list smell", the idea of packing all the parameters into a structure is a good one. You can achieve more readability on top of that by using the builder pattern to build up the structure of parameters in a more readable way than just assigning a bunch of values into a struct. In this blog post, I applied the builder pattern to the resource description structures in Direct3D. That allowed me to more directly express these "bags of data" with reasonable defaults and directly reveal my intent to override or replace default values with special values when necessary.

windows programming , how to create a function in dll which can take all datatypes as input?

I want a common function which can take any data type as a argument and return result in that data type only. How to implement this via dll.
It seems that you would like to export in the dll a templated function, without specifying it's type.
You cannot do that because templates are resolved at compile time (so when the code is generated). As mentioned by #MSlaters you cannot have an infinitely big template.
If you have a predefined number of data types, you can force instantiate each of them in your dll code in order to have them exposed.
If you want to make the most generic thing possible , you can only have
void* getResult (void* inputParameter)
But unfortunately, you won't know how the memory is mapped for the object (so less of gain, more of a pain if you'd ask me).
Not. A DLL contains compiled code, in particular the return statements. Since you support an inifinite number of types with an infinite number of return statements, the DLL would be infinitely big.

Implications of using std::vector in a dll exported function

I have two dll-exported classes A and B. A's declaration contains a function which uses a std::vector in its signature like:
class EXPORT A{
// ...
std::vector<B> myFunction(std::vector<B> const &input);
};
(EXPORT is the usual macro to put in place _declspec(dllexport)/_declspec(dllimport) accordingly.)
Reading about the issues related to using STL classes in a DLL interface, I gather in summary:
Using std::vector in a DLL interface would require all the clients of that DLL to be compiled with the same version of the same compiler because STL containers are not binary compatible. Even worse, depending on the use of that DLL by clients conjointly with other DLLs, the ''instable'' DLL API can break these client applications when system updates are installed (e.g. Microsoft KB packages) (really?).
Despite the above, if required, std::vector can be used in a DLL API by exporting std::vector<B> like:
template class EXPORT std::allocator<B>;
template class EXPORT std::vector<B>;
though, this is usually mentioned in the context when one wants to use std::vector as a member of A (http://support.microsoft.com/kb/168958).
The following Microsoft Support Article discusses how to access std::vector objects created in a DLL through a pointer or reference from within the executable (http://support.microsoft.com/default.aspx?scid=kb;EN-US;Q172396). The above solution to use template class EXPORT ... seems to be applicable too. However, the drawback summarized under the first bullet point seems to remain.
To completely get rid of the problem, one would need to wrap std::vector and change the signature of myFunction, PIMPL etc..
My questions are:
Is the above summary correct, or do I miss here something essential?
Why does compilation of my class 'A' not generate warning C4251 (class 'std::vector<_Ty>' needs to have dll-interface to be used by clients of...)? I have no compiler warnings turned off and I don't get any warning on using std::vector in myFunction in exported class A (with VS2005).
What needs to be done to correctly export myFunction in A? Is it viable to just export std::vector<B> and B's allocator?
What are the implications of returning std::vector by-value? Assuming a client executable which has been compiled with a different compiler(-version). Does trouble persist when returning by-value where the vector is copied? I guess yes. Similarly for passing std::vector as a constant reference: could access to std::vector<B> (which might was constructed by an executable compiled with a different compiler(-version)) lead to trouble within myFunction? I guess yes again..
Is the last bullet point listed above really the only clean solution?
Many thanks in advance for your feedback.
Unfortunately, your list is very much spot-on. The root cause of this is that DLL-to-DLL or DLL-to-EXE is defined on the level of the operating system, while the the interface between functions is defined on the level of a compiler. In a way, your task is similar (although somewhat easier) to that of client-server interaction, when the client and the server lack binary compatibility.
The compiler maps what it can to the way the DLL importing and exporting is done in a particular operating system. Since language specifications give compilers a lot of liberty when it comes to binary layout of user-defined types and sometimes even built-in types (recall that the exact size of int is compiler-dependent, as long as minimal sizing requirements are met), importing and exporting from DLLs needs to be done manually to achieve binary-level compatibility.
When you use the same version of the same compiler, this last issue above does not create a problem. However, as soon as a different compiler enters the picture, all bets are off: you need to go back to the plainly-typed interfaces, and introduce wrappers to maintain nice-looking interfaces inside your code.
I've been having the same problem and discovered a neat solution to it.
Instead of passing std:vector, you can pass a QVector from the Qt library.
The problems you quote are then handled inside the Qt library and you do not need to deal with it at all.
Of course, the cost is having to use the library and accept its slightly worse performance.
In terms of the amount of coding and debugging time it saves you, this solution is well worth it.

Removing a parameter list from f(list) with preprocessor

It seems to me that I saw something weird being done in a boost library and it ended up being exactly what I'm trying to do now. Can't find it though...
I want to create a macro that takes a signature and turns it into a function pointer:
void f(int,int) {}
...
void (*x)(int,int) = WHAT( (f(int,int)) );
x(2,4); // calls f()
I especially need this to work with member function pointers so that WHAT takes two params:
WHAT(ClassType, (f(int,int)); // results in static_cast<void (ClassType::*)(int,int)>(&ClassType::f)
It's not absolutely necessary in order to solve my problem, but it would make things a touch nicer.
This question has nothing, per-se, to do with function pointers. What needs to be done is to use the preprocessor to take "f(int,int)" and turn it into two different parts:
'f'
'(int,int)'
Why:
I've solved the problem brought up here: Generating Qt Q_OBJECT classes pragmatically
I've started a series of articles explaining how to do it:
http://crazyeddiecpp.blogspot.com/2011/01/quest-for-sane-signals-in-qt-step-1.html
http://crazyeddiecpp.blogspot.com/2011/01/quest-for-sane-signals-in-qt-step-2.html
The signature must be evaluated from, and match exactly, the "signal" that the user is attempting to connect with. Qt users are used to expressing this as SIGNAL(fun(param,param)), so something like connect_static(SIGINFO(object,fun(param,param)), [](int,int){}) wouldn't feel too strange.
In order to construct the signature I need to be able to pull it out of the arguments supplied. There's enough information to get the member function address (using C++0x's decltype) and fetch the signature in order to generate the appropriate wrapper but I can't see how to get it out. The closest I can come up with is SIGINFO(object, fun, (param,param)), which is probably good enough but I figured I'd ask here before considering it impossible to get the exact syntax I'd prefer.
What are you trying to do is impossible using standard preprocessor, unfortunately. There are a couple of reasons:
It is impossible to split parameters passed to a macro using custom character. They have to be comma delimited. Otherwise that could solve your problem instantly.
You cannot use preprocessor to define something that is not an identifier. Otherwise you could use double expansion where ( and ) is defined as , and split arguments on that as if it was passed as f, int, int,, then process it as variadic arguments.
Function pointer definition in C++ does not allow you to deduce the name given to defined type, unfortunately.
Going even further, even if you manage to create a function pointer, the code won't work for methods because in order to invoke a method, you need to have two pointers - pointer to the method and to the class instance. This means you have to have some wrapper around this stuff.
That is why QT is using its own tools like moc to generate glue code.
The closes thing you might have seen in Boost is probably Signals, Bind and Lambda libraries. It is ironic that those libraries are much more powerful than what you are trying to achieve, but at the same time they won’t allow you to achieve it the way you want it. For example, even if you could do what you want with the syntax you want, you won’t be able to “connect” a slot to a “signal” if signal has a different signature. At the same time, libraries from Boost I mentioned above totally allow that. For example, if your “slot” expects more parameters than “signal” provides, you can bind other objects to be passed when “slot” is invoked. Those libraries can also suppress extra parameters if “slot” does not expect them.
I’d say the best way from C++ prospective as for today is to use Boost Signal approach to implement event handling in GUI libraries. QT doesn’t use it for a number of reasons. First, it started in like 90-s when C++ was not that fancy. Plus, they have to parse your code in order to work with “slots” and “signals” in graphic designer.
It seems for me than instead of using macros or even worse – non-standard tools on top of C++ to generate code, and using the following:
void (*x)(int,int) = WHAT( (f(int,int)) );
It would be much better to do something like this:
void f (int x, int y, int z);
boost::function<void (int, int)> x = boost::bind (&f, _1, _2, 3);
x (1, 2);
Above will work for both functions and methods.