I'm working on a legacy code base that has this pattern:
struct sometype_t { /* ... */ };
int some_method(void *arg1) { // void pointer
((sometype_t*)arg1)->prop1; // cast
}
Is there any (common) scenario where it would be unsafe to use sometype_t * instead of void *?
int some_method(sometype_t *arg1) {
arg1->prop1;
}
The pointer isn't passed across ABIs or into 3rd-party libraries; it stays entirely within C++ code that we own.
It's usually not a good choice, but the only situation I'm aware of where this really make sense is if you want to have stateful callbacks passed into a function, without using templates:
void takes_callback(void(*f)(void*), void * data);
Basically the gist is that since you aren't using templates, you have to fix the function signature you accept (of course, it can and often does take other arguments and return something as well). If you just call the function with your own parameters though, the function can only hold state between calls via global variables. So instead the contract for takes_callback promises to call f with data as a parameter.
So, if you wanted to use some_method as a callback in such an API, you would have to have it take void*, and do the cast internally. Obviously, you are throwing away type safety here, and if you happen to call takes_callback with &somemethod and a pointer to anything that's not a sometype_t you have UB.
Having a C ABI is one reason to avoid templates, but it's not the only one. Maybe they were worried about code bloat, or wanted to keep the implementation in a .so so that versions could be upgraded without recompiling, etc.
The obvious common scenario that immediately comes to mind is callbacks for some functions from C standard library.
For example, the proper way to write the comparison callback for std::qsort is to declare the function with two const void * arguments and then cast them to proper specific pointer types inside the callback.
Replacing these const void * parameters with specifically-typed pointers will simply prevent the code from compiling.
I am developing a VM and I would like to make it possible to call compiled functions. However, because every function may end up having a different signature, my plan is to generalize all calls to 2 possible scenarios - a call for function with no return and no parameters, and a call to a function which takes one void * parameter.
The plan is to use it similarly to thiscall - all parameters are properly aligned at the location of the passed pointer and parameters are retrieved through indirection. Shouldn't be slower than reading them from the stack, at least IMO.
So instead of:
int foo(int a, int b) { return a+b; }
I can have something like:
void foo2(void *p) {
*(int*)p = *(int*)(p + 4) + *(int*)(p + 8);
}
So my question is what could potentially go wrong using this approach? What I can tell right away is it works "in the dark" so it would be crucial to calculate the offsets correctly. It is also a little inconvenient since all the temporaries need to be provided by the user. Assuming my VM compiler will deal with those two issues, I am mostly concerned about performance - I don't want to create a normal function and for each normal function a void * wrapper - I would like to directly use that convention for all functions, so I can't help but wonder how good of a job will the compiler do of inlining the functions when used in compiled code? Are there going to be any other possible performance implications I am overlooking (excluding __fastcall which will use one more register and one less indirection)?
Performance wise (and ease of use) you will probably be best off with cdecl - everything goes onto the stack. The C standard allows you to specify function prototypes with arbitrary arguments
typedef void (__cdecl * function_with_any_parameters)();
You will have to make sure to define all functions that you wish to invoke as:
void __cdecl f(type1 arg1, type2 arg2, type3 arg3); // any amount of arguments
And just invoke them with the right amount of arguments:
f(arg1, arg2, arg3, arg4);
If you wish to go through a single pointer then you do have extra overhead: the one pointer. The easiest way would be to define all functions as accepting a pointer to an anonymous struct:
void f(struct {type1 a; type2 b;} * args);
Then you can invoke the function with a pointer to the appropriate struct to avoid any misalignments.
struct {type1 a; type2 b;} args = {arg1, arg2};
f(&args);
You are effectively implementing cdecl on your own.
After running a few benchmarks I'd say the compiler does a pretty good job of optimizing similar pointer functions. The void * function is just as fast as the add function and the regular + operator.
It seems that this convention will be useful to provide the necessary calling abstraction without hurting optimizations and overall performance. The only sacrifice is safety, which may or may not be a primary concern depending on the application context.
Let's say this is a C function to be wrapped:
void foo(int(__stdcall *callback)());
The two main pitfalls with C function pointer callbacks are:
Not being able to store bind expressions
Not being able to store capturing lambdas
I would like to know the best way to wrap functions like these to do so. The first is particularly useful for a member function callback, and the second for an inline definition that uses surrounding variables, but those are not the only uses.
The other property of these particular function pointers is that they need to use the __stdcall calling convention. This, to my knowledge, eliminates lambdas as an option completely, and is a bit of a nuisance otherwise. I'd like to allow at least __cdecl as well.
This is the best I am able to come up with without things starting to bend back to relying on support that function pointers don't have. It would typically be in a header. Here is the following example on Coliru.
#include <functional>
//C function in another header I have no control over
extern "C" void foo(int(__stdcall *callback)()) {
callback();
}
namespace detail {
std::function<int()> callback; //pretend extern and defined in cpp
//compatible with the API, but passes work to above variable
extern "C" int __stdcall proxyCallback() { //pretend defined in cpp
//possible additional processing
return callback();
}
}
template<typename F> //takes anything
void wrappedFoo(F f) {
detail::callback = f;
foo(detail::proxyCallback); //call C function with proxy
}
int main() {
wrappedFoo([&]() -> int {
return 5;
});
}
There is, however, a major flaw. This is not re-entrant. If the variable is reassigned to before it's used, the old function will never be called (not taking into account multithreading issues).
One thing I have tried that ended up doubling back on itself was storing the std::function as a data member and using objects, so each would operate on a different variable, but there was no way to pass the object to the proxy. Taking the object as a parameter would cause the signature to mismatch and binding it would not let the result be stored as a function pointer.
One idea I have, but have not played around with is a vector of std::function. However, I think the only real safe time to erase from it would be to clear it when nothing is using it. However, each entry is first added in wrappedFoo, then used in proxyCallback. I'm wondering if a counter that is incremented in the former and decremented in the latter, then checked for zero before clearing the vector would work, but it sounds like a more convoluted solution than necessary anyway.
Is there any way to wrap a C function with a function pointer callback such that the C++ wrapped version:
Allows any function object
Allows more than just the C callback's calling convention (if it's critical that it's the same, the user can pass in something with the right calling convention)
Is thread-safe/re-entrant
Note: The obvious solution, stated as part of Mikael Persson's answer, is to make use of the void * parameter that should exist. However, this is sadly not a be-all, end-all option, mostly due to incompetence. What possibilities exist for those functions that do not have this option is where this can get interesting, and is the primary route to a very useful answer.
You are, unfortunately, out of luck.
There are ways to generate code at runtime, for example you can read on LLVM trampoline intrinsics where you generate a forwarding function that stores additional state, very akin to lambdas but runtime defined.
Unfortunately none of those are standard, and thus you are stranded.
The simplest solution to pass state is... to actually pass state. Ah!
Well defined C callbacks will take two parameters:
A pointer to the callback function itself
A void*
The latter is unused by the code itself, and simply passed to the callback when it is called. Depending on the interface either the callback is responsible to destroy it, or the supplier, or even a 3rd "destroy" function could be passed.
With such an interface, you can effectively pass state in a thread-safe & re-entrant fashion at the C level, and thus naturally wrap this up in C++ with the same properties.
template <typename Result, typename... Args)
Result wrapper(void* state, Args... args) {
using FuncWrapper = std::function<Result(Args...)>;
FuncWrapper& w = *reinterpret_cast<FuncWrapper*>(state);
return w(args...);
}
template <typename Result, typename... Args)
auto make_wrapper(std::function<Result(Args...)>& func)
-> std::pair<Result (*)(Args...), void*>
{
void* state = reinterpret_cast<void*>(&func);
return std::make_pair(&wrapper<Result, Args...>, state);
}
If the C interface does not provide such facilities, you can hack around a bit, but ultimately you are very limited. As was said, a possible solution is to hold the state externally, using globals, and do your best to avoid contention.
A rough sketch is here:
// The FreeList, Store and Release functions are up to you,
// you can use locks, atomics, whatever...
template <size_t N, typename Result, typename... Args>
class Callbacks {
public:
using FunctionType = Result (*)(Args...);
using FuncWrapper = std::function<Result(Args...)>;
static std::pair<FunctionType, size_t> Generate(FuncWrapper&& func) {
// 1. Using the free-list, find the index in which to store "func"
size_t const index = Store(std::move(state));
// 2. Select the appropriate "Call" function and return it
assert(index < N);
return std::make_pair(Select<0, N-1>(index), index);
} // Generate
static void Release(size_t);
private:
static size_t FreeList[N];
static FuncWrapper State[N];
static size_t Store(FuncWrapper&& func);
template <size_t I, typename = typename std::enable_if<(I < N)>::type>
static Result Call(Args...&& args) {
return State[I](std::forward<Args>(args)...);
} // Call
template <size_t L, size_t H>
static FunctionType Select(size_t const index) {
static size_t const Middle = (L+H)/2;
if (L == H) { return Call<L>; }
return index <= Middle ? Select<L, Middle>(index)
: Select<Middle + 1, H>(index);
}
}; // class Callbacks
// Static initialization
template <size_t N, typename Result, typename... Args>
static size_t Callbacks<N, Result, Args...>::FreeList[N] = {};
template <size_t N, typename Result, typename... Args>
static Callbacks<N, Result, Args...>::FuncWrapper Callbacks<N, Result, Args...>::State[N] = {};
This problem has two challenges: one easy and one nearly impossible.
The first challenge is the static type transformation (mapping) from any callable "thing" to a simple function pointer. This problem is solved with a simple template, no big deal. This solves the calling convention problem (simply wrapping one kind of function with another). This is already solved by the std::function template (that's why it exists).
The main challenge is the encapsulation of a run-time state into a plain function pointer whose signature does not allow for a "user-data" void* pointer (as any half-decent C API would normally have). This problem is independent of language (C, C++03, C++11) and is nearly impossible to solve.
You have to understand a fundamental fact about any "native" language (and most others too). The code is fixed after compilation, and only the data changes at run-time. So, even a class member function that appears as if it's one function belonging to the object (run-time state), it's not, the code is fixed, only the identity of the object is changed (the this pointer).
Another fundamental fact is that all external states that a function can use must either be global or passed as a parameter. If you eliminate the latter, you only have global state to use. And by definition, if the function's operation depends on a global state, it cannot be re-entrant.
So, to be able to create a (sort-of-)re-entrant* function that is callable with just a plain function pointer and that encapsulate any general (state-ful) function object (bind'ed calls, lambdas, or whatever), you will need a unique piece of code (not data) for each call. In other words, you need to generate the code at run-time, and deliver a pointer to that code (the callback function-pointer) to the C function. That's where the "nearly impossible" comes from. This is not possible through any standard C++ mechanisms, I'm 100% sure of that, because if this was possible in C++, run-time reflection would also be possible (and it's not).
In theory, this could be easy. All you need is a piece of compiled "template" code (not template in the C++ sense) that you can copy, insert a pointer to your state (or function object) as a kind of hard-coded local variable, and then place that code into some dynamically allocated memory (with some reference counting or whatever to ensure it exists as long as it's needed). But making this happen is clearly very tricky and very much of a "hack". And to be honest, this is quite ahead of my skill level, so I wouldn't even be able to instruct you on how exactly you could go about doing this.
In practice, the realistic option is to not even try to do this. Your solution with the global (extern) variable that you use to pass the state (function object) is going in the right direction in terms of a compromise. You could have something like a pool of functions that each have their own global function object to call, and you keep track of which function is currently used as a callback, and allocate unused ones whenever needed. If you run out of that limited supply of functions, you'll have to throw an exception (or whatever error-reporting you prefer). This scheme would be essentially equivalent to the "in theory" solution above, but with a limited number of concurrent callbacks being used. There are other solutions in a similar vein, but that depends on the nature of the specific application.
I'm sorry that this answer is not giving you a great solution, but sometimes there just aren't any silver bullets.
Another option is to avoid using a C API that was designed by buffoons who never heard of the unavoidable and tremendously useful void* user_data parameter.
* "sort-of" re-entrant because it still refers to a "global" state, but it is re-entrant in the sense that different callbacks (that need different state) do not interfere with each other, as is your original problem.
As said before, a C function pointer does not contain any state, so a callback function called with no arguments can only access global state. Therefore, such a "stateless" callback function can be used only in one context, where the context is stored in a global variable. Then declare different callbacks for different contexts.
If the number of callbacks required changes dynamically (for example, in a GUI, where each windows opened by the user requires a new callback to handle input to that window), then pre-define a large pool of simple state-less callbacks, that map to a statefull callback. In C, that could be done as follows:
struct cbdata { void (*f)(void *); void *arg; } cb[10000];
void cb0000(void) { (*cb[0].f)(cb[0].arg); }
void cb0001(void) { (*cb[1].f)(cb[1].arg); }
...
void cb9999(void) { (*cb[9999].f)(cb[99999].arg); }
void (*cbfs[10000])(void) =
{ cb0000, cb0001, ... cb9999 };
Then use some higher level module to keep a list of available callbacks.
With GCC (but not with G++, so the following would need to be in a strictly C, not C++ file), you can create new callback functions even on the fly by using a not-so-well-known GCC feature, nested functions:
void makecallback(void *state, void (*cb)(void *), void (*cont)(void *, void (*)()))
{
void mycallback() { cb(state); }
cont(state, mycallback);
}
In this case, GCC creates the code for the necessary code generation for you. The downside is, that it limits you to the GNU compiler collection, and that the NX bit cannot be used on the stack anymore, as even your code will require new code on the stack.
makecallback() is called from the high-level code to create a new anonymous callback function with encapsulated state. If this new function is called, it will call the statefull callback function cb with arg state. The new anonymous callback function is useable, as long, as makecallback() does not return. Therefore, makecallback() returns control to the calling code by calling the passed in "cont" function. This example assumes, that the actual callback cb() and the normal continue function cont() both use the same state, "state". It is also possible to use two different void pointers to pass different state to both.
The "cont" function may only return (and SHOULD also return to avoid memory leaks), when the callback is no longer required. If your application is multi-threaded, and requires the various callbacks mostly for its various threads, then you should be able to have each thread at startup allocate its required callback(s) via makecallback().
However, if your app is multi-threaded anyways, and if you have (or can establish) a strict callback-to-thread relationship, then you could use thread-local vars to pass the required state. Of course, that will only work, if your lib calls the callback in the right thread.
I'm having some trouble creating a static wrapper function using template parameters. I don't want to pass the function directly to the wrapper function, because it needs a specific signature int (lua_State *) so that it can be passed into the following function:
lua_pushcfunction(L, function);
(That's right, I'm going for an auto-generated lua wrapper.)
My first thought is to create a template function with a function pointer as a non-type template argument.
template <void(* f)(void)>
int luaCaller(lua_State * _luaState)
{
f();
return 0;
}
So far, this is looking pretty good. This function has the proper signature, and calls a function I pass in via template argument.
&(luaCaller<myFunc>)
My problem arises when I try to wrap this in another function. Non-type template parameters must be externally linked, and thus the following fails:
void pushFunction(lua_State * _luaState, void(* _f)(void))
{
lua_pushcfunction(_luaState, &(luaCaller<_f>));
}
Which makes sense, because the address of the function needs to be known at compile time. You can't throw in just any pointer and expect the compiler to know which classes to create. Unfortunately, if I add a function pointer that is known at compile time it still fails. The value of the function pointer is being copied into _a, and therefore _a is still technically not known at compile time. Because of this, I would expect the following to work:
void pushFunction(lua_State * _luaState, void(* const _f)(void))
{
lua_pushcfunction(_luaState, &(luaCaller<_f>));
}
or maybe
void pushFunction(lua_State * _luaState, void(* & _f)(void))
{
lua_pushcfunction(_luaState, &(luaCaller<_f>));
}
In the first case, because the value isn't allowed to change, we know that if it is externally linked, it will still technically be externally linked. In the second case it's being passed in as a reference, which would mean it should have the same linkage, no? But neither of these attempts work. Why? Is it possible to circumvent this somehow? How can I cleanly auto generate a function that calls another function?
The const qualifier means you aren't allowed to change something, not that it's a compile-time constant. The initial value of _a is determined at runtime when the function is called, and for a * const &a, the value can even change at runtime by some external means such as another thread, if the object underlying the reference is not const.
To make the fully templated wrapper work, you need to give the compiler enough information to compile a function for each possible template argument, and provide logic to switch among those functions. The template system generates and organizes related functions, but it's not a dynamic dispatcher.
If you can add the function pointer to the lua_State object and eliminate the template parameter, that would be one solution.
Your solution would work if make the function pointer a template argument to doCaller, but that would defeat its purpose.
Rather than using a non-type function-template approach in order to bind the secondary function you want to call inside the wrapper-function, you could use a struct with a static luaCaller method. That should allow you to maintain the function signature you need for passing luaCaller to lua_pushcfunction.
So for instance, you could have a struct that looks something like the following:
template<void (*f) void>
struct wrapper
{
static int luaCaller(lua_State * _luaState)
{
f();
return 0;
}
};
template<typename Functor>
void doCaller(lua_State * _luaState, Functor wrapper)
{
Functor::luaCaller(_luaState);
}
then call it like:
doCaller(&luaState, wrapper<my_func>());
Is it possible?
If as I have understood is correct, void pointer can point to any type. Therefore, a template function (undeclared type) is possible? or void pointer is only reserve for "variable" not function? Then what about void function pointer?
You can cast any function pointer type to any other, but you'd better cast it to the right type before you call it. You can therefore use void(*)() as an equivalent to void* for function pointers. This also works with function templates.
template<typename T>
void f(T){}
typedef void(*voidfp)();
voidfp fp=static_cast<voidfp>(&f<int>); // store address of f(int) in variable
static_cast<void(*)(int)>(fp)(3); // call the function
fp=static_cast<voidfp>(&f<std::string>); // store address of f(std::string) in variable
static_cast<void(*)(std::string)>(fp)("hello"); // call the function
According to the Standard, a void* is not required to be able to hold a function pointer. (It is required to hold a pointer to any kind of data). However, most cpu architectures you're likely to see these days have data pointers & function pointers that are the same size.
There is an issue here, because of the words used I am afraid.
There is a difference between pointers and function pointers, most notably they need not be the same size.
Therefore it is undefined behavior to use void* type to hold the address of a function pointer.
In general it is not a good idea in C++ to use void*. Those are necessary in C because of the lack of a proper type system but C++ type system is much more evolved (even though not as evolved as recent languages).
You could probably benefit from some objectification here. If you make your method an instance of a class (template) you can have this class derived from a common base class. This is quite common, those objects are called Functors.
However, without a precise description of your issue, it'll be hard to help more.
to do this with templates you need some trickery, else the compiler cannot disambiguate the function (this is really not recommended, its horrible to read, and probably violates a few thousand porgamming best practices)
IE: this does not work (atleast under VS08 & GCC 3.5):
template <typename tType> tType* GetNULLType()
{
return static_cast<tType*>(0);
}
void* pf = static_cast<void*>(GetNULLType<int>);
you instead need to do:
template <typename tType> tType* GetNULLType()
{
return static_cast<tType*>(0);
}
typedef int* (*t_pointer)();
t_pointer pfGetNull = GetNULLType<int>;
void* pfVoid = (void*)(pfGetNull);
(and before purists moan, it seems C++ style 'safe' casting will not allow this)