I'm working on a legacy code base that has this pattern:
struct sometype_t { /* ... */ };
int some_method(void *arg1) { // void pointer
((sometype_t*)arg1)->prop1; // cast
}
Is there any (common) scenario where it would be unsafe to use sometype_t * instead of void *?
int some_method(sometype_t *arg1) {
arg1->prop1;
}
The pointer isn't passed across ABIs or into 3rd-party libraries; it stays entirely within C++ code that we own.
It's usually not a good choice, but the only situation I'm aware of where this really make sense is if you want to have stateful callbacks passed into a function, without using templates:
void takes_callback(void(*f)(void*), void * data);
Basically the gist is that since you aren't using templates, you have to fix the function signature you accept (of course, it can and often does take other arguments and return something as well). If you just call the function with your own parameters though, the function can only hold state between calls via global variables. So instead the contract for takes_callback promises to call f with data as a parameter.
So, if you wanted to use some_method as a callback in such an API, you would have to have it take void*, and do the cast internally. Obviously, you are throwing away type safety here, and if you happen to call takes_callback with &somemethod and a pointer to anything that's not a sometype_t you have UB.
Having a C ABI is one reason to avoid templates, but it's not the only one. Maybe they were worried about code bloat, or wanted to keep the implementation in a .so so that versions could be upgraded without recompiling, etc.
The obvious common scenario that immediately comes to mind is callbacks for some functions from C standard library.
For example, the proper way to write the comparison callback for std::qsort is to declare the function with two const void * arguments and then cast them to proper specific pointer types inside the callback.
Replacing these const void * parameters with specifically-typed pointers will simply prevent the code from compiling.
Related
Let's say this is a C function to be wrapped:
void foo(int(__stdcall *callback)());
The two main pitfalls with C function pointer callbacks are:
Not being able to store bind expressions
Not being able to store capturing lambdas
I would like to know the best way to wrap functions like these to do so. The first is particularly useful for a member function callback, and the second for an inline definition that uses surrounding variables, but those are not the only uses.
The other property of these particular function pointers is that they need to use the __stdcall calling convention. This, to my knowledge, eliminates lambdas as an option completely, and is a bit of a nuisance otherwise. I'd like to allow at least __cdecl as well.
This is the best I am able to come up with without things starting to bend back to relying on support that function pointers don't have. It would typically be in a header. Here is the following example on Coliru.
#include <functional>
//C function in another header I have no control over
extern "C" void foo(int(__stdcall *callback)()) {
callback();
}
namespace detail {
std::function<int()> callback; //pretend extern and defined in cpp
//compatible with the API, but passes work to above variable
extern "C" int __stdcall proxyCallback() { //pretend defined in cpp
//possible additional processing
return callback();
}
}
template<typename F> //takes anything
void wrappedFoo(F f) {
detail::callback = f;
foo(detail::proxyCallback); //call C function with proxy
}
int main() {
wrappedFoo([&]() -> int {
return 5;
});
}
There is, however, a major flaw. This is not re-entrant. If the variable is reassigned to before it's used, the old function will never be called (not taking into account multithreading issues).
One thing I have tried that ended up doubling back on itself was storing the std::function as a data member and using objects, so each would operate on a different variable, but there was no way to pass the object to the proxy. Taking the object as a parameter would cause the signature to mismatch and binding it would not let the result be stored as a function pointer.
One idea I have, but have not played around with is a vector of std::function. However, I think the only real safe time to erase from it would be to clear it when nothing is using it. However, each entry is first added in wrappedFoo, then used in proxyCallback. I'm wondering if a counter that is incremented in the former and decremented in the latter, then checked for zero before clearing the vector would work, but it sounds like a more convoluted solution than necessary anyway.
Is there any way to wrap a C function with a function pointer callback such that the C++ wrapped version:
Allows any function object
Allows more than just the C callback's calling convention (if it's critical that it's the same, the user can pass in something with the right calling convention)
Is thread-safe/re-entrant
Note: The obvious solution, stated as part of Mikael Persson's answer, is to make use of the void * parameter that should exist. However, this is sadly not a be-all, end-all option, mostly due to incompetence. What possibilities exist for those functions that do not have this option is where this can get interesting, and is the primary route to a very useful answer.
You are, unfortunately, out of luck.
There are ways to generate code at runtime, for example you can read on LLVM trampoline intrinsics where you generate a forwarding function that stores additional state, very akin to lambdas but runtime defined.
Unfortunately none of those are standard, and thus you are stranded.
The simplest solution to pass state is... to actually pass state. Ah!
Well defined C callbacks will take two parameters:
A pointer to the callback function itself
A void*
The latter is unused by the code itself, and simply passed to the callback when it is called. Depending on the interface either the callback is responsible to destroy it, or the supplier, or even a 3rd "destroy" function could be passed.
With such an interface, you can effectively pass state in a thread-safe & re-entrant fashion at the C level, and thus naturally wrap this up in C++ with the same properties.
template <typename Result, typename... Args)
Result wrapper(void* state, Args... args) {
using FuncWrapper = std::function<Result(Args...)>;
FuncWrapper& w = *reinterpret_cast<FuncWrapper*>(state);
return w(args...);
}
template <typename Result, typename... Args)
auto make_wrapper(std::function<Result(Args...)>& func)
-> std::pair<Result (*)(Args...), void*>
{
void* state = reinterpret_cast<void*>(&func);
return std::make_pair(&wrapper<Result, Args...>, state);
}
If the C interface does not provide such facilities, you can hack around a bit, but ultimately you are very limited. As was said, a possible solution is to hold the state externally, using globals, and do your best to avoid contention.
A rough sketch is here:
// The FreeList, Store and Release functions are up to you,
// you can use locks, atomics, whatever...
template <size_t N, typename Result, typename... Args>
class Callbacks {
public:
using FunctionType = Result (*)(Args...);
using FuncWrapper = std::function<Result(Args...)>;
static std::pair<FunctionType, size_t> Generate(FuncWrapper&& func) {
// 1. Using the free-list, find the index in which to store "func"
size_t const index = Store(std::move(state));
// 2. Select the appropriate "Call" function and return it
assert(index < N);
return std::make_pair(Select<0, N-1>(index), index);
} // Generate
static void Release(size_t);
private:
static size_t FreeList[N];
static FuncWrapper State[N];
static size_t Store(FuncWrapper&& func);
template <size_t I, typename = typename std::enable_if<(I < N)>::type>
static Result Call(Args...&& args) {
return State[I](std::forward<Args>(args)...);
} // Call
template <size_t L, size_t H>
static FunctionType Select(size_t const index) {
static size_t const Middle = (L+H)/2;
if (L == H) { return Call<L>; }
return index <= Middle ? Select<L, Middle>(index)
: Select<Middle + 1, H>(index);
}
}; // class Callbacks
// Static initialization
template <size_t N, typename Result, typename... Args>
static size_t Callbacks<N, Result, Args...>::FreeList[N] = {};
template <size_t N, typename Result, typename... Args>
static Callbacks<N, Result, Args...>::FuncWrapper Callbacks<N, Result, Args...>::State[N] = {};
This problem has two challenges: one easy and one nearly impossible.
The first challenge is the static type transformation (mapping) from any callable "thing" to a simple function pointer. This problem is solved with a simple template, no big deal. This solves the calling convention problem (simply wrapping one kind of function with another). This is already solved by the std::function template (that's why it exists).
The main challenge is the encapsulation of a run-time state into a plain function pointer whose signature does not allow for a "user-data" void* pointer (as any half-decent C API would normally have). This problem is independent of language (C, C++03, C++11) and is nearly impossible to solve.
You have to understand a fundamental fact about any "native" language (and most others too). The code is fixed after compilation, and only the data changes at run-time. So, even a class member function that appears as if it's one function belonging to the object (run-time state), it's not, the code is fixed, only the identity of the object is changed (the this pointer).
Another fundamental fact is that all external states that a function can use must either be global or passed as a parameter. If you eliminate the latter, you only have global state to use. And by definition, if the function's operation depends on a global state, it cannot be re-entrant.
So, to be able to create a (sort-of-)re-entrant* function that is callable with just a plain function pointer and that encapsulate any general (state-ful) function object (bind'ed calls, lambdas, or whatever), you will need a unique piece of code (not data) for each call. In other words, you need to generate the code at run-time, and deliver a pointer to that code (the callback function-pointer) to the C function. That's where the "nearly impossible" comes from. This is not possible through any standard C++ mechanisms, I'm 100% sure of that, because if this was possible in C++, run-time reflection would also be possible (and it's not).
In theory, this could be easy. All you need is a piece of compiled "template" code (not template in the C++ sense) that you can copy, insert a pointer to your state (or function object) as a kind of hard-coded local variable, and then place that code into some dynamically allocated memory (with some reference counting or whatever to ensure it exists as long as it's needed). But making this happen is clearly very tricky and very much of a "hack". And to be honest, this is quite ahead of my skill level, so I wouldn't even be able to instruct you on how exactly you could go about doing this.
In practice, the realistic option is to not even try to do this. Your solution with the global (extern) variable that you use to pass the state (function object) is going in the right direction in terms of a compromise. You could have something like a pool of functions that each have their own global function object to call, and you keep track of which function is currently used as a callback, and allocate unused ones whenever needed. If you run out of that limited supply of functions, you'll have to throw an exception (or whatever error-reporting you prefer). This scheme would be essentially equivalent to the "in theory" solution above, but with a limited number of concurrent callbacks being used. There are other solutions in a similar vein, but that depends on the nature of the specific application.
I'm sorry that this answer is not giving you a great solution, but sometimes there just aren't any silver bullets.
Another option is to avoid using a C API that was designed by buffoons who never heard of the unavoidable and tremendously useful void* user_data parameter.
* "sort-of" re-entrant because it still refers to a "global" state, but it is re-entrant in the sense that different callbacks (that need different state) do not interfere with each other, as is your original problem.
As said before, a C function pointer does not contain any state, so a callback function called with no arguments can only access global state. Therefore, such a "stateless" callback function can be used only in one context, where the context is stored in a global variable. Then declare different callbacks for different contexts.
If the number of callbacks required changes dynamically (for example, in a GUI, where each windows opened by the user requires a new callback to handle input to that window), then pre-define a large pool of simple state-less callbacks, that map to a statefull callback. In C, that could be done as follows:
struct cbdata { void (*f)(void *); void *arg; } cb[10000];
void cb0000(void) { (*cb[0].f)(cb[0].arg); }
void cb0001(void) { (*cb[1].f)(cb[1].arg); }
...
void cb9999(void) { (*cb[9999].f)(cb[99999].arg); }
void (*cbfs[10000])(void) =
{ cb0000, cb0001, ... cb9999 };
Then use some higher level module to keep a list of available callbacks.
With GCC (but not with G++, so the following would need to be in a strictly C, not C++ file), you can create new callback functions even on the fly by using a not-so-well-known GCC feature, nested functions:
void makecallback(void *state, void (*cb)(void *), void (*cont)(void *, void (*)()))
{
void mycallback() { cb(state); }
cont(state, mycallback);
}
In this case, GCC creates the code for the necessary code generation for you. The downside is, that it limits you to the GNU compiler collection, and that the NX bit cannot be used on the stack anymore, as even your code will require new code on the stack.
makecallback() is called from the high-level code to create a new anonymous callback function with encapsulated state. If this new function is called, it will call the statefull callback function cb with arg state. The new anonymous callback function is useable, as long, as makecallback() does not return. Therefore, makecallback() returns control to the calling code by calling the passed in "cont" function. This example assumes, that the actual callback cb() and the normal continue function cont() both use the same state, "state". It is also possible to use two different void pointers to pass different state to both.
The "cont" function may only return (and SHOULD also return to avoid memory leaks), when the callback is no longer required. If your application is multi-threaded, and requires the various callbacks mostly for its various threads, then you should be able to have each thread at startup allocate its required callback(s) via makecallback().
However, if your app is multi-threaded anyways, and if you have (or can establish) a strict callback-to-thread relationship, then you could use thread-local vars to pass the required state. Of course, that will only work, if your lib calls the callback in the right thread.
I am trying to implement the use of a C++ library within my project that has not used const modifiers on its access functions. Up until now I have been using const in all of my code but this new library is causing two main problems:
Functions where the arguments are passed as const references cannot use the argument's access functions if these arguments are of a type defined by the library.
Classes with member objects of types defined by the library cannot use the access functions of these objects within a const function.
What is the best way to overcome this issue? The easiest solution would be to simply remove all use of const from my code but that would be quite frustrating to do.
Additional info: In this case I do have access to the source code and can see that the access functions do not modify anything. I omitted this information as I was interested in the more general case as well. For my scenario, const_cast appears to be the way to go
PS The library writer is not evil! It is more a bit of rough and ready code that he has kindly open sourced. I could ditch the library and use something more professional as others have noted. However, for this small time-constrained project, the simplicity of the interface to this library has made it the best choice.
How easy is it to tell whether the functions in the library actually modify anything or not?
If it's easy to tell, and they don't, then you can const_cast your const pointer/reference to non-const and call the library function. You might want to throw a wrapper around the library classes to do this for you, which is tedious and verbose but gets that code out of your classes. This wrapper could perhaps be a subclass that adds some const accessors, depending whether the way you use the library class allows that to work.
If it's hard to tell, or they do modify things, then you need to use non-const instances and references to the library classes in your code. mutable can help with those of type (2), but for those of type (1) you just need to pass non-const arguments around.
For an example of why it might be hard, consider that the library author might have written something like this:
struct Foo {
size_t times_accessed;
int value;
int get() {
++times_accessed;
return value;
}
};
Now, if you const_cast a const instance of Foo and call get(), you have undefined behavior[*]. So you have to be sure that get really doesn't modify the object it's called on. You could mitigate this a bit, by making sure that you never create any const instances of Foo, even though you do take const references to non-const instances. That way, when you const_cast and call get you at least don't cause UB. It might make your code confusing, that fields keep changing on objects that your functions claim not to modify.
[*] Why is it undefined behavior? It has to be, in order that the language can guarantee that the value of a const object never changes in a valid program. This guarantee allows the compiler to do useful things. For example it can put static const objects in read-only data sections, and it can optimize code using known values. It also means that a const integer object with a visible initializer is a compile-time constant, which the standard makes use of to let you use it as the size of an array, or a template argument. If it wasn't UB to modify a const object, then const objects wouldn't be constant, and these things wouldn't be possible:
#include <iostream>
struct Foo {
int a;
Foo(int a) : a(a) {}
};
void nobody_knows_what_this_does1(const int *p); // defined in another TU
void nobody_knows_what_this_does2(const int *p); // defined in another TU
int main() {
const Foo f(1);
Foo g(1);
nobody_knows_what_this_does1(&f.a);
nobody_knows_what_this_does2(&g.a);
int x;
if (std::cin >> x) {
std::cout << (x / f.a); // Optimization opportunity!
std::cout << (x / g.a); // Cannot optimize!
}
}
Because f is a const object, and hence f.a is a const object, the optimizer knows that f.a has value 1 when it's used at the end of the function. It could, if it chose, optimize away the division. It doesn't know the same thing about g.a: g is not a const object, a pointer to it has been passed into unknown code, so its value might have changed. So if you're the author of nobody_knows_what_this_does1 or nobody_knows_what_this_does2, and you're thinking of const_casting p and using it to modify its referand, then you can only do it if you somehow know that the referand is non-const. Which normally you don't, so normally you don't use const_cast.
I think you have the following options:
If you are sure the library is working as it should if it was using the const specifier, you could use const_cast<> to remove the const-ness of your objects when dealing with the library
Alternatively, you could make non-const copies of your const objects and pass those to the library, then update the changes to the non-const parts on your original objects
Search for another library that is const-correct
remove all const from your code (not recommended)
Another option is to copy your object into a modifiable temp and pitch it. This is probably the safest thing to do if you're in the circumstance where your class offers a copy constructor and it's not too expensive. This has been my preferred method when it's available, as I know it's 100% safe. Silly example:
int getInfoFromString(String& str); //what?? why isn't str const :(
So I do
String temp(str);
int stuffINeed = getInfoFromString(temp);
//happy
If the library's interface is not big, you can create a wrapper, where you would adjust your code to expected types, by either casting or making a copy of parameters, which are passed to library's functions.
But do not degrade your code in order to use the library.
Another suggestion: Are you familiar with the mutable keyword? If you use it correctly, it might actually accomplish exactly what you're trying to do, with exactly one word added to your code. This keyword can evoke religious opinion at the level of goto because it's probably used as a kludge for every hundred times that it's used because it really is the best design option. In your case, it's debatable which it is but I think it fits the spirit: your annoying library object is something that can be fake-modified without breaking the semantic const-ness of your methods, so go ahead.
class Outer {
mutable Inner inner;
public void foo() const {
inner.nonConstMethod(); //compiles because inner is mutable
}
};
I have the following code:
typedef void * (__stdcall * call_generic)(...);
typedef void * (__stdcall * call_push2)(unsigned long,unsigned long);
void * pfunc;
// assume pfunc is a valid pointer to external function
// this is a logically correct way of calling, however this includes:
// add esp, 8
// after the call, and that breaks my stack.
((call_generic)pfunc)(1,1);
// however, if i use this call:
((call_push2)pfunc)(1,1);
// this does not happen and code works properly.
It's a pain to track all the calls and count args manually (there are lots of such calls ahead), I'd prefer a macro or something for this, but with that bug it's not possible.
Is there a solution? Is there another way of creating call_generic type to do such things?
I do not really understand why exactly it does that "cleanup" but that breaks my stack badly, causing previously defined variables to be lost.
((call_generic)pfunc)(1,1); is only a logically correct way of calling if the function pointed to by pfunc actually has the signature you cast to, void *(...). Your code tells the compiler to make a varargs call, so it makes a varargs call. A varargs call to a function that isn't a varargs function doesn't work (in this case, there's disagreement who has to clean up the stack, and it gets done twice).
There's no way to do this for free. You must somehow cast the function pointer to the correct signature before calling it, otherwise the calling code doesn't know how to pass the parameters in a way that the callee code can use.
One option is to ensure that all the called functions that pfunc might point to have the same signature, then cast to that type. For example, you could make them all varargs functions, although I don't particularly recommend it. It would be more type safe to do what you don't want to - make sure that all the functions that might appear here take two unsigned long, and cast to call_push2.
The trick with call_generic won't work with functions that should be called with __stdcall calling convention. This is because __stdcall implies that the function should clean the stack, OTOH variadic functions (those with ... arguments) may not do this, since they are not aware of the arguments.
So that marking a variadic function with __stdcall calling convention is like shooting yourself in the foot.
In your specific case I'd go in the macro writing direction. I don't see a trivial trick that'd acomplish what you need.
EDIT
One of the techniques may be using template classes. For instance:
// any __stdcall function returning void taking 2 arguments
template <typename T1, typename T2>
struct FuncCaller_2
{
typedef void * (__stdcall * FN)(T1, T2);
static void Call(PVOID pfn, T1 t1, T2 t2)
{
((FN) pfn)(t1, t2);
}
};
// call your function
FuncCaller_2<int, long>::Call(pfn, 12, 19);
You'll need to create such a class for every number of arguments (0, 1, 2, 3, ...).
Needless to say this method is "unsafe" - i.e. there is no compile-time validation of the correctness of the function call.
seems to me you want a form of dynamic function binding mechanism that will bind a pointer to a protoype deducted from an invocation. This generally requires metaprogramming, for this boost::bind or something else from the boost functions library is your best bet (and if they don't have it, its doubtful that i can be done).
In C++ the only way to pass an array to a function is by pointer, considering following
functions:
void someFunc(sample input[7]){
//whatever
}
void someFunc(sample input[]){
//whatever
}
void someFunc(sample (& input)[7]){
//whatever
}
All above function parameters are identical with following function parameter when the function is not inlined:
void someFunc(sample * input){
//whatever
}
Now to pass the array with value we have to put it in a structure like below:
struct SampleArray{
public:
sample sampleArray[7];
};
Now I'd like to know if anyone knows the reason behind this design in C++ standard that makes passing pure arrays by value impossible by any syntax and forces to use structs.
Backward compatibility reasons. C++ had no choice but to inherit this behavior from C.
This answer to the earlier similar question addresses the reasons why such behavior was introduced in C (or, more specifically, why arrays and arrays within structs are handled differently by the language).
C++ adopted that part of C, but added the ability to pass arrays by reference (note that your third version of someFunc passes the array by reference, and is not equivalent to using a pointer to its first element). In addition, C++ includes various container types that can be passed by value (recently including std::array)
Probably as a function of its C ancestry. Arrays being just pointers to memory it'd be kind of hard for that to be done transparently without generating extra code by the compiler. Also it's trivial to implement a container class that does lazy copy on write, which will perform better.
Is it possible?
If as I have understood is correct, void pointer can point to any type. Therefore, a template function (undeclared type) is possible? or void pointer is only reserve for "variable" not function? Then what about void function pointer?
You can cast any function pointer type to any other, but you'd better cast it to the right type before you call it. You can therefore use void(*)() as an equivalent to void* for function pointers. This also works with function templates.
template<typename T>
void f(T){}
typedef void(*voidfp)();
voidfp fp=static_cast<voidfp>(&f<int>); // store address of f(int) in variable
static_cast<void(*)(int)>(fp)(3); // call the function
fp=static_cast<voidfp>(&f<std::string>); // store address of f(std::string) in variable
static_cast<void(*)(std::string)>(fp)("hello"); // call the function
According to the Standard, a void* is not required to be able to hold a function pointer. (It is required to hold a pointer to any kind of data). However, most cpu architectures you're likely to see these days have data pointers & function pointers that are the same size.
There is an issue here, because of the words used I am afraid.
There is a difference between pointers and function pointers, most notably they need not be the same size.
Therefore it is undefined behavior to use void* type to hold the address of a function pointer.
In general it is not a good idea in C++ to use void*. Those are necessary in C because of the lack of a proper type system but C++ type system is much more evolved (even though not as evolved as recent languages).
You could probably benefit from some objectification here. If you make your method an instance of a class (template) you can have this class derived from a common base class. This is quite common, those objects are called Functors.
However, without a precise description of your issue, it'll be hard to help more.
to do this with templates you need some trickery, else the compiler cannot disambiguate the function (this is really not recommended, its horrible to read, and probably violates a few thousand porgamming best practices)
IE: this does not work (atleast under VS08 & GCC 3.5):
template <typename tType> tType* GetNULLType()
{
return static_cast<tType*>(0);
}
void* pf = static_cast<void*>(GetNULLType<int>);
you instead need to do:
template <typename tType> tType* GetNULLType()
{
return static_cast<tType*>(0);
}
typedef int* (*t_pointer)();
t_pointer pfGetNull = GetNULLType<int>;
void* pfVoid = (void*)(pfGetNull);
(and before purists moan, it seems C++ style 'safe' casting will not allow this)