C# / C++ Asynchronous reverse pinvoke? - c++

I need to call C# code from a native C/C++ .dll asynchronously.
While searching how to do I found that I could create a C# delegate and get a function pointer from it, which I would use inside my native code.
The problem is that my native code needs to run asynchronously, i.e in a separate thread created from the native code, which means that the native code called from the C# will return to C# before the delegate is called.
I read in another SO question that someone had trouble with this, despite what the MSDN says, because his delegate was garbage collected before it gets called, due to the asynchronous nature of his task.
My question is : is it really posible to call a C# delegate using a function pointer from a native code running in a thread that was created inside the native code ? Thank you.

No, this is a universal bug and not specific to asynchronous code. It is just a bit more likely to byte in your case since you never have the machinery behind [DllImport] to keep you out of trouble. I'll explain why this goes wrong, maybe that helps.
A delegate declaration for a callback method is always required, that's how the CLR knows now to make the call to the method. You often declare it explicitly with the delegate keyword, you might need to apply the [UnmanagedFunctionPointer] attribute if the unmanaged code is 32-bit and assumes the function was written in C or C++. The declaration is important, that's how the CLR knows how the arguments you pass from your native code need to be converted to their managed equivalent. That conversion can be intricate if your native code passes strings, arrays or structures to the callback.
The scenario is heavily optimized in the CLR, important because managed code inevitably runs on an unmanaged operating system. There are a lot of these transitions, you can't see them because most of them happen inside .NET Framework code. This optimization involves a thunk, a sliver of auto-generated machine code that takes care of making the call to foreign method or function. Thunks are created on-the-fly, whenever you make the interop call that uses the delegate. In your case when C# code passes the delegate to your C++ code. Your C++ code gets a pointer to the thunk, a function pointer, you store it and make the callback later.
"You store it" is where the problem starts. The CLR is unaware that you stored the pointer to the thunk, the garbage collector cannot see it. Thunks require memory, usually just a few handful of bytes for the machine code. They don't live forever, the CLR automatically releases the memory when the thunk is no longer needed.
"Is no longer needed" is the rub, your C++ code cannot magically tell the CLR that it no longer is going to make a callback. So the simple and obvious rule it uses is that the thunk is destroyed when the delegate object is garbage collected.
Programmers forever get in trouble with that rule. They don't realize that the life-time of the delegate object is important. Especially tricky in C#, it has a lot of syntax sugar that makes it very easy to create delegate objects. You don't even have to use the new keyword or name the delegate type, just using the target method name is enough. The lifetime of such a delegate object is only the pinvoke call. After the call completes and your C++ code has stored the pointer, the delegate object isn't referenced anywhere anymore so is eligible for garbage collection.
Exactly when that happens, and the thunk is destroyed, is unpredictable. The GC runs only when needed. Could be a nanosecond after you made the call, that's unlikely of course, could be seconds. Most tricky, could be never. Happens in a typical unit test that doesn't otherwise calls GC.Collect() explicitly. Unit tests rarely put enough pressure on the GC heap to induce a collection. It is a bit more likely when you make the callback from another thread, implicit is that other code is running on other threads that make it more likely that a GC is triggered. You'll discover the problem quicker. Nevertheless, the thunk is going to get destroyed in a real program sooner or later. Kaboom when you make the callback in your C++ code after that.
So, rock-hard rule, you must store a reference to the delegate to avoid the premature collection problem. Very simple to do, just store it in a variable in your C# program that is declared static. Usually good enough, you might want to set it explicitly back to null when the C# code tells your C++ code to stop making callbacks, unlikely in your case. Very occasionally, you'd want to use GCHandle.Alloc()instead of a static variable.

Related

C++ calling conventions -- converting between win32 and Linux/GCC?

I'm knee deep in a project to connect Windows OpenVR applications running in Wine directly to Linux native SteamVR via a Winelib wrapper, the idea being to sidestep all the problems with trying to make what is effectively a very complicated device driver run inside Wine itself. I pretty much immediately hit a wall. The problem appears to be related to calling conventions, though I've had trouble getting meaningful information out of winedbg, so there's a chance I'm way way off.
The OpenVR API is C++ and consists primarily of classes filled with virtual methods. The application calls a getter (VR_GetGenericInterface) to acquire a derivative class object implementing those methods from the (closed source) runtime. It then calls those methods directly on the object given to it.
My current attempt goes like this: My wrapped VR_GetGenericInterface returns a custom wrapper class for the requested interface. This class's methods are defined in their own file separately compiled with -mabi=ms. It calls non-member methods in a separate file that is compiled without -mabi=ms, which finally make the corresponding call into the actual runtime.
This seems to work, until the application calls a method that returns a struct of some description. Then the application segfaults on the line the call happened, apparently just after the call returned, as my code isn't anywhere on the stack at that point and I verified with printfs that my wrapped class reaches the point of returning to the app.
This leads me to think there's yet another calling convention gotcha I haven't accounted for yet, related to structs and similar complex types. From googling it appears returning a struct is one of the more poorly-defined things in ABIs typically, but I found no concrete answers or descriptions of the differences.
What is most likely happening, and how can I dig deeper to see what exactly the application is expecting?

How to pass context to a WinEventProc (callback for SetWinEventHook)?

The question is in the title: How can I have the callback for SetWinEventHook have some context information? Usually functions that take callbacks also take a context parameter they pass on to the callbacks, but not this one.
Using a global (including thread_local) variable as suggested in SetWinEventHook with custom data for example is almost certainly unacceptable.
There may be more than one caller to the utility function that sets the WinEvent hook to do something so global is out. thread_local may work, but it also may not. What if the callback calls the same utility function and sets yet another hook? They will both get called next time I pump messages. Using thread_local requires me to define a contract that forbids a single thread to set more than one WinEvent hook through me, and requires my callers to abide by it. That's a pretty arbitrary and artificial limitation and I prefer not to set it.
I'm reimplementing in C++ a little PoC I had in C#. There I could simple use capturing lambdas and they would convert to function pointers that I can pass via P/Invoke to SetWinEventHook. C++ unfortunately doesn't have that feature (probably because it requires allocating WX memory and dynamically generating a little code, which some platforms don't allow), but perhaps there is a platform-specific way to do this?
I seem to remember that ATL did something like that in the past, and I guess I can probably implement something like this too, but maybe there's a better way? If not, perhaps somebody has already written something like that?
(Another valid solution would be a clever-enough data structure and algorithm to solve the reentrancy problem.)

How to pass C++ object to NPAPI plugin?

I'm writing an NPAPI plugin in C++ on Windows. When my plugin's instantiated, I want to pass it some private data from my main application (specifically, I want to pass it a pointer to a C++ object). There doesn't seem to be a mechanism to do this. Am I missing something? I can't simply create my object in the plugin instance, since it's meant to exist outside of the scope of the plugin instance and persists even when the plugin instance is destroyed.
Edit:
I'm using an embedded plugin in C++ via CEF. This means that my code is essentially the browser and the plugin. Obviously, this isn't the way standard NPAPI plugins behave, so this is probably not something that's supported by NPAPI itself.
You can't pass a C++ object to javascript; what you can do is pass an NPObject that is also a C++ object and exposes things through the NPRuntime interface.
See http://npapi.com/tutorial3 for more information.
You may also want to look at the FireBreath framework, which greatly simplifies things like this.
Edit: it seems I misunderstood your question. What you want is to be able to store data linked to a plugin instance. What you need is the NPP that is given to you when your plugin is created; the NPP has two members, ndata (netscape data) and pdata (plugin data). The pdata pointer is yours to control -- you can set it to point to any arbitrary value that you want, and then cast it back to the real type whenever you want to use it. Be sure to cast it back and delete it on NPP_Destroy, of course. I usually create a struct to keep a few pieces of information in it. FireBreath uses this and sends all plugin calls into a Plugin object instance so that you can act as though it were a normal object.
Relevant code example from FireBreath:
https://github.com/firebreath/FireBreath/blob/master/src/NpapiCore/NpapiPluginModule_NPP.cpp#L145
Pay particular attention to NPP_New and NPP_Destroy; also pay particular attention to how the pdata member of the NPP is used.
This is also discussed in http://npapi.com/tutorial2
There is no way to do this via NPAPI, since the concept doesn't make sense in NPAPI terms. Even if you hack something up that passes a raw pointer around, that assumes everything is running in one process, so if CEF switches to the multi-process approach Chromium is designed around, the hack would break.
You would be better off pretending they are different processes, and using some non-NPAPI method of sharing what you need to between the main application and the plugin.

Getting to know the caller is from which DLLs

Currently, I have a C++ exe project, which dynamic load N DLLs.
Those DLLs will perform calling to the functions which is re-inside exe project.
Now, within my exe project, I wish to know the callers are coming from which DLLs.
Is it possible to do so using any available Windows API?
It depends on what your actual goal is. You cannot do it if you're expecting the DLLs to be possibly malicious (that is, if you're expecting them to try to trick you). But if it's just for debugging or logging or something relaitvely harmless like that, you can look at the stack and get the address that the ret instruction will use to return to the caller, enumerate through the loaded DLLs and test which of them that address is inside of.
To get the "return address", you can use the _ReturnAddress intrinsic in Visual C++, and then you can use the GetModuleHandleEx function, passing in GET_MODULE_HANDLE_EX_FLAG_FROM_ADDRESS to get a handle to the DLL that the address is inside of.
But I must repeat: you cannot base security decisions off the results of this test. It is very easy for malicious code to fake and "trick" your program into thinking it's a "trusted" or "safe" DLL. As I said, if it's just for debugging or logging or something, then go right ahead.
Also, this will obviously only tell you the DLL the immediate caller is inside of. You can't do it if you're 5 levels deep or something....
If you have given the same callback to multiple DLL's, then it is up to them to provide you with information about who's who. Most API callback have a parameter you can pass to the callback. If this is so for your callbacks, you can use this to identify the DLLs.
It probably isn't possible considering that the call stack will come back down to your exe anyway.
EDIT: By the look of your post, is this a hypothetical situation?
Is this helpful? Check the parameter 'GetModuleBaseRoutine'
If you're architecting the exe, and you're not assuming the DLL's are hostile (see Dean's answer), you might be able to achieve the effect by providing each DLL with a different set of pointers for the callback functions, which each in-turn forward to the actual callback functions. You could then associate the calls with the calling DLL, based on which pass-through callback was actually called.
Of course, this assumes you're providing the callback addresses to the DLL's, but presumably this would be the normal design for an application where a DLL called back into the calling exe. It won't work if the DLL is mucking around in your process memory for internal functions, of course, but then you're probably into the hostile situation.

Problems regarding Boost::Python and Boost::Threads

Me and a friend are developing an application which uses Boost::Python. I have defined an interface in C++ (well a pure virtual class), exposed through Boost::Python to the users, who have to inherit from it and create a class, which the application takes and uses for some callback mechanism.
Everything that far goes pretty well. Now, the function callback may take some time (the user may have programmed some heavy stuff)... but we need to repaint the window, so it doesn't look "stuck".We wanted to use Boost::Thread for this. Only one callback will be running at a time (no other threads will call python at the same time), so we thought it wouldn't be such a great deal... since we don't use threads inside python, nor in the C++ code wrapped for python.
What we do is calling PyEval_InitThreads() just after Py_Initialize(), then, before calling the function callback inside it's own boost thread, we use the macro PY_BEGIN_ALLOW_THREADS and, and the macro PY_END_ALLOW_THREADS when the thread has ended.
I think I don't need to say the execution never reaches the second macro. It shows several errors everytime it runs... but t's always while calling the actual callback. I have googled a lot, even read some of the PEP docs regarding threads, but they all talk about threading inside the python module (which I don't sice it's just a pure virtual class exposed) or threading inside python, not about the main application calling Python from several threads.
Please help, this has been frustrating me for several hours.
Ps. help!
Python can be called from multiple threads serially, I don't think that's a problem. It sounds to me like your errors are just coming from bad C++ code, as you said the errors happened after PY_BEGIN_ALLOW_THREADS and before PY_END_ALLOW_THREADS.
If you know that's not true, can you post a little more of your actual code and show exactly where its erroring and exactly what errors its giving?