Running Function Inside Stub. Passing Function Pointer - c++

I'm working on creating a user-level thread library and what I want to do is run a function inside a stub and so I would like to pass the function pointer to the stub function.
Here is my stub function:
void _ut_function_stub(void (*f)(void), int id)
{
(*f)();
DeleteThread(id);
}
This is what the user calls. What I want to do is get pointer of _ut_function_stub to assign to pc and I've tried various different options including casting but the compiler keeps saying "invalid use of void expression".
int CreateThread (void (*f) (void), int weight)
{
... more code
pc = (address_t)(_ut_function_stub(f, tcb->id));
... more code
}
Any help is appreciated. Thanks!

If you're interested in implementing your own user-level-threads library, I'd suggest looking into the (now deprecated) ucontext implementation. Specifically, looking at the definitions for the structs used in ucontext.h will help you see all the stuff you actually need to capture to get a valid snapshot of the thread state.
What you're really trying to capture with the erroneous (address_t) cast in your example is the current continuation. Unfortunately, C doesn't support first-class continuations, so you're going to be stuck doing something much more low-level, like swapping stacks and dumping registers (hence why I pointed you to ucontext as a reference—it's going to be kind of complicated if you really want to get this right).

Related

Passing function pointers as an API interface to a compiled library

Dearest stack exchange,
I'm programming an MRI scanner. I won't go into too much background, but I'm fairly constrained in how much code I've got access to, and the way things have been set up is...suboptimal. I have a situation as follows:
There is a big library, written in C++. It ultimately does "transcoding" (in the worst possible way), writing out FPGA assembly that DoesThings. It provides a set of functions to "userland" that are translated into (through a mix of preprocessor macros and black magic) long strings of 16 bit and 32 bit words. The way this is done is prone to buffer overflows, and generally to falling over.*
The FPGA assembly is then strung out over a glorified serial link to the relevant electronics, which executes it (doing the scan), and returning the data back again for processing.
Programmers are expected to use the functions provided by the library to do their thing, in C (not C++) functions that are linked against the standard library. Unfortunately, in my case, I need to extend the library.
There's a fairly complicated chain of preprocessor substitution and tokenization, calling, and (in general) stuff happening between you writing doSomething() in your code, and the relevant library function actually executing it. I think I've got it figured out to some extent, but it basically means that I've got no real idea about the scope of anything...
In short, my problem is:
In the middle of a method, in a deep dark corner of many thousands of lines of code in a big blob I have little control over, with god-knows-what variable scoping going on, I need to:
Extend this method to take a function pointer (to a userland function) as an argument, but
Let this userland function, written after the library has been compiled, have access to variables that are local to both the scope of the method where it appears, as well as variables in the (C) function where it is called.
This seems like an absolute mire of memory management, and I thought I'd ask here for the "best practice" in these situations, as it's likely that there are lots of subtle issues I might run into -- and that others might have lots of relevant wisdom to impart. Debugging the system is a nightmare, and I've not really got any support from the scanner's manufacturer on this.
A brief sketch of how I plan to proceed is as follows:
In the .cpp library:
/* In something::something() /*
/* declare a pointer to a function */
void (*fp)(int*, int, int, ...);
/* by default, the pointer points to a placeholder at compile time*/
fp = &doNothing(...);
...
/* At the appropriate time, point the pointer to the userland function, whose address is supplied as an argument to something(): /*
fp= userFuncPtr;
/* Declare memory for the user function to plonk data into */
i_arr_coefficients = (int) malloc(SOMETHING_SENSIBLE);
/* Create a pointer to that array for the userland function */
i_ptr_array=&i_arr_coefficients[0];
/* define a struct of pointers to local variables for the userland function to use*/
ptrStrct=createPtrStruct();
/* Call the user's function: */
fp(i_ptr_array,ptrStrct, ...);
CarryOnWithSomethingElse();
The point of the placeholder function is to keep things ticking over if the user function isn't linked in. I get that this could be replaced with a #DEFINE, but the compiler's cleverness or stupidity might result in odd (to my ignorant mind, at least) behaviour.
In the userland function, we'd have something like:
void doUsefulThings(i_ptr_array, ptrStrct, localVariableAddresses, ...) {
double a=*ptrStrct.a;
double b=*ptrStrct.b;
double c=*localVariableAddresses.c;
double d=doMaths(a, b, c);
/* I.e. do maths using all of these numbers we've got from the different sources */
storeData(i_ptr_array, d);
/* And put the results of that maths where the C++ method can see it */
}
...
something(&doUsefulThings(i_ptr_array, ptrStrct, localVariableAddresses, ...), ...);
...
If this is as clear as mud please tell me! Thank you very much for your help. And, by the way, I sincerely wish someone would make an open hardware/source MRI system.
*As an aside, this is the primary justification the manufacturer uses to discourage us from modifying the big library in the first place!
You have full access to the C code. You have limited access to the C++ library code. The C code is defining the "doUsefullthings" function. From C code you are calling the "Something" function ( C++ class/function) with function pointer to "doUseFullThings" as the argument. Now the control goes to the C++ library. Here the various arguments are allocated memory and initialized. Then the the "doUseFullThings" is called with those arguments. Here the control transfers back to the C code. In short, the main program(C) calls the library(C++) and the library calls the C function.
One of the requirements is that the "userland function should have access to local variable from the C code where it is called". When you call "something" you are only giving the address of "doUseFullThings". There is no parameter/argument of "something" that captures the address of the local variables. So "doUseFullThings" does not have access to those variables.
malloc statement returns pointer. This has not been handled properly.( probably you were trying to give us overview ). You must be taking care to free this somewhere.
Since this is a mixture of C and C++ code, it is difficult to use RAII (taking care of allocated memory), Perfect forwarding ( avoid copying variables), Lambda functions ( to access local varibales) etc. Under the circumstances, your approach seems to be the way to go.

How to pass data to glutTimerFunc?

#include <gl/freeglut.h>
void Keyboard(int value){
glutTimerFunc(33,Keyboard,0);
}
int main(int argc, char **argv){
glutTimerFunc(33,Keyboard);
}
Is there any way to pass data from the main function to the keyboard function without having to use global variables? It seems like the function glutTimerFunc only allows functions with the signature (int value) which seems very restrictive.
so is there another way to get the data into another funcion without having them to be global?
Like other people already suggested: Don't use GLUT. GLUT never was meant to be the foundation of serious application. It's a small framework meant for small OpenGL tech demos. The problem you're running into is that C and C++ lack a concept called "closures", or sometimes also called "delegates". Other languages have them and in fact when you use the GLUT bindings in those languages you don't experience the problem.
Since C/C++'s lack of closures is so prominent a small library (that does crazy but genius things internally) has been written, that allows it, to create closures in C. It's called ffcall, unfortunately seems to be unmaintained, but is in use by projects as the GNU Common Lisp and Scheme compilers.
I wrote about using ffcall already in the StackOverflow answers https://stackoverflow.com/a/8375672/524368 and https://stackoverflow.com/a/10207698/524368
You can use a trick: pass a pointer to the object as value, after casting it to int:
i instance;
keyboard((int)&i);
If you need to call the function with the timer, consider the fact that when a function ends all the object in the stack are deallocated. So in this case you should allocate the object with new, and always have a handle to it in the case you need to deallocate it.
It's very tricky, I know. If you can switch to another library do it.

How to transmit a function to anonymous pipe WinAPI?

I need to write to anonymous pipe something like double (*fun)(double), but the following WriteFile(pipe, fun, 4, written_bytes, 0) causes an error in a pipe-receiver while ReadFile(read_pipe, fun, 4, written_bytes, 0). Are there any methods to do this?
I have an idea. I can create a struct with field of same type:
struct Foo
{
double (*f)(double);
};
And then I write it WriteFile(hWritePipe_StdIN, &to_process, sizeof(Foo), &bytes, 0);
But I have problem, that pipe-receiver never ends to read data:
ReadFile(hReadPipe, &to_process, sizeof(Foo), &bytes, 0);
There are some problems with it:
First, you should know the size of function.
If you do, you just call WriteFile(pipe, funcPtr, funcSize, ...) to transfer it.
Second, the function should contain only position-independent code, and don't address any data.
E.g. a function like this won't work:
double fun(double x)
{
int arr[10000]; // implicit function call (alloca or something like this)
printf("some");
static int some = 1;
return globalVal + (++some);
}
because function printf will have a different address and there will be no static variable and string in another process.
(Well, maybe you can transfer data as well, but there is no way you'll generate PI code.)
So, with all that limitations, you can send a function:
__declspec(naked) double fun(double x) { __asm ret }
const auto funcSize = 1;
WriteFile(pipe, &fun, funcSize, ...);
In native code you can not send function (the code) itself, neither to the same nor to different process. (You could try low-level hacking like the one #Abyx suggests, but it seriously limits functionality that the code can perform, and will probably make you resort to writing it all in assembler by hand.)
You also can't send function's address to another process, because each process has its own isolated address space; in another process, that address will contain different data.
The solution will be to create a shared library (preferably dynamic) that will contain all functions that could possibly be sent this way. Assign each function some tag (e.g. number or name), let DLL maintain a mapping between tags and addresses. Then send tags instead.
What are you trying to achieve, here? Are you really trying to write the function itself? Why? That's not something you can easily do in C++, for instance because the size of a function is not well-defined.
You should probably write the data, i.e. the number returned by fun() instead:
const double value = fun(input);
DWORD numberOfBytesWritten;
WriteFile(pipe, &value, sizeof value, &numberOfBytesWritten, NULL);
You should of course add code to check the output. Note that writing binary data like this can be brittle.
Since you're using WinAPI, the native way to send a function is via COM. In particular, expose the function as a method on a COM object, obtain a COM moniker, and send the moniker. Monikers can be serialized and sent over pipes. The other side can deserialize the moniker and get access to your object.
Under water, this works by looking up the object in the COM Running Object Table
Seeing how this is excessively complicated and error-prone to do in C++ (and only works with a very limited set of functions at all), I recommend you use a scripting language for this. Instruction caches and DEP are another two things you'd have to consider in addition to the already mentioned ones.
Really. Transmit the function as script, and run it on the other end. Save yourself that pain.
Angelscript looks and feels almost like C++, so that might be a possible candidate.
Now, if you object to this because you need something that a script cannot trivially do, knoweth: C++ will not be able to do it either, in this scenario.
Apart from the above mentioned PIC code issue (#Abyx) and the fact that you cannot safely or portably know a function's size, the only C++ functions that you could conceivably send via a pipe and execute in a meaningful manner are strictly const functions. Here, const is in the sense of e.g. GCC's __attribute__((const)), not the C++ const keyword.
That is, any such function may not examine any values except its arguments, and have no effects except the return value. The reason is obvious: A different process lives in a different address space, so anything you reference is meaningless. Anything you change is meaningless.
Now, this is just what a script can do, in a safe, straightforward manner, and reliably. The overhead is, considering you already send code through a pipe, neglegible.

Function pointers and unknown number of arguments in C++

I came across the following weird chunk of code.Imagine you have the following typedef:
typedef int (*MyFunctionPointer)(int param_1, int param_2);
And then , in a function , we are trying to run a function from a DLL in the following way:
LPCWSTR DllFileName; //Path to the dll stored here
LPCSTR _FunctionName; // (mangled) name of the function I want to test
MyFunctionPointer functionPointer;
HINSTANCE hInstLibrary = LoadLibrary( DllFileName );
FARPROC functionAddress = GetProcAddress( hInstLibrary, _FunctionName );
functionPointer = (MyFunctionPointer) functionAddress;
//The values are arbitrary
int a = 5;
int b = 10;
int result = 0;
result = functionPointer( a, b ); //Possible error?
The problem is, that there isn't any way of knowing if the functon whose address we got with LoadLibrary takes two integer arguments.The dll name is provided by the user at runtime, then the names of the exported functions are listed and the user selects the one to test ( again, at runtime :S:S ).
So, by doing the function call in the last line, aren't we opening the door to possible stack corruption? I know that this compiles, but what sort of run-time error is going to occur in the case that we are passing wrong arguments to the function we are pointing to?
There are three errors I can think of if the expected and used number or type of parameters and calling convention differ:
if the calling convention is different, wrong parameter values will be read
if the function actually expects more parameters than given, random values will be used as parameters (I'll let you imagine the consequences if pointers are involved)
in any case, the return address will be complete garbage, so random code with random data will be run as soon as the function returns.
In two words: Undefined behavior
I'm afraid there is no way to know - the programmer is required to know the prototype beforehand when getting the function pointer and using it.
If you don't know the prototype beforehand then I guess you need to implement some sort of protocol with the DLL where you can enumerate any function names and their parameters by calling known functions in the DLL. Of course, the DLL needs to be written to comply with this protocol.
If it's a __stdcall function and they've left the name mangling intact (both big ifs, but certainly possible nonetheless) the name will have #nn at the end, where nn is a number. That number is the number of bytes the function expects as arguments, and will clear off the stack before it returns.
So, if it's a major concern, you can look at the raw name of the function and check that the amount of data you're putting onto the stack matches the amount of data it's going to clear off the stack.
Note that this is still only a protection against Murphy, not Machiavelli. When you're creating a DLL, you can use an export file to change the names of functions. This is frequently used to strip off the name mangling -- but I'm pretty sure it would also let you rename a function from xxx#12 to xxx#16 (or whatever) to mislead the reader about the parameters it expects.
Edit: (primarily in reply to msalters's comment): it's true that you can't apply __stdcall to something like a member function, but you can certainly use it on things like global functions, whether they're written in C or C++.
For things like member functions, the exported name of the function will be mangled. In that case, you can use UndecorateSymbolName to get its full signature. Using that is somewhat nontrivial, but not outrageously complex either.
I do not think so, it is a good question, the only provision is that you MUST know what the parameters are for the function pointer to work, if you don't and blindly stuff the parameters and call it, it will crash or jump off into the woods never to be seen again... It is up to the programmer to convey the message on what the function expects and the type of parameters, luckily you could disassemble it and find out from looking at the stack pointer and expected address by way of the 'stack pointer' (sp) to find out the type of parameters.
Using PE Explorer for instance, you can find out what functions are used and examine the disassembly dump...
Hope this helps,
Best regards,
Tom.
It will either crash in the DLL code (since it got passed corrupt data), or: I think Visual C++ adds code in debug builds to detect this type of problem. It will say something like: "The value of ESP was not saved across a function call", and will point to code near the call. It helps but isn't totally robust - I don't think it'll stop you passing in the wrong but same-sized argument (eg. int instead of a char* parameter on x86). As other answers say, you just have to know, really.
There is no general answer. The Standard mandates that certain exceptions be thrown in certain circumstances, but aside from that describes how a conforming program will be executed, and sometimes says that certain violations must result in a diagnostic. (There may be something more specific here or there, but I certainly don't remember one.)
What the code is doing there isn't according to the Standard, and since there is a cast the compiler is entitled to go ahead and do whatever stupid thing the programmer wants without complaint. This would therefore be an implementation issue.
You could check your implementation documentation, but it's probably not there either. You could experiment, or study how function calls are done on your implementation.
Unfortunately, the answer is very likely to be that it'll screw something up without being immediately obvious.
Generally if you are calling LoadLibrary and GetProcByAddrees you have documentation that tells you the prototype. Even more commonly like with all of the windows.dll you are provided a header file. While this will cause an error if wrong its usually very easy to observe and not the kind of error that will sneak into production.
Most C/C++ compilers have the caller set up the stack before the call, and readjust the stack pointer afterwards. If the called function does not use pointer or reference arguments, there will be no memory corruption, although the results will be worthless. And as rerun says, pointer/reference mistakes almost always show up with a modicum of testing.

Does an arbitrary instruction pointer reside in a specific function?

I have a very difficult problem I'm trying to solve: Let's say I have an arbitrary instruction pointer. I need to find out if that instruction pointer resides in a specific function (let's call it "Foo").
One approach to this would be to try to find the start and ending bounds of the function and see if the IP resides in it. The starting bound is easy to find:
void *start = &Foo;
The problem is, I don't know how to get the ending address of the function (or how "long" the function is, in bytes of assembly).
Does anyone have any ideas how you would get the "length" of a function, or a completely different way of doing this?
Let's assume that there is no SEH or C++ exception handling in the function. Also note that I am on a win32 platform, and have full access to the win32 api.
This won't work. You're presuming functions are contigous in memory and that one address will map to one function. The optimizer has a lot of leeway here and can move code from functions around the image.
If you have PDB files, you can use something like the dbghelp or DIA API's to figure this out. For instance, SymFromAddr. There may be some ambiguity here as a single address can map to multiple functions.
I've seen code that tries to do this before with something like:
#pragma optimize("", off)
void Foo()
{
}
void FooEnd()
{
}
#pragma optimize("", on)
And then FooEnd-Foo was used to compute the length of function Foo. This approach is incredibly error prone and still makes a lot of assumptions about exactly how the code is generated.
Look at the *.map file which can optionally be generated by the linker when it links the program, or at the program's debug (*.pdb) file.
OK, I haven't done assembly in about 15 years. Back then, I didn't do very much. Also, it was 680x0 asm. BUT...
Don't you just need to put a label before and after the function, take their addresses, subtract them for the function length, and then just compare the IP? I've seen the former done. The latter seems obvious.
If you're doing this in C, look first for debugging support --- ChrisW is spot on with map files, but also see if your C compiler's standard library provides anything for this low-level stuff -- most compilers provide tools for analysing the stack etc., for instance, even though it's not standard. Otherwise, try just using inline assembly, or wrapping the C function with an assembly file and a empty wrapper function with those labels.
The most simple solution is maintaining a state variable:
volatile int FOO_is_running = 0;
int Foo( int par ){
FOO_is_running = 1;
/* do the work */
FOO_is_running = 0;
return 0;
}
Here's how I do it, but it's using gcc/gdb.
$ gdb ImageWithSymbols
gdb> info line * 0xYourEIPhere
Edit: Formatting is giving me fits. Time for another beer.