I am trying to print the address of a virtual member function.
If I know which class implements the function I can write:
print("address: %p", &A::func);
But I want to do something like this:
A *b = new B();
printf("address: %p", &b->func);
printf("address: %p", &b->A::func);
However this does not compile. Is it possible to do something like this, perhaps looking up the address in the vtable at runtime?
Currently there is no standard way of doing this in C++ although the information must be available somewhere. Otherwise, how could the program call the function? However, GCC provides an extension that allows us to retrieve the address of a virtual function:
void (A::*mfp)() = &A::func;
printf("address: %p", (void*)(b->*mfp));
...assuming the member function has the prototype void func().
This can be pretty useful when you want to cache the address of a virtual function or use it in generated code. GCC will warn you about this construct unless you specify -Wno-pmf-conversions. It's unlikely that it works with any other compiler.
Pointers to member functions are not always simple memory addresses. See the table in this article showing the sizes of member function pointers on different compilers - some go up to 20 bytes.
As the article outlines a member function pointer is actually a blob of implementation-defined data to help resolve a call through the pointer. You can store and call them OK, but if you want to print them, what do you print? Best to treat it as a sequence of bytes and get its length via sizeof.
I found a way to do this using a disassembler (https://github.com/vmt/udis86). The steps are:
Get a pointer to the virtual function via normal C++ code
Disassemble the jmp instruction at that address
Parse the real address from the disassembled string
Here is how I did it:
// First get the raw pointer to the virtual function
auto myVirtualFuncPtr = &MyClass::myFunc;
void* myVirtualFuncPtrRaw = (void*&)myVirtualFuncPtr;
// Resolve the real function!
void* myFuncPtr = resolveVirtualFunctionAddress(myVirtualFuncPtrRaw);
...
static void* resolveVirtualFunctionAddress(void* address)
{
const int jumpInstructionSize = 5;
static ud_t ud_obj;
ud_init(&ud_obj);
ud_set_mode(&ud_obj, sizeof(void*) * 8);
ud_set_syntax(&ud_obj, UD_SYN_INTEL);
ud_set_pc(&ud_obj, (uint64_t)address);
ud_set_input_buffer(&ud_obj, (unsigned uint8_t*)address, jumpInstructionSize);
std::string jmpInstruction = "";
if (ud_disassemble(&ud_obj))
{
jmpInstruction += ud_insn_asm(&ud_obj);
}
// TODO: Implement startsWith and leftTrim yourself
if (startsWith(jmpInstruction, "jmp "))
{
std::string jumpAddressStr = leftTrim(jmpInstruction, "jmp ");
return hexToPointer(jumpAddressStr);
}
// If the jmp instruction was not found, then we just return the original address
return address;
}
static void* hexToPointer(std::string hexString)
{
void* address;
std::stringstream ss;
ss << std::hex << hexString;
ss >> address;
return address;
}
From what I can tell in the standard, the only time you get dynamic binding is during a virtual function call. And once you've called a function, you're executing the statements within the function (i.e., you can't "stop halfway" into the call and get the address.)
I think it's impossible.
Doesn't make a lot a of sense to me. If you have a normal function:
void f( int n ) {
}
then you can take its address:
f
but you cannot take the address of a function call, which is what you seem to want to do.
Related
I am working on some legacy code where I have to make some changes in the cpp file.The cpp file contains entire code in extern "c" block -
I updated a function that returns a char* .The code looks something like func1() below.
Since I use std::strring and stringstream I included the sstream and string header files before extern block.
The below function is called from both c and cpp files.So I cannot return std::string here -
char* func1(someStruct* pt){
std::strig nam = somefunc(pt);
//have to append some integer in particular format
std::stringstream ss;
ss<<nam<<pt->int1 ......;
nam = ss.str();
//More code here for returning char* based on queries - (a)
}
At one of the places where this function is called -
void otherFunc(.....){
//......
char* x = func(myptr);
if(based_on_some_condition){
char* temp = func3(x); //returns a char* to dynamically allocated array.
strcpy(x,temp); //copying (b)
}
//..........
}
Following is my query -
1) At (a) I can return char* in following 2 forms.I have to make a decision such that copying at (b) does not cause any undefined behavior -
i)Create a char array dynamically with size = nam.length()+10 (extra 10 for some work happening in func3).<br>
char* rtvalue = (char*)calloc(sizeof(char),nam.length()+10);
strcpy(rtvalue,nam.c_str());
return rtvalue;
And free(temp); in otherFunc() after strcpy(x,temp);
ii) Declare 'nam' as static std::string nam;
and simply return const_cast<char*>(nam.c_str());
Will defining 'nam' with static scope ensure that a correct return happen from function (ie no dangling pointer at 'x')?
More importantly, can I do this without worrying about modification happening at (b).
Which one is a better solution?
Problem is returning a char *. When you using C++ you should not use this type. This is not C! std::string or std::vector<char> should be used.
If you will use char * as return type in this kind of function it will end with undefined behavior (access to released memory) or memory leak.
If you will use static std::string nam; function will maintain internal state and this is always leads to trouble.
For example if you create threading functionality you will have undefined behavior. Even worse if you will use this function twice for some reason result of second call will have impact on result for first call (for example your coworker will use this function since he will not expect hiden side effects).
If you are designing some API which should be accessible from C code than you should design this API in different way. I do not know what kind of functionality you are providing by most probably you should something like this:
char *func1(someStruct* pt, char *result, int size){ // good name could be like this: appendStructDescription
std::strig nam = somefunc(pt);
//have to append some integer in particular format
std::stringstream ss;
ss<<nam<<pt->int1 ......;
nam = ss.str();
int resultSize = std::min(size - 1, nam.length());
memcpy(result, nam.c_str(), resultSize);
result[resultSize] = 0;
return result + resultSize;
}
This approach has big advantages: responsibility for a memory management goes to caller, user of the API understands what is expected.
It is true that you should return string, but if you absolutely need to return char*, first method is better. And don't forget free. Otherwise, expressions like strcmp(f(pt1), f(pt2)) would return unpredictable results.
I have a specific problem I'm trying to solve, I need to find the location (in memory) of a class's method. I think I've hit a syntax constraint because a pointer to a method is handled as a member pointer Example:
class Foo {
public:
int targetFunction(int value) { return value + 5; }
};
DWORD findLocation() { // ignore the fact that DWORD might not equal pointer size.
int (Foo::*address)(int) = &(Foo::targetFunction); // member function pointer
void* typeHide = (void*)&address; // Remove type
DWORD targetAddress = *(DWORD*)typeHide; // Convert type from void* to DWORD* and dereference to DWORD
return targetAddress;
}
int (Foo::*address)(int) = can also be written as auto address =
Now, in VS2008, it says Foo::targetFunction's address is "0x000F B890" but &Foo::targetFunction is "0x000F 1762"
First, the member pointer works correctly using the member pointer operators .* and ->*. If I cast targetAddress back to a member pointer, it still works!
Second, the location can be a thunk function!
Finally, if I use VS2008's debugger to change the value of targetFunction from the member pointer's address 1762 to the VS debugger reported value B890, my code works correctly!
Is there a C++ specific way of getting the address value (B890) instead of the member pointer value (1762)?
Upon request, here is code I'm trying to make work:
BYTE overwriteStorage[300][2];
void NOP(void)
{
// hackish, but works for now.
}
void disableOlderVersions(DWORD _address, int index)
{
//...
_address = findLocation();
DWORD protectionStorage = 0;
VirtualProtect((void *)_address, 1+4, PAGE_WRITECOPY, &protectionStorage); // windows.h: Make Read/Write the location in code
{
BYTE *edit = (BYTE*)_address;
overwriteStorage[index][0] = *(edit+0); // store previous value to revert if needed
*(edit+0) = 0XE9; // JUMP (32-bit)
overwriteStorage[index][1] = *(edit+1); // store second value
signed int correctOffset = (signed int)NOP - (signed int)_address - 5; // calculate 0xE9 relative jump
*(signed int*)(edit+1) = correctOffset; // set jump target
}
VirtualProtect((void *)_address, 1+4, PAGE_EXECUTE, &protectionStorage);
}
if I replace the first line of findLocation from a member pointer to an actual function pointer it works perfectly. However, I need to read&write to several class methods as well, this method is broken by the odd member pointers.
Also, I've had some local functions not report the correct address either (recently). Is there possibly another way to find function addresses without being constrained by the compiler behaviors?
It sounds like you're trying to compress a member-function call into a single function pointer. It's not possible.
Remember:
Object x;
x.a(1);
is actually short for
a(&x /*this*/, 1 /*arg1, ... */); //approximation, leprechauns may be involved in actual implementations.
That first argument is crucial, it's going to become "this".
So you can't do something like this:
class Object {
public:
void f(int);
}
typedef void (*FNPTR)(int);
Object a;
void (Object::* memberFuncPtr)(int);
void* nerfedPtr = (void*)memberFuncPtrl
FNPTR funcPtr = static_cast<FNPTR>(nerfedPtr);
funcPtr(1);
Because you've robbed the member function of it's object context.
There is no way to call an object member function without having both the address of the function and the address of the instance.
I'm trying to get function addresses which are hidden behind structures. Unfortunately, the void* basic C++ conversion doesn't work, so I used C++ template instead.
1. Basic void* C++ conversion doesn't work with functions inside structures, why?
void * lpfunction;
lpfunction = scanf; //OK
lpfunction = MessageBoxA; //OK
I made a simple structure :
struct FOO{
void PRINT(void){printf("bla bla bla");}
void SETA(int){} //nothing you can see
void SETB(int){} //nothing you can see
int GETA(void){} //nothing you can see
int GETB(void){} //nothing you can see
};
///////////////////////////////////////////
void *lpFunction = FOO::PRINT;
And the compiling error :
error C2440: 'initializing' :
cannot convert from 'void (__thiscall FOO::*)(void)' to 'void *'
2. Is getting function member addresses impossible?
Then, I made a template function which is able to convert a function member to address. Then I will call it by assembly. It should be something like this:
template <class F,void (F::*Function)()>
void * GetFunctionAddress() {
union ADDRESS
{
void (F::*func)();
void * lpdata;
}address_data;
address_data.func = Function;
return address_data.lpdata; //Address found!!!
}
And here is the code :
int main()
{
void * address = GetFunctionAddress<FOO,&FOO::PRINT>();
FOO number;
number.PRINT(); //Template call
void * lpdata = &number;
__asm mov ecx, lpdata //Attach "number" structure address
__asm call address //Call FOO::PRINT with assembly using __thiscall
printf("Done.\n");
system("pause");
return 0;
}
But, I see it is extremely specific. It looks like LOCK - KEY, and I have to make a new template for every set of argument types.
Original (OK) :
void PRINT(); //void FOO::PRINT();
Modify a bit :
void PRINT(int); //void FOO::PRINT(int);
Immediately with old template code the compiler shows :
//void (F::*func)();
//address_data.func = Function;
error C2440: '=' : cannot convert from
'void (__thiscall FOO::*)(int)' to 'void (__thiscall FOO::*)(void)'
Why? They are only addresses.
69: address_data.func = Function;
00420328 mov dword ptr [ebp-4],offset #ILT+2940(FOO::PRINT) (00401b81)
...
EDIT3 : I know the better solution :
void(NUMBER::*address_PRINT)(void) = FOO::PRINT;
int(NUMBER::*address_GETA)(void) = FOO::GETA;
int(NUMBER::*address_GETB)(void) = FOO::GETB;
void(NUMBER::*address_SETA)(int) = FOO::SETA;
void(NUMBER::*address_SETA)(int) = FOO::SETB;
It's much better than template. And by the way I want to achieve the goal :
<special_definition> lpfunction;
lpfunction = FOO::PRINT; //OK
lpfunction = FOO::GETA; //OK
lpfunction = FOO::GETB; //OK
lpfunction = FOO::SETA; //OK
lpfunction = FOO::SETB; //OK
Is this possible?
Pointers to member functions are nothing like pointers to global functions or static member functions. There are many reasons for this, but I'm not sure how much you know about how C++ works, and so I'm not sure what reasons will make sense.
I do know that what you are trying in assembly simply won't work in the general case. It seems like you have a fundamental misunderstanding about the purpose of member functions and function pointers.
The thing is, you are doing some things that you would generally not do in C++. You don't generally build up tables of function pointers in C++ because the things you would use that sort of thing for are what virtual functions are for.
If you are determined to use this approach, I would suggest you not use C++ at all, and only use C.
To prove these pointer types are completely incompatible, here is a program for you:
#include <cstdio>
struct Foo {
int a;
int b;
int addThem() { return a + b; }
};
struct Bar {
int c;
int d;
int addThemAll() { return c + d; }
};
struct Qux : public Foo, public Bar {
int e;
int addAllTheThings() { return Foo::addThem() + Bar::addThemAll() + e; }
};
int addThemGlobal(Foo *foo)
{
return foo->a + foo->b;
}
int main()
{
int (Qux::*func)();
func = &Bar::addThemAll;
printf("sizeof(Foo::addThem) == %u\n", sizeof(&Foo::addThem));
printf("sizeof(Bar::addThemAll) == %u\n", sizeof(&Bar::addThemAll));
printf("sizeof(Qux::addAllTheThings) == %u\n", sizeof(&Qux::addAllTheThings));
printf("sizeof(func) == %u\n", sizeof(func));
printf("sizeof(addThemGlobal) == %u\n", sizeof(&addThemGlobal));
printf("sizeof(void *) == %u\n", sizeof(void *));
return 0;
}
On my system this program yields these results:
$ /tmp/a.out
sizeof(Foo::addThem) == 16
sizeof(Bar::addThemAll) == 16
sizeof(Qux::addAllTheThings) == 16
sizeof(func) == 16
sizeof(addThemGlobal) == 8
sizeof(void *) == 8
Notice how the member function pointer is 16 bytes long. It won't fit into a void *. It isn't a pointer in the normal sense. Your code and union work purely by accident.
The reason for this is that a member function pointer often needs extra data stored in it related to fixing up the object pointer it's passed in order to be correct for the function that's called. In my example, when called Bar::addThemAll on a Qux object (which is perfectly valid because of inheritance) the pointer to the Qux object needs to be adjusted to point at the Bar sub-object before the function is called. So Qux::*s to member functions must have this adjustment encoded in them. After all, saying func = &Qux::addAllTheThings is perfectly valid, and if that function were called no pointer adjustment would be necessary. So the pointer adjustment is a part of the function pointer's value.
And that's just an example. Compilers are permitted to implement member function pointers in any way they see fit (within certain constraints). Many compilers (like the GNU C++ compiler on a 64-bit platform like I was using) will implement them in a way that do not permit any member function pointer to be treated as at all equivalent to normal function pointers.
There are ways to deal with this. The swiss-army knife of dealing with member function pointers is the ::std::function template in C++11 or C++ TR1.
An example:
#include <functional>
// .... inside main
::std::function<int(Qux *)> funcob = func;
funcob can point at absolutely anything that can be called like a function and needs a Qux *. Member functions, global functions, static member functions, functors... funcob can point at it.
That example only works on a C++11 compiler though. But if your compiler is reasonably recent, but still not a C++11 compiler, this may work instead:
#include <tr1/functional>
// .... inside main
::std::tr1::function<int(Qux *)> funcob = func;
If worse comes to worse, you can use the Boost libraries, which is where this whole concept came from.
But I would rethink your design. I suspect that you will get a lot more milage out of having a well thought out inheritance hierarchy and using virtual functions than you will out of whatever it is you're doing now. With an interpreter I would have a top level abstract 'expression' class that is an abstract class for anything that can be evaluated. I would give it a virtual evaluate method. Then you can derive classes for different syntax elements like an addition expression a variable or a constant. Each of them will overload the evaluate method for their specific case. Then you can build up expression trees.
Not knowing details though, that's just a vague suggestion about your design.
Here is a clean solution. By means of a template wrap your member function into a static member function. Then you can convert it to whatever pointer you want:
template<class F, void (F::*funct)()>
struct Helper: public T {
static void static_f(F *obj) {
((*obj).*funct)();
};
};
struct T {
void f() {
}
};
int main() {
void (*ptr)(T*);
ptr = &(Helper<T,&T::f>::static_f);
}
It seems that you need to convert a pointer to a member function to a void *. I presume you want to give that pointer as a "user data" to some library function and then you will get back your pointer and want to use it on some given object.
If this is the case a reinterpret_cast<void *>(...) could be the right thing... I assume that the library receiving the pointer is not using it.
I'd like to use a pointer to member function in C++, but it doesn't work:
pointer declaration:
int (MY_NAMESPACE::Number::*parse_function)(string, int);
pointer assignation:
parse_function = &MY_NAMESPACE::Number::parse_number;
This call works perfectly (itd is an iterator to elements of a map):
printf("%s\t%p\n",itd->first.c_str(),itd->second.parse_function);
But this one doesn't work:
int ret = (itd->second.*parse_function)(str, pts);
$ error: 'parse_function' was not declared in this scope
And this one neither
int ret = (itd->second.*(MY_NAMESPACE::Number::parse_function))(str, pts);
$ [location of declaration]: error: invalid use of non-static data member 'MY_NAMESPACE::Number::parse_function'
$ [location of the call]: error: from this location
I don't understant why ...
Thx in advance !!
int (MY_NAMESPACE::Number::*parse_function)(string, int);
This shows, parse_function is a pointer to a member function of class Number.
This call works perfectly (itd is an iterator to elements of a map):
printf("%s\t%p\n",itd->first.c_str(),itd->second.parse_function);
and from this we can see parse_function is a member of itd->second, whatever this is.
For this call
int ret = (itd->second.*parse_function)(str, pts);
or this call
int ret = (itd->second.*(MY_NAMESPACE::Number::parse_function))(str, pts);
to succeed, itd->second must be of type Number, which it presumably isn't. And parse_function must be defined as either a variable in the current or enclosing scope (fist case) or a static variable of class Number (second case).
So you need some Number and apply parse_function to that
Number num;
(num.*(itd->second.parse_function))(str, pts);
or with a pointer
Number *pnum;
(pnum->*(itd->second.parse_function))(str, pts);
Update:
Since itd->second is a Number, you must apply parse_function, which is a member of it, like this
int ret = (itd->second.*(itd->second.parse_function))(str, pts);
You can define pointers to functions like so: type(*variable)() = &function;
For example:
int(*func_ptr)();
func_ptr = &myFunction;
I might just not realize your code this early morning, but problem could be that parse_function is a pointer, yet you're calling it like itd->second.*parse_function.
Pointers are called with the ->*, so try doing itd->second->parse_function.
Might not fix anything tho, I can't really seem to catch onto your code.
Posting more information, it's hard to tell from two lines of code.
Here's one example on how it's used in actual code, this one calls func() through cb() using pointers and parameters only:
int func()
{
cout << "Hello" << endl;
return 0;
}
void cb(int(*f)())
{
f();
}
int main()
{
int(*f)() = &func;
cb(f);
return 0;
}
I would like to do something like:
for(int i=0;i<10;i++)
addresses[i] = & function(){ callSomeFunction(i) };
Basically, having an array of addresses of functions with behaviours related to a list of numbers.
If it's possible with external classes like Boost.Lambda is ok.
Edit: after some discussion I've come to conclusion that I wasn't explicit enough. Please read Creating function pointers to functions created at runtime
What I really really want to do in the end is:
class X
{
void action();
}
X* objects;
for(int i=0;i<0xFFFF;i++)
addresses[i] = & function(){ objects[i]->action() };
void someFunctionUnknownAtCompileTime()
{
}
void anotherFunctionUnknowAtCompileTime()
{
}
patch someFunctionUnknownAtCompileTime() with assembly to jump to function at addresses[0]
patch anotherFunctionUnknownAtCompileTime() with assembly to jump to function at addresses[1]
sth, I don't think your method will work because of them not being real functions but my bad in not explaining exactly what I want to do.
If I understand you correctly, you're trying to fill a buffer with machine code generated at runtime and get a function pointer to that code so that you can call it.
It is possible, but challenging. You can use reinterpret_cast<> to turn a data pointer into a function pointer, but you'll need to make sure that the memory you allocated for your buffer is flagged as executable by the operating system. That will involve a system call (LocalAlloc() on Windows iirc, can't remember on Unix) rather than a "plain vanilla" malloc/new call.
Assuming you've got an executable block of memory, you'll have to make sure that your machine code respects the calling convention indicated by the function pointer you create. That means pushing/popping the appropriate registers at the beginning of the function, etc.
But, once you've done that, you should be able to use your function pointer just like any other function.
It might be worth looking at an open source JVM (or Mono) to see how they do it. This is the essence of JIT compilation.
Here is an example I just hacked together:
int func1( int op )
{
printf( "func1 %d\n", op );
return 0;
}
int func2( int op )
{
printf( "func2 %d\n", op );
return 0;
}
typedef int (*fp)(int);
int main( int argc, char* argv[] )
{
fp funcs[2] = { func1, func2 };
int i;
for ( i = 0; i < 2; i++ )
{
(*funcs[i])(i);
}
}
The easiest way should be to create a bunch of boost::function objects:
#include <boost/bind.hpp>
#include <boost/function.hpp>
// ...
std::vector< boost::function<void ()> > functors;
for (int i=0; i<10; i++)
functors.push_back(boost::bind(callSomeFunction, i));
// call one of them:
functors[3]();
Note that the elements of the vector are not "real functions" but objects with an overloaded operator(). Usually this shouldn't be a disadvantage and actually be easier to handle than real function pointers.
You can do that simply by defining those functions by some arbitrary names in the global scope beforehand.
This is basically what is said above but modifying your code would look something like this:
std::vector<int (*) (int)> addresses;
for(int i=0;i<10;i++) {
addresses[i] = &myFunction;
}
I'm not horribly clear by what you mean when you say functions created at run time... I don't think you can create a function at run time, but you can assign what function pointers are put into your array/vector at run time. Keep in mind using this method all of your functions need to have the same signature (same return type and parameters).
You can't invoke a member function by itself without the this pointer. All instances of a class have the function stored in one location in memory. When you call p->Function() the value of p is stored somewhere (can't remember if its a register or stack) and that value is used as base offset to calculate locations of the member variables.
So this means you have to store the function pointer and the pointer to the object if you want to invoke a function on it. The general form for this would be something like this:
class MyClass {
void DoStuf();
};
//on the left hand side is a declaration of a member function in the class MyClass taking no parameters and returning void.
//on the right hand side we initialize the function pointer to DoStuff
void (MyClass::*pVoid)() = &MyClass::DoStuff;
MyClass* pMyClass = new MyClass();
//Here we have a pointer to MyClass and we call a function pointed to by pVoid.
pMyClass->pVoid();
As i understand the question, you are trying to create functions at runtime (just as we can do in Ruby). If that is the intention, i'm afraid that it is not possible in compiled languages like C++.
Note: If my understanding of question is not correct, please do not downvote :)