C++ object model: calling virtual function by vptr leads to crash - c++

I've got this code snippet, it tries to call virtual function through an object's vptr (pointing to virtual function table) and uses object pointer to convert to p->vptr, like this:
#include<iostream>
using namespace std;
struct C {
virtual int f() {
return 7;
}
};
typedef int (*pf)();
int main() {
C c1;
pf *pvtable = (pf *) &c1;
cout << (*pvtable[0])() << endl;
return 0;
}
I used clang++14 to compile/link. On running it, programs returns 139, and no cout line is shown, seems it has crashed.
Why it doesn't work and how to fix it?

Why it doesn't work
You are casting a pointer-to-C to a pointer-to-int(*)().
This cast has no meaning in the C++ language and using the resulting pointer is explicitly Undefined Behavior.
and how to fix it?
There is no reliable way.
C++ does not promise the existence of a vtable pointer in any program, and if there is one C++ does not offer any method to access it.

Related

static_cast seems to work when it can't possibly work

In the following code I use static_cast<B*> on a void*, which points to an A object.
A and B are not related in any way. I understand the compiler cannot raise an error against this. But what I don't understand is, how come this actually seems to work when run...? I would expect a segfault or an error of some kind.
#include <iostream>
using namespace std;
class A {
public:
void f() const {
cout << "f" << endl;
}
};
class B {
public:
void q() {
cout << "q" << endl;
}
};
int main(int argc, char** argv) {
A a;
void* p = &a;
static_cast<B*>(p)->q(); // Prints "q"!
return 0;
}
What is the mechanism behind this?
The code causes undefined behaviour (because it dereferences a B * which does not point to a B object), which means anything can happen. You should not expect any particular subset of consequences.
To find out what your compiler did, you could inspect the assembly. My guess would be that the compiler generated assembly which would be correct if there were a B object there: call the function B::q() with implicit argument p.
how come this actually seems to work when run...?
Because the behaviour of the program is undefined.
I would expect a segfault or an error of some kind.
When behaviour is undefined, there is no guarantee that there would be a segfault or an error of some kind. It would generally be unreasonable to expect such. When behaviour is undefined, there are no guarantees whatsoever.

Import C++ member function at run-time with Gcc

Problem
I am currently working on a plugin-library, where one should be able to not only import C-Linkage symbols, but all imported things.
Thus far it works, though the problem is, that gcc screws member-function calls up.
If I export the following:
static member_function(Class* c)
{ c->method();}
it works fine an I can access the class-members. But if I do the following:
void (Class ::*p)() = import("Class::method");
(x.*p)();
i get the right pointer and also am able to call the function and the passed arguments, but the this pointer is pointing into nirvana. I think gcc is taking it from the wrong position of the stack or something like that.
It works just fine with MSVC.
I am using mingw-w64 5.1.
Does anyone have an idea what the error could be?
Simple example:
plugin.cpp
#include <iostream>
namespace space {
class __declspec(dllexport) SomeExportThingy
{
int i = 42;
public:
virtual void __declspec(dllexport) Method(int*) const
{
using namespace std;
cout << "Calling Method" << endl;
cout << pi << endl;
cout << *pi << endl;
cout << this << endl;
cout << this->i << endl;
}
}
}
loader.cpp
namespace space {
class SomeExportThingy
{
///dummy to have some data in the address
int dummy[20];
};
int main()
{
auto h = LoadLibrary("plugin.dll");
auto p = GetProcAddress(h, "_ZNK5space16SomeExportThingy6MethodEPi");
typedef void (space::SomeExportThingy::*mptr)(int*) const;
///used because posix passed void*
auto fp = *reinterpret_cast<mptr*>(&p);
space::SomeExportThingy st;
int value = 22;
cout << "ValueLoc: " << &value << endl;
cout << "StLoc: " << &st << endl;
(st.*fp)(&value);
}
Results
Now what happens is, that the function is called and the pointer to pi is passed correctly. However, the this pointer is completly screwed up.
Again: it works with MSVC, which get's the this pointer correctly, but gcc get's this wrong.
I have no idea why this happens, and removing the virtual from the method doesn't change that either.
I have no idea what causes this, so maybe someone has an idea what the ABI is doing here.
Here are the pointers I am getting:
0x00400000 == GetModuleHandleA(NULL)
0x61840000 == GetModuleHandleA("plugin.dll")
0x0029fcc4 == _&st
0x00ddcd60 == this
I wasn't able to find any relation between the values
This is not going to work with GCC:
typedef void (space::SomeExportThingy::*mptr)(int*) const;
///used because posix passed void*
auto fp = *reinterpret_cast<mptr*>(&p);
The representation of a pointer-to-member is twice the size of a normal function pointer (or a void*) so you are reading two words from a memory location that only contains one word. The second word (which tells the compiler how to adjust the this pointer for the call) is garbage, it is just whatever happens to be after p on the stack.
See https://gcc.gnu.org/onlinedocs/gcc/Bound-member-functions.html:
In C++, pointer to member functions (PMFs) are implemented using a wide pointer of sorts to handle all the possible call mechanisms; the PMF needs to store information about how to adjust the ‘this’ pointer,
p is a void* so it's a memory location on the stack that occupies sizeof(void*) bytes.
&p is a pointer to that memory location.
reinterpret_cast<mptr*>(&p) is a pointer to 2*sizeof(void*) bytes at the same address.
*reinterpret_cast<mptr*>(&p) reads 2*sizeof(void*) bytes from a memory location that is only sizeof(void*) bytes in size.
Bad things happen.
For linux, the functions for dynamic function loading are: dlopen(), dlsym(), and dlclose(). Please reference: dlopen() man page.
Consider that C++ method names are 'mangled' and and they have an invisible '*this' parameter passed before all the others. Together both issues makes trying to directly access C++ objects not trivial when using dynamic linking.
The easiest solution I've found is to use 'C' function(s) that expose access to the C++ object instance.
Secondly, memory management of C++ objects is not trivial when the code to instantiate is within an .so library object, though the referencing code is from the user's app.
For the long answer as to why avoiding Pointer to C++ Member Methods is difficult, please reference: ISO CPP Reference, Pointers to Methods.
/** File: MyClass.h **/
// Explicitly ensure 'MyClassLoaderFunc' is NOT name mangled.
extern 'C' MyClass* MyClassLoaderFunc(p1, p2 ,p3, etc );
extern 'C' MyClass* MyClassDestroyerFunc(MyClass* p);
// Create function pointer typedef named 'LoaderFuncPtr'
typedef MyClass*(MyClassLoaderFunc* LoaderFuncPtr)(p1,p2,p3,etc);
// Define MyClass
class MyClass
{
/** methods & members for the class go here **/
char dummy[25];
int method( const char *data);
};
/** File: MyClass.cpp **/
#include "MyClass.h"
MyClass* MyLoaderFunc(p1, p2 ,p3, etc) {
MyClass* newInstance = new MyClass::CreateInstance( p1, p2, p3, etc);
/** Do something with newInstance **/
return newInstance;
}
MyClass::method(const char* data)
{
}
/** File: MyProgram.cpp **/
#include "MyClass.h"
main()
{
// Dynamically load in the library containing the object's code.
void *myClassLibrary = dlopen("path/to/MyClass.so",RTLD_LOCAL);
// Dynamically resolve the unmangled 'C' function name that
// provides the bootstrap access to the MyClass*
LoaderFuncPtr loaderPtr = dlsym(myClassLibrary,"MyClassLoaderFunc");
DestroyFuncPtr destroyerPtr = dlsym(myClassLibrary,"MyClassDestroyerFunc");
// Use dynamic function to retrieve an instance of MyClass.
MyClass* myClassPtr = loadPtr(p1,p2,p3,etc);
// Do something with MyClass
myClassPtr->method();
// Cleanup of object should happen within original .cpp file
destroyPtr(myClassPtr);
myClassPtr = NULL;
// Release resources
dlclose(myClassLibrary);
return 0;
}
Hope this helps..
I also suggest a factory paradigm as an more robust solution, that I'll leave to the reader to explore.
As Jonathan pointed out, pointer-to-members are bigger than normal function pointers.
The simplest solution is to reserve and initialize the extra space.
typedef void (space::SomeExportThingy::*mptr)(int*) const;
union {
mptr fp;
struct {
FARPROC function;
size_t offset;
};
} combFp;
combFp.function = p;
combFp.offset = 0;
auto fp = combFp.fp;

convert an int pointer to any object pointer, then call it's methods. it works

I encounter some strange things in C++, but I don't know why?
I have a class like this
header file
class foo
{
public:
void call_foo();
int get_foo();
int get_foo(int val);
};
here is the cpp file
#include "foo.h"
#include <iostream>
using namespace std;
void foo::call_foo()
{
int i = 0;
int j = 33;
cout << i + j << endl;
cout << "Hello, Foo" << endl;
}
int foo::get_foo(int val)
{
int a = 345;
int rc = val + a;
cout << rc << endl;
return rc;
}
int foo::get_foo()
{
int a = 100;
int d = 23;
int rc = a + d;
cout << rc << endl;
return rc;
}
I using code to test as below
int main()
{
int* val = new int[100];
foo* foo_ptr;
foo_ptr = (foo*)val;
foo_ptr->call_foo();
foo_ptr->get_foo();
foo_ptr->get_foo(100);
delete [] val;
return 0;
}
then i compile and execute it.
clang++ foo.cpp main.cpp
Apple LLVM version 5.0 (clang-500.2.79)
os x 10.9
an int pointer convert to an object pointer, then call it's methods, it work! so weird!
Is there anybody know what is going on?
I wrote an article on my blog about why it works in my understood, Thanks all of you!! about object structure, virtual function table. just Chinese version :)
What you are experiencing is called Undefined Behavior.
Undefined Behavior means "anything can happen." Anything here includes the illusion that your code worked, did something you expected it to do, or didn't do something you expected it to do -- like crash.
Code that evokes Undefined Behavior is always faulty code. You cannot rely on Undefined Behavior, if simply for the reason that you cannot predict what will happen.
Now in this case, the reason why calling the methods might appear to work is because in practice an instance of a class doesn't get it's own copy of the code for each of the non-static methods. Instead, there's one copy of the code that is shared between all instances of foo. The pointer to that code never changes, so when you (incorrectly) resolve a pointer-to-foo and then call one of the methods through that pointer, the actual method you expected to call was actually called. This is all still Undefined Behavior however, and you need to fix your code.
It is undefined behaviour and your program is ill-formed. As far as language specification is concerned, anything could happen.
It just happens to appear to work because no member function access any data that would belong to a particular instance of foo objects. All they do is allocate local data and access cout.
It doesn't work, it has undefined behaviour.
However, the functions aren't virtual, and the object has no data members, so it's likely that your program won't actually touch the invalid memory, and so will have the same effect as calling the functions on a valid object.
Your class has no members and no virtual functions so when you call a member function through any arbitrary pointer it will 'work' since you have a statically bound function call and don't do any memory access that would be invalid. Bad things would happen if you tried to call a virtual function or access a member variable.

How does the Visual C++ compiler pass the this ptr to the called function?

I'm learning C++ using Eckel's "Thinking in C++". It states the following:
If a class contains virtual methods, a virtual function table is created for that class etc. The workings of the function table are explained roughly. (I know a vtable is not mandatory, but Visual C++ creates one.)
The calling object is passed to the called function as an argument. (This might not be true for Visual C++ (or any compiler).) I'm trying to find out how VC++ passes the calling object to the function.
To test both points in Visual C++, I've created the following class (using Visual Studio 2010, WinXP Home 32bit):
ByteExaminer.h:
#pragma once
class ByteExaminer
{
public:
short b[2];
ByteExaminer(void);
virtual void f() const;
virtual void g() const;
void bruteFG();
};
ByteExaminer.cpp:
#include "StdAfx.h"
#include "ByteExaminer.h"
using namespace std;
ByteExaminer::ByteExaminer(void)
{
b[0] = 25;
b[1] = 26;
}
void ByteExaminer::f(void) const
{
cout << "virtual f(); b[0]: " << hex << b[0] << endl;
}
void ByteExaminer::g(void) const
{
cout << "virtual g(); b[1]: " << hex << b[1] << endl;
}
void ByteExaminer::bruteFG(void)
{
int *mem = reinterpret_cast<int*>(this);
void (*fg[])(ByteExaminer*) = { (void (*)(ByteExaminer*))(*((int *)*mem)), (void (*)(ByteExaminer*))(*((int *)(*mem + 4))) };
fg[0](this);
fg[1](this);
}
The navigation through the vtable in bruteFG() works - when I call fg[0](this), f() is called. What does NOT work, however, is the passing of this to the function - meaning that this->b[0] is not printed correctly (garbage comes out instead. I'm actually lucky this doesn't produce a segfault).
So the actual output for
ByteExaminer be;
be.bruteFG();
is:
virtual f(); b[0]: 1307
virtual g(); b[1]: 0
So how should I proceed to get the correct result? How are the this pointers passed to functions in VC++?
(Nota bene: I'm NOT going to program this way seriously, ever. This is "for the lulz"; or for the learning experience. So don't try to convert me to proper C++ianity :))
Member functions in Visual Studio have a special calling convention, __thiscall, where this is passed in a special register. Which one, I don't recall, but MSDN will say. You will have to go down to assembler if you want to call a function pointer which is in a vtable.
Of course, your code exhibits massively undefined behaviour- it's only OK to alias an object using a char or unsigned char pointer, and definitely not an int pointer- even ignoring the whole vtable assumptions thing.
OK using DeadMG's hint I've found a way without using assembler:
1) Remove the ByteExaminer* arg from the functions in the fg[] array
2) Add a function void callfunc(void (*)()); to ByteExaminer:
void ByteExaminer::callfunc(void (*func)())
{
func();
}
... this apparently works because func() is the first thing to be used in callfunc, so ecx is apparently not changed before. But this is a dirty trick (as you can see in the code above, I'm always on the hunt for clean code). I'm still looking for better ways.

Virtual member functions and std::tr1::function: How does this work?

Here is a sample piece of code. Note that B is a subclass of A and both provide a unique print routine. Also notice in main that both bind calls are to &A::print, though in the latter case a reference to B is passed.
#include <iostream>
#include <tr1/functional>
struct A
{
virtual void print()
{
std::cerr << "A" << std::endl;
}
};
struct B : public A
{
virtual void print()
{
std::cerr << "B" << std::endl;
}
};
int main (int argc, char * const argv[])
{
typedef std::tr1::function<void ()> proc_t;
A a;
B b;
proc_t a_print = std::tr1::bind(&A::print, std::tr1::ref(a));
proc_t b_print = std::tr1::bind(&A::print, std::tr1::ref(b));
a_print();
b_print();
return 0;
}
Here is the output I see compiling with GCC 4.2:
A
B
I would consider this correct behavior, but I am at a loss to explain how it is working properly given that the std::tr1::functions were bound to &A::print in both cases. Can someone please enlighten me?
EDIT: Thanks for the answers. I am familiar with inheritance and polymorphic types. What I am interested in is what does &A::print mean? Is it an offset into a vtable, and that vtable changes based on the referred object (in this case, a or b?) From a more nuts-and-bolts perspective, how does this code behave correctly?
This works in the same manner as it would have worked with plain member function pointers. The following produces the same output:
int main ()
{
A a;
B b;
typedef void (A::*fp)();
fp p = &A::print;
(a.*p)(); // prints A
(b.*p)(); // prints B
}
It would have been surprising if boost/tr1/std::function did anything different since they presumably store these pointers to member functions under the hood. Oh, and of course no mention of these pointers is complete without a link to the Fast Delegates article.
Because print() is declared virtual, A is a polymorphic class. By binding to the print function pointer, you will be calling through an A pointer, much in the same way as:
A* ab = &b;
ab->print();
In the ->print call above, you would expect polymorphic behavior. Same it true in your code as well. And this is a Good Thing, if you ask me. At least, most of the time. :)