C++ Pointer can call Member Function without Object - c++

Amazingly people may call it feature but I use to say it another bug of C++ that we can call member function through pointer without assigning any object. See following example:
class A{
public:
virtual void f1(){cout<<"f1\n";}
void f2(){cout<<"f2\n";};
};
int main(){
A *p=0;
p->f2();
return 0;
}
Output:
f2
We have checked this in different compilers & platforms but result is same, however if we call virtual function through pointer without object then there occur run-time error. Here reason is obvious for virtual function when object is checked it is not found so there comes error.

This is not a bug. You triggered an Undefined Behavior. You may get anything including the result you expected.
Dereferencing a NULL pointer is undefined behavior.
BTW, there is no thing such as "bug of C++". The bugs may occur in C++ Compilers not in the language it self.

As pointed out, this is undefined behaviour, so anything goes.
To answer the question in terms of the implementation, why do you see this behaviour?
The non-virtual call is implemented as just an ordinary function call, with the this pointer (value null) passed in as paremeter. The parameter is not dereferenced (as no member variables are used), so the call succeeds.
The virtual call requires a lookup in the vtable to get the adress of the actual function to call. The vtable address is stored in a pointer in the data of the object itself. Thus to read it, a de-reference of the this pointer is required - segmentation fault.

When you create a class by
class A{
public:
virtual void f1(){cout<<"f1\n";}
void f2(){cout<<"f2\n";};
};
The Compiler puts the code of member functions in the text area.
When you do p->MemberFunction() then the compiler just deferences p and tries to find the function MemberFunction using the type information of p which is Class A.
Now since the function's code exists in the text area so it is called. If the function had references to some class variables then while accessing them, you might have gotten a Segmentation Fault as there is no object, but since that is not the case, hence the function executes properly.
NOTE: It all depends on how a compiler implements member function access. Some compiler may choose to see if the pointer of object is null before accessing the member function, but then the pointer may have some garbage value instead of 0 which a compiler cannot check, so generally compilers ignore this check.

You can achieve a lot with undefined behavior. You can even call a function which only takes 1 argument and receive the second one like this:
#include <iostream>
void Func(int x)
{
uintptr_t ptr = reinterpret_cast<uintptr_t>(&x) + sizeof(x);
uintptr_t* sPtr = (uintptr_t*)ptr;
const char* secondArgument = (const char*)*sPtr;
std::cout << secondArgument << std::endl;
}
int main()
{
typedef void(*PROCADDR)(int, const char*);
PROCADDR ext_addr = reinterpret_cast<PROCADDR>(&Func);
//call the function
ext_addr(10, "arg");
return 0;
}
Compile and run under windows and you will get "arg" as result for the second argument. This is not a fault within C++, it is just plain stupid on my part :)

This will work on most compilers. When you make a call to a method (non virtual), the compiler translates:
obj.foo();
to something:
foo(&obj);
Where &obj becomes the this pointer for foo method. When you use a pointer:
Obj *pObj = NULL;
pObj->foo();
For the compiler it is nothing but:
foo(pObj);
i.e.:
foo(NULL);
Calling any function with null pointer is not a crime, the null pointer (i.e. pointer having null value) will be pushed to call stack. It is up to the target function to check if null was passed to it. It is like calling:
strlen(NULL);
Which will compile, and also run, if it is handled:
size_t strlen(const char* ptr) {
if (ptr==NULL) return 0;
... // rest of code if `ptr` is not null
}
Thus, this is very much valid:
((A*)NULL)->f2();
As long as f2 is non-virtual, and if f2 doesn't read/write anything out of this, including any virtual function calls. Static data and function access will still be okay.
However, if method is virtual, the function call is not as simple as it appears. Compiler puts some additional code to perform late binding of given function. The late binding is totally based on what is being pointed by this pointer. It is compiler dependent, but a call like:
obj->virtual_fun();
Will involve looking up the current type of obj by virtual function table lookup. therefore, obj must not be null.

Related

Why program didn't crash when lifetime of temporary has ended? [duplicate]

A question came up here on SO asking "Why is this working" when a pointer became dangling. The answers were that it's UB, which means it may work or not.
I learned in a tutorial that:
#include <iostream>
struct Foo
{
int member;
void function() { std::cout << "hello";}
};
int main()
{
Foo* fooObj = nullptr;
fooObj->member = 5; // This will cause a read access violation but...
fooObj->function(); // Because this doesn't refer to any memory specific to
// the Foo object, and doesn't touch any of its members
// It will work.
}
Would this be the equivalent of:
static void function(Foo* fooObj) // Foo* essentially being the "this" pointer
{
std::cout << "Hello";
// Foo pointer, even though dangling or null, isn't touched. And so should
// run fine.
}
Am I wrong about this? Is it UB even though as I explained just calling a function and not accessing the invalid Foo pointer?
You're reasoning about what happens in practice. Undefined behavior is allowed to do the thing you expect... but it is not guaranteed.
For the non-static case, this is straightforward to prove using the rule found in [class.mfct.non-static]:
If a non-static member function of a class X is called for an object that is not of type X, or of a type derived from X, the behavior is undefined.
Note that there's no consideration about whether the non-static member function accesses *this. The object is simply required to have the correct dynamic type, and *(Foo*)nullptr certainly does not.
In particular, even on platforms which use the implementation you describe, the call
fooObj->func();
gets converted to
__assume(fooObj); Foo_func(fooObj);
and is optimization-unstable.
Here's an example which will work contrary to your expectations:
int main()
{
Foo* fooObj = nullptr;
fooObj->func();
if (fooObj) {
fooObj->member = 5; // This will cause a read access violation!
}
}
On real systems, this is likely to end up with an access violation on the commented line, because the compiler used the fact that fooObj can't be null in fooObj->func() to eliminate the if test following it.
Don't do things that are UB even if you think you know what your platform does. Optimization instability is real.
Also, the Standard is even more restrictive that you might think. This will also cause UB:
struct Foo
{
int member;
void func() { std::cout << "hello";}
static void s_func() { std::cout << "greetings";}
};
int main()
{
Foo* fooObj = nullptr;
fooObj->s_func(); // well-formed call to static member,
// but unlike Foo::s_func(), it requires *fooObj to be a valid object of type Foo
}
The relevant portions of the Standard are found in [expr.ref]:
The expression E1->E2 is converted to the equivalent form (*(E1)).E2
and the accompanying footnote
If the class member access expression is evaluated, the subexpression evaluation happens even if the result is unnecessary to determine the value of the entire postfix expression, for example if the id-expression denotes a static member.
This means that the code in question definitely evaluates (*fooObj), attempting to create a reference to a non-existent object. There have been several proposals to make this allowed and only forbid allowing lvalue->rvalue conversion on such a reference, but those have been rejected this far; even forming the reference is illegal in all versions of the Standard to date.
In practice this is usually how major compilers implement member functions, yes. This means that your test program would probably appear to run "just fine".
Having said that, dereferencing a pointer pointing to nullptr is undefined behavior which means that all bets are off and the whole program and it's output is meaningless, anything could happen.
You can never rely on this behavior, optimizers in particular could mess all of this code up because they're allowed to assume that fooObj is never nullptr.
Compiler isn't obliged by standard to implement member function by passing it a pointer to the class instance. Yes, there is pseudo-pointer "this", but it is unrelated element, guaranteed to be "understood".
nullptr pointer doesn't point on any existing object, and -> () calls a member of that object. From standard's view, this is nonsense and result of such operation is undefined (and potentially, catastrophic).
If function() would be virtual, then call is allowed to fail, because address of function would be unavailable (vtable might be implemented as part of object and doesn't exist if object doesn't).
if the member function (method) behaves like that and meant to be called like that it should be a static member function (method). Static method doesn't access non-static fields and doesn't call non-static methods of class. If it is static, the call could look like this as well:
Foo::function();

Is this undefined behaviour in C++ calling a function from a dangling pointer

A question came up here on SO asking "Why is this working" when a pointer became dangling. The answers were that it's UB, which means it may work or not.
I learned in a tutorial that:
#include <iostream>
struct Foo
{
int member;
void function() { std::cout << "hello";}
};
int main()
{
Foo* fooObj = nullptr;
fooObj->member = 5; // This will cause a read access violation but...
fooObj->function(); // Because this doesn't refer to any memory specific to
// the Foo object, and doesn't touch any of its members
// It will work.
}
Would this be the equivalent of:
static void function(Foo* fooObj) // Foo* essentially being the "this" pointer
{
std::cout << "Hello";
// Foo pointer, even though dangling or null, isn't touched. And so should
// run fine.
}
Am I wrong about this? Is it UB even though as I explained just calling a function and not accessing the invalid Foo pointer?
You're reasoning about what happens in practice. Undefined behavior is allowed to do the thing you expect... but it is not guaranteed.
For the non-static case, this is straightforward to prove using the rule found in [class.mfct.non-static]:
If a non-static member function of a class X is called for an object that is not of type X, or of a type derived from X, the behavior is undefined.
Note that there's no consideration about whether the non-static member function accesses *this. The object is simply required to have the correct dynamic type, and *(Foo*)nullptr certainly does not.
In particular, even on platforms which use the implementation you describe, the call
fooObj->func();
gets converted to
__assume(fooObj); Foo_func(fooObj);
and is optimization-unstable.
Here's an example which will work contrary to your expectations:
int main()
{
Foo* fooObj = nullptr;
fooObj->func();
if (fooObj) {
fooObj->member = 5; // This will cause a read access violation!
}
}
On real systems, this is likely to end up with an access violation on the commented line, because the compiler used the fact that fooObj can't be null in fooObj->func() to eliminate the if test following it.
Don't do things that are UB even if you think you know what your platform does. Optimization instability is real.
Also, the Standard is even more restrictive that you might think. This will also cause UB:
struct Foo
{
int member;
void func() { std::cout << "hello";}
static void s_func() { std::cout << "greetings";}
};
int main()
{
Foo* fooObj = nullptr;
fooObj->s_func(); // well-formed call to static member,
// but unlike Foo::s_func(), it requires *fooObj to be a valid object of type Foo
}
The relevant portions of the Standard are found in [expr.ref]:
The expression E1->E2 is converted to the equivalent form (*(E1)).E2
and the accompanying footnote
If the class member access expression is evaluated, the subexpression evaluation happens even if the result is unnecessary to determine the value of the entire postfix expression, for example if the id-expression denotes a static member.
This means that the code in question definitely evaluates (*fooObj), attempting to create a reference to a non-existent object. There have been several proposals to make this allowed and only forbid allowing lvalue->rvalue conversion on such a reference, but those have been rejected this far; even forming the reference is illegal in all versions of the Standard to date.
In practice this is usually how major compilers implement member functions, yes. This means that your test program would probably appear to run "just fine".
Having said that, dereferencing a pointer pointing to nullptr is undefined behavior which means that all bets are off and the whole program and it's output is meaningless, anything could happen.
You can never rely on this behavior, optimizers in particular could mess all of this code up because they're allowed to assume that fooObj is never nullptr.
Compiler isn't obliged by standard to implement member function by passing it a pointer to the class instance. Yes, there is pseudo-pointer "this", but it is unrelated element, guaranteed to be "understood".
nullptr pointer doesn't point on any existing object, and -> () calls a member of that object. From standard's view, this is nonsense and result of such operation is undefined (and potentially, catastrophic).
If function() would be virtual, then call is allowed to fail, because address of function would be unavailable (vtable might be implemented as part of object and doesn't exist if object doesn't).
if the member function (method) behaves like that and meant to be called like that it should be a static member function (method). Static method doesn't access non-static fields and doesn't call non-static methods of class. If it is static, the call could look like this as well:
Foo::function();

about void pointer, classes and casting

I have c++ experience for about a year or two but I code the same way I code in Java (simple oop stuff). Now I have this sample code which I don't understand. (it's quite big so I tried to make it shorter, I hope it's clear enough for you guys)
//in .h file
typedef void*(*AnimalCreation)();
//in .cpp
void foo(void* p)
{
AnimalCreation ac = (AnimalCreation)p;
Animal* current_animal = reinterpret_cast<Animal*>(ac());
current_animal->init();
}
//somewhere in another class foo is called
Dog* dog = new Dog(); //Dog is a subclass of Animal
foo((void*)&dog)
What is the purpose of AnimalCreation?
And what's the difference between that and
typedef void(*AnimalCreation)();`//without asterisk after void
What's happening inside foo?
If the expected argument foo receives is always a subclass of Animal why does the programmer need to implement it like in the above and not just foo(Animal*)?
Thanks.
typedef void(*AnimalCreation)();
this declares "AnimalCreation" to be used as type-alias for a pointer to a function which doesn't return any value, while this
typedef void*(*AnimalCreation)();
declares it to be used as a type-alias for a pointer to a function which returns a void pointer, i.e. an address to something you don't know its type.
Inside foo you're receiving such a "generic address" and you're C-casting(potentially unsafe, checked at runtime) it to a function pointer. This is at your own risk: you don't know what that received address is pointing to. And after that you're calling the function and receiving another void pointer which you reinterpret (dangerous) as an Animal object. And then you use it.
A function pointer cannot be a subclass of anything so I don't think the argument in that code is an Animal subclass... rather the subclass to the Animal class is the object returned by that function. Assuming that is also a polymorphic class, you will then be able to call its methods with the virtual inheritance rules. If you intend to check the pointer received by the function call and you're unsure whether it is a subclass of the Animal class, you'd rather be using dynamic_cast.
As a sidenote: converting between function pointers and void* is a bad practice in C++ since you lose valuable type information.
The typedef line is AnimalCreation being defined as a function pointer type
Function foo takes in a void * argument which it casts into an AnimalCreation type (i.e. into the function pointer type). It can then invoke the function via the function pointer. This invocation returns a void * (as per the typedef - the part before the firts bracket is the return type, hence void*) which is then casted to an Animal* by reinterpret_cast.
If you removed the asterisk from the typdef - it would still declare a function pointer type, but now the return value would be void instead of void * (i.e. nothing returned, rather than a pointer). You could still invoke the function via the function pointer, but it would not return anything.
All in all, this is a nice little function pointer tutorial.
EDIT : the big picture of what this code seems to be doing - this is one way of implementing a 'Factory Pattern' in C++ - abstracting the creation of an object, and returning a polymorphic base class pointer to a derived class. Casting between void * and function pointers and reinterpret_cast is not the nicest way to achieve this, for alternatives you could look here
First off, this is quite ugly C-style code.
typedef void*(*AnimalCreation)();
To interpret this, follow the general rule of C & C++ declaration reading: if you type the declaration as an expression, you'll get its type.
*AnimalCreation This means AnimalCreation is a pointer
(*AnimalCreation)() This means *AnimalCreation is a function taking no arguments, so AnimalCreation is a pointer to function taking no arguments
void *(*AnimalCreation)() This means (*AnimalCreation)() is a void* (= pointer to void), so AnimalCreation is a pointer to a function which takes no arguments and returns a void*.
If it was just typedef void (*AnimalCreation)();, it would be a pointer to a function taking no arguments and returning no value (i.e. returning void).
Now, foo().
That takes a void* (pointer to anything) and interprets it as AnimalCreation - as a pointer to function taking no arguments and returning a void*. If the argument passed to foo was actually of that type, all is well. If something else is passed in, the program will exhibit Undefined Behaviour, which means anything can happen. It would most likely crash, as it could be trying to interpret data as code, for example.
foo() calls that function passed in, which returns a void*. foo() then interprets that as a pointer to Animal. If that's what the function actually returned, great. If not, Undefined Behaviour again.
Finally, the call you're showing will force the Undefined Behaviour to happen, because it's passing in the address of a pointer to an object. But, as stated above, foo() will interpret that as the address of a function, and try to call that function. Hilarity ensues.
To summarize, such code is bad and its author should feel bad. The only place you'd expect to see such code is interoperability with a C-style external library, and in such case it should be extremely well documented.

Why no segfault on member function call?

I'm looking for clarification on this bit of code. The call to A::hello() works (I expected a segv). The segfault does come through on the access to member x, so it seems like the method resolution alone doesn't actually dereference bla?
I compiled with optimization off, gcc 4.6.3. Why doesn't bla->hello() blow up? Just wonderin' what's going on. Thanks.
class A
{
public:
int x;
A() { cout << "constructing a" << endl; }
void hello()
{
cout << "hello a" << endl;
}
};
int main()
{
A * bla;
bla = NULL;
bla->hello(); // prints "hello a"
bla->x = 5; // segfault
}
Your program exhibits undefined behavior. "Seems to work" is one possible manifestation of undefined behavior.
In particular, bla->hello() call appears to work because hello() doesn't actually use this in any way, so it just happens not to notice that this is not a valid pointer.
You are dereferencing NULL pointer, i.e. trying to access object stored at address NULL:
bla = NULL;
bla->hello();
bla->x = 5;
which results in undefined behavior, which means that anything can happen, including the seg fault while assigning 5 to the member x and also including deceptive "works as expected" effect while invoking the hello method.
At least in the typical implementation, when you call a non-virtual member function via a pointer, that pointer is not dereferenced to find the function.
The type of the pointer is used to determine the scope (context) in which to search for the name of the function, but that happens entirely at compile time. What the pointer points at (including "nothing", in the case of a null pointer) is irrelevant to finding the function itself.
After the compiler finds the correct function, it typically translates a call like a->b(c); into something roughly equivalent to: b(a, c); -- a then becomes the value of this inside the function. When the function refers to data in the object, this is dereferenced to find the correct object, then an offset is normally applied to find the correct item in that object.
If the member function never attempts to use any of the object's members, the fact that this is a null pointer doesn't affect anything.
If, on the other hand, the member function does attempt to use a member of the object, it'll attempt to dereference this to do that, and if this is a NULL pointer, that won't work. Likewise, calling a virtual function always uses the vtable pointer, which is in the object, so attempting to call a virtual function via a null pointer can be expected to fail (regardless of whether the code in the virtual function refers to data in the object or not).
As I said to start with, I'm talking about the typical implementation here. From the viewpoint of the standard, you simply have undefined behavior, and that's the end of it. In theory, an implementation doesn't have to work the way I've described. In reality, however, essentially all the reasonably popular implementations of C++ (e.g., MS VC++, g++, Clang, and Intel) all work very similarly in these respects.
As long as a member function doesn't dereference this, it's usually "safe" to call it, but you're then in undefined behavior land.
In theory, this is undefined behavior. In practice, you do not use this pointer when calling hello(), it does not reference your class at all, and therefore works and does not generate memory access violation. When you do bla->x, however, you are trying to reference a memory though bla pointer which is uninitialized, and it crashes. Again, even in this case there is no guarantee that it will crash, this is undefined behavior.

error when calling virtual destructor using a function pointer in VC 6.0

i want to see the content of the vtable of the class A, especially the virtual desctructor, but i can not call it through a function pointer.
Here is my code:
typedef void (*fun)();
class A {
public:
virtual func() {printf("A::func() is called\n");}
virtual ~A() {printf("A::~A() is called\n");}
};
//enter in the vtable
void *getvtable (void* p, int off){
return (void*)*((unsigned int*)p+off);
}
//off_obj is used for multiple inherence(so not here), off_vtable is used to specify the position of function in vtable
fun getfun (A* obj, unsigned int off_obj,int off_vtable){
void *vptr = getvtable(obj,off_obj);
unsigned char *p = (unsigned char *)vptr;
p += sizeof(void*) * off_vtable;
return (fun)getvtable(p,0);
}
void main() {
A* ptr_a = new A;
fun pfunc = getfun(ptr_a,0,0);
(*pfunc)();
pfunc = getfun(ptr_a,0,1);
(*pfunc)(); //error occurred here, this is supposed to be the virtual desctrutor, why?
}
Let's suppose for the sake of argument that the vtable in question really is laid out the way you think it is, as a table of ordinary memory addresses, and that when casting those addresses to function pointers, they're callable.
You have at least two problems:
The calling convention for the member functions isn't necessarily the same as for ordinary functions. Microsoft's default calling convention is thiscall, which places a pointer to the object whose method is being called in the ECX register. There's no facility for specifying that manually; the only way to make that happen is by calling a member function in the way member functions are called, which involves syntax like obj.f() or pobj->f(). You can't do that with pointers to functions (not even member-function pointers), unless you write machine code or assembler to get all the low-level details right.
You happen not to hit this problem for func because it doesn't make reference to this (either directly or by implicit reference to other members). The destructor does, though. Destructors are special, and what's actually stored in the vtable is a pointer to a compiler-generated helper function that calls the real destructor and then checks some flags passed as a hidden parameter to determine whether it should free the object's memory. The value that happens to be in ECX doesn't matter for the func call, but it's very important to be right for the ~A call.
Destructors aren't like normal functions. As I mentioned above, the compiler can generate one or more helper functions, and they receive parameters in addition to this. You haven't accounted for that in your code. The compiler generates separate helpers for array and non-array destructors, so right now we don't even know which one you found at index 1 of the vtable. But since you didn't pass it a valid flag parameter, and there's no way to pass it the this value, it doesn't really matter what you find in the vtable anyway.
You can attempt to solve the first problem by specifying a different calling convention, like stdcall. That puts the this parameter back on the stack with the rest of the parameters, and that allows you to pass it when you call the function pointer. For func, fun would need to have a declaration like this:
typedef void (__stdcall * fun)(A*);
Invoke pfunc like this:
pfunc(ptr_a);
To solve the second problem, you'll need to determine the actual order of the vtable functions so you know to find the right destructor helper. And to call it, you'd need a different function-pointer declaration, too. Destructors don't technically have a return type, but void works well enough. You could use something like this:
typedef void (__stdcall * destr)(A*, unsigned flags);
For most of this answer, I've used an article by Igorsk about recognizing certain patterns in a program for the purpose of decompiling it back into C++. Part 2 covers classes.
You don't call the destructor. You call operator delete(), and it figures out the destructor. Calling destructors directly is Undefined Behavior, in the same sense that dereferencing NULL is, i.e. blows up on every platform I've seen.