Pointers to virtual member functions. How does it work? - c++

Consider the following C++ code:
class A
{
public:
virtual void f()=0;
};
int main()
{
void (A::*f)()=&A::f;
}
If I'd have to guess, I'd say that &A::f in this context would mean "the address of A's implementation of f()", since there is no explicit seperation between pointers to regular member functions and virtual member functions. And since A doesn't implement f(), that would be a compile error. However, it isn't.
And not only that. The following code:
void (A::*f)()=&A::f;
A *a=new B; // B is a subclass of A, which implements f()
(a->*f)();
will actually call B::f.
How does it happen?

It works because the Standard says that's how it should happen. I did some tests with GCC, and it turns out for virtual functions, GCC stores the virtual table offset of the function in question, in bytes.
struct A { virtual void f() { } virtual void g() { } };
int main() {
union insp {
void (A::*pf)();
ptrdiff_t pd[2];
};
insp p[] = { { &A::f }, { &A::g } };
std::cout << p[0].pd[0] << " "
<< p[1].pd[0] << std::endl;
}
That program outputs 1 5 - the byte offsets of the virtual table entries of those two functions. It follows the Itanium C++ ABI, which specifies that.

Here is way too much information about member function pointers. There's some stuff about virtual functions under "The Well-Behaved Compilers", although IIRC when I read the article I was skimming that part, since the article is actually about implementing delegates in C++.
http://www.codeproject.com/KB/cpp/FastDelegate.aspx
The short answer is that it depends on the compiler, but one possibility is that the member function pointer is implemented as a struct containing a pointer to a "thunk" function which makes the virtual call.

I'm not entirely certain, but I think it's just regular polymorphic behavior. I think that &A::f actually means the address of the function pointer in the class's vtable, and that's why you aren't getting a compiler error. The space in the vtable is still allocated, and that is the location you are actually getting back.
This makes sense because derived classes essentially overwrite these values with pointers to their functions. This is why (a->*f)() works in your second example - f is referencing the vtable that is implemented in the derived class.

Related

What is the reason non-member virtual functions are not supported in C++

I am interested to know what the reason is for there to be no non-member virtual functions in C++. Especially considering the fact that it simply increases code layers when you want to achieve it, since you can define a virtual member-function and then call it from a non-member function.
EDIT:
Just for reference, you can do that:
struct Base
{
virtual void say() const
{
std::cout << "Base\n";
}
};
struct Derived : public Base
{
void say() const final
{
std::cout << "Derived\n";
}
};
void say(Base* obj)
{
obj->say();
}
say(static_cast<Base*>(new Derived()));
Edit 2:
And there are indeed cases where you want virtual polymorphism, since you can have the case below which doesn't work in a similar fashion, since it prints Base whereas if you were to call it with the above code, in a similar fashion it will print Derived. I believe this summarizes the crux of the problem.
void say(Base* obj)
{
std::cout << "Base\n";
}
void say(Derived* obj)
{
std::cout << "Derived\n";
}
say(static_cast<Base*>(new Derived()));
A non-member function does not require an implicit this pointer in order to invoke it.
But virtual functions require a this pointer (i.e. an object instance) in order for polymorphism to work.
And there's the contradiction: so it's not possible to have a polymorphic non-member function.
Having virtual non-member functions is technically challenging to compile.
Virtual functions are usually implemented with a vtable. Classes with virtual member functions store a pointer to that vtable, and that vtable has all the requisite functions added to it. When a virtual function is invoked, the exact function to invoke is looked up in the vtable.
Consider this: I'm writing a library in C++. For user convenience, and to reduce compiletimes, the library is distributed as:
the header files for the library
binary files that provide the implementation of the functions defined in the headers.
So what's the problem?
These binary files will also contain the vtables for any classes with virtual functions within the header files. In order to add virtual functions to a base class, the compiler will have to read and process the binary representation of the library files, modifying the vtables to add the necessary functions.
This would greatly increase the complexity of linking (making the compiler partially responsible for doing so), and it would bloat executable size (any dynamically loaded libraries would have to be statically linked, since the compiler might not have permission to modify their contents).
Are there technical work-arounds?
Yes, although it would require the implementation of the class to be present in the header file, like a template. Alternatively, the new module system could provide a way to implement this feature by forgoing the need to have separate implementation files.
Even then, it would require a lot of work on the part of compiler developers, and there has not been much demand for this feature. The main benefit this feature provides is being able to quickly and easily overload functions for specific derived classes, which itself is considered something of a code smell (since you'd come close to breaking encapsulation - a library writer writing a function that returns a pointer to a base class may want to change which derived class it returns, for example).
When you want to use polymorphism in free functions you basically have two options. Either you overload the function or you call virtual functions:
#include <iostream>
struct base {
virtual void func() = 0;
};
struct foo : base { void func() { std::cout << "foo\n"; } };
struct bar : base { void func() { std::cout << "bar\n"; } };
void f(foo& f) { f.func(); }
void f(bar& f) { f.func(); }
void g(base& b) { b.func(); }
int main() {
foo a;
bar b;
f(a);
f(b);
g(a);
g(b);
}
Considering that the main difference to member functions is the implicit this parameter, g is actually rather close to what I'd call a "virtual free function". However, other than that there are no virtual non-member functions in C++.

How exactly does C++ runtime use the vptr to choose the right function.Who takes care of this?

I am confused how vptr resolves virtual function call at run time if there are more virtual functions in a class. Who takes care of that. And who creates vtable . is it compiler?
Consider code something like this:
class A {
int x;
public:
virtual void foo() { std::cout << "base::foo()\n"; }
virtual void bar() = 0;
virtual ~A() {}
};
class B : public A {
int y;
public:
virtual void bar() { std::cout << "Derived::bar()"; }
virtual void baz() { std::cout << "Added function"; }
};
int main() {
A a;
B b;
}
This is going to result in a layout something on this general order:
So, each object contains its own copy of the object's data, which is an amalgamation of all the data defined in that class and all its base classes. When it contains at least one virtual function, it has a vtable pointer. That points to a table somewhere in the generated code. That table, in turn, contains pointers to the virtual functions for the class. The key to this working is that (for virtual functions that are common between them) the base class and derived class store those pointers at the same offsets in the vtable. When you invoke a virtual function, the compiler generates code to "chase" the vtable pointer, then invoke the function at the right offset in the vtable.
Although it's not shown directly here, when each member function (virtual or otherwise) is called, the address of the variable is typically passed as a hidden parameter that's named this inside the function. References to members can use this implicitly (so an assignment like somember=a; is really equivalent to this->somemember = a;).
Note: this is depicting how things are typically done--at least in theory, an implementation is free to do things entirely differently, as long as what it does meets the requirements in the standard. That said, every implementation of which I'm aware works fairly similarly.

Calling a virtual function from within an inherited function?

I've tried to map it out in my head, but honestly I have no idea what's really going on here.
What exactly is happening when I add and remove the virtual keyword from the below example?
#include <iostream>
#include <string>
class A {
public:
A() { me = "From A"; }
void caller() { func(); }
virtual void func() { std::cout << me << std::endl; } // THIS LINE!
private:
std::string me;
};
class B : public A {
public:
B() { me = "From B"; }
void func() { std::cout << me << std::endl; }
private:
std::string me;
};
int main() {
A a;
a.caller();
B b;
b.caller();
return 0;
}
With the virtual keyword, it prints "From A", then "From B".
Without the virtual keyword, it prints "From A", then "From A".
So far, this is the only time I've found a use for virtual functions without pointers being involved. I thought that if the virtual keyword was removed, the compiler would do the standard thing which is to overload the inherited function and end up printing "From A", and "From B" anyway.
I think this is deeper than just the VTable, and that it's more about the way it behaves in particular circumstances. Does B even have a VTable?
The call
func()
is equivalent to
this->func()
so there is a pointer involved.
Still, there's no need to involve pointers to understand the behavior.
Even a direct call of e.g. b.func() has to work as if it's a virtual call, when func is virtual in the statically known type. The compiler can optimize it based on knowing the most derived type of b. But that's a different kind of consideration (optimizations can do just about anything).
Apart from the issue of virtual dispatch, what may bring extra confusion, is that you have two mes, one declared in A and another declared in B. These are two distinct objects.
An object of type B has two data members of type std::string; one on its own, and one incorporated into the subobject of type A. The latter one, though, is not immediately available in the methods of type B because its name is eclipsed by the new me introduced in this class (though you may use a qualified name, A::me to refer to it).
Therefore, even though the bodies of A::func and B::func seem identical, the identifier me used in both of them refers to different members.
In your example, you won't see the difference:
With the virtual function, the compiler will generate a call via the VTable and at runtime, each objects will call the right function for their real class.
With the non virtual function, the compiler determines at compile time the right function to call, based on the objects defined class.
Now try the following, to see the virtual function in action:
A *pa = &b; // pointer to an A: valid as b is a B wich is also an A.
pa -> caller(); // guess what will be called if virtual or not.
No need for pointer to experimenting with virtual functions. You can observe the same effect with references as well:
A& ra = b; // create a reference to an A, but could as well be a parameter passed by reference.
ra.caller();
Virtual functions are useful for polymorphism. The idea is that you work with a general object of a class, but you don't know at compile time, if at runtime the object will really be of this class, or if it will not be a more specialiszed object (inheriting from the class).

Static function overloading?

I'll start by saying I understand that that only nonstatic member functions can be virtual, but this is what I want:
A base class defining an interface: so I can use base class pointers to access functions.
For memory management purposes (this is an embedded system with limited ram) I want the overriding functions to be statically allocated. I accept the consequence that with a static function, there will be constraints on how I can manipulate data in the function.
My current thinking is that I may keep a light overloading function by making it a wrapper for a function that actually is static.
Please forbear telling me I need to re-think my design. This is why I am asking the question. If you'd like to tell me I'm better off using c and using callbacks, please direct me to some reading material to explain the pitfalls of using an object oriented approach. Is there a object oriented pattern of design which meets the requirements I have enumerated?
Is there a object oriented pattern of design which meets the requirements I have enumerated?
Yes, plain old virtual functions. Your desire is "the overriding functions to be statically allocated." Virtual functions are statically allocated. That is, the code which implements the functions exists once, and only once, and is fixed at compile/link time. Depending upon your linker command, they are as likely to be stored in flash as any other function.
class I {
public:
virtual void doit() = 0;
virtual void undoit() = 0;
};
class A : public I {
public:
virtual void doit () {
// The code for this function is created statically and stored in the code segment
std::cout << "hello, ";
}
virtual void undoit () {
// ditto for this one
std::cout << "HELLO, ";
}
};
class B : public I {
public:
int i;
virtual void doit() {
// ditto for this one
std::cout << "world\n";
}
virtual void undoit() {
// yes, you got it.
std::cout << "WORLD\n";
}
};
int main () {
B b; // So, what is stored inside b?
// There are sizeof(int) bytes for "i",
// There are probably sizeof(void*) bytes for the vtable pointer.
// Note that the vtable pointer doesn't change size, regardless of how
// many virtual methods there are.
// sizeof(b) is probably 8 bytes or so.
}
For memory management purposes (this is an embedded system with
limited ram) I want the overriding functions to be statically
allocated.
All functions in C++ are always statically allocated. The only exception is if you manually download and utilize a JIT.
Static member functions are just plain functions (like non-member functions), that are inside the namespace of the class. That means you can treat them like plain functions, and the following solution should do:
class Interface
{
public:
void (*function) ();
};
class Implementation: public Interface
{
public:
Implementation()
{
function = impl_function;
}
private:
static void impl_function()
{
// do something
}
};
then
Implementation a;
Interface* b = &a;
b->function(); // will do something...
The problem with this approach is that you would be doing almost what the compiler does for you when you use virtual member functions, just better (needs less code, is less error-prone, and the pointer to the implementation functions are shared). The main difference is that using virtual your function would receive the (invisible) this parameter when called, and you would be able to access the member variables.
Thus, I would recommend to you to simply not do this, and use ordinary virtual methods.
The overhead with virtual functions is two-fold: besides the code for the actual implementations (which resides in the code segment, just like any other function you write), there is the virtual function table, and there are pointers to that table. The virtual function table is present once for each derived class, and its size depends on the number of virtual functions. Every object must carry a pointer to its virtual function table.
My point is, the per-object overhead of virtual functions is the same no matter how many virtual functions you have, or how much code they contain. So the way you arrange your virtual functions should have little impact on your memory consumtion, once you have decided that you want some degree of polymorphism.

C++: prototype of a virtual pointer

I am not sure if this is documented anywhere.
We all know in case of virtual functions, each class holds a vptr which pointer to an array of function pointers called the virtual table.
I want to know what is the prototype of the vptr.
For ex, if a class is declared as follows,
class A
{
int a;
public: A(){}
virtual void display();
virtual void setValue(int x);
};
Now we are having two function pointer in the vtable of class A. How will a single vptr is capable of two definitions of different prototype?
Please let me know if my understanding is wrong.
Thx!
Rahul.
The vptr is an implementation detail, and as such, it does not have a prototype.
The virtual table (in implementations that use such a thing) is a product of "compiler magic." It doesn't need to have a specific prototype because no C++ code ever uses it directly. Instead, the compiler custom-generates one to conform to each class that needs one. The compiler also generates the code to access each element, so it can guarantee that each one is accessed in a type-safe manner.
For example, the compiler knows that the slot for the A::setValue method holds a pointer to a function that matches the setValue signature because the compiler is the one that put it there in the first place. Furthermore, the only code that directly accesses that slot is machine code that the compiler generated, and prior to generating such code, the compiler already confirmed that the original C++ code was calling the setValue function. Thus, there is no worry that the setValue slot could ever hold anything other than a setValue-conformant function pointer. Nor is there any concern that some other slot might be accessed instead; if that happened, it would be a compiler bug, never something that would happen as a result of ordinary user code.
The elements of the table are never treated as a group, so there's no requirement that they all have the same type. At best, they all have a type of "general pointer or offset suitable for the CPU to jump to." Since it's not really C++ at that point, the "type" doesn't have to fit any particular C++ type.
As Oli Charlesworth noted, virtual pointers are an implementation detail, so this question does not really make sense in terms of C++. That said, the following manual implementation of (some of the functionality) of virtual functions might be helpful for your understanding:
struct vtable {
void (*display)(void*);
void (*setValue)(void*, int);
};
void A_display(void *this_) { /*Cast this_ to A* and do A stuff*/ }
void A_setValue(void *this_, int x) { /*Cast this_ to A* and do A stuff*/ }
vtable A_vtable = {A_display, A_setValue};
struct A {
vtable *vptr = &A_vtable;
int a;
public: A(){}
};
void B_display(void *this_) { /*Cast this_ to B* and do B stuff*/ }
void B_setValue(void *this_, int x) { /*Cast this_ to B* and do B stuff*/ }
vtable B_vtable = {B_display, B_setValue};
struct B {
vtable *vptr = &B_vtable;
int a;
public: B(){}
};
void display(void *obj) {
((*static_cast<vtable**>(obj))->display)(obj);
}
void setValue(void *obj, int) {
((*static_cast<vtable**>(obj))->setValue)(obj, int);
}
Of course, this only gives a small subset of the capabilities of virtual functions, but it should be fairly straightforward to see that vptrs point to collections of pointers to functions, with fixed types.