typecasting with virtual functions - c++

In the code below, pC == pA:
class A
{
};
class B : public A
{
public:
int i;
};
class C : public B
{
public:
char c;
};
int main()
{
C* pC = new C;
A* pA = (A*)pC;
return 0;
}
But when I add a pure virtual function to B and implement it in C, pA != pC:
class A
{
};
class B : public A
{
public:
int i;
virtual void Func() = 0;
};
class C : public B
{
public:
char c;
void Func() {}
};
int main()
{
C* pC = new C;
A* pA = (A*)pC;
return 0;
}
Why is pA not equal to pC in this case? Don't they both still point to the same "C" object in memory?

You're seeing a different value for your pointer because the new virtual function is causing the injection of a vtable pointer into your object. VC++ is putting the vtable pointer at the beginning of the object (which is typical, but purely an internal detail).
Let's add a new field to A so that it's easier to explain.
class A {
public:
int a;
};
// other classes unchanged
Now, in memory, your pA and A look something like this:
pA --> | a | 0x0000004
Once you add B and C into the mix, you end up with this:
pC --> | vtable | 0x0000000
pA --> | a | 0x0000004
| i | 0x0000008
| c | 0x000000C
As you can see, pA is pointing to the data after the vtable, because it doesn't know anything about the vtable or how to use it, or even that it's there. pC does know about the vtable, so it points directly to the table, which simplifies its use.

A pointer to an object is convertible to a pointer to base object and vice versa, but the conversion doesn't have to be trivial. It's entirely possible, and often necessary, that the base pointer has a different value than the derived pointer. That's why you have a strong type system and conversions. If all pointers were the same, you wouldn't need either.

Here are my assumptions, based on the question.
1) You have a case where you cast from a C to an A and you get the expected behaviour.
2) You added a virtual function, and that cast no longer works (in that you can no longer pull data from A directly after the cast to A, you get data that makes no sense to you).
If these assumptions are true the hardship you are experiencing is the insertion of the virtual table in B. This means the data in the class is no longer perfectly lined up with the data in the base class (as in the class has added bytes, the virtual table, that are hidden from you). A fun test would be to check sizeof to observe the growth of unknown bytes.
To resolve this you should not cast directly from A to C to harvest data. You should add a getter function that is in A and inherited by B and C.
Given your update in the comments, I think you should read this, it explains virtual tables and the memory layout, and how it is compiler dependent. That link explains, in more detail, what I explained above, but gives examples of the pointers being different values. Really, I had WHY you were asking the question wrong, but it seems the information is still what you wanted. The cast from C to A takes into account the virtual table at this point (note C-8 is 4, which on a 32 bit system would be the size of the address needed for the virtual table, I believe).

Related

Call a function of base class casted to derived class

Say that I have the following code, is it safe to use?
Base class:
class B
{
public:
B(bool isDerived = false) : m_isDerived(isDerived) {}
bool isDerived() { return m_isDerived; }
private:
bool m_isDerived;
}
Derived class:
class D : public B
{
public:
D() : B(true) {}
}
Code:
B* b = new B(); // Create new base class
D* unknown = static_cast<D*>(b); // Cast the base class into a derived class
if (unknown->isDerived()) // Is this allowed?
// unknown is a D and can be used as such
else
// unknown is not a D and can not be used
Can I safely call unknown->isDerived() even though unknown is really a B in this case?
We make the assumption that unknown NEVER contains anything other than a B* or D* and that we NEVER do anything with unknown until isDerived() have been checked on it.
Edit:
Given the questions I will try to explain the reason why I'm trying to do this:
So essentially I have a Windows tree control which of course can't be directly connected to the c++ tree structure I'm using to store my data. So I have to reinterpret_cast my data to a DWORD_PTR that is stored together with each node in the tree control so I have a connection between the two. My tree structure consists of either the base type (a normal node) or the derived type (a node with more info in it that should be handled differently). The pointers to these are reinterpret_cast:ed and put in the tree control.
Now, when I'm stepping through the tree control I want to act on the nodes which are of the derived type, so I want to reinterpret_cast the DWORD_PTR into a derived type. But to be entirely correct I should reinterpret_cast it to a base type first (I guess?), and then downcast it to the derived type if it is a derived type. However I thought I could make it a bit simpler by reinterpret_cast it into a derived type immediately and then check via the function if it really is a derived type. If it wasn't I do nothing more with it. In my mind the base class data in the pointer should be at the same memory location no matter how much derived it is, but I might be wrong which is why I ask here.
I wanted to make the question clearer by not involving Windows in it, but what I want in reality would be something closer to this:
B* b1 = new B();
B* b2 = new D();
DWORD_PTR ptr1 = reinterpret_cast<DWORD_PTR>(b1);
DWORD_PTR ptr2 = reinterpret_cast<DWORD_PTR>(b2);
D* unknown = reinterpret_cast<D*>(ptr1 /* or ptr2 */); // Safe?
unknown->isDerived(); // Safe?
Essentially no matter what I do it's still unsafe at some level as I must reinterpret_cast the data.
Can I safely call unknown->isDerived() even though unknown is really a
B in this case?
First of all, why would you do this? You could just call b->isDerived() and then do the downcasting.
Now, while premature and likely invalid downcast yields an undefined behavior (and should be universally despised) in this case it should work. Neither B nor D have implicit data members that might change the relative offset of m_isDerived and the address of isDerived member function is constant.
So yeah, it should work. If should is good enough for you.
EDIT: You can place a few tests to make sure offsets are same:
#include <cstddef> // for offsetof macro
#include <cassert> // for assert
#define offsetofclass(base, derived) ((static_cast<base*>((derived*)8))-8)
class Test
{
public:
Test()
{
assert(offsetofclass(B, D) == 0);
// you'll need to befriend Test with B & D to make this work
// or make the filed public... or just move these asserts
// to D's constructor
assert(offsetof(B, m_isDerived) == offsetof(D, m_isDerived));
}
};
Test g_test;
This will get executed on startup. I don't think it can be turned into a static assertion (executed and compile time).
Given your
class D : public B
and
B* b = new B(); // Create new base class
D* unknown = static_cast<D*>(b);
Formally this is Undefined Behavior, but, as long as only B things are accessed there's no technical reason why it should not work. Typically it's done in order to gain access to otherwise inaccessible B things, such as std::stack<T>::c. However there are better ways to do that, including the member function pointer trick.
Regarding
B(bool isDerived = false) : m_isDerived(isDerived) {}
that's very brittle and unsafe.
Instead class B should have a virtual member. Then you can use dynamic_cast and typeid. This is commonly known as RTTI, Run-time Type Information.
However, on the third and gripping hand, the relevant functionality should be made available via class B so that no downcasting is necessary.
As you have pointed out yourself, the only generally correct approach for storing an arbitary hierarchy in a single, opaque pointer is to go via the base class. So, first make your hierarchy:
struct Base { virtual ~Base(){}; /* ... */ };
struct FooDerived : Base { /* ... */ };
struct BarDerived : Base { /* ... */ };
struct ZipDerived : Base { /* ... */ };
You will now exclusively transform between a Base* and whatever raw pointer type you have. Strictly speaking, you can only store pointers in either a void* or a uintptr_t, but let's assume that your DWORD_PTR is wide enough.
If we wrap everything in a function, the upcasting is already taken care of:
void marshal_pointer(Base const * p, DWORD_PTR & dst)
{
static_assert(sizeof(DWORD_PTR) == sizeof(void *), "Not implementable");
dst = reinterpret_cast<DWORD_PTR>(p);
}
The return direction is just as easy:
Base * unmarshal_pointer(DWORD_PTR src)
{
static_assert(sizeof(DWORD_PTR) == sizeof(void *), "Not implementable");
return reinterpret_cast<Base *>(src);
}
All the actual polymorphic behaviour should be implemented in terms of virtual functions if possible. Manual dynamic_casts should be your last resort (though occasionally they're appropriate):
Base * p = unmarshal_pointer(weird_native_windows_thing);
p->TakeVirtualAction();

Why are pA,pB,pC not equal?

Consider the following program
#include<iostream>
using namespace std;
class ClassA
{
public:
virtual ~ClassA(){};
virtual void FunctionA(){};
};
class ClassB
{
public:
virtual void FunctionB(){};
};
class ClassC : public ClassA,public ClassB
{
};
void main()
{
ClassC aObject;
ClassA* pA = &aObject;
ClassB* pB = &aObject;
ClassC* pC = &aObject;
cout<<"pA = "<<pA<<endl;
cout<<"pB = "<<pB<<endl;
cout<<"pC = "<<pC<<endl;
}
pA,pB,pC are supposed to equal,but the result is
pA = 0031FD90
pB = 0031FD94
pC = 0031FD90
why pB = pA + 4?
and when i change
class ClassA
{
public:
virtual ~ClassA(){};
virtual void FunctionA(){};
};
class ClassB
{
public:
virtual void FunctionB(){};
};
to
class ClassA
{
};
class ClassB
{
};
the result is
pA = 0030FAA3
pB = 0030FAA4
pC = 0030FAA3
pB = pA + 1?
The multiply inherited object has two merged sub-objects. I would guess the compiler is pointing one of the pointers to an internal object.
C has two inherited subobjects, therefore is the concatenation of a A object and a B object. When you have an object C, it is composed of an object A followed by an object B. They're not located at the same address, that's why. All three pointers point to the same object, but as different superclasses. The compiler makes the shift for you, so you don't have to worry about that.
Now. Why is there a difference of 4 in one case and 1 in another? Well, in the first case, you have virtual functions for both A and B, therefore each subobject has to have a pointer to its vtable (the table containing the addresses of the resolved virtual function calls). So in this case, sizeof(A) is 4. In the second case, you have no virtual functions, so no vtable. But each subobject must be addressable independently, so the compiler still has to allocate for a different addresses for subobject of class A and subobject of class B. The minimum of difference between two addresses is 1. But I wonder if EBO (empty base-class optimization) should not have kicked in in this case.
That's the implementation detail of compiler.
The reason you hit this case is because you have MI in your code.
Think about how the computer access the member in ClassB, it using the offset to access the member. So let's say you have two int in class B, it using following statement to access the second int member.
*((int*)pb + 1) // this actually will be assembly generate by compiler
But if the pb point to the start of the aObject in your class, this will not work anymore, so the compiler need to generate multiple assembly version to access the same member base on the class inherit structure, and have run-time cost.
That's why the compiler adjust the pb not equal as pa, which will make the above code works, it is the most simply and effect way to implement.
And that's also explain why pa == pc but not equals with pb.

Multiple inheritance of virtual classes

Suppose I have the following code:
class a {
public:
virtual void do_a() = 0;
}
class b {
public:
virtual void do_b() = 0;
}
class c: public a, public b {
public:
virtual void do_a() {};
virtual void do_b() {};
}
a *foo = new c();
b *bar = new c();
Will foo->do_a() and bar->do_b() work? What's the memory layout here?
Will a->do_a() and b->do_b() work?
Assuming you meant foo->do_a() and bar->do_b(), as a and b are not object, they're type, yes. They will work. Did you try run that?
What's the memory layout here?
That is implementation-defined, mostly. Fortunately, you don't need to know about that unless you want to write non-portable code.
Why shouldn't they? The memory layout will typically be something like:
+----------+
| A part |
+----------+
| B part |
+----------+
| C part |
+----------+
If you convert your foo and bar to void* and display them, you'll
get different addresses, but the compiler knows this, and will arrange
for the this pointer to be correctly fixed up when calling the
function.
As others have mentioned the following will work without any problems
foo->do_a();
bar->do_b();
These, however, will not compile
bar->do_a();
foo->do_b();
Since bar is of type b* it has no knowledge of do_a. The same is true for foo and do_b. If you want to make those function calls you must downcast.
static_cast<c *>(foo)->do_b();
static_cast<c *>(bar)->do_a();
The other very important thing that is not shown in your example code is, when inheriting, and referring to the derived class through base class pointer, the base class MUST have a virtual destructor. If it doesn't then the following will produce undefined behavior.
a* foo = new c();
delete a;
The fix is simple
class a {
public:
virtual void do_a() = 0;
virtual ~a() {}
};
Of course, this change needs to be made to b as well.
Yes, of course it will work. The mechanics are a bit tricky though. The object will have two vtables, one for the class a parent and one for the class b parent. The pointers will be adjusted so that they point to the subset of the object that corresponds to the pointer type, leading to this surprising result:
c * baz = new c;
a * foo = baz;
b * bar = baz;
assert((void *)foo == (void *)bar); // assertion fails!
The compiler knows the types at the time of the assignment, and knows exactly how to adjust the pointers.
This is of course completely compiler dependent; nothing in the C++ standard says it has to work this way. Only that it has to work.
foo->do_a(); // will work
bar->do_b(); // will work
bar->do_a(); // compile error (do_a() is not a member of B)
foo->do_b(); // compile error (do_b() is not a member of A)
// If you really know the types are correct:
C* c = static_cast<C*>(foo);
c->do_a(); // will work
c->do_b(); // will work
// If you don't know the types, you can try at runtime:
if(C* c = dynamic_cast<C*>(foo))
{
c->do_a(); // will work
c->do_b(); // will work
}
Will a->do_a() and b->do_b() work?
No.
Will foo->do_a() and bar->do_b() work?
Yes. Your code is the canonical example of virtual function dispatch.
Why didn't you just try it?
What's the memory layout here?
Who cares?
(i.e. this is implementation-defined, and abstracted from you. You should not need to nor want to know.)
They will work. In terms of memory, this is implementation dependent. You have created objects on the heap, and for most systems, it is worth noting that objects on the heap grow upwards (c.f. the stack grows downwards). So possibly, you will have:
Memory:
+foo+
-----
+bar+

understanding vptr in multiple inheritance?

I am trying to make sense of the statement in book effective c++. Following is the inheritance diagram for multiple inheritance.
Now the book says separate memory in each class is required for vptr. Also it makes following statement
An oddity in the above diagram is that there are only three vptrs even though four classes are involved. Implementations are free to generate four vptrs if they like, but three suffice (it turns out that B and D can share a vptr), and most implementations take advantage of this opportunity to reduce the compiler-generated overhead.
I could not see any reason why there is requirement of separate memory in each class for vptr. I had an understanding that vptr is inherited from base class whatever may be the inheritance type. If we assume that it shown resultant memory structure with inherited vptr how can they make the statement that
B and D can share a vptr
Can somebody please clarify a bit about vptr in multiple inheritance?
Do we need separate vptr in each class ?
Also if above is true why B and D can share vptr ?
Your question is interesting, however I fear that you are aiming too big as a first question, so I will answer in several steps, if you don't mind :)
Disclaimer: I am no compiler writer, and though I have certainly studied the subject, my word should be taken with caution. There will me inaccuracies. And I am not that well versed in RTTI. Also, since this is not standard, what I describe are possibilities.
1. How to implement inheritance ?
Note: I will leave out alignment issues, they just mean that some padding could be included between the blocks
Let's leave it out virtual methods, for now, and concentrate on how inheritance is implemented, down below.
The truth is that inheritance and composition share a lot:
struct B { int t; int u; };
struct C { B b; int v; int w; };
struct D: B { int v; int w; };
Are going to look like:
B:
+-----+-----+
| t | u |
+-----+-----+
C:
+-----+-----+-----+-----+
| B | v | w |
+-----+-----+-----+-----+
D:
+-----+-----+-----+-----+
| B | v | w |
+-----+-----+-----+-----+
Shocking isn't it :) ?
This means, however, than multiple inheritance is quite simple to figure out:
struct A { int r; int s; };
struct M: A, B { int v; int w; };
M:
+-----+-----+-----+-----+-----+-----+
| A | B | v | w |
+-----+-----+-----+-----+-----+-----+
Using these diagrams, let's see what happens when casting a derived pointer to a base pointer:
M* pm = new M();
A* pa = pm; // points to the A subpart of M
B* pb = pm; // points to the B subpart of M
Using our previous diagram:
M:
+-----+-----+-----+-----+-----+-----+
| A | B | v | w |
+-----+-----+-----+-----+-----+-----+
^ ^
pm pb
pa
The fact that the address of pb is slightly different from that of pm is handled through pointer arithmetic automatically for you by the compiler.
2. How to implement virtual inheritance ?
Virtual inheritance is tricky: you need to ensure that a single V (for virtual) object will be shared by all the other subobjects. Let's define a simple diamond inheritance.
struct V { int t; };
struct B: virtual V { int u; };
struct C: virtual V { int v; };
struct D: B, C { int w; };
I'll leave out the representation, and concentrate on ensuring that in a D object, both the B and C subparts share the same subobject. How can it be done ?
Remember that a class size should be constant
Remember that when designed, neither B nor C can foresee whether they will be used together or not
The solution that has been found is therefore simple: B and C only reserve space for a pointer to V, and:
if you build a stand-alone B, the constructor will allocate a V on the heap, which will be handled automatically
if you build B as part of a D, the B subpart will expect the D constructor to pass the pointer to the location of V
And idem for C, obviously.
In D, an optimization allow the constructor to reserve space for V right in the object, because D does not inherit virtually from either B or C, giving the diagram you have shown (though we don't have yet virtual methods).
B: (and C is similar)
+-----+-----+
| V* | u |
+-----+-----+
D:
+-----+-----+-----+-----+-----+-----+
| B | C | w | A |
+-----+-----+-----+-----+-----+-----+
Remark now that casting from B to A is slightly trickier than simple pointer arithmetic: you need follow the pointer in B rather than simple pointer arithmetic.
There is a worse case though, up-casting. If I give you a pointer to A how do you know how to get back to B ?
In this case, the magic is performed by dynamic_cast, but this require some support (ie, information) stored somewhere. This is the so called RTTI (Run-Time Type Information). dynamic_cast will first determine that A is part of a D through some magic, then query D's runtime information to know where within D the B subobject is stored.
If we were in case where there is no B subobject, it would either return 0 (pointer form) or throw a bad_cast exception (reference form).
3. How to implement virtual methods ?
In general virtual methods are implemented through a v-table (ie, a table of pointer to functions) per class, and v-ptr to this table per-object. This is not the sole possible implementation, and it has been demonstrated that others could be faster, however it is both simple and with a predictable overhead (both in term of memory and dispatch speed).
If we take a simple base class object, with a virtual method:
struct B { virtual foo(); };
For the computer, there is no such things as member methods, so in fact you have:
struct B { VTable* vptr; };
void Bfoo(B* b);
struct BVTable { RTTI* rtti; void (*foo)(B*); };
When you derive from B:
struct D: B { virtual foo(); virtual bar(); };
You now have two virtual methods, one overrides B::foo, the other is brand new. The computer representation is akin to:
struct D { VTable* vptr; }; // single table, even for two methods
void Dfoo(D* d); void Dbar(D* d);
struct DVTable { RTTI* rtti; void (*foo)(D*); void (*foo)(B*); };
Note how BVTable and DVTable are so similar (since we put foo before bar) ? It's important!
D* d = /**/;
B* b = d; // noop, no needfor arithmetic
b->foo();
Let's translate the call to foo in machine language (somewhat):
// 1. get the vptr
void* vptr = b; // noop, it's stored at the first byte of B
// 2. get the pointer to foo function
void (*foo)(B*) = vptr[1]; // 0 is for RTTI
// 3. apply foo
(*foo)(b);
Those vptrs are initialized by the constructors of the objects, when executing the constructor of D, here is what happened:
D::D() calls B::B() first and foremost, to initiliaze its subparts
B::B() initialize vptr to point to its vtable, then returns
D::D() initialize vptr to point to its vtable, overriding B's
Therefore, vptr here pointed to D's vtable, and thus the foo applied was D's. For B it was completely transparent.
Here B and D share the same vptr!
4. Virtual tables in multi-inheritance
Unfortunately this sharing is not always possible.
First, as we have seen, in the case of virtual inheritance, the "shared" item is positionned oddly in the final complete object. It therefore has its own vptr. That's 1.
Second, in case of multi-inheritance, the first base is aligned with the complete object, but the second base cannot be (they both need space for their data), therefore it cannot share its vptr. That's 2.
Third, the first base is aligned with the complete object, thus offering us the same layout that in the case of simple inheritance (the same optimization opportunity). That's 3.
Quite simple, no ?
If a class has virtual members, one need to way to find their address. Those are collected in a constant table (the vtbl) whose address is stored in an hidden field for each object (vptr). A call to a virtual member is essentially:
obj->_vptr[member_idx](obj, params...);
A derived class which add virtual members to his base class also need a place for them. Thus a new vtbl and a new vptr for them. A call to an inherited virtual member is still
obj->_vptr[member_idx](obj, params...);
and a call to new virtual member is:
obj->_vptr2[member_idx](obj, params...);
If the base is not virtual, one can arrange for the second vtbl to be put immediately after the first one, effectively increasing the size of the vtbl. And the _vptr2 is no more needed. A call to a new virtual member is thus:
obj->_vptr[member_idx+num_inherited_members](obj, params...);
In the case of (non virtual) multiple inheritance, one inherit two vtbl and two vptr. They can't be merged, and calls must pay attention to add an offset to the object (in order for the inherited data members to be found at the correct place). Calls to the first base class members will be
obj->_vptr_base1[member_idx](obj, params...);
and for the second
obj->_vptr_base2[member_idx](obj+offset, params...);
New virtual members can again either be put in a new vtbl, or appended to the vtbl of the first base (so that no offsets are added in future calls).
If a base is virtual, one can not append the new vtbl to the inherited one as it could leads to conflicts (in the example you gave, if both B and C append their virtual functions, how D be able to build its version?).
Thus, A needs a vtbl. B and C need a vtbl and it can't be appended to A's one because A is a virtual base of both. D needs a vtbl but it can be appended to B one as B is not a virtual base class of D.
It all has to do with how compiler figures out the actual addresses of method functions. The compiler assumes that virtual table pointer is located at a known offset from the base of the object (typically at offset 0). The compiler also needs to know the structure of the virtual table for each class - in other words, how to lookup pointers to functions in the virtual table.
Class B and class C will have completely different structures of Virtual Tables since they have different methods. Virtual table for class D can look like a virtual table for class B followed by additional data for methods of class C.
When you generate an object of class D, you can cast it as a pointer to B or as a pointer to C or even as a pointer to class A. You may pass these pointers to modules that are not even aware of existence of class D, but can call methods of class B or C or A. These modules need to know how to locate the pointer to the virtual table of the class and they need to know how to locate pointers to methods of class B/C/A in the virtual table. That's why you need to have separate VPTRs for each class.
Class D is well aware of existence of class B and the structure of its virtual table and therefore can extend its structure and reuse the VPTR from object B.
When you cast a pointer to object D to a pointer to object B or C or A, it will actually update the pointer by some offset, so that it starts from vptr corresponding to that specific base class.
I could not see any reason why there
is requirement of separate memory in
each class for vptr
At runtime, when you invoke a (virtual) method via a pointer, the CPU has no knowledge about the actual object on which the method is dispatched. If you have B* b = ...; b->some_method(); then the variable b can potentially point at an object created via new B() or via new D() or
even new E() where E is some other class that inherits from (either) B or D. Each of these classes can supply its own implementation (override) for some_method(). Thus, the call b->some_method() should dispatch the implementation from either B, D or E depending on the object on which b is pointing.
The vptr of an object allows the CPU to find the address of the implementation of some_method that is in effect for that object. Each class defines it own vtbl (containing addresses of all virtual methods) and each object of the class starts with a vptr that points at that vtbl.
I think D needs 2 or 3 vptrs.
Here A may or may not require a vptr.
B needs one that should not be shared with A (because A is virtually inherited).
C needs one that should not be shared with A (ditto).
D can use B or C's vftable for its new virtual functions (if any), so it can share B's or C's.
My old paper "C++: Under the Hood" explains the Microsoft C++ implementation of virtual base classes. http://www.openrce.org/articles/files/jangrayhood.pdf
And (MS C++) you can compile with cl /d1reportAllClassLayout to get a text report of class memory layouts.
Happy hacking!

Segfault with Embedded Structs and virtual functions

I have structs like this:
struct A
{
int a;
virtual void do_stuff(A*a)
{
cout << "I'm just a boring A-struct: " << a << endl;
}
}
struct B
{
A a_part;
char * bstr;
void do_stuff(B*bptr)
{
cout << "I'm actually a B-struct! See? ..." << bptr->bstr << endl;
}
}
B * B_new(int n, char * str)
{
B * b = (B*) malloc(sizeof(struct B));
b->a_part.a = n;
b->bstr = strdup(str);
return b;
}
Now, when I do this:
char * blah = strdup("BLAAARGH");
A * b = (A*) B_new(5, blah);
free(blah);
b->do_stuff(b);
I get a segfault on the very last line when I call do_stuff and I have no idea why.
This is my first time working with virtual functions in structs like this so I'm quite lost. Any help would be greatly appreciated!
Note: the function calls MUST be in the same format as the last line in terms of argument type, which is why I'm not using classes or inheritance.
You're mixing a C idiom (embedded structs) with C++ concepts (virtual functions). In C++, the need for embedded structs is obviated by classes and inheritance. virtual functions only affect classes in the same inheritance hierarchy. In your case, there is no relationship between A and B, so A's doStuff is always going to get called.
Your segfault is probably caused because b is a really a B, but assigned to an A*. When the compiler sees b->doStuff, it tries to go to a vtable to look up which version of doStuff to call. However, B doesn't have a vtable, so your program crashes.
In C++, a class without virtual functions that doesn't inherit from any other classes is laid out exactly like a C struct.
class NormalClass
{
int a;
double b;
public:
NormalClass(int x, double y);
};
looks like this:
+------------------------------------+
| a (4 bytes) | b (8 bytes) |
+------------------------------------+
However, a class (or struct) with virtual functions also has a pointer to a vtable, which enables C++'s version of polymorphism. So a class like this:
class ClassWithVTable
{
int a;
double b;
public:
ClassWithVTable();
virtual void doSomething();
};
is laid out in memory like this:
+-----------------------------------------------------------+
| vptr (sizeof(void *)) | a (4 bytes) | b (8 bytes) |
+-----------------------------------------------------------+
and vptr points to an implementation-defined table called the vtable, which is essentially an array of function pointers.
Casting a B * to an A * and then attempting to dereference it via a member function call is undefined behaviour. One possibility is a seg-fault. I'm not saying that this is definitely the cause, but it's not a good start.
I don't understand why you're not using inheritance here!
For polymorphic objects, the pointer to the vtable is stored inside the object.
So at runtime, the method to be actually called is found via dereferencing and jumping into the vtable.
In your case you cast B * to A *.
Since A is polymorhic, the method call will be determined via the vtable, but since the object being used is actually B the vpointer used, is actually garbage and you get the segfault.