Suppose I have the following code:
class a {
public:
virtual void do_a() = 0;
}
class b {
public:
virtual void do_b() = 0;
}
class c: public a, public b {
public:
virtual void do_a() {};
virtual void do_b() {};
}
a *foo = new c();
b *bar = new c();
Will foo->do_a() and bar->do_b() work? What's the memory layout here?
Will a->do_a() and b->do_b() work?
Assuming you meant foo->do_a() and bar->do_b(), as a and b are not object, they're type, yes. They will work. Did you try run that?
What's the memory layout here?
That is implementation-defined, mostly. Fortunately, you don't need to know about that unless you want to write non-portable code.
Why shouldn't they? The memory layout will typically be something like:
+----------+
| A part |
+----------+
| B part |
+----------+
| C part |
+----------+
If you convert your foo and bar to void* and display them, you'll
get different addresses, but the compiler knows this, and will arrange
for the this pointer to be correctly fixed up when calling the
function.
As others have mentioned the following will work without any problems
foo->do_a();
bar->do_b();
These, however, will not compile
bar->do_a();
foo->do_b();
Since bar is of type b* it has no knowledge of do_a. The same is true for foo and do_b. If you want to make those function calls you must downcast.
static_cast<c *>(foo)->do_b();
static_cast<c *>(bar)->do_a();
The other very important thing that is not shown in your example code is, when inheriting, and referring to the derived class through base class pointer, the base class MUST have a virtual destructor. If it doesn't then the following will produce undefined behavior.
a* foo = new c();
delete a;
The fix is simple
class a {
public:
virtual void do_a() = 0;
virtual ~a() {}
};
Of course, this change needs to be made to b as well.
Yes, of course it will work. The mechanics are a bit tricky though. The object will have two vtables, one for the class a parent and one for the class b parent. The pointers will be adjusted so that they point to the subset of the object that corresponds to the pointer type, leading to this surprising result:
c * baz = new c;
a * foo = baz;
b * bar = baz;
assert((void *)foo == (void *)bar); // assertion fails!
The compiler knows the types at the time of the assignment, and knows exactly how to adjust the pointers.
This is of course completely compiler dependent; nothing in the C++ standard says it has to work this way. Only that it has to work.
foo->do_a(); // will work
bar->do_b(); // will work
bar->do_a(); // compile error (do_a() is not a member of B)
foo->do_b(); // compile error (do_b() is not a member of A)
// If you really know the types are correct:
C* c = static_cast<C*>(foo);
c->do_a(); // will work
c->do_b(); // will work
// If you don't know the types, you can try at runtime:
if(C* c = dynamic_cast<C*>(foo))
{
c->do_a(); // will work
c->do_b(); // will work
}
Will a->do_a() and b->do_b() work?
No.
Will foo->do_a() and bar->do_b() work?
Yes. Your code is the canonical example of virtual function dispatch.
Why didn't you just try it?
What's the memory layout here?
Who cares?
(i.e. this is implementation-defined, and abstracted from you. You should not need to nor want to know.)
They will work. In terms of memory, this is implementation dependent. You have created objects on the heap, and for most systems, it is worth noting that objects on the heap grow upwards (c.f. the stack grows downwards). So possibly, you will have:
Memory:
+foo+
-----
+bar+
Related
I'm pretty sure this is dangerous code. However, I wanted to check to see if anyone had an idea of what exactly would go wrong.
Suppose I have this class structure:
class A {
protected:
int a;
public:
A() { a = 0; }
int getA() { return a; }
void setA(int v) { a = v; }
};
class B: public A {
protected:
int b;
public:
B() { b = 0; }
};
And then suppose I want to have a way of automatically extending the class like so:
class Base {
public:
virtual ~Base() {}
};
template <typename T>
class Test: public T, public Base {};
One really important guarantee that I can make is that neither Base nor Test will have any other member variables or methods. They are essentially empty classes.
The (potentially) dangerous code is below:
int main() {
B *b = new B();
// dangerous part?
// forcing Test<B> to point to to an address of type B
Test<B> *test = static_cast<Test<B> *>(b);
//
A *a = dynamic_cast<A *>(test);
a->setA(10);
std::cout << "result: " << a->getA() << std::endl;
}
The rationale for doing something like this is I'm using a class similar to Test, but in order for it to work currently, a new instance T (i.e. Test) has necessarily be made, along with copying the instance passed. It would be really nice if I could just point Test to T's memory address.
If Base did not add a virtual destructor, and since nothing gets added by Test, I would think this code is actually okay. However, the addition of the virtual destructor makes me worried that the type info might get added to the class. If that's the case, then it would potentially cause memory access violations.
Finally, I can say this code works fine on my computer/compiler (clang), though this of course is no guarantee that it's not doing bad things to memory and/or won't completely fail on another compiler/machine.
The virtual destructor Base::~Base will be called when you delete the pointer. Since B doesn't have the proper vtable (none at all in the code posted here) that won't end very well.
It only works in this case because you have a memory leak, you're never deleting test.
Your code produces undefined behaviour, as it violates strict aliasing. Even if it did not, you are invoking UB, as neither B nor A are polymorphic classes, and the object pointed to is not a polymorphic class, therefore dynamic_cast cannot succeed. You're attempting to access a Base object that does not exist to determine the runtime type when using dynamic_cast.
One really important guarantee that I can make is that neither Base
nor Test will have any other member variables or methods. They are
essentially empty classes.
It's not important at all- it's utterly irrelevant. The Standard would have to mandate EBO for this to even begin to matter, and it doesn't.
As long as you perform no operations with Test<B>* and avoid any magic like smart pointers or automatic memory management, you should be fine.
You should be sure to look for obscured code like debug prints or logging that will inspect the object. I've had debuggers crash on me for attempting to look into a value of a pointer set up like this. I will bet this will cause you some pain, but you should be able to make it work.
I think the real problem is maintenance. How long will it be before some developer does an operation on Test<B>*?
conv.h
class Base
{
public:
void foo();
};
class Derived: public Base
{
public:
void bar();
};
class A {};
class B
{
public:
void koko();
};
conv.cpp
void Base::foo()
{
cout<<"stamm";
}
void Derived::bar()
{
cout<<"bar shoudn't work"<<endl;
}
void B::koko()
{
cout<<"koko shoudn't work"<<endl;
}
main.cpp
#include "conv.h"
#include <iostream>
int main()
{
Base * a = new Base;
Derived * b = static_cast<Derived*>(a);
b->bar();
Derived * c = reinterpret_cast<Derived*>(a);
c->bar();
A* s1 = new A;
B* s2 = reinterpret_cast<B*>(s1);
s2->koko();
}
output:
bar shoudn't work
bar shoudn't work
koko shoudn't work
How come the method bar is succeeded to be called in run time despite that I have created a Base class not derived?? it works even with two types of conversions (static and reinterpret cast).
same question as above but with unrelated classes (A & B) ??
Undefined behaviour can do anything, including appear to work.
It's working (read: "compiling and not crashing") 'cause you never use the this pointer in your nominally "member" functions. If you tried to print out a member variable, for example, you'd get the garbage output or crashes you expect - but these functions as they are now don't depend on anything in the classes they're supposedly part of. this could even be NULL for all they care.
The compiler knows a Derived can use member functions foo() and bar() and knows where to find them. After you did your "tricks", you had pointers to Derived.
The fact that they were pointers of type Derived -- regardless of what data was associated with those pointers -- allowed them to call the functions foo() and kook() associated with Derived.
As has been mentioned, if you had actually used the data at the pointers (i.e. read or wrote data members relative to this belonging to the Derived class (which you don't have in this case), you would have been access memory that didn't belong to your objects.
Please ignore the #include parts assuming they are done correctly. Also this could be implementation specific (but so is the concept of vtables) but i am just curious as it enhances me to visualize multiple inheritance. (I'm using MinGW 4.4.0 by the way)
initial code:
class A {
public:
A() : a(0) {}
int a;
};
//Edit: adding this definition instead
void f(void* ptrA) {
std::cout<<((A*)ptrA)->a;
}
//end of editing of original posted code
#if 0
//this was originally posted. Edited and replaced by the above f() definition
void f(A* ptrA) {
std::cout<<ptrA->a;
}
#endif
this is compiled and Object code is generated.
in some other compilation unit i use (after inclusion of header file for above code):
class C : public B , public A {
public:
int c;
}objC;
f(&objC); // ################## Label 1
memory model for objC:
//<1> stuff from B
//<2> stuff from B
//<3> stuff from A : int a
//<4> stuff from C : int c
&objC will contain starting address of <1> in memory model assumed above
how/when will the compiler shift it to <3>? Does it happen during the inspection of call at Label 1 ?
EDIT::
since Lable 1 seems to be a give away, just making it a little more obscure for the compiler. Pls see the Edited code above. Now when does the compiler do and where?
Yes, you are quite correct.
To fully understand the situation, you have to know what the compiler knows at two points:
At Label 1 (as you have already identified)
Inside function f()
(1) The compiler knows the exact binary layout of both C and A and how to convert from C* to A* and will do so at the call site (Label 1)
(2) Inside function f(), however, the compiler only (needs to) know(s) about A* and so restricts itself to members of A (int a in this case) and cannot be confused about whether the particular instance is part of anything else or not.
Short answer: Compiler will adjust pointer values during cast operations if it knows the relationship between the base and derived class.
Let's say the address of your object instance of class C was at address 100. And let's say sizeof(C) == 4. As does sizeof(B) and sizeof(A).
When a cast happens such as the following:
C c;
A* pA = &c; // implicit cast, preferred for upcasting
A* pA = (A*)&c; // explicit cast old style
A* pA = static_cast<A*>(&c); // static-cast, even better
The pointer value of pA will be the memory address of c plus the offset from where "A" begins in C. In this case, pA will reference memory address 104 assuming sizeof(B) is also 4.
All of this holds true for passing a derived class pointer into a function expecting a base class pointer. The implicit cast will occur as does the pointer offset adjustment.
Likewise, for downcasting:
C* pC = (C*)(&a);
The compiler will take care of adjusting the pointer value during the assigment.
The one "gotcha" to all of this is when a class is forward declared without a full declaration:
// foo.h
class A; // same as above, base class for C
class C; // same as above, derived class from A and B
inline void foo(C* pC)
{
A* pA = (A*)pC; // oops, compiler doesn't know that C derives from A. It won't adjust the pointer value during assigment
SomeOtherFunction(pA); // bug! Function expecting A* parameter is getting garbage
}
That's a real bug!
My general rule. Avoid the old "C-style" cast and favor using the static_cast operator or just rely on implicit casting without an operator to do the right thing (for upcasts). The compiler will issue an error if the casting isn't valid.
In the code below, pC == pA:
class A
{
};
class B : public A
{
public:
int i;
};
class C : public B
{
public:
char c;
};
int main()
{
C* pC = new C;
A* pA = (A*)pC;
return 0;
}
But when I add a pure virtual function to B and implement it in C, pA != pC:
class A
{
};
class B : public A
{
public:
int i;
virtual void Func() = 0;
};
class C : public B
{
public:
char c;
void Func() {}
};
int main()
{
C* pC = new C;
A* pA = (A*)pC;
return 0;
}
Why is pA not equal to pC in this case? Don't they both still point to the same "C" object in memory?
You're seeing a different value for your pointer because the new virtual function is causing the injection of a vtable pointer into your object. VC++ is putting the vtable pointer at the beginning of the object (which is typical, but purely an internal detail).
Let's add a new field to A so that it's easier to explain.
class A {
public:
int a;
};
// other classes unchanged
Now, in memory, your pA and A look something like this:
pA --> | a | 0x0000004
Once you add B and C into the mix, you end up with this:
pC --> | vtable | 0x0000000
pA --> | a | 0x0000004
| i | 0x0000008
| c | 0x000000C
As you can see, pA is pointing to the data after the vtable, because it doesn't know anything about the vtable or how to use it, or even that it's there. pC does know about the vtable, so it points directly to the table, which simplifies its use.
A pointer to an object is convertible to a pointer to base object and vice versa, but the conversion doesn't have to be trivial. It's entirely possible, and often necessary, that the base pointer has a different value than the derived pointer. That's why you have a strong type system and conversions. If all pointers were the same, you wouldn't need either.
Here are my assumptions, based on the question.
1) You have a case where you cast from a C to an A and you get the expected behaviour.
2) You added a virtual function, and that cast no longer works (in that you can no longer pull data from A directly after the cast to A, you get data that makes no sense to you).
If these assumptions are true the hardship you are experiencing is the insertion of the virtual table in B. This means the data in the class is no longer perfectly lined up with the data in the base class (as in the class has added bytes, the virtual table, that are hidden from you). A fun test would be to check sizeof to observe the growth of unknown bytes.
To resolve this you should not cast directly from A to C to harvest data. You should add a getter function that is in A and inherited by B and C.
Given your update in the comments, I think you should read this, it explains virtual tables and the memory layout, and how it is compiler dependent. That link explains, in more detail, what I explained above, but gives examples of the pointers being different values. Really, I had WHY you were asking the question wrong, but it seems the information is still what you wanted. The cast from C to A takes into account the virtual table at this point (note C-8 is 4, which on a 32 bit system would be the size of the address needed for the virtual table, I believe).
I have following classes.
class A
{
public:
void fun();
}
class B: public A
{
}
class C: public A
{
}
A * ptr = new C;
Is it ok to do something like below? Will i have some problems if introduce some virtual functions in the baseclass?
((B *)ptr)->fun();
This may look stupid, but i have a function that calls A's function through B and i don't want to change that.
You can't cast an A* pointing to Class C as a B* because Class C doesn't have any relation with Class B. You'll get undefined behavior which will probably be the wrong function called and stack corruption.
If you intended for class C to derive from class B then you could. However, you wouldn't need to. If class C doesn't have fun() defined, it will inherit A's. You didn't declare fun() virtual though so you'll get strange behavior if you even implement C::fun() or B::fun(). You almost certainly want fun() to be declared virtual.
I'm guessing here but I suspect the behavior of this might depend on the compiler you use and how it decides to organize the vf pointer table.
I'm also going to note that I think what you are doing is a bad idea and could lead to all kinds of nightmarish problems (use of things like static_cast and dynamic_cast are generally a good idea). The other thing is because fun() is defined in the base class (and it is not virtual) ptr->fun() will always call A::fun() without having to cast it to B*.
You don't have to do the casting (B*) ptr->fun(); since the fun() is already in the base class. both objects of class B or C will invoke the same fun() function in your example.
I'm not sure what happens when u override the fun() function in class B...
But trying to invoke function from another class (not the base class) is bad OO, in my opinion.
You can cast from A * to B *, and it should work if the original pointer was B *.
A* p = new B;
B* q = static_cast<B*>(p); // Just work (tm)
But in this case it is a C *, and it is not guaranteed to work, you will end with a dangling pointer, if you are lucky you will get an access violation, if not you man end up silently corrupting your memory.
A* p = new C;
B* q = static_cast<B*>(p); // Owned (tm)