C++ casting oddness - c++

Let's use this simple class hierarchy:
class A
{
public:
virtual void Af() {};
};
class IB
{
public:
virtual void Bf() = 0;
};
class C : public A, public IB
{
public:
virtual void Bf() {}
void Cf() { printf("Cf"); }
};
An now some tests I have done, trying to understand static_cast and dynamic_cast:
1) C* c = new C();
2) A* a = static_cast<A*>(c);
3) IB* ib = static_cast<IB*>(c); //ib gets a different pointer than c because ib vtable is assigned
4) A* correctA = static_cast<A*>(static_cast<C*>(ib)); //Correct, but I must cast first to C and the to A from Interface
5) A* incorrectA = static_cast<A*>(ib); //Compiler error
6) A* correctA2 = dynamic_cast<A*>(ib); //Correct result
Now, some questions:
1) I have started to code in C++ since I moved to C# about 5 years ago. I'm surprised of the "ib" variable value in number 3. I expected it to be same pointer as "c" variable but instead the cast is assigning the value of the vtable of class "ib" in "c"
2) Why must I cast fist to C* and then to A* in 3 to get a correct value? This makes polimorphism useless in this case. Because I want to cast from the interface to the base type without knowing the real type. 5 shows that this is not possible with static_cast (I guess that this is checking the inheritance tree and concluding IB interface is not related to A* but they really are at runtime.
3) 6 gets a correct value into correctA2. I guess it does this correclty as I explain in question 2 because this can be resolved only at runtime.
Could you explain a bit this kind of behaviours and confirm my guessings? It is hard to come back from c# to c++ :D.
Cheers.

It looks like you may be trying to write C# in C++ in which case I suggest just sticking with C#. However I'll try to answer your questions:
1) (note this is implementation details that are probably right on most systems) In a multiply inherited derived class typically an implementation will have multiple virtual tables as the first items in the object memory. In this case a C would have first an A vtable and then an IB vtable. If you try to use the derived pointer as IB without changing its address, the IB would be using the A class's vtable resulting in havoc. Thus, the compiler fixes up the address for you.
2) This is just the way the language tells us static_cast will work: converting between parent/child objects, and a few other relationships like different integral types. dynamic_cast is needed to traverse sibling relationships directly.
3) Correct, since dynamic_cast offers more flexibility for polymorphic conversions you can use it to convert between a sibling relationship.
I should make a closing remark that using multiple inheritance in C++ to provide an implementation to an interface is not a common pattern. There may be alternate approaches if you ask your real question.

A static_cast requires there to be a single compile time relationship between the types that is more direct than any other relationship.
Imagine you had also defined
class D : public IB, public A
The relationship between A and IB through D would be no more nor less direct than through C. A static_cast can use the fact that the most direct relationship between IB and C is IB as a base class of C and can use the fact that the most direct relationship between C and A is A as a base class of C. But the relationship between IB and A through C cannot be know to be the most direct compile time relationship, so static_cast can't use it (by dynamic_cast can use it as the only available run time relationship).

Related

"Crossing the hierarchy" -- why not?

We have a multiple inheritance hierarchy:
// A B
// \ /
// C
//
Both A and B are abstract classes. C is actually a templated class, so downcasting is near impossible if you have a B and you want to access a member of it as an A.
All A's and B's must be C's, since that is the only concrete class in the hierarchy. Therefore, all A's must be B's, and vice versa.
I'm trying to debug something quickly where I have a B that I need to access A::name of. I can't downcast to C because I don't know the templated type of it. So I'm writing code like below and surprisingly it doesn't work; and I'm wondering what gives.
struct A { virtual void go() = 0; std::string name; };
struct B { virtual void go() = 0; };
struct C : A, B { void go() override { } };
int main()
{
C c;
c.name = "Pointer wonders";
puts(c.name.c_str()); // Fine.
B* b = (B*)&c;
//puts(b->name.c_str()); // X no from compiler.
A* a1 = (A*)&c;
puts(a1->name.c_str()); // As expected this is absolutely fine
// "Cross" the hierarchy, because the B really must be a __C__, because of the purely virtual functions.
// neither A nor B can be instantiated, so every instance of A or B must really be a C.
A* a2 = (A*)b;
puts(a2->name.c_str()); // Why not??
// If you downcast first, it works
C* c2 = (C*)b;
A* a3 = (A*)c2;
puts(a3->name.c_str()); // fine
}
First of all, stop using C style cast. The compiler won't complain if you do something wrong (C style cast usually do not works in multiple inheritance).
Any cast that cause run-time error in you example would not compile with a static_cast. While it is a bit longer to type, you get instant feedback when used improperly instead of undefined behavior that will sometime corrupt data and cause problem long afterward when that data is use.
As A and B contains virtual function, you can easily use dynamic_cast without knowing C. If you know C, you could use static_cast to C if you know there is a derived C for sure. But why not use virtual functions and not do any crossing between siblings?
The reason it does not works is because C-style cast can do any of the following cast:
static_cast
reinterperet_cast
const_cast
Also, C style cast will do a reinterpret_cast if the definition of a class is missing. You also need to be very careful with void *as you must convert back to original type.
As a simplified rule, you can imagine that C cast is like doing either a single static_cast (known child or parent class or primitive types like int) or reinterpret_cast (unknown type, not a parent/child class) followed by a const_cast if necessary.
C * --> void * --> B * won't work with any C or C++ cast.
Th primary reason that such cast don't works is that the compiler must adjust this pointer when doing a cast and multiple inheritance is used. This is required to take into account that the A and B part start at a distinct offset.
Alternatively, you can add a virtual function A * GetA() = 0 in B and implemente it in C to have your own way to navigate. That can be an option if is unknown and RTTI must be disabled (for ex. on embedded systems).
Honestly, you should avoid multiple inheritance and casting as it make the code harder to maintain as it increase coupling between classes and it can cause hard to find bug particularily when mixing both together.

Best practice with dynamic_cast and polymorphism

I have a design problem that I am not sure how to handle in the best way. I want my code to be future proof and still not be to messy and complex (the plight of a geek).
Currently my design has the following setup
derived class D derived class E
^ ^
| |
derived abstract class B -> derived class C
^ ^
| |
Abstract Base class A
class B differs from class A, class D and class E differs from class B and C differs from class A.
As these classes differ I need to use the dynamic_cast operator.
A* a1 = new class C;
C* c1 = dynamics_cast<C*>(a1);
if(c1){ //Success!!!}
...
As you can see I will be wanting to use dynamic_cast a lot! I need to try and find a nice way of doing this that is not full of if() statements extra.
Is there a neat way of doing this? I seem to remember seeing something in a Scott Meyers book, but I am away from home and can not access it for a week or two.
Any references/examples would be much appreciated.
Also if need be I could make class B into an interface which would be a sort of solution
B* d1 = new D;
B* e1 = new E;
but not ideal.
NOTE: I have added this example to show my current design failure
class A {...};
class B: public A {
public:
virtual std::vector<objectType1*>& getObjectType1();
};
class C: public B {
private:
void newFunction();
};
A* c1 = new C();
std::vector<objectType1*> example = c1->getObjectType1(); // Wrong; need dyanmic_cast :-(
can not access getObjectType1() without dynamic_cast as the static type A does not know about this function. I need to rethink my design. Any ideas?
In a properly designed object hierarchy, the use of dynamic_cast is a fair rarity. Sure, it exists to cast up & down a polymorphic tree, but that doesn't mean you have to use it.
Think about this. Why would you need to use dynamic_cast, generally speaking? Because the base pointer you have doesn't provide the facilities you need, right? And what is the point of polymorphism? To provide different behavior for similar operations, where the behavior is chosen at run-time, right?
Aren't those two statements somewhat contradictory? If your object's interface is designed correctly, then you don't really care what kind of behavior is going to be employed when you call ->Foo() -- you just want to call ->Foo().
If you need to make frequent use of dynamic_cast, it's a code smell that tells you that there's something wrong with your interfaces. Maybe you're trying to shove too much in to one interface?

c++ casting derived to void* and restoring base

I have:
class A {...};
class B : public A {...};
class C : public B {...};
Then I'm storing C instance as void*:
C *instance = new C();
void *pC = instance;
Is it ok to make this:
B *pB = reinterpret_cast<B*>(pC);
Or I have to cast to C* only?
PS: I have more classes derived from B in my program and I'm not sure if's ok to cast like i'm doing (to B*).
Why void*: I'm using void *userdata' field of physical body class in box2d engine. I can't store my class there in other way
I would suggest casting it back to the original C* then do a dynamic_cast to B* in order to follow the rules of C++ - although you should avoid casting to void* in the first place and instead use a base pointer (A*) instead of void ptr.
The general rule is that when you cast from a pointer to void * you should always cast back to the one you originally came from.
Casting to void* is sometimes a necessary evil because of old APIs.
And sometimes it is done by design to create a "light" template. A light template is where you write code that handles a collection of pointers to objects, which are all handled in the same way, and this prevents code having to be generated for every type.
Around this code you have a strongly typed template that does the simple thing of casting back and forth (likely to be inlined anyway) so the users get strongly typed code but the implementation is less bloated.
For the code you shown it should be ok to cast directly to B* and you can use static_cast for this. However, in general, pointers to base and derived classes might be different and you'll have to cast to the original type first and only then cast it to the pointer to base.
Edit: that should be ok in practice for single inheritance, but in general it's UB.
This will work fine. You'll only get access to methods of A and B in this case though. (You'll simply "hide" methods of C.)
Casting to void* is always a bad idea and so is doing a dynamic_cast. If you have to, it mostly means your code requires re-factoring as it is an object design flaw.

Downcast in a diamond hierarchy

Why static_cast cannot downcast from a virtual base ?
struct A {};
struct B : public virtual A {};
struct C : public virtual A {};
struct D : public B, public C {};
int main()
{
D d;
A& a = d;
D* p = static_cast<D*>(&a); //error
}
g++ 4.5 says:
error: cannot convert from base ‘A’ to derived type ‘D’ via virtual base ‘A’
The solution is to use dynamic_cast ? but why. What is the rational ?
-- edit --
Very good answers below. No answers detail exactly how sub objects and vtables end up to be ordered though. The following article gives some good examples for gcc:
http://www.phpcompiler.org/articles/virtualinheritance.html#Downcasting
The obvious answer is: because the standard says so. The
motivation behind this in the standard is that static_cast
should be close to trivial—at most, a simple addition or
subtraction of a constant to the pointer. Where s the downcast
to a virtual base would require more complicated code: perhaps
even with an additional entry in the vtable somewhere. (It
requires something more than constants, since the position of
D relative to A may change if there is further derivation.)
The conversion is obviously doable, since when you call
a virtual function on an A*, and the function is implemented
in D, the compiler must do it, but the additional overhead was
considered inappropriate for static_cast. (Presumably, the
only reason for using static_cast in such cases is
optimization, since dynamic_cast is normally the preferred
solution. So when static_cast is likely to be as expensive as
dynamic_cast anyway, why support it.)
Because if the object was actually of type E (derived from D), the location of A subobject relative to D subobject could be different than if the object is actually D.
It actually already happens if you consider instead casting from A to C. When you allocate C, it has to contain instance of A and it lives at some specific offset. But when you allocate D, the C subobject refers to the instance of A that came with B, so it's offset is different.

How is C++'s multiple inheritance implemented?

Single inheritance is easy to implement. For example, in C, the inheritance can be simulated as:
struct Base { int a; }
struct Descendant { Base parent; int b; }
But with multiple inheritance, the compiler has to arrange multiple parents inside newly constructed class. How is it done?
The problem I see arising is: should the parents be arranged in AB or BA, or maybe even other way? And then, if I do a cast:
SecondBase * base = (SecondBase *) &object_with_base1_and_base2_parents;
The compiler must consider whether to alter or not the original pointer. Similar tricky things are required with virtuals.
The following paper from the creator of C++ describes a possible implementation of multiple inheritance:
Multiple Inheritance for C++ - Bjarne Stroustrup
There was this pretty old MSDN article on how it was implemented in VC++.
And then, if I do a cast:
SecondBase base = (SecondBase *) object_with_base1_and_base2_parents;
The compiler must consider whether to alter or not the original pointer. Similar tricky things with virtuals.
With non-virutal inheritance this is less tricky than you might think - at the point where the cast is compiled, the compiler knows the exact layout of the derived class (after all, the compiler did the layout). Usually all that happens is a fixed offset (which may be zero for one of the base classes) is added/subtracted from the derived class pointer.
With virutal inheritance it is maybe a bit more complex - it may involve grabbing an offset from a vtbl (or similar).
Stan Lippman's book, "Inside the C++ Object Model" has very good descriptions of how this stuff might (and often actually does) work.
Parents are arranged in the order that they're specified:
class Derived : A, B {} // A comes first, then B
class Derived : B, A {} // B comes first, then A
Your second case is handled in a compiler-specific manner. One common method is using pointers that are larger than the platform's pointer size, to store extra data.
This is an interesting issue that really isn't C++ specific. Things get more complex also when you have a language with multiple dispatch as well as multiple inheritance (e.g. CLOS).
People have already noted that there are different ways to approach the problem. You might find reading a bit about Meta-Object Protocols (MOPs) interesting in this context...
Its entirely down to the compiler how it is done, but I beleive its generally done througha heirarchical structure of vtables.
I have performed simple experiment:
class BaseA { int a; };
class BaseB { int b; };
class Descendant : public BaseA, BaseB {};
int main() {
Descendant d;
BaseB * b = (BaseB*) &d;
Descendant *d2 = (Descendant *) b;
printf("Descendant: %p, casted BaseB: %p, casted back Descendant: %p\n", &d, b, d2);
}
Output is:
Descendant: 0xbfc0e3e0, casted BaseB: 0xbfc0e3e4, casted back Descendant: 0xbfc0e3e0
It's good to realise that static casting does not always mean "change the type without touching the content". (Well, when data types do not fit each other, then there will be also an interference into content, but it's different situation IMO).