Downcast in a diamond hierarchy - c++

Why static_cast cannot downcast from a virtual base ?
struct A {};
struct B : public virtual A {};
struct C : public virtual A {};
struct D : public B, public C {};
int main()
{
D d;
A& a = d;
D* p = static_cast<D*>(&a); //error
}
g++ 4.5 says:
error: cannot convert from base ‘A’ to derived type ‘D’ via virtual base ‘A’
The solution is to use dynamic_cast ? but why. What is the rational ?
-- edit --
Very good answers below. No answers detail exactly how sub objects and vtables end up to be ordered though. The following article gives some good examples for gcc:
http://www.phpcompiler.org/articles/virtualinheritance.html#Downcasting

The obvious answer is: because the standard says so. The
motivation behind this in the standard is that static_cast
should be close to trivial—at most, a simple addition or
subtraction of a constant to the pointer. Where s the downcast
to a virtual base would require more complicated code: perhaps
even with an additional entry in the vtable somewhere. (It
requires something more than constants, since the position of
D relative to A may change if there is further derivation.)
The conversion is obviously doable, since when you call
a virtual function on an A*, and the function is implemented
in D, the compiler must do it, but the additional overhead was
considered inappropriate for static_cast. (Presumably, the
only reason for using static_cast in such cases is
optimization, since dynamic_cast is normally the preferred
solution. So when static_cast is likely to be as expensive as
dynamic_cast anyway, why support it.)

Because if the object was actually of type E (derived from D), the location of A subobject relative to D subobject could be different than if the object is actually D.
It actually already happens if you consider instead casting from A to C. When you allocate C, it has to contain instance of A and it lives at some specific offset. But when you allocate D, the C subobject refers to the instance of A that came with B, so it's offset is different.

Related

static_cast to access members of another base class returns wrong if own base is empty [duplicate]

Consider the following code:
struct Base {};
struct Derived : public virtual Base {};
void f()
{
Base* b = new Derived;
Derived* d = static_cast<Derived*>(b);
}
This is prohibited by the standard ([n3290: 5.2.9/2]) so the code does not compile, because Derived virtually inherits from Base. Removing the virtual from the inheritance makes the code valid.
What's the technical reason for this rule to exist?
The technical problem is that there's no way to work out from a Base* what the offset is between the start of the Base sub-object and the start of the Derived object.
In your example it appears OK, because there's only one class in sight with a Base base, and so it appears irrelevant that the inheritance is virtual. But the compiler doesn't know whether someone defined another class Derived2 : public virtual Base, public Derived {}, and is casting a Base* pointing at the Base subobject of that. In general[*], the offset between the Base subobject and the Derived subobject within Derived2 might not be the same as the offset between the Base subobject and the complete Derived object of an object whose most-derived type is Derived, precisely because Base is virtually inherited.
So there's no way to know the dynamic type of the complete object, and different offsets between the pointer you've given the cast, and the required result, depending what that dynamic type is. Hence the cast is impossible.
Your Base has no virtual functions and hence no RTTI, so there certainly is no way to tell the type of the complete object. The cast is still banned even if Base does have RTTI (I don't immediately know why), but I guess without checking that a dynamic_cast is possible in that case.
[*] by which I mean, if this example doesn't prove the point then keep adding more virtual inheritance until you find a case where the offsets are different ;-)
static_cast can perform only those casts where memory layout between the classes is known at compile-time. dynamic_cast can check information at run-time, which allows to more accurately check for cast correctness, as well as read run-time information regarding the memory layout.
Virtual inheritance puts a run-time information into each object which specifies what is the memory layout between the Base and Derived. Is one right after another or is there an additional gap? Because static_cast cannot access such information, the compiler will act conservatively and just give a compiler error.
In more detail:
Consider a complex inheritance structure, where - due to multiple inheritance - there are multiple copies of Base. The most typical scenario is a diamond inheritance:
class Base {...};
class Left : public Base {...};
class Right : public Base {...};
class Bottom : public Left, public Right {...};
In this scenario Bottom consists of Left and Right, where each has its own copy of Base. The memory structure of all the above classes is known at compile time and static_cast can be used without a problem.
Let us now consider the similar structure but with virtual inheritance of Base:
class Base {...};
class Left : public virtual Base {...};
class Right : public virtual Base {...};
class Bottom : public Left, public Right {...};
Using the virtual inheritance ensures that when Bottom is created, it contains only one copy of Base that is shared between object parts Left and Right. The layout of Bottom object can be for example:
Base part
Left part
Right part
Bottom part
Now, consider that you cast Bottom to Right (that is a valid cast). You obtain a Right pointer to an object that is in two pieces: Base and Right have a memory gap in between, containing the (now-irrelevant) Left part. The information about this gap is stored at run-time in a hidden field of Right (typically referred to as vbase_offset). You can read the details for example here.
However, the gap would not exist if you would just create a standalone Right object.
So, if I give you just a pointer to Right you do not know at compile time if it is a standalone object, or a part of something bigger (e.g. Bottom). You need to check the run-time information to properly cast from Right to Base. That is why static_cast will fail and dynamic_cast will not.
Note on dynamic_cast:
While static_cast does not use run-time information about the object, dynamic_cast uses and requires it to exist! Thus, the latter cast can be used only on those classes which contain at least one virtual function (e.g. a virtual destructor)
Fundamentally, there's no real reason, but the intention is that
static_cast be very cheap, involving at most an addition or a
subtraction of a constant to the pointer. And there's no way to
implement the cast you want that cheaply; basically, because the
relative positions of Derived and Base within the object may change
if there is additional inheritance, the conversion would require a good
deal of the overhead of dynamic_cast; the members of the committee
probably thought that this defeats the reasons for using static_cast
instead of dynamic_cast.
Consider the following function foo:
#include <iostream>
struct A
{
int Ax;
};
struct B : virtual A
{
int Bx;
};
struct C : B, virtual A
{
int Cx;
};
void foo( const B& b )
{
const B* pb = &b;
const A* pa = &b;
std::cout << (void*)pb << ", " << (void*)pa << "\n";
const char* ca = reinterpret_cast<const char*>(pa);
const char* cb = reinterpret_cast<const char*>(pb);
std::cout << "diff " << (cb-ca) << "\n";
}
int main(int argc, const char *argv[])
{
C c;
foo(c);
B b;
foo(b);
}
Although not really portable, this function shows us the "offset" of A and B. Since the compiler can be quite liberal in placing the A subobject in case of inheritance (also remember that the most derived object calls the virtual base ctor!), the actual placement depends on the "real" type of the object. But since foo only gets a ref to B, any static_cast (which works at compile time by at most applying some offset) is bound to fail.
ideone.com (http://ideone.com/2qzQu) outputs for this:
0xbfa64ab4, 0xbfa64ac0
diff -12
0xbfa64ac4, 0xbfa64acc
diff -8
static_cast is a compile time construct. it checks for the validity of cast at compile time and gives an compilation error if invalid cast.
virtualism is a runtime phenomenon.
Both can't go together.
C++03 Standard §5.2.9/2 and §5.2.9/9 ar relevant in this case.
An rvalue of type “pointer to cv1 B”, where B is a class type, can be converted to an rvalue of type “pointer to cv2 D”, where D is a class derived (clause 10) from B, if a valid standard conversion from “pointer to D” to “pointer to B” exists (4.10), cv2 is the same cv-qualification as, or greater cv-qualification than, cv1, and B is not a virtual base class of D. The null pointer value (4.10) is converted to the null pointer value of the destination type. If the rvalue of type “pointer to cv1 B” points to a B that is actually a sub-object of an object of type D, the resulting pointer points to the enclosing object of type D. Otherwise, the result of the cast is undefined.
I suppose, this is due to classes with virtual inheritance having different memory layout. The parent has to be shared between children, therefore only one of them could be laid out continuously. That means, you are not guaranteed to be able to separate a continuous area of memory to treat it as a derived object.

Static cast allows conversion of object pointers but not integers

Why does static cast allow an upcast or downcast between pointers to objects derived or base as below, but in the case of casting between a char* and int* or vice versa int* to char*, there is a compilation error?
Casting between different pointers to objects is just as bad I believe.
// compiles fine
class Base {};
class Derived: public Base {};
Base * a = new Base;
Derived * bc = static_cast<Derived*>(a);
// Gives an invalid static cast error during compilation
char charVar = 8;
char* charPtr = &charVar;
int* intPtr = static_cast<int*>(charPtr);
C++ is strongly performance oriented. So as long as there is some use case for that you can gain performance, C++ will allow you to do it. Consider std::vector: Sure, there is the safe element access via function at, which does range checking for you. But if you know that your indices are in range (e. g. in a for loop), these range checks are just dead weight. So you additionally get the (less safe) operator[] which just omits these checks.
Similarly, if you have a pointer of type Base, it could, in reality, point to an object of type Derived. If in doubt, you would dynamic_cast from Base* to Derived*. But this comes with some overhead. But if you know 100% for sure (by whatever means...) what the sub class actually is, would you want this overhead? As there is a natural (even implicit!) way from Derived* to Base*, we want to have some low-cost way back.
On the other hand, there is no such natural cast between pointers of totally unrelated types (such as char and int or two unrelated classes) and thus no such low-cost way back (compared to dynamic_cast, which isn't available either, of course). Only way to transform in between is reinterpret_cast.
Actually, reinterpret_cast comes with no cost either, it just interprets the pointer as a different type – with all risks! And a reinterpret_cast even can fail, if instead a static_cast would have been required (right to prevent the question "why not just always use ..."):
class A { int a; };
class B { };
class C : public A, public B { };
B* b = new C();
C* c = reinterpret_cast<C*>(b); // FAILING!!!
From view of memory layout, C looks like this (even if hidden away from you):
class C
{
A baseA;
B baseB; // this is what pointer b will point to!
};
Obviously, we'll get an offset when casting between C* and B* (either direction), which is considered by both static_cast and dynamic_cast, but not by reinterpret_cast...
Why does static cast allow an upcast ...
There is no reason to prevent upcast. In fact, a derived pointer is even implicitly convertible to a base pointer - no cast is necessary (except in convoluted cases where there are multiple bases of same type). A derived class object always contains a base class sub object.
Upcasting is particularly useful because it allows runtime polymorphism through the use of virtual functions.
or downcast between pointers to objects derived or base as below
A base pointer may point to a base sub object of a derived object, as a consequence of an upcast. Like for example here:
Derived d;
Base *b = &d;
There are cases where you may want to gain access to members of the derived object that you know is being pointed at. Static cast makes that possible without the cost of run time type information.
It is not possible in general for the compiler to find out (at compile time) the concrete type of the pointed object (i.e. whether the pointer points to a sub object and if it does, what is the type of the container object). It is the responsibility of the programmer to ensure that the requirements of the cast are met. If the programmer cannot prove the correctness, then writing the static cast is a bug.
That's because what you're trying to do is a reinterpretation - for which there's a reinterpret_cast<> operator. static_cast<> for pointers is only used for down-casting:
If new_type is a pointer or reference to some class D and the type of expression is a pointer or reference to its non-virtual base B, static_cast performs a downcast. This downcast is ill-formed if B is ambiguous, inaccessible, or virtual base (or a base of a virtual base) of D. Such static_cast makes no runtime checks to ensure that the object's runtime type is actually D, and may only be used safely if this precondition is guaranteed by other means, such as when implementing static polymorphism. Safe downcast may be done with dynamic_cast.
See:
When should static_cast, dynamic_cast, const_cast and reinterpret_cast be used?
for a detailed discussion of when to use each casting operator.

Virtual multiple inheritance and casting

I tried creating a class that inherits from multiple classes as followed, getting a "diamond"
(D inherits from B and C. B and C both inherits from A virtually):
A
/ \
B C
\ /
D
Now, I have a container with a linked list that holds pointers to the base class (A).
When I tried doing explicit casting to a pointer (after typeid check) I got the following error:
"cannot convert a pointer to base class "A" to pointer to derived class "D" -- base class is virtual"
But when I use dynamic casting it seems to work just fine.
Can anyone please explain to me why I have to use dynamic casting and why does the virtual inheritance causes this error?
"Virtual" always means "determined at runtime". A virtual function is located at runtime, and a virtual base is also located at runtime. The whole point of virtuality is that the actual target in question is not knowable statically.
Therefore, it is impossible to determine the most-derived object of which you are given a virtual base at compile time, since the relationship between the base and the most-derived object is not fixed. You have to wait until you know what the actual object is before you can decide where it is in relation to the base. That's what the dynamic cast is doing.
When I tried doing explicit casting to a pointer (after typeid check)
After a successful typeid(x) == typeid(T) you know the dynamic type of x, and you could in theory avoid any other runtime checking involved in dynamic_cast at that point. OTOH, the compiler is not required to do this kind of static analysis.
static_cast<T&>(x) does not convey to the compiler the knowledge that the dynamic type of x really is T: the precondition is weaker (that a T object has x as a subobject base class).
C++ could provide a static_exact_cast<T&>(x) which is only valid if x designates an object of dynamic type T (and not some type derived from T, unlike static_cast<T&>). This hypothetical static_exact_cast<T&>(x), by assuming the dynamic type of x is T, would skip any runtime check and compute the correct address from knowledge of T object layout: because in
D d;
B &br = d;
no runtime offset computation is necessary, in static_exact_cast<D&>(br) the reverse adjustment would involve no runtime offset computation.
Given
B &D_to_B (D &dr) {
return dr;
}
a runtime offset computation is needed in D_to_B (except if whole program analysis shows that no class derived from D has different offset of base class A); given
struct E1 : virtual A
struct E2 : virtual A
struct F : E1, D, E2
the layout of the D subobject of F will be different from the layout of a D complete object: the A subobject will be at a different offset. The offset needed by D_to_B will be given by the vtable of D (or stored directly in the object); it means that D_to_B will not just involve a constant offset as a simple "static" upcast (before entering the constructor of the object, the vptr is not set up so such casting cannot work; be careful with up casts in constructor init list).
And BTW, D_to_B (d) is not different from a static_cast<B&> (d), so you see that the runtime computation of an offset can be done inside a static_cast.
Consider the following code compile naively (assuming no fancy analysis showing that ar has dynamic type F):
F f;
D &dr = f; // static offset
A &ar = dr; // runtime offset
D &dr2 = dynamic_cast<D&>(ar);
Finding the A base class subject from a reference to D (a lvalue of unknown dynamic type) requires a runtime check of the vtable (or equivalent). Going back to the D subobject requires a non trivial computation:
finding out the address of the complete object (which happens to be of type F), using the vtable of A
finding out the address of the unambiguous and publicly derived D base class subobject of f, using the vtable again
This is not trivial as dynamic_cast<D&>(ar) statically knows nothing specific about F (the layout of F, the layout of the vtable of F); everything is fetched from the vtable. All dynamic_cast knows is that there is a derived class of A and the vtable has all the information.
Of course, there is no static_exact_cast<> in C++, so you have to use dynamic_cast, with the associated runtime checks; dynamic_cast is a complex function, but the complexity covers the base class cases; when the dynamic type is given to dynamic_cast, the base classes tree walking is avoided, and the test is fairly simple.
Conclusion:
Either you name the dynamic type in dynamic_cast<T> and dynamic_cast is fast and simple anyway, or you don't and the complexity of dynamic_cast is really needed.

Can't downcast because class is not polymorphic?

Is it possible to have inheritance with no virtual methods? The compiler is saying that the following code is not polymorphic.
Example:
class A {
public:
int a;
int getA(){return a;};
}
class B : public A {
public:
int b;
int getB(){return b;};
}
In another class we are trying to downcast from an A object to a B object:
A *a = ...;
B *b = dynamic_cast<B*>(a)
but this gives the following compile-time error:
cannot dynamic_cast ... (source type is not polymorphic)
Syntax errors non-withstanding, you cannot dynamic_cast a non-polymorphic type. static_cast is the cast you would use in this case, if you know that it is in fact an object of the target type.
The reason why: static_cast basically has the compiler perform a check at compile time "Could the input be cast to the output?" This is can be used for cases where you are casting up or down an inheritance hierarchy of pointers (or references). But the check is only at compile time, and the compiler assumes you know what you are doing.
dynamic_cast can only be used in the case of a pointer or reference cast, and in addition to the compile time check, it does an additional run time check that the cast is legal. It requires that the class in question have at least 1 virtual method, which allows the compiler (if it supports RTTI) to perform this additional check. However, if the type in question does not have any virtual methods, then it cannot be used.
The simplest case, and probably worthwhile if you're passing pointers around like this, is to consider making the base class's destructor virtual. In addition to allowing you to use dynamic cast, it also allows the proper destructors to be called when a base class pointer is deleted.
You need at least one virtual method in a class for run-time type information (RTTI) to successfully apply dynamic_cast operator.
just make A destructor virtual (always do for any class just for safety).
yes, dynamic_cast for non-polymorphic types are not allowed. The base class shall have at least one virtual method. Only then that class can be called as polymorphic.
This article explains a similar example: http://www.cplusplus.com/doc/tutorial/typecasting/
A a;
B *b = dynamic_cast<B*>(a)
Here a is an object and b is a pointer.
Actually, upcasting and downcasting are both allowed in C++. But when using downcasting, 2 things should be pay attention to:
The superclass should has at least one virtual method.
Since superclass is "smaller" than subclass, one should use memory object carefully.

Why can't static_cast be used to down-cast when virtual inheritance is involved?

Consider the following code:
struct Base {};
struct Derived : public virtual Base {};
void f()
{
Base* b = new Derived;
Derived* d = static_cast<Derived*>(b);
}
This is prohibited by the standard ([n3290: 5.2.9/2]) so the code does not compile, because Derived virtually inherits from Base. Removing the virtual from the inheritance makes the code valid.
What's the technical reason for this rule to exist?
The technical problem is that there's no way to work out from a Base* what the offset is between the start of the Base sub-object and the start of the Derived object.
In your example it appears OK, because there's only one class in sight with a Base base, and so it appears irrelevant that the inheritance is virtual. But the compiler doesn't know whether someone defined another class Derived2 : public virtual Base, public Derived {}, and is casting a Base* pointing at the Base subobject of that. In general[*], the offset between the Base subobject and the Derived subobject within Derived2 might not be the same as the offset between the Base subobject and the complete Derived object of an object whose most-derived type is Derived, precisely because Base is virtually inherited.
So there's no way to know the dynamic type of the complete object, and different offsets between the pointer you've given the cast, and the required result, depending what that dynamic type is. Hence the cast is impossible.
Your Base has no virtual functions and hence no RTTI, so there certainly is no way to tell the type of the complete object. The cast is still banned even if Base does have RTTI (I don't immediately know why), but I guess without checking that a dynamic_cast is possible in that case.
[*] by which I mean, if this example doesn't prove the point then keep adding more virtual inheritance until you find a case where the offsets are different ;-)
static_cast can perform only those casts where memory layout between the classes is known at compile-time. dynamic_cast can check information at run-time, which allows to more accurately check for cast correctness, as well as read run-time information regarding the memory layout.
Virtual inheritance puts a run-time information into each object which specifies what is the memory layout between the Base and Derived. Is one right after another or is there an additional gap? Because static_cast cannot access such information, the compiler will act conservatively and just give a compiler error.
In more detail:
Consider a complex inheritance structure, where - due to multiple inheritance - there are multiple copies of Base. The most typical scenario is a diamond inheritance:
class Base {...};
class Left : public Base {...};
class Right : public Base {...};
class Bottom : public Left, public Right {...};
In this scenario Bottom consists of Left and Right, where each has its own copy of Base. The memory structure of all the above classes is known at compile time and static_cast can be used without a problem.
Let us now consider the similar structure but with virtual inheritance of Base:
class Base {...};
class Left : public virtual Base {...};
class Right : public virtual Base {...};
class Bottom : public Left, public Right {...};
Using the virtual inheritance ensures that when Bottom is created, it contains only one copy of Base that is shared between object parts Left and Right. The layout of Bottom object can be for example:
Base part
Left part
Right part
Bottom part
Now, consider that you cast Bottom to Right (that is a valid cast). You obtain a Right pointer to an object that is in two pieces: Base and Right have a memory gap in between, containing the (now-irrelevant) Left part. The information about this gap is stored at run-time in a hidden field of Right (typically referred to as vbase_offset). You can read the details for example here.
However, the gap would not exist if you would just create a standalone Right object.
So, if I give you just a pointer to Right you do not know at compile time if it is a standalone object, or a part of something bigger (e.g. Bottom). You need to check the run-time information to properly cast from Right to Base. That is why static_cast will fail and dynamic_cast will not.
Note on dynamic_cast:
While static_cast does not use run-time information about the object, dynamic_cast uses and requires it to exist! Thus, the latter cast can be used only on those classes which contain at least one virtual function (e.g. a virtual destructor)
Fundamentally, there's no real reason, but the intention is that
static_cast be very cheap, involving at most an addition or a
subtraction of a constant to the pointer. And there's no way to
implement the cast you want that cheaply; basically, because the
relative positions of Derived and Base within the object may change
if there is additional inheritance, the conversion would require a good
deal of the overhead of dynamic_cast; the members of the committee
probably thought that this defeats the reasons for using static_cast
instead of dynamic_cast.
Consider the following function foo:
#include <iostream>
struct A
{
int Ax;
};
struct B : virtual A
{
int Bx;
};
struct C : B, virtual A
{
int Cx;
};
void foo( const B& b )
{
const B* pb = &b;
const A* pa = &b;
std::cout << (void*)pb << ", " << (void*)pa << "\n";
const char* ca = reinterpret_cast<const char*>(pa);
const char* cb = reinterpret_cast<const char*>(pb);
std::cout << "diff " << (cb-ca) << "\n";
}
int main(int argc, const char *argv[])
{
C c;
foo(c);
B b;
foo(b);
}
Although not really portable, this function shows us the "offset" of A and B. Since the compiler can be quite liberal in placing the A subobject in case of inheritance (also remember that the most derived object calls the virtual base ctor!), the actual placement depends on the "real" type of the object. But since foo only gets a ref to B, any static_cast (which works at compile time by at most applying some offset) is bound to fail.
ideone.com (http://ideone.com/2qzQu) outputs for this:
0xbfa64ab4, 0xbfa64ac0
diff -12
0xbfa64ac4, 0xbfa64acc
diff -8
static_cast is a compile time construct. it checks for the validity of cast at compile time and gives an compilation error if invalid cast.
virtualism is a runtime phenomenon.
Both can't go together.
C++03 Standard §5.2.9/2 and §5.2.9/9 ar relevant in this case.
An rvalue of type “pointer to cv1 B”, where B is a class type, can be converted to an rvalue of type “pointer to cv2 D”, where D is a class derived (clause 10) from B, if a valid standard conversion from “pointer to D” to “pointer to B” exists (4.10), cv2 is the same cv-qualification as, or greater cv-qualification than, cv1, and B is not a virtual base class of D. The null pointer value (4.10) is converted to the null pointer value of the destination type. If the rvalue of type “pointer to cv1 B” points to a B that is actually a sub-object of an object of type D, the resulting pointer points to the enclosing object of type D. Otherwise, the result of the cast is undefined.
I suppose, this is due to classes with virtual inheritance having different memory layout. The parent has to be shared between children, therefore only one of them could be laid out continuously. That means, you are not guaranteed to be able to separate a continuous area of memory to treat it as a derived object.