c++ casting derived to void* and restoring base

c++ casting derived to void* and restoring base - c++

I have:
class A {...};
class B : public A {...};
class C : public B {...};
Then I'm storing C instance as void*:
C *instance = new C();
void *pC = instance;
Is it ok to make this:
B *pB = reinterpret_cast<B*>(pC);
Or I have to cast to C* only?
PS: I have more classes derived from B in my program and I'm not sure if's ok to cast like i'm doing (to B*).
Why void*: I'm using void *userdata' field of physical body class in box2d engine. I can't store my class there in other way

I would suggest casting it back to the original C* then do a dynamic_cast to B* in order to follow the rules of C++ - although you should avoid casting to void* in the first place and instead use a base pointer (A*) instead of void ptr.

The general rule is that when you cast from a pointer to void * you should always cast back to the one you originally came from.
Casting to void* is sometimes a necessary evil because of old APIs.
And sometimes it is done by design to create a "light" template. A light template is where you write code that handles a collection of pointers to objects, which are all handled in the same way, and this prevents code having to be generated for every type.
Around this code you have a strongly typed template that does the simple thing of casting back and forth (likely to be inlined anyway) so the users get strongly typed code but the implementation is less bloated.

For the code you shown it should be ok to cast directly to B* and you can use static_cast for this. However, in general, pointers to base and derived classes might be different and you'll have to cast to the original type first and only then cast it to the pointer to base.
Edit: that should be ok in practice for single inheritance, but in general it's UB.

This will work fine. You'll only get access to methods of A and B in this case though. (You'll simply "hide" methods of C.)

Casting to void* is always a bad idea and so is doing a dynamic_cast. If you have to, it mostly means your code requires re-factoring as it is an object design flaw.

Related

Is casting Derived → Base wrong? What's the alternative?

Context
My goal is to have a base container class that contains and manipulates several base class objects, and then a derived container class that contains and manipulates several derived class objects. On the advice of this answer, I attempted to do this by having each contain an array of pointers (a Base** and a Derived**), and cast from Derived** to Base** when initializing the base container class.
However, I ran into a problem – despite compiling just fine, when manipulating the contained objects, I'd have segfaults or the wrong methods would be called.
Problem
I have boiled the problem down to the following minimal case:
#include <iostream>
class Base1 {
public:
virtual void doThing1() {std::cout << "Called Base1::doThing1" << std::endl;}
};
class Base2 {
public:
virtual void doThing2() {std::cout << "Called Base2::doThing2" << std::endl;}
};
// Whether this inherits "virtual public" or just "public" makes no difference.
class Derived : virtual public Base1, virtual public Base2 {};
int main() {
Derived derived;
Derived* derivedPtrs[] = {&derived};
((Base2**) derivedPtrs)[0]->doThing2();
}
You may expect this to print "Called Base2::doThing2", but…
$ g++ -Wall -Werror main.cpp -o test && ./test
Called Base1::doThing1
Indeed – the code calls Base2::doThing2, but Base1::doThing1 ends up being called. I've also had this segfault with more complex classes, so I assume it's address-related hijinks (perhaps vtable-related – the error doesn't seem to occur without virtual methods). You can run it here, and see the assembly it compiles to here.
You can see my actual structure here – it's more complex, but it ties it into the context and explains why I need something along the lines of this.
Why does the Derived**→Base** cast not work properly when Derived*→Base* does, and more importantly, what is the right way to handle an array of derived objects as an array of base objects (or, failing that, another way to make a container class that can contain multiple derived objects)?
I can't index before upcasting (((Base2*) derivedPtrs[0])->doThing2()), I'm afraid, since in the full code that array is a class member – and I'm not sure it's a good idea (or even possible) to cast manually in every place contained objects are used in the container classes. Correct me if that is the way to handle this, though.
(I don't think it makes a difference in this case, but I'm in an environment where std::vector is unavailable.)
Edit: Solution
Many of the answers suggested that casting each pointer individually is the only way to have an array that can contain derived objects – and that does indeed seem to be the case. For my particular use case, though, I managed to solve the problem using templates! By supplying a type parameter for what the container class is supposed to contain, instead of having to contain an array of derived objects, the type of the array can be set to the derived type at compile time (e.g. BaseContainer<Derived> container(length, arrayOfDerivedPtrs);).
Here's a version of the broken "actual structure" code above, fixed with templates.

There are many things that make this code pretty awful and contribute to this issue:
Why are we dealing with two-star types in the first place? If std::vector doesn't exist, why not write your own?
Don't use C-style casts. You can cast pointers to completely unrelated types into each other and the compiler is not allowed to stop you (coincidentally, this is exactly what is happening here). Use static_cast/dynamic_cast instead.
Let's assume we had std::vector, for ease of notation. You are trying to cast a std::vector<Derived*> to std::vector<Base*>. Those are unrelated types (the same is true for Derived** and Base**), and casting one to the other is not legal in any way.
Pointer casts from/to derived are not necessarily trivial. If you have struct X : A, B {}, then a pointer to the B base will be different from a pointer to the A base† (and with a vtable in play, possible also different from a pointer to X). They must be, because the (sub)objects cannot reside at the same memory address. The compiler will adjust the pointer value when you cast the pointer. This of course does not/cannot happen for each individual pointer if you (try to) cast an array of pointers.
If you have an array of pointers to Derived and want to get an array of those pointers to Base, then you have to manually cast each one. Since the pointer values will generally be different in both arrays, there is no way to "reuse" the same array.
†(Unless the conditions for empty base optimization are fulfilled, which is not the case for you).

Like others said the problem is that you are not letting the compiler doing its job in adjusting indices, suppose Derived layout in memory is something like (not guaranteed by the standard, just a possible implementation):
| vtable_Base1 | Base1 | vtable_Base2 | Base2 | vtable_Derived | Derived |
Then &derived points to the start of the object, when you normally do
Base2* base = static_cast<Derived*>(&derived)
the compiler knows the offset which Base2 structure has inside Derived type and adjusts the address to point to the start of it.
If, instead, you cast an array of pointers directly the compiler is coercing the type, assuming that your array is storing pointers to Base2 already, but without adjusting them.
A dirty hack which may work or not in your situation is having a method which returns the pointer to itself, eg:
class Base2 {
public:
Base2* base2() { return this; }
}
so that you can do derivedPtrs[0]->base2()->doThing2().

It doesn't work the same reason why this don't work:
struct Base {};
struct Derived : Base { int i; };
int main() {
Derived d[6];
Derived* d2 = d;
Base** b = &d2; // ERROR!
}
Your c-style cast is a bad practice because it didn't warn you of the error. Just don't do that to your code. Your c-style cast was actually a reinterpret_cast in disguise, which is tottaly wrong in the case.
But why cannot you cast an array to derived into an array to base? Simple: they have different layout.
You see, when you iterate on an array of a type, each elements in the array are contiguous in memory. The Derived class may have a size of let's say, 24 bytes, and Base a size of 8:
Derived d[4];
------------------------------------------------------
| D1 | D2 | D3 | D4 |
------------------------------------------------------
Base b[4];
---------------------
| B1 | B2 | B3 | B4 |
---------------------
As you can see, Derived[4] and Base[4] are different types with different layouts.
Then what can you do?
There's many solution in fact. The easiest would be to create a new array of pointer to base, and cast each derived to base pointers. You have to ajust the pointer of each objects anyway.
It would look like this:
std::vector<Base*> bases;
bases.reserve(std::size(derived_arr))
std::transform(
std::begin(derived_arr), std::end(derived_arr),
std::back_inserter(bases),
[](Dervied* d) {
// You must use dynamic cast because the
// pointer offset in only known at runtime
// when using virtual inheritance
return dynamic_cast<Base*>(d);
}
);
The other solution that is in place in memory would be to create you own type of iterator that would do the cast when calling the operator* and the operator->. This is a bit harder to do but can save you an allocation by making iteration a bit slower.
On top all that, this may be unrelated, but I would advise against virtual inheritance. This is not a good practice and in my experience resulted in more pain than anything. I would suggest using the adaptator pattern and wrap non-polymorphic types instead.

When you cast from Derived* to Base* the compiler will adjust the value. When you cast Derived** to Base** you defeat that.
This is a good reason to always use static_cast. Example error with your code changed:
test.cpp:19:5: error: static_cast from 'Derived **' to 'Base2 **' is not
allowed
static_cast<Base2**>(derivedPtrs)[0]->doThing2();

(Base2 **) is in fact a reinterpret_cast (You can confirm this by trying all four casts) and that expression causes UB. The fact that you can implicitly cast a pointer to derived to a pointer to base doesn't mean that they are the same, e.g. int and float. And here, you are referring an object by the type which it isn't, in this case, causes an UB.
Calling a virtual function this way results the final overrider of that object being called. How this is achieved depends on the compiler.
Assume that the compiler uses a "vtable", finding the vtable of Base2 (might be adding an offset to the memory. assembly) from the memory address of a Derived is not trivial.
Basically, you must perform a dynamic cast on every pointer, store them somewhere or dynamic cast it when in need.

Multiple inheritance cast not working as expected

I recently had a problem with casts and multiple inheritance: I needed to cast a Base* to Unrelated*, because a specific Derived class derives the Unrelated class.
This is a short example:
#include <iostream>
struct Base{
virtual ~Base() = default;
};
struct Unrelated{
float test = 111;
};
struct Derived : Base,Unrelated{};
int main(){
Base* b = new Derived;
Unrelated* u1 = (Unrelated*)b;
std::cout << u1->test << std::endl; //outputs garbage
Unrelated* y = dynamic_cast<Unrelated*>(b);
std::cout << y->test << std::endl; //outputs 111
}
The first cast clearly doesnt work, but the second one did work.
My question is: Why did the second cast work? Shouldnt dynamic_cast only work for cast to a related class type? I thought there wasnt any information about Unrelated at runtime because it is not polymorphic.
Edit: I used colirus gcc for the example.

dynamic_cast works because the dynamic type of object pointed to by a pointer to its base class is related to Unrelated.
Keep in mind that dynamic_cast requires a virtual table to inspect the inheritance tree of the object at run-time.
The C-style cast (Unrelated*)b does not work because the C-style cast does const_cast, static_cast, reinterpret_cast and more, but it does not do dynamic_cast.
I would suggest avoiding C-style casts in C++ code because they do so many things as opposed to precise C++ casts. My colleagues who insist on using C-style cast still occasionally get them wrong.

The first cast (Unrelated*)b doesn't work because you're treating the Base class sub-object, containing probably just a vtable pointer, as an Unrelated, containing a float.
Instead you can cast down and up, static_cast<Unrelated*>( static_cast<Derived*>( b ) ).
And this is what dynamic_cast does for you, since Base is a polymorphic type (at least one virtual method) which allows dynamic_cast to inspect the type of the most derived object.
In passing, dynamic_cast<void*>( b ) would give you a pointer to the most derived object.
However, since you know the types there's no need to invoke the slight overhead of a dynamic_cast: just do the down- and up-casts.
Instead of the C style cast (Unrelated*)b you should use the corresponding C++ named cast or casts, because C style casts of pointers can do things you'd not expect, and because the effect can change completely when types are changed during maintenance.
The C style cast will maximum do 2 C++ named casts. In this case the C style cast corresponds to a reinterpret_cast. The compiler will not allow any other named cast here.
Which is a warning sign. ;-)
In contrast, the down- and up-casts are static_casts, which usually are benign casts.
All that said, the best is to almost completely avoid casts by using the top secret technique:
Don't throw away type information in the first place.
I.e., in the example code, just use Derived* as the type of the pointer.

With multiple inheritance, the Derived object consists of two sub-objects, one Base and one Unrelated. The compiler knows how to access whichever part of the object it needs, typically by adding an offset to the pointer (but that's an implementation detail). It can only do that when it knows the actual type of a pointer. By using a C-style cast, you've told the compiler to ignore the actual type and treat that pointer value as a pointer to something else. It no longer has the information necessary to properly access the sub-object you desire, and it fails.
dynamic_cast allows the compiler to use run-time information about the object to locate the proper sub-object contained within it. If you were to output or examine the pointer values themselves, you'd see that they are different.

static_cast from Derived* to void* to Base*

I would like to cast a pointer to a member of a derived class to void* and from there to a pointer of the base class, like in the example below:
#include <iostream>
class Base
{
public:
void function1(){std::cout<<"1"<<std::endl;}
virtual void function2()=0;
};
class Derived : public Base
{
public:
virtual void function2(){std::cout<<"2"<<std::endl;}
};
int main()
{
Derived d;
void ptr* = static_cast<void*>(&d);
Base* baseptr=static_cast<Base*>(ptr);
baseptr->function1();
baseptr->function2();
}
This compiles and gives the desired result (prints 1 and 2 respectively), but is it guaranteed to work? The description of static_cast I found here: http://en.cppreference.com/w/cpp/language/static_cast
only mentions conversion to void* and back to a pointer to the same class (point 10).

In the general case, converting a base to void to derived (or vice versa) via static casting is not safe.
There will be cases where it will almost certainly work: if everything involved is a pod, or standard layout, and only single inheritance is involved, then things should be fine, at least in practice: I do not have chapter and verse from the standard, but the general idea is that the base in that case is guaranteed to be the prefix of the derived, and they will share addresses.
If you want to start seeing this fail, mix in virtual inheritance, multiple inheritance (both virtual and not), and multiple implementation inheritance that are non trivial. Basically when the address of the different type views of this differ, the void cast from and back to a different type is doomed. I have seen this fail in practice, and the fact it can fail (due to changes in your code base far away from the point of casting) is why you want to be careful about always casting to and from void pointer with the exact same type.

In general, no, it is not safe.
Suppose that casting Derived* directly to Base* results in a different address (for example, if multiple inheritance or virtual inheritance is involved).
Now if you inserted a cast to void* in between, how would the compiler know how to convert that void* to an appropriate Base* address?
If you need to cast a Derived* to a void*, you should explicitly cast the void* back to the original Derived* type first. (And from there, the cast from Derived* to Base* is implicit anyway, so you end up with the same number of casts, and thus it's not actually less any convenient.)

From the link you supplied yourself
9) A pointer to member of some class D can be upcast to a pointer to member of its base class B. This static_cast makes no checks to ensure the member actually exists in the runtime type of the pointed-to object.
Meaning as long as you know that the upcast is safe before you do it it is guaranteed to work. That's why you should be using dynamic_cast which returns nullptr if unsuccessful.
Think of it this way if you have.
Type * t1;
static_cast of t1 to one of the classes deriving from it cannot be known at compile time, without in depth analysis of your program (which obviously it is not and should not be doing), so even if it ends up being correct you have no way of checking. dynamic_cast does extra work at runtime to check if the conversion was successful hence the prefix dynamic.

C-Style casting causing SIGILL, dynamic_cast ok

In a project of mine here I have a class that implements four interfaces:
class A : public B,
public C,
public D,
public E
{
----Implementation Code here----
};
these four interfaces contains only pure virtual functions, none of them compose the diamond problem (so I don't need to use the virtual keyword) so I spected no problem when doing something like this:
A* var = new A;
((C*)var)->method_from_interface();
Yet something ugly is happening, because the function is jumping around to a different method of the A class, and valgrind complains about an unhandled instruction. However, doing this:
A* var = new A;
(dynamic_cast<C*>(var))->method_from_interface();
works as suspected.
So I'm wondering if this is a G++ bug or some misuse of the language?
edit:
Maybe I've simplified too much my problem. I'm receiving the A class as a D* on a function call:
void do_something(provider* p) {
D* iface = p->recoverItemByName("nameThatReturnsClassA");
((B*) iface)->call_method_from_b_iface();
}
Note that I know that in this time the D* is in fact an A*, so casting doing a casting to B* isn't breaking any rules. I can't static_cast it, though, as D* and B* have no relation, but I can use reinterpret_cast successfully.

C-style cast is just doing a "basic" memory mapping of your pointer to your class. If your method is offset=42 in your class, it would make (*(A+42)) () (simplified, of course).
however, if you inherit from more than one class, you have to take into account the various classes and order in which the compiler puts them.
static_cast is taking into account multiple inheritance.
On your example, it would probably work with either B,C,D or E, but not with the other ones. However, you have no good reason to do that : you would call A->methodFromInterface() and it just works!
On C++, it's advised to use static_cast or dynamic_cast (note : second one relies on rtti which might not be available) and discard the old C-style cast.
EDIT
Tried to reproduce issue but couldn't, files can be found on
class2.hpp : http://pastebin.com/CBVykqcD
class2.cpp : http://pastebin.com/Vy2YsEGP
class.cpp : http://pastebin.com/wwHpe87g
Compiled with on MacOs 10.6:
g++ -fno-rtti -o truc ./class.cpp ./class2.cpp && ./truc

Note that I know that in this time the D* is in fact an A*, so casting
doing a casting to B* isn't breaking any rules.
Wrong.
You may know that it's actually an A*, but the compiler doesn't, at least at compile time when it's trying to figure out what code to emit to do the conversion.
A funny thing happens when you inherit from multiple classes with virtual methods. When you convert from the derived class to one of the interface classes, the pointer address changes! The compiler does adjustments to the pointer to make it valid for the pointer type you've declared. Try it yourself, start with an A* pointer and display its value, then cast to a B*, C*, and D* and display those. At most one of the base classes will be the same as the A*, the others will be different.
When you use a C-style cast you're telling the compiler "I don't care if you can't do the conversion properly, do it anyway." It duly treats the D* as a B* without doing the required fixups, so now the pointer is completely wrong. It isn't pointing to the class B vtable so the wrong methods get called.
A dynamic_cast works because it uses extra information available at run-time; it can trace the pointer to its most-derived A* and then back down to a B*.

The problem is in your function do_something.
This line is fine:
D* iface = p->recoverItemByName("nameThatReturnsClassA");
But this line is bad because B is not related to D, so you cannot cast safely. Try calling static_cast(iface) and you will see the compiler complaining - which should be a warning to you.
((B*) iface)->call_method_from_b_iface();
In the upper line you are coding knowledge that iface does not only point to a D, but to an A. Better store the a pointer to A and use that pointer. So do
A* iface = p->recoverItemByName("nameThatReturnsClassA");
iface->call_method_from_b_iface();
If you still want to stick to D* iface, then code it like this (with only one the both options):
D* iface = p->recoverItemByName("nameThatReturnsClassA");
static_cast<A*>(iface)->call_method_from_b_iface(); // option 1, easy to understand
dynamic_cast<B*>(iface)->call_method_from_b_iface(); // option 2, will do a cross-cast
For more information about dynamic_cast and it's cross cast capability see http://msdn.microsoft.com/en-us/library/cby9kycs.aspx.
Last but not least: try to avoid old C-style cast and always try to use the C++ style cast const_cast, static_cast/dynamic_cast and reinterpret_cast (in this order).

Is casting a base pointer to a derived pointer dangerous in any way?

I have always avoided the C++ casts (static_cast, const_cast, dynamic_cast [I also avoid RTTI], etc) because I regard them as a waste of typing and I never saw any advantages, so I use C-style casts exclusively.
My question is, if you have an inheritance hierarchy and a pointer to the base type, can you safely cast a base pointer to a derived pointer with a C-style cast (provided that somehow you are absolutely sure the base pointer points to an instance of a derived type) without something happening behind the scenes that will cause seemingly inexplicable failures?
I ask this because I read in one of the comments on another question that using a C-style cast from a base to a derived type will not "adjust the pointer" or something like that. I'll try to find the exact comment again.

Using static_cast or dynamic_cast for casting ensures some kind of safety.
static_cast gives you an compile time error if an cast is invalid while,
dynamic_cast throws an exception for References or returns a null pointer in case of invalid cast at run time.
Thus they are better than the c-style cast mainly due the safety they provide.
If you are absolutely sure of the cast being valid even a c-style cast just does the same, but it is better to use the language provided security than rather manage it ourself(since we don't need to).

A dynamic_cast to a pointer (not a reference) gives you a chance to detect an error and handle it; the plain C cast won't. But, if you're absolutely sure about the types, the C-style cast - abominable though it is - should work OK; it was what had to work before there was a C++ standard.

If your derived class introduces additional members, yes I think things can go wrong if you try to index the pointer or index on the pointer as an array.
Consider the following:
class base
{
public:
int _data1;
base () : _data1(0) {}
};
class derived : public base
{
public:
int _data2;
derived () : _data2(1) {}
};
int main ()
{
base *_base_ptr = new derived[10];
// this isn't going to work correctly, because the compiler will use `sizeof(base)`
// to do the pointer offsets, rather than sizeof(derived)...
int _data = _base_ptr[1]._data1;
delete[] _base_ptr;
return _data;
}
Running this snippet in ideone returns 1 rather than the expected 0.
Hope this helps.

The C++ style casts are important in template code. In template code are much less sure of the types of the objects you are casting from or to, because those types have been supplied by the user. C++ style cast will give you compiler errors if you end up doing an unwanted cast (for instance a static cast between unrelated pointer types) but the C style cast will just do the cast anyway.
For instance (trivial example)
template <class T>
T* safe_downcast(Base* ptr)
{
return static_cast<T*>(ptr);
}
There'd be nothing safe about this function if I used a C style cast.
But I agree, outside of template code I often use C style casts.

If you are absolutely sure that the base class pointer actually points to an instance of the derived type, why do you have a base pointer at all? Also: It sounds as if this was a classical case for virtual functions. If you want to avoid virtual calls for some reason (performance, ...), look for CRTP.
I use C++-style casts mainly because they remind me to check whether the cast is absolutely necessary or not by looking ugly. I try to avoid reinterpret_cast and dynamic_cast if at all possible (most of the time it is) and I use static_cast mostly inside CRTP template classes, where it is OK.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js