Casting Rc<ConcreteType> to an Rc<Trait> - casting

Horse is a struct which implements the Animal trait. I have an Rc<Horse> and a function that needs to take in an Rc<Animal>, so I want to convert from Rc<Horse> to Rc<Animal>.
I did this:
use std::rc::Rc;
struct Horse;
trait Animal {}
impl Animal for Horse {}
fn main() {
let horse = Rc::new(Horse);
let animal = unsafe {
// Consume the Rc<Horse>
let ptr = Rc::into_raw(horse);
// Now it's an Rc<Animal> pointing to the same data!
Rc::<Animal>::from_raw(ptr)
};
}
Is this a good solution? Is it correct?

The answer by Boiethios already explains that upcasting can be explicitly performed using as, or even happens implicitly in certain situaions. I'd like to add a few more detail on the mechanisms.
I'll start with explaining why your unsafe code works correctly.
let animal = unsafe {
let ptr = Rc::into_raw(horse);
Rc::<Animal>::from_raw(ptr)
};
The first line in the unsafe block consumes horse and returns a *const Horse, which is a pointer to a concrete type. The pointer is exactly what you'd expect it to be – the memory address of horse's data (ignoring the fact that in your example Horse is zero-sized and has no data). In the second line, we call Rc::from_raw(); let's look at the protoype of that function:
pub unsafe fn from_raw(ptr: *const T) -> Rc<T>
Since we are calling this function for Rc::<Animal>, the expected argument type is *const Animal. Yet the ptr we have has type *const Horse, so why does the compiler accept the code? The answer is that the compiler performs an unsized coercion, a type of implicit cast that is performed in certain places for certain types. Specifically, we convert a pointer to a concrete type to a pointer to any type implementing the Animal trait. Since we don't know the exact type, now the pointer isn't a mere memory address anymore – it's a memory address together with an identifier of the actual type of the object, a so-called fat pointer. This way, the Rc created from the fat pointer can retain the information of the underlying concrete type, and can call the correct methods for Horse's implementation of Animal (if there are any; in your example Animal doesn't have any functions, but of course this should continue to work if there are).
We can see the difference between the two kinds of pointer by printing their size
let ptr = Rc::into_raw(horse);
println!("{}", std::mem::size_of_val(&ptr));
let ptr: *const Animal = ptr;
println!("{}", std::mem::size_of_val(&ptr));
This code first makes ptr a *const Horse, prints the size of the pointer, then uses an unsized coercion to convert ptr to and *const Animal and prints its size again. On a 64-bit system, this will print
8
16
The first one is just a simple memory address, while the second one is a memory address together with information on the concrete type of the pointee. (Specifically, the fat pointer contains a pointer to the virtual method table.)
Now let's look at what happens in the code in Boethios' answer
let animal = horse as Rc<Animal>;
or equivalently
let animal: Rc<Animal> = horse;
also perform an unsized coercion. How does the compiler know how to do this for a Rc rather than a raw pointer? The answer is that the trait CoerceUnsized exists specifically for this purpose. You can read the RFC on coercions for dynamically sized types for further details.

I think that your solution is correct, while I'm not a specialist for unsafe code. But, you do not have to use unsafe code to do such simple things as upcasting:
use std::rc::Rc;
trait Animal {}
struct Horse;
impl Animal for Horse {}
fn main() {
let horse = Rc::new(Horse);
let animal = horse as Rc<Animal>;
}
If you want to pass it to a function, you do not even have to cast:
fn gimme_an_animal(_animal: Rc<Animal>) {}
fn main() {
let horse = Rc::new(Horse);
gimme_an_animal(horse);
}
Because Horse implements Animal, an horse is an animal. You do not have to do anything special for casting it. Note that this transformation is destructive, and you cannot make an Rc<Horse> from an Rc<Animal>.

Related

A function to return different derived-type object/reference not a pointer [duplicate]

I did find some questions already on StackOverflow with similar title, but when I read the answers, they were focusing on different parts of the question, which were really specific (e.g. STL/containers).
Could someone please show me, why you must use pointers/references for implementing polymorphism? I can understand pointers may help, but surely references only differentiate between pass-by-value and pass-by-reference?
Surely so long as you allocate memory on the heap, so that you can have dynamic binding, then this would have been enough. Obviously not.
"Surely so long as you allocate memory on the heap" - where the memory is allocated has nothing to do with it. It's all about the semantics. Take, for instance:
Derived d;
Base* b = &d;
d is on the stack (automatic memory), but polymorphism will still work on b.
If you don't have a base class pointer or reference to a derived class, polymorphism doesn't work because you no longer have a derived class. Take
Base c = Derived();
The c object isn't a Derived, but a Base, because of slicing. So, technically, polymorphism still works, it's just that you no longer have a Derived object to talk about.
Now take
Base* c = new Derived();
c just points to some place in memory, and you don't really care whether that's actually a Base or a Derived, but the call to a virtual method will be resolved dynamically.
In C++, an object always has a fixed type and size known at compile-time and (if it can and does have its address taken) always exists at a fixed address for the duration of its lifetime. These are features inherited from C which help make both languages suitable for low-level systems programming. (All of this is subject to the as-if, rule, though: a conforming compiler is free to do whatever it pleases with code as long as it can be proven to have no detectable effect on any behavior of a conforming program that is guaranteed by the standard.)
A virtual function in C++ is defined (more or less, no need for extreme language lawyering) as executing based on the run-time type of an object; when called directly on an object this will always be the compile-time type of the object, so there is no polymorphism when a virtual function is called this way.
Note that this didn't necessarily have to be the case: object types with virtual functions are usually implemented in C++ with a per-object pointer to a table of virtual functions which is unique to each type. If so inclined, a compiler for some hypothetical variant of C++ could implement assignment on objects (such as Base b; b = Derived()) as copying both the contents of the object and the virtual table pointer along with it, which would easily work if both Base and Derived were the same size. In the case that the two were not the same size, the compiler could even insert code that pauses the program for an arbitrary amount of time in order to rearrange memory in the program and update all possible references to that memory in a way that could be proven to have no detectable effect on the semantics of the program, terminating the program if no such rearrangement could be found: this would be very inefficient, though, and could not be guaranteed to ever halt, obviously not desirable features for an assignment operator to have.
So in lieu of the above, polymorphism in C++ is accomplished by allowing references and pointers to objects to reference and point to objects of their declared compile-time types and any subtypes thereof. When a virtual function is called through a reference or pointer, and the compiler cannot prove that the object referenced or pointed to is of a run-time type with a specific known implementation of that virtual function, the compiler inserts code which looks up the correct virtual function to call a run-time. It did not have to be this way, either: references and pointers could have been defined as being non-polymorphic (disallowing them to reference or point to subtypes of their declared types) and forcing the programmer to come up with alternative ways of implementing polymorphism. The latter is clearly possible since it's done all the time in C, but at that point there's not much reason to have a new language at all.
In sum, the semantics of C++ are designed in such a way to allow the high-level abstraction and encapsulation of object-oriented polymorphism while still retaining features (like low-level access and explicit management of memory) which allow it to be suitable for low-level development. You could easily design a language that had some other semantics, but it would not be C++ and would have different benefits and drawbacks.
I found it helpful to understand that a copy constructor is invoked when assigning like this:
class Base { };
class Derived : public Base { };
Derived x; /* Derived type object created */
Base y = x; /* Copy is made (using Base's copy constructor), so y really is of type Base. Copy can cause "slicing" btw. */
Since y is an actual object of class Base, rather than the original one, functions called on this are Base's functions.
Consider little endian architectures: values are stored low-order-bytes first. So, for any given unsigned integer, the values 0-255 are stored in the first byte of the value. Accessing the low 8-bits of any value simply requires a pointer to it's address.
So we could implement uint8 as a class. We know that an instance of uint8 is ... one byte. If we derive from it and produce uint16, uint32, etc, the interface remains the same for purposes of abstraction, but the one most important change is size of the concrete instances of the object.
Of course, if we implemented uint8 and char, the sizes may be the same, likewise sint8.
However, operator= of uint8 and uint16 are going to move different quantities of data.
In order to create a Polymorphic function we must either be able to:
a/ receive the argument by value by copying the data into a new location of the correct size and layout,
b/ take a pointer to the object's location,
c/ take a reference to the object instance,
We can use templates to achieve a, so polymorphism can work without pointers and references, but if we are not counting templates, then lets consider what happens if we implement uint128 and pass it to a function expecting uint8? Answer: 8 bits get copied instead of 128.
So what if we made our polymorphic function accept uint128 and we passed it a uint8. If our uint8 we were copying was unfortunately located, our function would attempt to copy 128 bytes of which 127 were outside of our accessible memory -> crash.
Consider the following:
class A { int x; };
A fn(A a)
{
return a;
}
class B : public A {
uint64_t a, b, c;
B(int x_, uint64_t a_, uint64_t b_, uint64_t c_)
: A(x_), a(a_), b(b_), c(c_) {}
};
B b1 { 10, 1, 2, 3 };
B b2 = fn(b1);
// b2.x == 10, but a, b and c?
At the time fn was compiled, there was no knowledge of B. However, B is derived from A so polymorphism should allow that we can call fn with a B. However, the object it returns should be an A comprising a single int.
If we pass an instance of B to this function, what we get back should be just a { int x; } with no a, b, c.
This is "slicing".
Even with pointers and references we don't avoid this for free. Consider:
std::vector<A*> vec;
Elements of this vector could be pointers to A or something derived from A. The language generally solves this through the use of the "vtable", a small addition to the object's instance which identifies the type and provides function pointers for virtual functions. You can think of it as something like:
template<class T>
struct PolymorphicObject {
T::vtable* __vtptr;
T __instance;
};
Rather than every object having its own distinct vtable, classes have them, and object instances merely point to the relevant vtable.
The problem now is not slicing but type correctness:
struct A { virtual const char* fn() { return "A"; } };
struct B : public A { virtual const char* fn() { return "B"; } };
#include <iostream>
#include <cstring>
int main()
{
A* a = new A();
B* b = new B();
memcpy(a, b, sizeof(A));
std::cout << "sizeof A = " << sizeof(A)
<< " a->fn(): " << a->fn() << '\n';
}
http://ideone.com/G62Cn0
sizeof A = 4 a->fn(): B
What we should have done is use a->operator=(b)
http://ideone.com/Vym3Lp
but again, this is copying an A to an A and so slicing would occur:
struct A { int i; A(int i_) : i(i_) {} virtual const char* fn() { return "A"; } };
struct B : public A {
int j;
B(int i_) : A(i_), j(i_ + 10) {}
virtual const char* fn() { return "B"; }
};
#include <iostream>
#include <cstring>
int main()
{
A* a = new A(1);
B* b = new B(2);
*a = *b; // aka a->operator=(static_cast<A*>(*b));
std::cout << "sizeof A = " << sizeof(A)
<< ", a->i = " << a->i << ", a->fn(): " << a->fn() << '\n';
}
http://ideone.com/DHGwun
(i is copied, but B's j is lost)
The conclusion here is that pointers/references are required because the original instance carries membership information with it that copying may interact with.
But also, that polymorphism is not perfectly solved within C++ and one must be cognizant of their obligation to provide/block actions which could produce slicing.
You need pointers or reference because for the kind of polymorphism you are interested in (*), you need that the dynamic type could be different from the static type, in other words that the true type of the object is different than the declared type. In C++ that happens only with pointers or references.
(*) Genericity, the type of polymorphism provided by templates, doesn't need pointers nor references.
When an object is passed by value, it's typically put on the stack. Putting something on the stack requires knowledge of just how big it is. When using polymorphism, you know that the incoming object implements a particular set of features, but you usually have no idea the size of the object (nor should you, necessarily, that's part of the benefit). Thus, you can't put it on the stack. You do, however, always know the size of a pointer.
Now, not everything goes on the stack, and there are other extenuating circumstances. In the case of virtual methods, the pointer to the object is also a pointer to the object's vtable(s), which indicate where the methods are. This allows the compiler to find and call the functions, regardless of what object it's working with.
Another cause is that very often the object is implemented outside of the calling library, and allocated with a completely different (and possibly incompatible) memory manager. It could also have members that can't be copied, or would cause problems if they were copied with a different manager. There could be side-effects to copying and all sorts of other complications.
The result is that the pointer is the only bit of information on the object that you really properly understand, and provides enough information to figure out where the other bits you need are.

Check if list of abstract elements contains an element of a certain derived type in C++? [duplicate]

I did find some questions already on StackOverflow with similar title, but when I read the answers, they were focusing on different parts of the question, which were really specific (e.g. STL/containers).
Could someone please show me, why you must use pointers/references for implementing polymorphism? I can understand pointers may help, but surely references only differentiate between pass-by-value and pass-by-reference?
Surely so long as you allocate memory on the heap, so that you can have dynamic binding, then this would have been enough. Obviously not.
"Surely so long as you allocate memory on the heap" - where the memory is allocated has nothing to do with it. It's all about the semantics. Take, for instance:
Derived d;
Base* b = &d;
d is on the stack (automatic memory), but polymorphism will still work on b.
If you don't have a base class pointer or reference to a derived class, polymorphism doesn't work because you no longer have a derived class. Take
Base c = Derived();
The c object isn't a Derived, but a Base, because of slicing. So, technically, polymorphism still works, it's just that you no longer have a Derived object to talk about.
Now take
Base* c = new Derived();
c just points to some place in memory, and you don't really care whether that's actually a Base or a Derived, but the call to a virtual method will be resolved dynamically.
In C++, an object always has a fixed type and size known at compile-time and (if it can and does have its address taken) always exists at a fixed address for the duration of its lifetime. These are features inherited from C which help make both languages suitable for low-level systems programming. (All of this is subject to the as-if, rule, though: a conforming compiler is free to do whatever it pleases with code as long as it can be proven to have no detectable effect on any behavior of a conforming program that is guaranteed by the standard.)
A virtual function in C++ is defined (more or less, no need for extreme language lawyering) as executing based on the run-time type of an object; when called directly on an object this will always be the compile-time type of the object, so there is no polymorphism when a virtual function is called this way.
Note that this didn't necessarily have to be the case: object types with virtual functions are usually implemented in C++ with a per-object pointer to a table of virtual functions which is unique to each type. If so inclined, a compiler for some hypothetical variant of C++ could implement assignment on objects (such as Base b; b = Derived()) as copying both the contents of the object and the virtual table pointer along with it, which would easily work if both Base and Derived were the same size. In the case that the two were not the same size, the compiler could even insert code that pauses the program for an arbitrary amount of time in order to rearrange memory in the program and update all possible references to that memory in a way that could be proven to have no detectable effect on the semantics of the program, terminating the program if no such rearrangement could be found: this would be very inefficient, though, and could not be guaranteed to ever halt, obviously not desirable features for an assignment operator to have.
So in lieu of the above, polymorphism in C++ is accomplished by allowing references and pointers to objects to reference and point to objects of their declared compile-time types and any subtypes thereof. When a virtual function is called through a reference or pointer, and the compiler cannot prove that the object referenced or pointed to is of a run-time type with a specific known implementation of that virtual function, the compiler inserts code which looks up the correct virtual function to call a run-time. It did not have to be this way, either: references and pointers could have been defined as being non-polymorphic (disallowing them to reference or point to subtypes of their declared types) and forcing the programmer to come up with alternative ways of implementing polymorphism. The latter is clearly possible since it's done all the time in C, but at that point there's not much reason to have a new language at all.
In sum, the semantics of C++ are designed in such a way to allow the high-level abstraction and encapsulation of object-oriented polymorphism while still retaining features (like low-level access and explicit management of memory) which allow it to be suitable for low-level development. You could easily design a language that had some other semantics, but it would not be C++ and would have different benefits and drawbacks.
I found it helpful to understand that a copy constructor is invoked when assigning like this:
class Base { };
class Derived : public Base { };
Derived x; /* Derived type object created */
Base y = x; /* Copy is made (using Base's copy constructor), so y really is of type Base. Copy can cause "slicing" btw. */
Since y is an actual object of class Base, rather than the original one, functions called on this are Base's functions.
Consider little endian architectures: values are stored low-order-bytes first. So, for any given unsigned integer, the values 0-255 are stored in the first byte of the value. Accessing the low 8-bits of any value simply requires a pointer to it's address.
So we could implement uint8 as a class. We know that an instance of uint8 is ... one byte. If we derive from it and produce uint16, uint32, etc, the interface remains the same for purposes of abstraction, but the one most important change is size of the concrete instances of the object.
Of course, if we implemented uint8 and char, the sizes may be the same, likewise sint8.
However, operator= of uint8 and uint16 are going to move different quantities of data.
In order to create a Polymorphic function we must either be able to:
a/ receive the argument by value by copying the data into a new location of the correct size and layout,
b/ take a pointer to the object's location,
c/ take a reference to the object instance,
We can use templates to achieve a, so polymorphism can work without pointers and references, but if we are not counting templates, then lets consider what happens if we implement uint128 and pass it to a function expecting uint8? Answer: 8 bits get copied instead of 128.
So what if we made our polymorphic function accept uint128 and we passed it a uint8. If our uint8 we were copying was unfortunately located, our function would attempt to copy 128 bytes of which 127 were outside of our accessible memory -> crash.
Consider the following:
class A { int x; };
A fn(A a)
{
return a;
}
class B : public A {
uint64_t a, b, c;
B(int x_, uint64_t a_, uint64_t b_, uint64_t c_)
: A(x_), a(a_), b(b_), c(c_) {}
};
B b1 { 10, 1, 2, 3 };
B b2 = fn(b1);
// b2.x == 10, but a, b and c?
At the time fn was compiled, there was no knowledge of B. However, B is derived from A so polymorphism should allow that we can call fn with a B. However, the object it returns should be an A comprising a single int.
If we pass an instance of B to this function, what we get back should be just a { int x; } with no a, b, c.
This is "slicing".
Even with pointers and references we don't avoid this for free. Consider:
std::vector<A*> vec;
Elements of this vector could be pointers to A or something derived from A. The language generally solves this through the use of the "vtable", a small addition to the object's instance which identifies the type and provides function pointers for virtual functions. You can think of it as something like:
template<class T>
struct PolymorphicObject {
T::vtable* __vtptr;
T __instance;
};
Rather than every object having its own distinct vtable, classes have them, and object instances merely point to the relevant vtable.
The problem now is not slicing but type correctness:
struct A { virtual const char* fn() { return "A"; } };
struct B : public A { virtual const char* fn() { return "B"; } };
#include <iostream>
#include <cstring>
int main()
{
A* a = new A();
B* b = new B();
memcpy(a, b, sizeof(A));
std::cout << "sizeof A = " << sizeof(A)
<< " a->fn(): " << a->fn() << '\n';
}
http://ideone.com/G62Cn0
sizeof A = 4 a->fn(): B
What we should have done is use a->operator=(b)
http://ideone.com/Vym3Lp
but again, this is copying an A to an A and so slicing would occur:
struct A { int i; A(int i_) : i(i_) {} virtual const char* fn() { return "A"; } };
struct B : public A {
int j;
B(int i_) : A(i_), j(i_ + 10) {}
virtual const char* fn() { return "B"; }
};
#include <iostream>
#include <cstring>
int main()
{
A* a = new A(1);
B* b = new B(2);
*a = *b; // aka a->operator=(static_cast<A*>(*b));
std::cout << "sizeof A = " << sizeof(A)
<< ", a->i = " << a->i << ", a->fn(): " << a->fn() << '\n';
}
http://ideone.com/DHGwun
(i is copied, but B's j is lost)
The conclusion here is that pointers/references are required because the original instance carries membership information with it that copying may interact with.
But also, that polymorphism is not perfectly solved within C++ and one must be cognizant of their obligation to provide/block actions which could produce slicing.
You need pointers or reference because for the kind of polymorphism you are interested in (*), you need that the dynamic type could be different from the static type, in other words that the true type of the object is different than the declared type. In C++ that happens only with pointers or references.
(*) Genericity, the type of polymorphism provided by templates, doesn't need pointers nor references.
When an object is passed by value, it's typically put on the stack. Putting something on the stack requires knowledge of just how big it is. When using polymorphism, you know that the incoming object implements a particular set of features, but you usually have no idea the size of the object (nor should you, necessarily, that's part of the benefit). Thus, you can't put it on the stack. You do, however, always know the size of a pointer.
Now, not everything goes on the stack, and there are other extenuating circumstances. In the case of virtual methods, the pointer to the object is also a pointer to the object's vtable(s), which indicate where the methods are. This allows the compiler to find and call the functions, regardless of what object it's working with.
Another cause is that very often the object is implemented outside of the calling library, and allocated with a completely different (and possibly incompatible) memory manager. It could also have members that can't be copied, or would cause problems if they were copied with a different manager. There could be side-effects to copying and all sorts of other complications.
The result is that the pointer is the only bit of information on the object that you really properly understand, and provides enough information to figure out where the other bits you need are.

C++ does reinterpret_cast always return the result?

I have two classes, A, and B. A is a parent class of B, and I have a function that takes in a pointer to a class of type A, checks if it is also of type B, and if so will call another function that takes in a pointer to a class of type B.
When the function calls the other function, I supply reinterpret_cast(a) as the parameter. If this seems ambiguous here is a code example:
void abc(A * a) {
if (a->IsA("B")) { //please dont worry much about this line,
//my real concern is the reinterpret_cast
def(reinterpret_cast<B *>(a));
};
};
So now that you know how I am calling "def", I am wondering if reinterpret_cast actually returns a pointer of type B to be sent off as the parameter to def.
I would appreciate any help.
Thanks
reinterpret_cast will always do what you say - it is a sledghammer. You can do
def(reinterpret_cast<B *>(42));
or
std::string hw = "hello";
def(reinterpret_cast<B *>(hw));
it will always return a pointer that might point at the correct type. It assumes you know what you are doing
You will have a pointer of type B*, but reinterpret_cast isn't really great.
If you're sure the type is a B, use static_cast, if not, use dynamic_cast and test the pointer (if dynamic_cast fails, it returns nullptr)
See https://stackoverflow.com/a/332086/5303336
reinterpret_cast is the result of a broken type system. Its behaviour assumes that there is a union such as
union {
TypeA anA;
TypeB aB;
} a;
so
reinterpret_cast< B* >( a );
Assumes a is pointer to member anA and can then deliver the aB address.
If the type is part of the same class hierarchy, then static_cast<> would allow you to find out at compile time if there was enough information to perform the cast. This is generally when B is a base class of A (either singly or multiply).
If there is insufficient information for static_cast to work, then it may be possible to get a dynamic_cast<> to work. This is the case where the B type is derived in some way from A.
It is important to note that the dynamic_cast<B*>( a ) or static_cast< B*>( a ) may not yield the same address when they succeed.
That is because when multiply inherited, the secondary inheritance creates multiple classes and vtables in the object. When this happens, the static_cast, dynamic_cast adjust the base address of the object to find the correct embedded class base address.
The fact that dynamic_cast and static_cast may change the address, is why reinterpret_cast is discouraged. It can result in a value which doesn't do what you want.
Reinterpret cast will always return a pointer. It may just not be a valid pointer in the sense that it actually points to an object of type B.
If B has more than one base class, and A is not the first base class, reinterpret cast will do the wrong thing and fail to perform necessary adjustment to the pointer.
For your usecase you should use a static cast which has the advantage that the compiler will check whether B is actually derived from A and perform any needed adjustment. No runtime overhead is incurred by additional checks, however there will be no warning if the object is not actually of type B and the program will fail arbitrarily.
As others have stated, reinterpret_cast is the wrong solution, use dynamic_cast instead:
void abc(A * a) {
B *b = dynamic_cast<B*>(a);
if (b) {
def(b);
}
}

Problems with casting a void** to a T**

TL;DR: I have a Derived** that I store as a in Lua as a void* userdata. Then I try to get it back as a Base** and stuff breaks. Is there anything I can do or is this all madness that's doomed to failure?
Details:
I'm passing some data back and forth between Lua and C++, and Lua requires the use of void* to store userdata (That I'm using Lua isn't too important, other than that it uses void pointers). Makes sense so far. Lets say I have three classes, Base and Derived, with Derived inheriting from Base. The userdata I feed to Lua is a pointer to a pointer, like so:
template <typename T>
void lua_push(L, T* obj) {
T** ud = (T**)lua_newuserdata(L, sizeof(T*)); // Create a new userdata
*ud = obj; // Set a pointer to my object
// rest of the function setting up other stuff omitted
}
Of course, this is in a nice templated function, so I can pass any of my three types in this way. Later on I can use another templated function to get my userdata out of Lua, like so:
template <typename T>
T* lua_to(lua_State* L, int index) {
// there's normally a special metatable check here that ensures that
// this is the type I want, I've omitted it for this example
return *(T**)lua_touserdata(L, index);
}
This works fine when I pass in and out the same type. I'm running into a problem though when trying to pull a Derived out as a Base.
In my specific case, I have a vector being stored on Base. I use lua_push<Derived>(L, obj); to push my object to Lua. Later, in another place I pull it out using Base* obj = lua_to<Base>(L, i);. I then push_back some stuff into my vector. Later on, another portion of code pulls out that exact same object (verified with pointer comparisons) except this time uses Derived* obj = lua_to<Derived>(L, i); My Derived object doesn't see that object that was pushed in. I believe I've narrowed this down to incorrect casting, and I'm probably corrupting some memory somewhere when I make my call to push_back
So my question is, is there a way to make that cast work right? I've tried the various flavors of casts. static_cast, dynamic_cast and reinterpret_cast don't seem to work, either giving me the same wrong answer or not compiling at all.
Specific example:
Base* b = lua_to<Base>(L, -1); // Both lua_to's looking at the same object
Derived* d = lua_to<Derived>(L, -1); // You can be double sure because the pointers in the output match
std::cout << "Base: " << b << " " << b->myVec.size() << std::endl;
std::cout << "Derived: " << d << " " << d->myVec.size() << std::endl;
Output:
Base: 0xa1fb470 1
Derived: 0xa1fb470 0
The code is not safe. When you cast Base * to void *, you should always cast void * back to Base * first and then cast it again to Derived *. As so:
Derived *obj = ...;
Base** ud = reinterpret_cast<Base **>(lua_newuserdata(L, sizeof(Base*)));
*ud = obj; // implicit cast Derived -> Base
...
Derived *obj = static_cast<Derived *>(*ud); // explicit Base -> Derived
Basically speaking,
Y -> X -> void* -> X -> Y (safe)
Y -> X -> void* -> Y (unsafe)
The reason for this is that the actual pointer value of two pointers pointing to the same object may be different if the two pointers have different types. Whether it works depends on various factors such as inheritance and virtual functions. (It always works in C since C doesn't have those facilities.)
This is all very much up to the compiler you're using, but generally the pointer to a base class is the same as a pointer to a derived class. Doing a coercive cast shouldn't hurt anything. The only exception is when there is multiple inheritance involved; a pointer to one base class won't be the same as a pointer to another base class, even with the same object. The compiler needs to know the exact type of the original pointer to properly adjust it, and the cast to void* loses that information.
In addition to Mark Ransom's answer you can also get problems with this kind of casting if Derived contains a virtual function and Base does not, but again this is compiler specific.

Safely checking the type of a variable

For a system I need to convert a pointer to a long then the long back to the pointer type. As you can guess this is very unsafe. What I wanted to do is use dynamic_cast to do the conversion so if I mixed them I'll get a null pointer. This page says http://publib.boulder.ibm.com/infocenter/lnxpcomp/v7v91/index.jsp?topic=/com.ibm.vacpp7l.doc/language/ref/clrc05keyword_dynamic_cast.htm
The dynamic_cast operator performs
type conversions at run time. The
dynamic_cast operator guarantees the
conversion of a pointer to a base
class to a pointer to a derived class,
or the conversion of an lvalue
referring to a base class to a
reference to a derived class. A
program can thereby use a class
hierarchy safely. This operator and
the typeid operator provide run-time
type information (RTTI) support in
C++.
and I'd like to get an error if it's null so I wrote my own dynamic cast
template<class T, class T2> T mydynamic_cast(T2 p)
{
assert(dynamic_cast<T>(p));
return reinterpret_cast<T>(p);
}
With MSVC I get the error "error C2681: 'long' : invalid expression type for dynamic_cast". It turns out this will only work with classes which have virtual functions... WTF! I know the point of a dynamic cast was for the up/down casting inheritance problem but I also thought it was to solve the type cast problem dynamically. I know I could use reinterpret_cast but that doesn't guarantee the same type of safety.
What should I use to check if my typecast are the same type? I could compare the two typeid but I would have a problem when I want to typecast a derived to its base. So how can I solve this?
dynamic_cast can be used only between classes related through inheritance. For converting a pointer to long or vice-versa, you can use reinterpret_cast. To check whether the pointer is null, you can assert(ptr != 0). However, it is usually not advisable to use reinterpret_cast. Why do you need to convert a pointer to long?
Another option is to use a union:
union U {
int* i_ptr_;
long l;
}
Again, union too is needed only seldom.
I've had to do similar things when loading C++ DLLs in apps written in languages that only support a C interface. Here is a solution that will give you an immediate error if an unexpected object type was passed in. This can make things much easier to diagnose when something goes wrong.
The trick is that every class that you pass out as a handle has to inherit from a common base class.
#include <stdexcept>
#include <typeinfo>
#include <string>
#include <iostream>
using namespace std;
// Any class that needs to be passed out as a handle must inherit from this class.
// Use virtual inheritance if needed in multiple inheritance situations.
class Base
{
public:
virtual ~Base() {} // Ensure a v-table exists for RTTI/dynamic_cast to work
};
class ClassA : public Base
{
};
class ClassB : public Base
{
};
class ClassC
{
public:
virtual ~ClassC() {}
};
// Convert a pointer to a long handle. Always use this function
// to pass handles to outside code. It ensures that T does derive
// from Base, and that things work properly in a multiple inheritance
// situation.
template <typename T>
long pointer_to_handle_cast(T ptr)
{
return reinterpret_cast<long>(static_cast<Base*>(ptr));
}
// Convert a long handle back to a pointer. This makes sure at
// compile time that T does derive from Base. Throws an exception
// if handle is NULL, or a pointer to a non-rtti object, or a pointer
// to a class not convertable to T.
template <typename T>
T safe_handle_cast(long handle)
{
if (handle == NULL)
throw invalid_argument(string("Error casting null pointer to ") + (typeid(T).name()));
Base *base = static_cast<T>(NULL); // Check at compile time that T converts to a Base *
base = reinterpret_cast<Base *>(handle);
T result = NULL;
try
{
result = dynamic_cast<T>(base);
}
catch(__non_rtti_object &)
{
throw invalid_argument(string("Error casting non-rtti object to ") + (typeid(T).name()));
}
if (!result)
throw invalid_argument(string("Error casting pointer to ") + typeid(*base).name() + " to " + (typeid(T).name()));
return result;
}
int main()
{
ClassA *a = new ClassA();
ClassB *b = new ClassB();
ClassC *c = new ClassC();
long d = 0;
long ahandle = pointer_to_handle_cast(a);
long bhandle = pointer_to_handle_cast(b);
// long chandle = pointer_to_handle_cast(c); //Won't compile
long chandle = reinterpret_cast<long>(c);
// long dhandle = pointer_to_handle_cast(&d); Won't compile
long dhandle = reinterpret_cast<long>(&d);
// send handle to library
//...
// get handle back
try
{
a = safe_handle_cast<ClassA *>(ahandle);
//a = safe_handle_cast<ClassA *>(bhandle); // fails at runtime
//a = safe_handle_cast<ClassA *>(chandle); // fails at runtime
//a = safe_handle_cast<ClassA *>(dhandle); // fails at runtime
//a = safe_handle_cast<ClassA *>(NULL); // fails at runtime
//c = safe_handle_cast<ClassC *>(chandle); // Won't compile
}
catch (invalid_argument &ex)
{
cout << ex.what() << endl;
}
return 0;
}
Remember that in Windows 64, a pointer will be a 64-bit quantity but long will still be a 32-bit quantity and your code is broken. At the very least, you need to make the choice of integer type based on the platform. I don't know whether MSVC has support for uintptr_t, the type provided in C99 for holding pointers; that would be the best type to use if it is available.
As for the rest, others have addressed the why's and wherefore's of dynamic_cast vs reinterpret_cast sufficiently.
reinterpret_cast is the correct cast to use here.
This is pretty much the only thing it can do safely.
reinterpret_cast from a pointer type to a type T and back to the original pointer type yields the original pointer. (Assuming T is a pointer or integer type that is at least as big as the original pointer type)
Note that reinterpret_cast from a pointer type to T is unspecified. There are no guarantees about the value of the T type, except that if you then reinterpret_cast it back to the original type, you get the original value. So assuming you don't try to do anything with the intermediate long value in your case, reinterpret_cast is perfectly safe and portable.
Edit: Of course this doesn't help if you don't know at the second cast, what the original type was. In that case,you're screwed. The long can't possibly in any way carry type information about which pointer it was converted from.
You can use reinterpret_cast to cast to an integral type and back to the pointer type. If the integral type is large enough to store the pointer value, then that conversion will not change the pointer value.
As others already say, it is not defined behavior to use dynamic_cast on a non-polymorphic class (except when you do an upcast, which is implicit anyway and be ignored here), and it also only works on pointers or references. Not on integral types.
You better use ::intptr_t found in on various posix systems. You can use that type as your intermediate type you cast to.
Regarding your check whether the conversion will succeed, you can use sizeof:
BOOST_STATIC_ASSERT(sizeof(T1) >= sizeof(T2));
will fail at compile time if the conversion couldn't be done. Or continue to use assert with that condition, and it will assert at run-time instead.
Warning: This won't prevent you from casting T* to intptr_t back to U* with U another type than T. Thus, this only guarantees you the cast won't change the value of the pointer if you cast from T* to intptr_t and back to T*. (Thanks to Nicola pointing out you may expect another protection).
What you want to do sounds like a really bad and dangerous idea, but if you MUST do it (i.e. you're working in a legacy system or on hardware that you know will never change), then I would suggest wrapping the pointer in some kind of simple struct that contains two members: 1) a void pointer to your object instance and a string, enum, or some other kind of unique identifier that will tell you what to cast the original void* to. Here's an example of what I meant (note: I didn't bother testing this so there may be syntactical errors in it):
struct PtrWrapper {
void* m_theRealPointer;
std::string m_type;
};
void YourDangerousMethod( long argument ) {
if ( !argument )
return;
PtrWrapper& pw = *(PtrWrapper*)argument;
assert( !pw.m_type.empty() );
if ( pw.m_type == "ClassA" ) {
ClassA* a = (ClassA*)pw.m_theRealPointer;
a->DoSomething();
} else if (...) { ... }
}
dynamic_cast<> is a cast intended to be used only on convertible types (in the polymorphic sense). Forcing the cast of a pointer to a long (litb correctly suggests the static_assert to ensure the compatibility of the size) all the information about the type of the pointer are lost. There's no way to implement a safe_reinterpret_cast<> to obtain the pointer back: both value and type.
To clarify what I mean:
struct a_kind {};
struct b_kind {};
void function(long ptr)
{}
int
main(int argc, char *argv[])
{
a_kind * ptr1 = new a_kind;
b_kind * ptr2 = new b_kind;
function( (long)ptr1 );
function( (long)ptr2 );
return 0;
}
There's no way for function() to determine the kind of pointer passed and "down" cast it to the proper type, unless either:
the long is wrapped by an object with some information of the type.
the type itself is encoded in the referenced object.
Both the solutions are ugly and should be avoided, since are RTTI surrogates.
also, better use size_t instead of a long -- I think this type is ensured to be compatible with the size of the address space.
As soon as you decided to cast a pointer to a long, you threw type safety to the wind.
dynamic_cast is used to cast up & down a derivation tree. That is, from a base class pointer to a derived class pointer. If you have:
class Base
{
};
class Foo : public Base
{
};
class Bar : public Base
{
};
You can use dynamic_cast in this way...
Base* obj = new Bar;
Bar* bar = dynamic_cast<Bar*>(obj); // this returns a pointer to the derived type because obj actually is a 'Bar' object
assert( bar != 0 );
Foo* foo = dynamic_cast<Foo*>(obj); // this returns NULL because obj isn't a Foo
assert( foo == 0 );
...but you can't use dynamic cast to cast in to an out of a derivation tree. You need reinterpret_cast or C-style casts for that.