Let's say I have type A, and a derived type B. When I perform a dynamic cast from A* to B*, what kind of "runtime checks" the environment performs? How does it know that the cast is legal?
I assume that in .Net it's possible to use the attached metadata in the object's header, but what happen in C++?
Exact algorithm is compiler-specfic. Here's how it works according to Itanium C++ ABI (2.9.7) standard (written after and followed by GCC).
Pointer to base class is a pointer to the middle of the body of the "big" class. The body of a "big" class is assembled in such a way, that whatever base class your pointer points to, you can uniformly access RTTI for that "big" class, which your "base" class in fact is. This RTTI is a special structure that relates to the "big" class information: of what type it is, what bases it has and at what offsets they are.
In fact, it is the "metadata" of the class, but in more "binary" style.
V instance;
Base *v = &instance;
dynamic_cast<T>(v);
Dynamic cast makes use of the fact that when you write dynamic_cast<T>(v), the compiler can immediately identify metadata for the "big" class of v -- i.e. V! When you write it, you think that T is more derived than Base, so compiler will have hard time doing base-to-drived cast. But compiler can immediately (at runtime) determine most deirved type--V--and it only has then to traverse the inheritance graph contained in metadata to check whether it can downcast V to T. If it can, it just checks the offset. If it can't or is amboguous -- returns NULL.
Dynamic cast is a two-step process:
Given the vtable of a pointer to an object, use the offset to recover a pointer to the full class. (All adjustments will then be made from this pointer.) This is the equivalent of down-casting to the full class.
Search the type_info of the full class for the type we want - in other words, go through a list of all bases. If we find one, use the offset to adjust the pointer again. If the search in step 2 fails, return NULL.
Dynamic cast performs a runtime check whether this is a valid and doable cast; it'll return NULL when it's not possible to perform the cast.
Refere your favourite book on RTTI.
Related
My base class is called Account while the derived class Businessaccount has an additional int variable called x, as well as a getter-method (int getx()) for it.
Is slicing supposed to occur in the following situation? It obviously happens in my case:
vector<shared_ptr<Account>> vec;
shared_ptr<Businessaccount> sptr = make_shared<Businessaccount>();
vec.push_back(sptr);
After that, if I do this:
(*vec.at(0)).getx();
it says that class<Account> has no member named getx()!
I'd be thankful if somebody would tell me why this occurs and how to fix it.
(*vec.at(0)) is going to return an Account which doesn't know about x. You need to cast the Account pointer into a Businessaccount to reference that member.
shared_ptr<BusinessAccount> bA = dynamic_pointer_cast<BusinessAccount>(vec.at(0));
bA->getx();
No, slicing does not happen in this situation, your pointer is just converted to pointer to a base class ie your data is the same, just type of pointer is changed. When slicing happens you loose all data of derived class.
To solve the issue you either need to provide virtual method in base class that would be properly implemented in Businessaccount or use dynamic_cast or static_cast (if you are sure that object has type Businessaccount by different matter). Though using such cast is usually sign of not well designed program.
In C++ the static and dynamic type of an object pointed to can differ.
Your static type of what the shared pointers in that vec point to is Account.
The dynamic type of what the shared pointers in that vec point to varies. In your case, you put a Businessaccount in it.
When you want to access or call methods, you are given access to only the static type methods.
The static type is what you have proven to the compiler the type contains at that line.
If you know better, you can do a static_cast<Businessaccount*>(vec.at(0).get())->getx(). By doing so you are promising to the compiler that you have certain knowledge that the data at that location is actually a Businessaccount. If you are wrong, your program's behavior is undefined (if you are lucky you get a crash).
You can also use RTTI (run time type information) to ask if a particular object is a particular sub-type (in some cases, where the base class has a virtual method).
Account* paccount = vec.at(0).get();
Businessaccount* pba = dynamic_cast<Businessaccount*>(paccount);
if (pba)
pba->getx();
the above checks if the paccount is actually a Businessaccount*, and if so calls getx on it. If not, it does nothing.
Often dynamic casting is a sign you didn't design your object use properly; having to drill down past the interface into which implementation means maybe your interface wasn't rich enough.
In some scripting and bytecode compiled languages, they let you go off and call getx and proceed to crash/throw an exception/etc if that method isn't isn't there.
C++ instead lets you use what you have claimed to be there (via the type system), then lets you write your own handler if you want dynamic type checking.
I'm working on a Box2d project.
In a particular class constructor, I do:
this->body->SetUserData(this);
where body is a member of this class. body is of type b2Body.
Later on, I call a method:
body->GetUserData();
GetUserData() returns void*
How do I determine what type of class void* is pointing to ?
EDIT: For those who don't use Box2d, you can set the user data to your wrapper class which holds all the non-physics related logic etc, while b2Body represents a physics body.
EDIT: For example, in Objective-C , one would cast void* to NSObject* and then call isMemberOf to determine whether it is of a particular type.
There's nothing intrinsic to C++ which will let you determine the type pointed to by a void*. Your best options are probably:
Make an abstract base class which all your userdata items will derive from. Maybe you already have one. Then you can assume the void* will always be a type derived from that base, and use it accordingly.
Make a discriminated union type (or use Boost.Variant), and always have the void* point to one of those.
Make a small struct which the void* will always point to an instance of, and make that struct be the first member of everything you assign to the void* (this will only work if you're doing more C-style programming, and the classes have no bases to interfere with the alignment).
The short answer, as John Zwinck points out, is you can't.
The use of void* is an old C trick how to add user configurable data to a library. This is normally used when integrating a middleware library into a bigger software. The basic assumption here is that you know what type is behind the void*. In my experience with wrapping ODE and Bullet, this works out quite well.
There are basically two cases and with each you know what basic type is behind that void*:
In the first case you have a one to one correlation between a body or geometry to an object in the wrapping software. In this case you would simply reinterpret_cast to the wrapping object.
In the second case the body or geometry is contained in some "game object". This can be any object within the scene. But normally all "object within the scene" share a common base class. Here you simply assume that you can reinterpret_cast to the base class. Now you have an object in the hand, what you do from here is up to you. You can either call virtual methods on it, use dynamic_cast or some homebrew reflection.
I found below post
C++ polymorphism without pointers
that explains to have polymorphism feature C++ must use pointer or reference types.
I looked into some further resources all of them says the same but the reason .
Is there any technical difficulty to support polymorphism with values or it is possible but C++ have decided to not to provide that ability ?
The problem with treating values polymorphically boils down to the object slicing issue: since derived objects could use more memory than their base class, declaring a value in the automatic storage (i.e. on the stack) leads to allocating memory only for the base, not for the derived object. Therefore, parts of the object that belong to the derived class may be sliced off. That is why C++ designers made a conscious decision to re-route virtual member-functions to the implementations in the base class, which cannot touch the data members of the derived class.
The difficulty comes from the fact that what you call objects are allocated in automatic memory (on the stack) and the size must be known at compile-time.
Size of pointers are known at compile-time regardless of what they point to, and references are implemented as pointers under the hood, so no worries there.
Consider objects though:
BaseObject obj = ObjectFactory::createDerived();
How much memory should be allocated for obj if createDerived() conditionally returns derived objects? To overcome this, the object returned is sliced and "converted* to a BaseObject whose size is known.
This all stems from the "pay for what you use" mentality.
The short answer is because the standard specifies it. But are there any insurmountable technical barriers to allowing it?
C++ data structures have known size. Polymorphism typically requires that the data structures can vary in size. In general, you cannot store a different (larger) type within the storage of a smaller type, so storing a child class with extra variables (or other reasons to be larger) within storage for a parent class is not generally possible.
Now, we can get around this. We can create a buffer larger than what is required to store the parent class, and construct child classes within that buffer: but in this case, exposure to said instance will be via references, and you will carefully wrap the class.
This is similar to the technique known as "small object optimization" used by boost::any, boost::variant and many implementations of std::string, where we store (by value) objects in a buffer within a class and manage their lifetime manually.
There is also an issue where Derived pointers to an instance can have different values than Base pointers to an instance: value instances of objects in C++ are presumed to exist where the storage for the instance starts by most implementations.
So in theory, C++ could allow polymorphic instances if we restricted it to derived classes that could be stored in the same memory footprint, with the same "pointer to" value for both Derived and Base, but this would be an extremely narrow corner case, and could reduce the kinds of optimizations and assumptions compilers could make about value instances of a class in nearly every case! (Right now, the compiler can assume that value instances of a class C have virtual methods that are not overridden elsewhere, as an example) That is a non trivial cost for an extremely marginal benefit.
What more, we are capable of using the C++ language to emulate this corner case using existing language features (placement new, references, and manual destruction) if we really need it, without imposing that above cost.
It is not immediately clear what you mean by "polymorphism with values". In C++ when you have an object of type A, it always behaves as an object of type A. This is perfectly normal and logical thing to expect. I don't see how it can possible behave in any other way. So, it is not clear what "ability" that someone decided "not to provide" you are talking about.
Polymorphism in C++ means one thing: virtual function calls made through an expression with polymorphic type are resolved in accordance with the dynamic type of that expression (as opposed to static type for non-virtual functions). That's all there is to it.
Polymorphism in C++ always works in accordance with the above rule. It works that way through pointers. It works that way through references. It works that way through immediate objects ("values" as you called them). So, it not not correct to say that polymorphism in C++ only works with pointers and references. It works with "values" as well. They all follow the same rule, as stated above.
However, for an immediate object (a "value") its dynamic type is always the same as it static type. So, even though polymorphism works for immediate values, it does not demonstrate anything truly "polymorphic". The behavior of an immediate object with polymorphism is the same as it would be without polymorphism. So, polymorphism of an immediate object is degenerate, trivial polymorphism. It exists only conceptually. This is, again, perfectly logical: an object of type A should behave as an object of type A. How else can it behave?
In order to observe the actual non-degenerate polymorphism, one needs an expression whose static type is different from its dynamic type. Non-trivial polymorphism is observed when an expression of static type A behaves (with regard to virtual function calls) as an object of different type B. For this an expression of static type A must actually refer to an object of type B. This is only possible with pointers or references. The only way to create that difference between static and dynamic type of an expression is through using pointers or references.
In other words, its not correct to say that polymorphism in C++ only works through pointers or references. It is correct to say is that with pointers or references polymorphism becomes observable and non-trivial.
I dont see what the following macro is doing? If anyone can help me see it it would be appreciated.
#define BASE_OFFSET(ClassName,BaseName)\
(DWORD(static_cast < BaseName* >( reinterpret_cast\
< ClassName* >(Ox10000000)))-Ox10000000)
If anyone is curious to know where it is coming from, it comes out of the 3rd chapter of Don Box Book Essential COM where he is building a QueryInterface function using interface tables and the above macro is somehow used to find the pointer to the interface vtable of the class, where class is the ClassName implementing the BaseName, although I dont know how it is doing that.
It tells to the compiler: "imagine there a ClassName object at 0x10000000. Where would the BaseName data begin in that object, relative to 0x10000000"?
Think of a memory layout of a class object with multiple bases:
class A: B, C{};
In the memory block that constitutes an A object, there's the chunk of data that belong to B, also a chunk of data that belongs to C, and the data that are specific to A. Since the address of at least one base's data cannot be the same as the address of the class instance as a whole, the numeric value of the this pointer that you pass to different methods needs to vary. The macro retrieves the value of the difference.
EDIT: The pointer to the vtable is, by convention, the first data member in any class with virtual functions. So by finding the address of the base data, one finds the address of its vtable pointer.
Now, about the type conversion. Normally, when you typecast pointers, the operation is internally trivial - the numeric value of the address does not depend on what type does it point to; the very notion of datatype only exists on the C level. There's one important exception though - when you cast object pointers with multiple inheritance. As we've just discussed, the this pointer that you need to pass to a base class method might be numerically different from the one of the derived object's.
So the distinction between static_cast and reinterpret_cast captures this difference neatly. When you use reinterpret_cast, you're telling the compiler: "I know better. Take this numeric value and interpret it as a pointer to what I say". This is a deliberate subversion of the type system, dangerous, but occasionally necessary. This kind of cast is by definition trivial - cause you say so.
By "trivial" I mean - the numeric value of the pointer does not change.
The static_cast is a more high level construct. In this particular case, you're casting between an object and its base. That's a reasonable, safe cast under C++ class rules - BUT it might be numerically nontrivial. That's why the macro uses two different typecasts. static_cast does NOT violate the type system.
To recap:
reinterpret_cast<ClassName* >(OxlOOOOOOO)
is an unsafe operation. It returns a bogus pointer to a bogus object, but it's OK because we never dereference it.
static_cast<BaseName*>(...)
is a safe operation (with an unsafe pointer, the irony). It's the part where the nontrivial pointer typecast happens.
(DWORD(...)-OxlOOOOOOO)
is pure arithmetic. That's where the unsafety doubles back on itself: rather than use the pointer as a pointer, we cast it back to an integer and forget that it ever was a pointer.
The last stage could be equivalently rephrased as:
((char*)(...)-(char*)OxlOOOOOOO)
if that makes more sense.
Remark about magic 0x10000000 constant.
If that constant will be 0, GCC will show warning -Winvalid-offset-of (if it is enabled, of course). Maybe other compilers do something like that.
I'm not talking about a pointer to an instance, I want a pointer to a class itself.
In C++, classes are not "first class objects". The closest you can get is a pointer to its type_info instance.
No. A pointer is the address of something in the memory of the computer at run-time. A class is just a set of instructions to the compiler.
As everyone else have already said, it's not possible to have a pointer to a class.
But if the point is to create a new instance from some class chosen at runtime, you might want to check out the Factory Method (or Abstract Factory) design patterns.
Yes and No. This depends on your context of what you are trying to achieve. If you simply want a pointer to a type then no there is not a way. A type does not live in memory in the sense of a pointer.
There reason I said yes though is some people would consider the virtual table a pointer to a type. It is possible to get this pointer since the virtual table does exist in memory and can be used to invoke virtual methods with a bit of trickery.
Unlike true Object-Based languages, a class is not an object in C++, more is the pity. The closest you can come to "pointer to class" is RTTI:
const std::type_info &info = typeid(object expression);
type_info has name() member finction, and they can be compared to each other.
A "Class" does not exist. The only thing you can point to is the data.
The rest of a "Class" is actually a dispatch table. For each method in the class, the dispatch table has a pointer. That way, the class points to the correct method of your class regardless of what type it's currently cast to. This would be useless to access.
Methods in your class (the things pointed to by the dispatch table) are actually just "Functions" that are passed in your class data pointer. The definition of a method is pretty much that it's a function that takes the classes data as a parameter. In most C-style languages, that data pointer is hidden but referred to as "this".
The methods for your class may be spread all over the codebase. Because of parent classes, you're not likely even find these methods adjacent to each other.
You can't have a (run-time) pointer to a class, but C++ does has a similar compile-time concept: template parameters. Boost has a library dedicated to manipulating them and a traits library for getting information about classes.
Depending upon how you want to think about pointers, you can have a "pointer" to a class, if by pointer you mean some integral value. Boost allows you to register types and assign a unique integer for every type that you register. If the types you are registering are all classes then you can look up at run-time the code necessary to create an object of the type you want, as long as you have the value of the type you want. But in general, classes aren't first class objects in the language and the best you can hope for is to simulate the behavior you want to have.
True, there is no support for reflection/introspection in built in to C++, but there are a number of libraries that will add many of eg java's Class functionality, and allow a programmer to get an object representing a class, create an instance, etc. google c++ reflection.