How is dynamic_cast typically implemented? - c++

Is the type check a mere integer comparison? Or would it make sense to have a GetTypeId virtual function to distinguishing which would make it an integer comparison?
(Just don't want things to be a string comparison on the class names)
EDIT: What I mean is, if I'm often expecting the wrong type, would it make sense to use something like:
struct Token
{
enum {
AND,
OR,
IF
};
virtual std::size_t GetTokenId() = 0;
};
struct AndToken : public Token
{
std::size_t GetTokenId() { return AND; }
};
And use the GetTokenId member instead of relying on dynamic_cast.

The functionality of the dynamic_cast goes far beyond a simple type check. If it was just a type check, it would be very easy to implement (something like what you have in your original post).
In addition to type checking, dynamic_cast can perform casts to void * and hierarchical cross-casts. These kinds of casts conceptually require some ability to traverse class hierarchy in both directions (up and down). The data structures needed to support such casts are more complicated than a mere scalar type id. The information the dynamic_cast is using is a part of RTTI.
Trying to describe it here would be counterproductive. I used to have a good link that described one possible implementation of RTTI... will try to find it.

I don't know the exact implementation, but here is an idea how I would do it:
Casting from Derived* to Base* can be done in compile time. Casting between two unrelated polimorphic types can be done in compile time too (just return NULL).
Casting from Base* to Derived* needs to be done in run-time, because multiple derived classes possible. The identification of dynamic type can be done using the virtual method table bound to the object (that's why it requires polymorphic classes).
This VMT probably contains extra information about the base classes and their data offsets. These data offsets are relevant when multiple inheritance is involved and is added to the source pointer to make it point to the right location.
If the desired type was not found among the base classes, dynamic_cast would return null.

In some of the original compilers you are correct they used string comparison.
As a result dynamic_cast<> was very slow (relatively speaking) as the class hierarchy was traversed each step up/down the hierarchy chain required a string compare against the class name.
This leads to a lot of people developing their own casting techniques. This was nearly always ultimately futile as it required each class to be annotated correctly and when things went wrong it was nearly impossible to trace the error.
But that is also ancient history.
I am not sure how it is done now but it definitely does not involve string comparison. Doing it yourself is also a bad idea (never do work that the compiler is already doing). Any attempt you make will not be as fast or as accurate as the compiler, remember that years of development have gone into making the compiler code as quick as possible (and it will always be correct).

The compiler cannot divine additional information you may have and stick it in dynamic_cast. If you know certain invariants about your code and you can show that your manual casting mechanism is faster, do it yourself. It doesn't really matter how dynamic_cast is implemented in that case.

Related

dynamic_cast downcasting: How does the runtime check whether Base points to Derived?

I am interested in understanding how, generally speaking, the runtime checks whether a base class actually points to a derived class when using dynamic_cast to apply a downcast.
I know that each virtual table of a polymorphic class contains RTTI too (in the form of type_info pointers).
Every compiler is going to have slight differences in implementation, but I'm going to use MSVC as a reference as they easily supply the source with VS.
You can view all of the details on how MSVC does it by going to your Visual Studio installation and going to /Community/VS/Tools/MSVC/${VERSION}/crt/src/vcruntime/rtti.cpp
The compiler will internally convert dynamic_cast's to call the function __RTDynamicCast (or __RTCastToVoid if try to cast to a void*), this will take in the RTTI info for the target type and the source type. The key element in the _RTTITypeDescriptor structure is the fully decorated name. It will then dispatch to one of 3 different implementations depending on whether the input type has single inheritance, multiple inheritance, or virtual inheritance.
For single inheritance FindSITargetTypeInstance will walk through the the list of base class types, and if it finds a match by pointer comparison it will return the class descriptor. If it fails to find a match by pointer comparison it will try again using string comparisons.
For multiple inheritance FindMITargetTypeInstance will walk the class hierarchy in a depth-first, left-to-right order until it has seen the descriptor for both the source type and the target type. Every comparison is done through TypeidsEqual which will first try to do a pointer comparison, but will immediately fallback to string comparison, this differs from single inheritance in that single inheritance did this in 2 separate loops as it is likely the pointers would match.
For virtual inheritance FindVITargetTypeInstance will walk the entire class hierarchy. This is the slowest of the three as it cannot exit early due to potential diamond problems. Every comparison is done through TypeidsEqual.
Once the descriptor for the type has been found it will use that to calculate an offset from the source pointer to the target pointer.
Overall the code is fairly simple and incredibly well documented, though I did leave out some nuances like checking to see if the base class was publicly derived, or distinguishing between down-casts, up-casts, and cross-casts.

Void pointer in C++

I learned that templates are the void* equivalents in C++. Is this true?
I have this issue in polling "events" off some procedure, when I have an EventType variable and I may also need to pass raw data that is related to that event.
struct WindowEvent {
enum Type {WINDOW_SIZE_CHANGE, USER_CLICK, ...};
void* data;
};
The user may then cast data to the necessary type, depending on the event type.
Is this approach okay in C++? Are there any better approaches?
In C, which generally lacks support for polymorphism, void* pointers can be used to accept data of any type, along with some run-time representation of the actual type, or just knowledge that the data will be casted back to the correct type.
In C++ and other languages with support for polymorphism one will generally instead use either dynamic polymorphism (classes with virtual functions) or static polymorphism (function overloads and templates).
The main difference is that the C approach yields dynamic (run time) manual type checking, while the C++ approaches yield mostly static (compile time) and fully automated type checking. This means less time spent on testing and hunting down silly easily preventable bugs. The cost is more verbose code, and that means that there's a code size offset somewhere, under which the C approach probably rules for productivity, and above which the C++ approaches rule.
"I learned that templates are the void equivalents in C++. Is this true?"*
No - Templates maintain type safety
"Is this approach okay in C++?"
No
"Are there any better approaches?"
Depending on the use case one could use (for example)
class EventData {
public:
virtual int getData() = 0;
};
And then use the appropriate inherited class. Perhaps using smart pointers.

C++ double dispatch "extensible" without RTTI

Does anyone know a way to have double dispatch handled correctly in C++ without using RTTI and dynamic_cast<> and also a solution, in which the class hierarchy is extensible, that is the base class can be derived from further and its definition/implementation does not need to know about that?
I suspect there is no way, but I'd be glad to be proven wrong :)
The first thing to realize is that double (or higher order) dispatch doesn't scale. With single
dispatch, and n types, you need n functions; for double dispatch n^2, and so on. How you
handle this problem partially determines how you handle double dispatch. One obvious solution is to
limit the number of derived types, by creating a closed hierarchy; in that case, double dispatch can
be implemented easily using a variant of the visitor pattern. If you don't close the hierarchy,
then you have several possible approaches.
If you insist that every pair corresponds to a function, then you basically need a:
std::map<std::pair<std::type_index, std::type_index>, void (*)(Base const& lhs, Base const& rhs)>
dispatchMap;
(Adjust the function signature as necessary.) You also have to implement the n^2 functions, and
insert them into the dispatchMap. (I'm assuming here that you use free functions; there's no
logical reason to put them in one of the classes rather than the other.) After that, you call:
(*dispatchMap[std::make_pair( std::type_index( typeid( obj1 ) ), std::type_index( typeid( obj2 ) )])( obj1, obj2 );
(You'll obviously want to wrap that into a function; it's not the sort of thing you want scattered
all over the code.)
A minor variant would be to say that only certain combinations are legal. In this case, you can use
find on the dispatchMap, and generate an error if you don't find what you're looking for.
(Expect a lot of errors.) The same solution could e used if you can define some sort of default
behavior.
If you want to do it 100% correctly, with some of the functions able to handle an intermediate class
and all of its derivatives, you then need some sort of more dynamic searching, and ordering to
control overload resolution. Consider for example:
Base
/ \
/ \
I1 I2
/ \ / \
/ \ / \
D1a D1b D2a D2b
If you have an f(I1, D2a) and an f(D1a, I2), which one should be chosen. The simplest solution
is just a linear search, selecting the first which can be called (as determined by dynamic_cast on
pointers to the objects), and manually managing the order of insertion to define the overload
resolution you wish. With n^2 functions, this could become slow fairly quickly, however. Since
there is an ordering, it should be possible to use std::map, but the ordering function is going to
be decidedly non-trivial to implement (and will still have to use dynamic_cast all over the
place).
All things considered, my suggestion would be to limit double dispatch to small, closed hierarchies,
and stick to some variant of the visitor pattern.
The "visitor pattern" in C++ is often equated with double dispatch. It uses no RTTI or dynamic_casts.
See also the answers to this question.
The first problem is trivial. dynamic_cast involves two things: run-time check and a type cast. The former requires RTTI, the latter does not. All you need to do to replace dynamic_cast with a functionality that does the same without requiring RTTI is to have your own method to check the type at run-time. To do this, all you need is a simple virtual function that returns some sort of identification of what type it is or what more-specific interface it complies to (that can be an enum, an integer ID, even a string). For the cast, you can safely do a static_cast once you have already done the run-time check yourself and you are sure that the type you are casting to is in the object's hierarchy. So, that solves the problem of emulating the "full" functionality of dynamic_cast without needing the built-in RTTI. Another, more involved solution is to create your own RTTI system (like it is done in several softwares, like LLVM that Matthieu mentioned).
The second problem is a big one. How to create a double dispatch mechanism that scales well with an extensible class hierarchy. That's hard. At compile-time (static polymorphism), this can be done quite nicely with function overloads (and/or template specializations). At run-time, this is much harder. As far as I know, the only solution, as mentioned by Konrad, is to keep a dispatch table of function pointers (or something of that nature). With some use of static polymorphism and splitting dispatch functions into categories (like function signatures and stuff), you can avoid having to violate type safety, in my opinion. But, before implementing this, you should think very hard about your design to see if this double dispatch is really necessary, if it really needs to be a run-time dispatch, and if it really needs to have a separate function for each combination of two classes involved (maybe you can come up with a reduced and fixed number of abstract classes that capture all the truly distinct methods you need to implement).
You may want to check how LLVM implement isa<>, dyn_cast<> and cast<> as a template system, since it's compiled without RTTI.
It is a bit cumbersome (requires tidbits of code in every class involved) but very lightweight.
LLVM Programmer's Manual has a nice example and a reference to the implementation.
(All 3 methods share the same tidbit of code)
You can fake the behaviour by implementing the compile-time logic of multiple dispatch yourself. However, this is extremely tedious. Bjarne Stroustrup has co-authored a paper describing how this could be implemented in a compiler.
The underlying mechanism – a dispatch table – could be dynamically generated. However, using this approach you would of course lose all syntactical support. You’d need to to maintain 2-dimensional matrix of method pointers and manually look up the correct method depending on the argument types. This would render a simple (hypothetical) call
collision(foo, bar);
at least as complicated as
DynamicDispatchTable::lookup(collision_signature, FooClass, BarClass)(foo, bar);
since you didn’t want to use RTTI. And this is assuming that all your methods take only two arguments. As soon as more arguments are required (even if those aren’t part of the multiple dispatch) this becomes more complicated still, and would require circumventing type safety.

Reflexion Perfect Forwarding and the Visitor Pattern

http://codepad.org/etWqYnn3
I'm working on some form of a reflexion system for C++ despite the many who have warned against. What I'm looking at having is a set of interfaces IScope, IType, IMember, IMonikerClient and a wrapper class which contains the above say CReflexion. Ignoring all but the member which is the important part here is what I would like to do:
1) Instance the wrapper
2) Determine which type is to be used
3) Instance type
4) Overload the () and [] to access the contained member from outer(the wrapper) in code as easily as it is done when using a std::vector
I find that using 0x I can forward a method call with any type for a parameter. I can't however cast dynamically as cast doesn't take a variable(unless there are ways I am unaware of!)
I linked the rough idea above. I am currently using a switch statement to handle the varying interfaces. I would, and for obvious reasons, like to collapse this. I get type match errors in the switch cases as a cause of the call to the methods compiling against each case where only one of three work for any condition and compiler errors are thrown.
Could someone suggest anything to me here? That is aside from sticking to VARIANT :/
Thanks!
C++, even in "0x land", simply does not expose the kind of information you would need to create something like reflection.
I find that using 0x I can forward a method call with any type for a parameter.
You cannot forward a type as a parameter. You can forward the const-volatile qualifiers on a member, but that's all done in templates, at compile time. No runtime check ever is done when you're using things like forward.
Your template there for operator() is not going to compile unless T is convertable to int*, string*, and A** all at once. Think of templates as a simple find and replace algorithm that generates several functions for you -- the value of T gets replaced with the typename when the template is instantiated, and the function is compiled as normal.
Finally, you can only use dyanmic_cast to cast down the class hierarchy -- casting between the completely unrelated types A B and C isn't going to operate correctly.
You're better off taking the time to rethink your design such that it doesn't use reflection at all. It will probably be a better design anyway, considering even in language with reflection, reflection is most often used to paper over poor designs.

C++ Abstract class can't have a method with a parameter of that class

I created this .h file
#pragma once
namespace Core
{
class IComparableObject
{
public:
virtual int CompareTo(IComparableObject obj)=0;
};
}
But compiler doesn't like IComparableObject obj param if the method is virtual pure, while
virtual int CompareTo(IComparableObject obj) {}
It's ok, however I want it as virtual pure. How can I manage to do it? Is it possible?
You are trying to pass obj by value. You cannot pass an abstract class instance by value, because no abstract class can ever be instantiated (directly). To do what you want, you have to pass obj by reference, for example like so:
virtual int CompareTo(IComparableObject const &obj)=0;
It works when you give an implementation for CompareTo because then the class is not abstract any longer. But be aware that slicing occurs! You don't want to pass obj by value.
Well I have to give an unexpected answer here! Dennycrane said you can do this:
virtual int CompareTo(IComparableObject const &obj)=0;
but this is not correct either. Oh yes, it compiles, but it is useless because it can never be implemented correctly.
This issue is fundamental to the collapse of (statically typed) Object Orientation, so it is vital that programmers using OO recognize the issue. The problem has a name, it is called the covariance problem and it destroys OO utterly as a general programming paradigm; that is, a way of representing and independently implementing general abstractions.
This explanation will be a bit long and sloppy so bear with me and try to read between the lines.
First, an abstract class with a pure virtual method taking no arguments can be easily implemented in any derived class, since the method has access to the non-static data variables of the derived class via the this pointer. The this pointer has the type of a pointer to the derived class, and so we can say it varies along with the class, in fact it is covariant with the derived class type.
Let me call this kind of polymorphism first order, it clearly supports dispatching predicates on the object type. Indeed, the return type of such a method may also vary down with the object and class type, that is, the return type is covariant.
Now, I will generalise the idea of a method with no arguments to allow arbitrary scalar arguments (such as ints) claiming this changes nothing: this is merely a family of methods indexed by the scalar type. The important property here is that the scalar type is closed. In a derived class exactly the same scalar type must be used. in other words, the type is invariant.
General introduction of invariant parameters to a virtual function still permits polymorphism, but the result is still first order.
Unfortunately, such functions have limited utility, although they are very useful when the abstraction is only first order: a good example is device drivers.
But what if you want to model something which is actually interesting, that is, it is at least a relation?
The answer to this is: you cannot do it. This is a mathematical fact and has nothing to do with the programming language involved. Lets suppose you have an abstraction for say, numbers, and you want to add one number to another number, or compare them (as in the OP's example). Ignoring symmetry, if you have N implementations, you will have to write N^2 functions to perform the operations. If you add a new implementation of the abstraction, you have to write N+1 new functions.
Now, I have the first proof that OO is screwed: you cannot fit N^2 methods into a virtual dispatch schema because such a schema is linear. N classes gives you N methods you can implement and for N>1, N^2 > N, so OO is screwed, QED.
In a C++ context you can see the problem: consider :
struct MyComparable : IComparableObject {
int CompareTo(IComparableObject &other) { .. }
};
Arggg! We're screwed! We can't fill in the .. part here because we only have a reference to an abstraction, which has no data in it to compare to. Of course this must be the case, because there are an open/indeterminate/infinite number of possible implementations. There's no possible way to write a single comparison routine as an axiom.
Of course, if you have various property routines, or a common universal representation you can do it, but this does not count, because then the mapping to the universal representation is parameterless and thus the abstraction is only first order. For example if you have various integer representations and you add them by converting both to GNU gmp's data type mpz, then you are using two covariant projection functions and a single global non-polymorphic comparison or addition function.
This is not a counter example, it is a non-solution of the problem, which is to represent a relation or method which is covariant in at least two variables (at least self and other).
You may think you could solve this with:
struct MyComparable : IComparableObject {
int CompareTo(MyComparable &other) { .. }
};
After all you can implement this interface because you know the representation of other now, since it is MyComparable.
Do not laugh at this solution, because it is exactly what Bertrand Meyer did in Eiffel, and it is what many people do in C++ with a small change to try to work around the fact it isn't type safe and doesn't actually override the base-class function:
struct MyComparable : IComparableObject {
int CompareTo(IComparableObject &other) {
try
MyComparable &sibling = dynamic_cast(other);
...
catch (..) { return 0; }
}
};
This isn't a solution. It says that two things aren't equal just because they have different representations. That does not meet the requirement, which is to compare two things in the abstract. Two numbers, for example, cannot fail to be equal just because the representation used is different: zero equals zero, even if one is an mpz and the other an int. Remember: the idea is to properly represent an abstraction, and that means the behaviour must depend only on the abstract value, not the details of a particular implementation.
Some people have tried double dispatch. Clearly, that cannot work either. There is no possible escape from the basic issue here: you cannot stuff a square into a line.
virtual function dispatch is linear, second order problems are quadratic, so OO cannot represent second order problems.
Now I want to be very clear here that C++ and other statically typed OO languages are broken, not because they can't solve this problem, because it cannot be solved, and it isn't a problem: its a simple fact. The reason these languages and the OO paradigm in general are broken is because they promise to deliver general abstractions and then fail to do so. In the case of C++ this is the promise:
struct IComparableObject { virtual int CompareTo(IComparableObject obj)=0; };
and here is where the implicit contract is broken:
struct MyComparable : IComparableObject {
int CompareTo(IComparableObject &other) { throw 00; }
};
because the implementation I gave there is effectively the only possible one.
Well before leaving, you may ask: What is the right way (TM).
The answer is: use functional programming. In C++ that means templates.
template<class T, class U> int compare(T,U);
So if you have N types to compare, and you actually compare all combinations, then yes indeed you have to provide N^2 specialisations. Which shows templates deliver on the promise, at least in this respect. It's a fact: you can't dispatch at run time over an open set of types if the function is variant in more than one parameter.
BTW: in case you aren't convinced by theory .. just go look at the ISO C++ Standard library and see how much virtual function polymorphism is used there, compared to functional programming with templates..
Finally please note carefully that I am not saying classes and such like are useless, I use virtual function polymorphism myself: I'm saying that this is limited to particular problems and not a general way to represent abstractions, and therefore not worthy of being called a paradigm.
From C++03, §10.4 3:
An abstract class shall not be used as a parameter type, as a function return type, or as the type of an explicit conversion. Pointers and references to an abstract class can be declared.
Passing obj as a const reference is allowed.
When the CompareTo member function is pure virtual, IComparableObject is an abstract class.
You can't directly copy an object of an abstract class.
When you pass an object by value you're directly copying that object.
Instead of passing by value, you can pass by reference to const.
That is, formal argument type IComparableObject const&.
By the way, the function should probably be declared const so that it can be called on const object.
Also, instead of #pragma once, which is non-standard (but supported by most compilers), consider an ordinary include guard.
Also, when posting code that illustrates a problem, be sure to post exact code. In this case, there's a missing semicolon at the end, indicating manual typing of the code (and so that there could be other typos not so easily identified as such, but instead misidentified as part of your problem). Simply copy and paste real code.
Cheers & hth.,