This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
A question about virtual mechanism in C++
Is using vtable the only way to implement virtual member functions mechanism in C++? What other ways exist?
Technically, all that's required for dynamic dispatch is the ability to identify the dynamic type of an object, given a pointer to it. Thus, any sort of hidden (or not-so-hidden) typeid field would work.
Dynamic dispatch would use that typeid to find the associated functions. That association could be a hastable or an array where typeid is the index, or any other suitable relationship. vptr just happens to be the way to achieve this in the least number of steps.
Another known mechanism is type dispatch functions. Effectively, you replace the vtable pointer by a typeid (small enum). The (dyanmic) linker collects all overrides of a given virtual function, and wraps them in one big switch statement on the typeid field.
The theoretical justifcation is that this replaces an indirect jump (non-predicatble) by lots of predicatable jumps. With some smarts in choosing enum values, the switch statement can be fairly effective too (i.e. better than lineair)
Another possible implementation would be to store the pointers to virtual functions directly into the objects. Of course, this solution is never used in practice (at least in no languages I'm aware of) since it would lead to a dramatic increase of the memory footprint. However, it is interesting to note that a code using this implementation could actually run faster since it removes an indirection layer by suppressing the need for the vptr.
I'm not aware of any compiler which implements virtual functions without using vtable approach.
Theoretically, however, one could create an internal map of object pointers and a table of pointers to virtual functions, something like map<objPtr, functionTable*>, to implement dynamic polymorphism through virtual functions. But then it would make the dynamic dispatching slower than vtable-approach.
It seems vtable-approach is probably the fastest mechanism to implement dynamic polymorphism. Maybe, that is why all compilers employ this approach!
Related
This question already has answers here:
Alternative virtual function calls implementations?
(11 answers)
Closed 8 years ago.
Discussion
I'm aware that all the implementations (i.e., C++ compilers) that I know, implement the dynamic dispatch mechanism via the use of virtual dispatch tables and virtual table pointers (i.e., the known vtable and vptr).
However, interrogating the C++ standard I found out that the C++ standard does not mandate exactly how dynamic dispatch must be implemented. This means that a vendor could use an alternative method for dynamic dispatch provided that its behaviour complies to the C++ standard demands for dynamic dispatch behaviour.
Questions
Q1. Are there any other valid methods, beside vtables and vptrs, that dynamic dispatch could be implemented with?
Q2. If Q1 is true: What are the reasons, if any, that made implementers decide to use vtables and vptrs to implement dynamic dispatch instead of some other valid method?
Q1: Dynamic compilers can implement virtual functions faster than using a vtable. Say a method is virtual, but all objects created so far use implementation X. A dynamic compiler will produce a direct call to implementation X or even inline it. When an object using a different implementation is created, all the code that might now be wrong will be recompiled.
Even if there are two implementations, the dynamic compiler may produce code like "if (object uses implementation X) { inlined_code_for_x (); } else { recompile_this_code (); }
Q2: A potential reason: If you have a base class with many virtual functions and a huge vtable, and many derived classes which rarely override any of these virtual functions, then having the same vtable for each class is inefficient. Both from a memory point of view, and potential from an execution point of view, because certain processor optimisations don't work if pointers to the same function are stored in different memory locations.
Considering the fact that a pointer to a function returning another pointer to another function is the mechanism used in C to introduce some runtime polymorphism/callbacks, What is the equivalent way to implement this in C++ while improving locality and lower the cost about pointers and indirections ?
For example this syntactic sugar can help but I'm not really interested in this, altough it's a nice way to do things in a C++ way instead of a more C-ish typedef, I'm more interested in improving locality while trying to reduce the use of explicit pointers at runtime.
The real reason as to why people use function pointers in C to emulate polymorphism is not performance but the fact that C neither supports real polymorphism nor templates. These are two alternatives you have in C++. All three approaches are compared in this thread.
Note that even though calling a function pointer does not require the additional vtable lookup that virtual function calls do, calling virtual functions and function pointers both suffer from the same major performance problem: Branch prediction in both cases is not as reliable and you tend to end up with more pipeline flushes.
I think you can use Virtual function for meet part of the requirement.
This question already has answers here:
Is there any reason not to make a member function virtual?
(7 answers)
Closed 4 years ago.
I hope this question is not too vague, but coming from java, I can not think of any reason why I would use non-virtual functions in C++.
Is there a nice example which demonstrates the benefit of non-virtual functions in C++.
Virtual functions have a runtime cost associated with them. They are dispatched at runtime and thus are slower to call. They are similar to calling regular functions through a function pointer, where the address is determined at runtime according to the actual type of the object. This incurs overhead.
One of the C++ design decisions has always been that you should not pay for things you don't need. In contrast, Java does not concern itself much with this kind of low-level optimization.
It is true that calling a virtual function can be slower, but not nearly as slow as most C++ programmers think.
Modern CPUs have gotten pretty good at branch prediction. If, every time you execute a particular call to a virtual function, you are actually calling the same implementation, the CPU will figure that out and start "guessing" (speculatively executing) the call before it even computes the address. This can often hide the cost of the virtual call completely, making it exactly as fast as a non-virtual call. (If you doubt this, try it for yourself on a current-generation processor.)
If you are not calling the same implementation, then you are actually relying on the virtual dispatch, so you could not directly replace it with a non-virtual function anyway.
The only common exception to this is inlined functions, where the compiler can perform constant propagation, CSE, etc. between the caller and callee. Obviously it cannot do this if it does not know the destination of the call at compile time.
But as a rule of thumb, your instinct that you always want to use virtual functions is not all that bad. The times when the performance difference is noticeable are rare.
Very few member functions in the standard library are virtual.
Offhand I can only remember the destructor and what function of standard exceptions.
As of 2012 the only good reason to have a virtual member function is to support overriding of that member function in a derived class, i.e. a customization point, and that can often be achieved in other ways (e.g. parameterization, templating).
However, I can remember at one time, like 15 years ago, being very frustrated with the design of Microsoft's MFC class framework. I wanted every member function to be virtual so as to be able to override the functionality and in order to be able to more easily debug things, as an alternative to non-existing or very low quality documentation. Thus, I argued that virtual should be the default, also in other software.
I have since understood that MFC was not representative and is not representative of C++ software in general, so the MFC-specific reasons do not apply in general. :-)
The efficiency cost of virtual function is, like, virtually non-existent. :-) See for example the international standarization committee's Technical Report on C++ Performance. However, there is a real cost in providing this freedom for derived classes, because freedom implies responsibility: any derived class then has to ensure that overriding the member function respects the contract of the base class.
Well, one of the principles on which C++ language is based is that you should not pay for something you don't use.
Virtual function call is more expensive than non-virtual function call, since in a typical implementation it comes through two (or three) additional levels of indirection. A virtual call cannot be inlined, meaning that the expenses can grow even higher due to fact that we have to call a full-fledged function.
Adding virtual functions to an class makes it polymorphic, thus creating some invisible internal structures inside objects of that class. These structures incur additional household expenses and preclude low-level processing of class objects.
Finally, separating functions into virtual and non-virtual ones (i.e into overridable and non-overridable ones) is a matter of your design. It simply makes no sense whatsoever to unconditionally make all functions in our class overridable in the derived classes.
C++ is meant to be both fast like C, and support OO and generic programming (templates). To achieve both this goals, C++ member functions by default can't be inerited, unless you mark them as virtual, in which case the virtual table gets into business. So you can build classes that don't involve virtual functions, when not needed.
Although not as efficient as non-virtual calls, virtual functions calls using the virtual table are very fast. You might notice the different only within tight loops that do nothing but calling a member function. So the Java way - all members are "virtual" - is indeed more practical IMO.
Is there any efficiency disadvantage associated with deep inheritance trees (in c++), i.e, a large set of classes A, B, C, and so on, such that B extends A, C extends B, and so one. One efficiency implication that I can think of is, that when we instantiate the bottom most class, say C, then the constructors of B and A are also called, which will have performance implications.
Let's enumerate the operations we should consider:
Construction/destruction
Each constructor/destructor will call its base class equivalents. However, as James McNellis pointed out, you were obviously going to do that work anyway. You didn't derived from A just because it was there. So the work is going to get done one way or another.
Yes, it will involve a few more function calls. But function call overhead will be nothing compared to the actual work any significantly deep class hierarchy will have to actually do. If you're at the point where function call overhead is actually important for performance, I would strongly suggest that calling constructors at all is probably not what you want to be doing in that code.
Object Size
In general, the overhead for a derived class is nothing. The overhead for virtual members is a pointer or for virtual inheritance.
Member Function Calls, Static
By this, I mean calling non-virtual member functions, or calling virtual member functions with class names (ClassName::FunctionName syntax). Both of these allow the compiler to know at compile time which function to call.
The performance of this is invariant with the size of the hierarchy, since it's compile-time determined.
Member Function Calls, Dynamic
This is calling virtual functions with the full and complete expectation of runtime calls.
Under most sane C++ implementations, this is invariant with the size of the object hierarchy. Most implementations use a v-table for each class. Each object has a v-table pointer as a member. For any particular dynamic call, the compiler accesses the v-table pointer, picks out the method, and calls it. Since the v-table is the same for each class, it won't be any slower for a class that has a deep hierarchy than one with a shallow one.
Virtual inheritance plays a bit with this.
Pointer Casts, Static
This refers to static_cast or any equivalent operation. This means the implicit cast from a derived class to a base class, the explicit use of static_cast or C-style casts, etc.
Note that this technically includes reference casting.
The performance of static casts between classes (up or down) is invariant with the size of the hierarchy. Any pointer offsets will be compile-time generated. This should be true for virtual inheritance as well as non-virtual inheritance, but I'm not 100% certain of that.
Pointer Casts, Dynamic
This obviously refers to the explicit use of dynamic_cast. This is typically used when casting from a base class to a derived one.
The performance of dynamic_cast will likely change for a large hierarchy. But sane implementations should only check the classes between the current class and the requested one. So it's simply linear in the number of classes between the two, not linear in the number of classes in the hierarchy.
Typeof
This means the use of the typeof operator to fetch the std::type_info object associated with an object.
The performance of this will be invariant with the size of the hierarchy. If the class is a virtual one (has virtual functions or virtual base classes), then it will simply pull it out of the vtable. If it's not virtual, then it's compile-time defined.
Conclusion
In short, most operations are invariant with the size of the hierarchy. But even in the cases where it has an impact, it's not a problem.
I'd be more concerned with some design ethic where you felt the need to build such a hierarchy. In my experience, hierarchies like this come from two lines of design.
The Java/C# ideal of having everything derived from a common base class. This is a horrible idea in C++ and should never be used. Each object should derive from what it needs to, and only that. C++ was built on the "pay for what you use" principle, and deriving from a common base works against that. In general, anything you could do with such a common base class is either something you shouldn't be doing period, or something that could be done with function overloading (using operator<< to convert to strings, for example).
Misuse of inheritance. Using inheritance when you should be using containment. Inheritance creates an "is a" relationship between objects. More often than not, "has a" relationships (one object having another as a member) are far more useful and flexible. They make it easier to hide data, and you don't allow the user to pretend one class is another.
Make sure that your design does not fall afoul of one of these principles.
There will be but not as bad as the programmer performance implications.
As #Nicol points out, it may be doing a number of things.
If those are things that you require to be done, regardless of design, because they are all precisely necessary steps in getting the program from call main to exit within the fewest possible cycles, then your design is simply a matter of coding clarity (or maybe lack of it :).
In my experience performance tuning, as in this example, what I often see as a huge source of wasted time is over-design of data (i.e. class) structures.
Wierdly enough, the justification for the data structures is often (guess what?) - performance!
In my experience, the thing to do with data structure is keep it as simple as possible and as normalized as possible. If it is completely normalized, then any single change to it can't make it inconsistent. You can't always achieve complete normality, in which case you have to deal with the possibility that the data can be temporarily inconsistent.
This is why people write notification handlers, and this is encouraged in OOP.
The idea is, if you change something in one place, that can trigger notifications that "automatically" propagate the change to other places, trying to maintain consistency.
The problem with notifications is they can run away. Simply changing some boolean property from true to false can cause a fire-storm of notifications ripping through the data structure in ways no one programmer understands, updating databases, painting windows, zipping files, etc. etc. I often find this is where most clock cycles go.
I think it is simpler and far more efficient to temporarily tolerate inconsistency, and periodically repair it with some kind of sweeping process.
Another way data structures go along with huge inefficiency is if the data is effectively being interpreted by some process to produce some output.
This is very common in graphics.
If the data changes at a very slow rate, it may make sense to "compile" it rather than "interpret" it.
In other words, translate it into a simpler instruction set, or source code which is compiled "on the fly", which can then execute far more quickly to produce the desired output.
What are the memory/performance overheads of enabling RTTI in a C++ program?
Can anyone please throw some light between the internal implementation of RTTI mechanism and the relevant overheads?
I do understand how to use RTTI through typeid and dynamic_cast, what I am trying to know is the internal implementation details of how the run time keeps track of this information and how it is an overhead?
Enabling RTTI typically brings only a small overhead. The usual implementation carries a pointer to the type information structure in the vtable of an object. Since the vtable must be constructed anyway, the extra time is small - it's like adding another virtual function to the class.
typeid is therefore comparable to calling a virtual function.
dynamic_cast is slower - it needs to traverse the inheritance hierarchy to do a cast. Calling dynamic_cast too frequently can be a performance bottleneck. By 'can' I mean that it usually won't …
There is a slight bloat in executable size since the typeinfo structures need to be stored somewhere. In most cases it won't be relevant.
Please read appropriate section in this document.
To sum up:
typeid (5.3.7): find vtable, through that find most derived class object, then extract type_info from that object's vtable. It is still very slow comparing with function call;
dynamic_cast (5.3.8): find type_info as described above, then determine whether conversion is possible, then adjust pointers. Run-time cost depends on the relative position in the class hierarchy of two classes involved. Down- and cross-casts are very slow these days (though here you can find the article about possible (but restricted) constant-time implementation of dynamic_cast).
First there is no way to say exactly how much overhead is involved with out specifying a compiler and version as it is an implementation detail. That said it is well known that in some compilers dynamic_cast searches the class hierarchy doing string comparisons to match class names.
I wonder where did get the idea of RTTI "overhead" ?
I read on the net, that in order to provide R.T.T.I., some (early) C to C++ preprocessors or translators, similar tools (GObject, QT, Objective-C, not sure), and other progr. langr. generate some "behind the scene" code, that did generate some "overhead" in memory and speed.
I read that eventually, that "overhead" was reduced and many times is considered trivial.
Maybe you would like to program in assembly, or "plain C", without R.T.T.I. overhead, is much easier than C++