No RTTI but still virtual methods - c++

C++ code can be compiled with run-time type information disabled, which disables dynamic_cast. But, virtual (polymorphic) methods still need to be dispatched based on the run-time type of the target. Doesn't that imply the type information is present anyway, and dynamic_cast should be able to always work?

Disabling RTTI kills dynamic_cast and typeid but has no impact on virtual functions. Virtual functions are dispatched via the "vtable" of classes which have any virtual functions; if you want to avoid having a vtable you can simply not have virtual functions.
Lots of C++ code in the wild can work without dynamic_cast and almost all of it can work without typeid, but relatively few C++ applications would survive without any virtual functions (or more to the point, functions they expected to be virtual becoming non-virtual).
A virtual table (vtable) is just a per-instance pointer to a per-type lookup table for all virtual functions. You only pay for what you use (Bjarne loves this philosophy, and initially resisted RTTI). With full RTTI on the other hand, you end up with your libraries and executables having quite a lot of elaborate strings and other information baked in to describe the name of each type and perhaps other things like the hierarchical relations between types.
I have seen production systems where disabling RTTI shrunk the size of executables by 50%. Most of this was due to the massive string names that end up in some C++ programs which use templates heavily.

Related

Why aren't all functions virtual in C++? [duplicate]

Is there any real reason not to make a member function virtual in C++? Of course, there's always the performance argument, but that doesn't seem to stick in most situations since the overhead of virtual functions is fairly low.
On the other hand, I've been bitten a couple of times with forgetting to make a function virtual that should be virtual. And that seems to be a bigger argument than the performance one. So is there any reason not to make member functions virtual by default?
One way to read your questions is "Why doesn't C++ make every function virtual by default, unless the programmer overrides that default." Without consulting my copy of "Design and Evolution of C++": this would add extra storage to every class unless every member function is made non-virtual. Seems to me this would have required more effort in the compiler implementation, and slowed down the adoption of C++ by providing fodder to the performance obsessed (I count myself in that group.)
Another way to read your questions is "Why do C++ programmers do not make every function virtual unless they have very good reasons not to?" The performance excuse is probably the reason. Depending on your application and domain, this might be a good reason or not. For example, part of my team works in market data ticker plants. At 100,000+ messages/second on a single stream, the virtual function overhead would be unacceptable. Other parts of my team work in complex trading infrastructure. Making most functions virtual is probably a good idea in that context, as the extra flexibility beats the micro-optimization.
Stroustrup, the designer of the language, says:
Because many classes are not designed to be used as base classes. For example, see class complex.
Also, objects of a class with a virtual function require space needed by the virtual function call mechanism - typically one word per object. This overhead can be significant, and can get in the way of layout compatibility with data from other languages (e.g. C and Fortran).
See The Design and Evolution of C++ for more design rationale.
There are several reasons.
First, performance: Yes, the overhead of a virtual function is relatively low seen in isolation. But it also prevents the compiler from inlining, and that is a huge source of optimization in C++. The C++ standard library performs as well as it does because it can inline the dozens and dozens of small one-liners it consists of. Additionally, a class with virtual methods is not a POD datatype, and so a lot of restrictions apply to it. It can't be copied just by memcpy'ing, it becomes more expensive to construct, and takes up more space. There are a lot of things that suddenly become illegal or less efficient once a non-POD type is involved.
And second, good OOP practice. The point in a class is that it makes some kind of abstraction, hides its internal details, and provides a guarantee that "this class will behave so and so, and will always maintain these invariants. It will never end up in an invalid state".
That is pretty hard to live up to if you allow others to override any member function. The member functions you defined in the class are there to ensure that the invariant is maintained. If we didn't care about that, we could just make the internal data members public, and let people manipulate them at will. But we want our class to be consistent. And that means we have to specify the behavior of its public interface. That may involve specific customizability points, by making individual functions virtual, but it almost always also involves making most methods non-virtual, so that they can do the job of ensuring that the invariant is maintained. The non-virtual interface idiom is a good example of this:
http://www.gotw.ca/publications/mill18.htm
Third, inheritance isn't often needed, especially not in C++. Templates and generic programming (static polymorphism) can in many cases do a better job than inheritance (runtime polymorphism). Yes, you sometimes still need virtual methods and inheritance, but it is certainly not the default. If it is, you're Doing It Wrong. Work with the language, rather than try to pretend it was something else. It's not Java, and unlike Java, in C++ inheritance is the exception, not the rule.
I'll ignore performance and memory cost, because I have no way to measure them for the "in general" case...
Classes with virtual member functions are non-POD. So if you want to use your class in low-level code which relies on it being POD, then (among other restrictions) any member functions must be non-virtual.
Examples of things you can portably do with an instance of a POD class:
copy it with memcpy (provided the target address has sufficient alignment).
access fields with offsetof()
in general, treat it as a sequence of char
... um
that's about it. I'm sure I've forgotten something.
Other things people have mentioned that I agree with:
Many classes are not designed for inheritance. Making their methods virtual would be misleading, since it implies child classes might want to override the method, and there shouldn't be any child classes.
Many methods are not designed to be overridden: same thing.
Also, even when things are intended to be subclassed / overridden, they aren't necessarily intended for run-time polymorphism. Very occasionally, despite what OO best practice says, what you want inheritance for is code reuse. For example if you're using CRTP for simulated dynamic binding. So again you don't want to imply your class will play nicely with runtime polymorphism by making its methods virtual, when they should never be called that way.
In summary, things which are intended to be overridden for runtime polymorphism should be marked virtual, and things which don't, shouldn't. If you find that almost all your member functions are intended to be virtual, then mark them virtual unless there's a reason not to. If you find that most of your member functions are not intended to be virtual, then don't mark them virtual unless there's a reason to do so.
It's a tricky issue when designing a public API, because flipping a method from one to the other is a breaking change, so you have to get it right first time. But you don't necessarily know before you have any users, whether your users are going to want to "polymorph" your classes. Ho hum. The STL container approach, of defining abstract interfaces and banning inheritance entirely, is safe but sometimes requires users to do more typing.
The following post is mostly opinion, but here goes:
Object oriented design is three things, and encapsulation (information hiding) is the first of these things. If a class design is not solid on this, then the rest doesn't really matter very much.
It has been said before that "inheritance breaks encapsulation" (Alan Snyder '86) A good discussion of this is present in the group of four design pattern book. A class should be designed to support inheritance in a very specific manner. Otherwise, you open the possibility of misuse by inheritors.
I would make the analogy that making all of your methods virtual is akin to making all your members public. A bit of a stretch, I know, but that's why I used the word 'analogy'
As you are designing your class hierarchy, it may make sense to write a function that should not be overridden. One example is if you are doing the "template method" pattern, where you have a public method that calls several private virtual methods. You would not want derived classes to override that; everyone should use the base definition.
There is no "final" keyword, so the best way to communicate to other developers that a method should not be overridden is to make it non-virtual. (other than easily ignored comments)
At the class level, making the destructor non-virtual communicates that the class should not be used as a base class, such as the STL containers.
Making a method non-virtual communicates how it should be used.
The Non-Virtual Interface idiom makes use of non-virtual methods. For more information please refer to Herb Sutter "Virtuality" article.
http://www.gotw.ca/publications/mill18.htm
And comments on the NVI idiom:
http://www.parashift.com/c++-faq-lite/strange-inheritance.html#faq-23.3
http://accu.org/index.php/journals/269 [See sub-section]

When you prefer virtual functions over templates in C++?

I was interviewed by a financial company and was asked this question:
"List the case(s) when you prefer virtual functions over templates?"
It sounds weird for me, because usually we are aiming the opposite right?
All the books, articles, talks are there encouraging us to use static polymorphism instead of dynamic.
Are there any known cases I was not aware of when you should use virtual functions and avoid templates?
GUI / visualization widget toolkits are an obvious case. Re-implementing a draw method, for example, is certainly less cumbersome with virtual methods and dynamic dispatch. And since modern C++ tends to discourage raw pointer management, std::unique_ptr can manage the resource for you.
I'm sure there are plenty of other hierarchical examples you can come up with... a base enemy class for a game, with virtual methods handling the behaviour of various baddies:)
The whole 'overhead' argument for dynamic dispatch is completely without merit today. I'd argue that the vtable indirection implementation hasn't been a significant overhead for serious workloads in decades. There's the more interesting question that if C++ was designed today, would polymorphism be part of the language? But that's neither here nor there now.
I don't like the chances of this question remaining open, as it's not a direct programming problem and is probably too subject to opinions. It might be a better question for software engineering.
When the type of the object is not known at compile time, use virtual methods.
e.g.
void Accept (Fruit* pFruit) // supplied from external factors at runtime
{
pFruit->eat(); // `Fruit` can be anyone among `Apple/Blackberry/Chickoo/`...
}
Based on what user enters, a fruit will be supplied to the function. Hence, there is no way we can figure out that what is going to be eat(). So it's a candidate for runtime polymorphism:
class Fruit
{
public: virtual void eat () = 0;
}
In all the other cases, always prefer static polymorphism (includes templates). It's more deterministic and easy to maintain.
Interview question seems to asking for preference (& your reasoing).
I would prefer interfaces (virtual methods) for ease of creating mocks for unit testing. We can use template for these, but it is cumbersome (consumer has to be templated). If profiling shows no speed degradation with vtable lookups, prefer that over mocking/testing non-virtual methods.
Also, type erasure. I don't think it can be implemented at all using templates. Type erasure can be roughly thought of as void ptr and function pointers, which can be implemented easily using interfaces + virtual methods.
Templates need code implementation to be made available. I believe virtual methods of an interface can be invoked on a child ptr with just its binary (object) file.
Not sure if code bloat with template is an issue with modern compilers, but if executable size increases significantly, that could be an issue in embedded systems with limited memory and these would not prefer to use templates.

Does 'final' specifier add any overhead?

Does using specifier final on a class or on a function add any memory or cpu overhead, or is it used at compile time only?
And how does std::is_final recognise what is final?
It actually can reduce overhead. And in rare cases, increase it.
If you have a pointer to a final class A, any virtual method calls can be de-virtualized and called directly. Similarly, a call to a virtual final method can be de-virtualized. In addition, the inheritance tree of a final class is fixed, even if it contains virtual parent classes, so you can de-virtualize some parent access.
Each of these de-virtualizations reduce or eliminate the requirement that a run-time structure (the vtable) be queried.
There can be a slight downside. Some coding techniques rely on vtable access to avoid direct access to a symbol, and then do not export the symbol. Accessing a vtable can be done via convention (without symbols from a library, just the header file for the classes in question), while accessing a method directly involves linking against that symbol.
This breaks one form of dynamic C++ library linking (where you avoid linking more than a dll loading symbol and/or C linkage functions that return pointers, and classes are exported via their vtables).
It is also possible that if you link against a symbol in a dynamic library, the dynamic library symbol load could be more expensive than the vtable lookup. I have not experienced or profiled this, but I have seen it claimed. The benefits should, in general, outweigh such costs. Any such cost is a quality of implementation issue, as the cost is not mandated to occur because the method is final.
Finally, final inhibits the empty base optimization trick on classes, where someone knows your class has no state, and inherits from it to reduce the overhead of "storing" an instance of your class from 1 byte to 0 bytes. If your class is empty and contains no virtual methods/inheritance, don't use final to avoid this being blocked. There is no equivalent for final functions.
Other than the EBO optimization issue (which only occurs with empty types), any overhead from final comes from how other code interacts with it, and will be rare. Far more often it will make other code faster, as directly interacting with a method enables both a more direct call of the method, and can lead to knock-on optimizations (because the call can be more fully understood by the compiler).
Marking anything except an empty type as final when it is final is almost certainly harmless at run time. Doing so on classes with virtual functions and inheritance is likely to be beneficial at run time.
std::is_final and similar traits are almost all implemented via compiler built-in magic. A good number of the traits in std require such magic. See How to detect if a class is final in C++11? (thanks to #Csq for finding that)
No, it's only used at compile-time
Magic (see here for further info - thanks Csq for link)

Why is the virtual keyword needed?

In other words, why doesn't the compiler just "know" that if the definition of a function is changed in a derived class, and a pointer to dynamically allocated memory of that derived class calls the changed function, then that function in particular should be called and not the base class's?
In what instances would not having the virtual keyword work to a programmer's benefit?
virtual keyword tells the compiler to implement dynamic dispatch.That is how the language was designed.
Without such an keyword the compiler would not know whether or not to implement dynamic dispatch.
The downside of virtual or dynamic dispatch in general is that,
It has slight performance penalty. Most compilers would implement dynamic dispatch using vtable and vptr mechanism, where the appropriate function to call is decided through vtable and hence an additional indirection is needed in case of dynamic dispatch.
It makes your class Non-POD.
One reason:
Consider base classes located in separate module, like library.
And derived classes in your application.
How would compiler knows during compiling the library that the given function is/must be virtual.
One of the main designing principles of C++ is that C++ does not incur overhead for features that are not used (the "zero-overhead principle"). This is because of a focus on high performance
This is why you need to opt in to features like virtual functions while in languages like Java, functions are virtual by default.
The compiler doesn't know, because it can't. It might be your intention, to not use virtual functions, because there's always a cost associated with every feature.

single virtual inheritance compiler optimization in c++?

If I have this situation in C++ project:
1 base class 'Base' containing only pure virtual functions
1 class 'Derived', which is the only class which inherits (public) from 'Base'
Will the compiler generate a VTABLE?
It seems there would be no need because the project only contains 1 class to which a Base* pointer could possibly point (Derived), so this could be resolved compile time for all cases.
This is interesting if you want to do dependency injection for unit testing but don't want to incur the VTABLE lookup costs in production code.
I don't have hard data, but I have good reasons to say no, it won't turn virtual calls into static ones.
Usually, the compiler only sees a single compilation unit. It cannot know there's only a single subclass, because five months later you may write another subclass, compile it, get some ancient object files from the backup and link them all together.
While link-time optimizations do see the whole picture, they usually work on a far lower-level representation of the program. Such representation allow e.g. inlining of static calls, but don't represent inheritance information (except perhaps as optional metadata) and already have the virtual calls and vtables spelt out explicitly. I know this is the case for Clang and IIRC gcc's whole-program optimizations also work on some low-level IR (GIMPLE?).
Also note that with dynamic loading, you can still add more subclasses long after compilation and LTO. You may not need it, but if I was a compiler writer, I'd be weary of adding an optimization that allows people royally breaking virtual calls in very specific, hard-to-track-down circumstances.
It's rarely worth the trouble - if you don't need virtual calls (e.g. because you know you won't need any more subclasses), don't make stuff virtual. Review your design. If you need some polymorphism but not the full power of virtual, the curiously recurring template pattern may help.
The compiler doesn't have to use a vtable based implementation of virtual function dispatch at all so the answer to your question will be specific to the implementation that you are using.
The vtable is usually not only used for virtual functions, but it is also used to identify the class type when you do some dynamic_cast or when the program accesses the type_info for the class.
If the compiler detects that no virtual functions ever need a dynamic dispatch and none of the other features are used, it just could remove the vtable pointer as an optimization.
Obviously the compiler writer hasn't found it worth the trouble of doing this. Probably because it wouldn't be used very often.