Why is the virtual keyword needed? - c++

In other words, why doesn't the compiler just "know" that if the definition of a function is changed in a derived class, and a pointer to dynamically allocated memory of that derived class calls the changed function, then that function in particular should be called and not the base class's?
In what instances would not having the virtual keyword work to a programmer's benefit?

virtual keyword tells the compiler to implement dynamic dispatch.That is how the language was designed.
Without such an keyword the compiler would not know whether or not to implement dynamic dispatch.
The downside of virtual or dynamic dispatch in general is that,
It has slight performance penalty. Most compilers would implement dynamic dispatch using vtable and vptr mechanism, where the appropriate function to call is decided through vtable and hence an additional indirection is needed in case of dynamic dispatch.
It makes your class Non-POD.

One reason:
Consider base classes located in separate module, like library.
And derived classes in your application.
How would compiler knows during compiling the library that the given function is/must be virtual.

One of the main designing principles of C++ is that C++ does not incur overhead for features that are not used (the "zero-overhead principle"). This is because of a focus on high performance
This is why you need to opt in to features like virtual functions while in languages like Java, functions are virtual by default.

The compiler doesn't know, because it can't. It might be your intention, to not use virtual functions, because there's always a cost associated with every feature.

Related

No RTTI but still virtual methods

C++ code can be compiled with run-time type information disabled, which disables dynamic_cast. But, virtual (polymorphic) methods still need to be dispatched based on the run-time type of the target. Doesn't that imply the type information is present anyway, and dynamic_cast should be able to always work?
Disabling RTTI kills dynamic_cast and typeid but has no impact on virtual functions. Virtual functions are dispatched via the "vtable" of classes which have any virtual functions; if you want to avoid having a vtable you can simply not have virtual functions.
Lots of C++ code in the wild can work without dynamic_cast and almost all of it can work without typeid, but relatively few C++ applications would survive without any virtual functions (or more to the point, functions they expected to be virtual becoming non-virtual).
A virtual table (vtable) is just a per-instance pointer to a per-type lookup table for all virtual functions. You only pay for what you use (Bjarne loves this philosophy, and initially resisted RTTI). With full RTTI on the other hand, you end up with your libraries and executables having quite a lot of elaborate strings and other information baked in to describe the name of each type and perhaps other things like the hierarchical relations between types.
I have seen production systems where disabling RTTI shrunk the size of executables by 50%. Most of this was due to the massive string names that end up in some C++ programs which use templates heavily.

Disable dynamic binding (virtual table creation) in c++ for virtual functions

I recently I came across a C++ interview question which got me very intrigued:
Suppose you declare mistakenly some C++ member function as virtual, but (maybe for performance reasons) you want to prevent the compiler from creating a v-table for this function. That is, disable thee dynamic function binding in favor of a static binding.
How would you achieve this? Also, are there some C++11 specific ways of doing so?
I'm aware of no way to coerce a C++ compiler to disable dynamic binding, short of forcing it to compile code purely as C if it supports such an option (which not all C++ compilers do, but most do). However, that is sort of throwing out the baby with the bathwater, as it notionally disables all C++ features that are not part of C.
There is, of course, the final identifier introduced in C++11, which prevents further derivation from a class or overriding of virtual members. Strictly speaking, this does not prevent dynamic dispatch though - it addresses a different problem.
One way to avoid implications (perceived or actual) of dynamic binding is to avoid using or writing any classes with virtual member functions, and to not create a class hierarchy (i.e. don't derive from classes with virtual functions). Clearly, if there are no virtual functions in play, there is no need for virtual function dispatch, and therefore no need for dynamic binding.
If you know the type of an object, it is possible to avoid use of dynamic binding by using static dispatch i.e. explicitly naming which function to call. For example, let's say we have a class Base that provides a public virtual member named foo() and a class named Derived that inherits from Base and overrides foo(). Then the following avoids doing dynamic dispatch;
Base *b = new Derived;
b->Base::foo(); // static call; will not call `Derived::foo()`
b->Derived::Foo(); // incorrect static call. Will not compile since b is a pointer to Base not Derived
Derived *d = new Derived;
d->Derived::foo(); // static call of Derived::foo()
d->Base::foo(); // static call of Base::foo()
Of course, if the code which uses an object relies on knowledge of the ACTUAL type of an object, or on a specific variant of foo() being called, then its design sort-of defeats the purpose of having a polymorphic base class and other classes that derive from it.
In the above, the compiler will still support virtual function calls (vtable, etc, if that is now the compiler works) and that may affect the process of creating and destroying objects.
Another technique to avoid dynamic dispatch (or binding) is to use templates (sometimes called compile-time polymorphism). Essentially, the template may assume a type provides some interface (or set of operations) and will work with any variable of a type with that interface. For example;
struct X
{
void foo();
};
template<class T> void func()
{
T x; // relies on T being instantiable (and destructible)
x.foo(); // relies on T having a member named foo()
}
// in some function somewhere where both X and func() are known to the compiler
func<X>();
Such templates do not need the type T to have virtual functions, so do not rely on dynamic dispatch (binding). However, there is nothing stopping such a template function working with a class that has virtual member functions, so this does not disable dynamic binding - it only allows the programmer to make choices to avoid using dynamic binding.
If I was asked this question in interview, I'd probably point out all of the above, but leave it unsaid that the question is rather silly. An interviewer knowledgeable about C++ will realise that, and just be interested in how you think through and address such a question (after all real-world developers are often asked by management or customers to meet silly or unrealistic requirements, and are expected to be tactful enough to avoid telling their managers or customers they are being foolish). If the interviewer has asked the question without understanding that (or without another member of the interview panel who understands that in the room) I wouldn't want to work with that employer anyway.
you could avoid the overhead by disabling RTTI...there is a compile time switch for that.
There wont be any overhead of virtual table dispatch, for dynamic_cast/typeid once RTTI disable flag is enabled.

single virtual inheritance compiler optimization in c++?

If I have this situation in C++ project:
1 base class 'Base' containing only pure virtual functions
1 class 'Derived', which is the only class which inherits (public) from 'Base'
Will the compiler generate a VTABLE?
It seems there would be no need because the project only contains 1 class to which a Base* pointer could possibly point (Derived), so this could be resolved compile time for all cases.
This is interesting if you want to do dependency injection for unit testing but don't want to incur the VTABLE lookup costs in production code.
I don't have hard data, but I have good reasons to say no, it won't turn virtual calls into static ones.
Usually, the compiler only sees a single compilation unit. It cannot know there's only a single subclass, because five months later you may write another subclass, compile it, get some ancient object files from the backup and link them all together.
While link-time optimizations do see the whole picture, they usually work on a far lower-level representation of the program. Such representation allow e.g. inlining of static calls, but don't represent inheritance information (except perhaps as optional metadata) and already have the virtual calls and vtables spelt out explicitly. I know this is the case for Clang and IIRC gcc's whole-program optimizations also work on some low-level IR (GIMPLE?).
Also note that with dynamic loading, you can still add more subclasses long after compilation and LTO. You may not need it, but if I was a compiler writer, I'd be weary of adding an optimization that allows people royally breaking virtual calls in very specific, hard-to-track-down circumstances.
It's rarely worth the trouble - if you don't need virtual calls (e.g. because you know you won't need any more subclasses), don't make stuff virtual. Review your design. If you need some polymorphism but not the full power of virtual, the curiously recurring template pattern may help.
The compiler doesn't have to use a vtable based implementation of virtual function dispatch at all so the answer to your question will be specific to the implementation that you are using.
The vtable is usually not only used for virtual functions, but it is also used to identify the class type when you do some dynamic_cast or when the program accesses the type_info for the class.
If the compiler detects that no virtual functions ever need a dynamic dispatch and none of the other features are used, it just could remove the vtable pointer as an optimization.
Obviously the compiler writer hasn't found it worth the trouble of doing this. Probably because it wouldn't be used very often.

When to mark a function in C++ as a virtual?

Because of C++ nature of static-binding for methods, this affects the polymorphic calls.
From Wikipedia:
Although the overhead involved in this dispatch mechanism is low, it
may still be significant for some application areas that the language
was designed to target. For this reason, Bjarne Stroustrup, the
designer of C++, elected to make dynamic dispatch optional and
non-default. Only functions declared with the virtual keyword will be
dispatched based on the runtime type of the object; other functions
will be dispatched based on the object's static type.
So the code:
Polygon* p = new Triangle;
p->area();
provided that area() is a non-virtual function in Parent class that is overridden in the Child class, the code above will call the Parent's class method which might not be expected by the developer. (thanks to the static-binding I've introduced)
So, If I want to write a class to be used by others (e.g library), should I make all my functions to be virtual for the such previous code to run as expected?
The simple answer is if you intend functions of your class to be overridden for runtime polymorphism you should mark them as virtual, and not if you don't intend so.
Don't mark your functions virtual just because you feel it imparts additional flexibility, rather think of your design and purpose of exposing an interface. For ex: If your class is not designed to be inherited then making your member functions virtual will be misleading. A good example of this is Standard Library containers,which are not meant to be inherited and hence they do not have virtual destructors.
There are n no of reasons why not to mark all your member functions virtual, to quote some performance penalties, non-POD class type and so on, but if you really intent that your class is intended for run time overidding then that is the purpose of it and its about and over the so-called deficiencies.
Mark it virtual if derived classes should be able to override that method. It's as simple as that.
In terms of memory performance, you get a virtual pointer table if anything is virtual, so one way to look at it is "please one, please all". Otherwise, as the others say, mark them as virtual if you want them to be overridable such that calling that method on a base class means that the specialized versions are run.
As a general rule, you should only mark a function virtual if the class is explicitly designed to be used as a base class, and that function is designed to be overridden. In practice, most virtual functions will be pure virtual in the base class. And except in cases of call inversion, where you explicitly don't provide a contract for the overriding function, virtual functions should be private (or at the most protected), and wrapped with non-virtual functions enforcing the contract.
That's basically the idea ; actually if you are using a parent class, I don't think you'll need to override every methods so just make them virtual if you think you'll use it this way.

Inheritance in C++ internals

Can some one explain me how inheritance is implemented in C++ ?
Does the base class gets actually copied to that location or just refers to that location ?
What happens if a function in base class is overridden in derived class ? Does it replace it with the new function or copies it in other location in derived class memory ?
first of all you need to understand that C++ is quite different to e.g. Java, because there is no notion of a "Class" retained at runtime. All OO-features are compiled down to things which could also be achieved by plain C or assembler.
Having said this, what acutally happens is that the compiler generates kind-of a struct, whenever you use your class definition. And when you invoke a "method" on your object, actually the compiler just encodes a call to a function which resides somewhere in the generated executable.
Now, if your class inherits from another class, the compiler somehow includes the fields of the baseclass in the struct he uses for the derived class. E.g. it could place these fields at the front and place the fields corresponding to the derived class after that. Please note: you must not make any assumptions regarding the concrete memory layout the C++ compiler uses. If you do so, you're basically on your own and loose any portability.
How is the inheritance implemented? well, it depends!
if you use a normal function, then the compiler will use the concrete type he's figured out and just encode a jump to the right function.
if you use a virtual function, the compiler will generate a vtable and generate code to look up a function pointer from that vtable, depending on the run time type of the object
This distinction is very important in practice. Note, it is not true that inheritance is allways implemented through a vtable in C++ (this is a common gotcha). Only if you mark a certain member function as virtual (or have done so for the same member function in a baseclass), then you'll get a call which is directed at runtime to the right function. Because of this, a virtual function call is much slower than a non-virtual call (might be several hundered times)
Inheritance in C++ is often accomplished via the vtable. The linked Wikipedia article is a good starting point for your questions. If I went into more detail in this answer, it would essentially be a regurgitation of it.