Member function templates cannot be declared virtual - From Addison Wesley: C++ Templates - c++

From Addison Wesley: C++ Templates
Member function templates cannot be
declared virtual. This constraint is
imposed because the usual
implementation of the virtual function
call mechanism uses a fixed-size table
with one entry per virtual function.
However, the number of instantiations
of a member function template is not
fixed until the entire program has
been translated.
Does the above quote mean that templates have static binding and virtual functions have dynamic binding, that's the reason there cannot be virtual function templates? Please see if a explanation in layman's language is possible.

Yes, and no.
The most popular method to resolve virtual function calls is to use a table ("vtable"), where each virtual function maps to an index in the table. This more or less requires that you know the size of the table.
With templates, new functions will be created as needed in different modules. You would then either have to convince the linker to build the table after figuring out the final number of functions, or use some kind of runtime structure to search for available functions at runtime.
On many systems, the linker is part of the OS and knows nothing about C++, so that option is limited. A runtime search would of course affect the performance negatively, perhaps for all virtual functions.
So, in the end, it was decided that it just was not worth the trouble of introducing virtual templates into the language.

Consider:
struct X
{
template <typename T>
T incr(const T& t)
{
return t + 1;
}
};
As incr() is applied to different T types, new functions are generated. Say inside app.c++ you have:
X x;
x.incr(7); // incr<int>()
x.incr(7.0); // incr<double>()
x.incr("hello"); // incr<const char*>()
Then as it's compiling app.c++, it sees 3 functions that - if incr were allowed to be virtual - it could make space for the three instantiations above in the virtual dispatch table for X. Then say it loads a shared library at run-time, and the code for that library had 2 instantations of X::incr for uint32_t and std::string::const_iterator. dlopen() would need to grow the existing virtual dispatch table for the already created objects to make space for two new functions. Doesn't sound too horrible, but consider:
each bit of code calling virtual functions must know if the address of those functions was bumped along by some offset at run-time (due to dynamic loading of extra instantiations), so there's extra memory and performance cost in every virtual dispatch
when there's multiple inheritance, or a derived class is itself derived from, the compiler may want to create a single virtual dispatch table for the total set of virtual functions (one option, there are many for implementing virtual dispatch): in this case, the new virtual functions would either displace other classes' virtual functions or need to be disjoint from the existing ones. Again, more run-time overheads in any scheme to manage this.
So, the very rare occasions when this might be useful aren't worth compromising and complicating the more common case of non-templated virtuals.

Does the above quote mean that templates have static binding and virtual functions have dynamic binding, that's the reason there cannot be virtual function templates?
Basically, yes. More specifically, the static binding causes a problem when the code is being generated to support dynamic binding.
When the compiler compiles the base class, it finds a virtual function and decides to make a virtual function table - this will be used to implement dynamic binding: when a virtual function is called on a derived instance, the compiled code follows a pointer in the instance to the virtual function table for the derived class, then a pointer in that table to the implementation of the virtual function. This table has to include every possible virtual function that could be called. Now, suppose we made a templated virtual function. The function table would need an entry for every instantiation of the template, because any of those functions could conceivably be called at runtime. But the information about what types the template is instantiated with, cannot (in general) be gathered at the time that the virtual function table is generated. (At least, not without playing around with the C++ compilation model.)

virtual functions and templates still work fine together, there is just a small special case which is not implmented.
template<class T>
class A { virtual void f()=0; }; // works fine
class A { template<class T> virtual void f(T t)=0; }; // does not work

Depends on what you mean by binding.
You can implement a virtual method by calling a member template. As long as you inline it, any compiler with tail-call optimization will eliminate the overhead

Sorta.
You can't really "override" an uninstantiated template because it doesn't even exist in the compiled application. If you instantiate it, then you're not overriding a template, but just another ordinary function. :-)

Related

Difference Between Virtual Function and Conditional Execution on Machine Code Layer

I am trying to understand the concept of virtual functions. According to Wikipedia:
In short, a virtual function defines a target function to be executed,
but the target might not be known at compile time.
My question is, how is this different from conditional execution?
void conditional_func(func_to_run) {
switch(func_to_run) {
case func1_tag: func1(); break;
case func1_tag: func1(); break;
...
}
}
int main() {
read func_to_run
conditional_func(func_to_run)
}
As you can see the ultimate target of the conditional_func is not known at runtime.
In C++, it seems that virtual function is defined as a facility to allow for "polymorphism". My definition of polymorphism: A polymorphic class is a class with objects that can have different forms (morphology) as opposed to having a static form. That is, the objects can have different actions and properties based on their subclass. (I'm avoiding mention of language specific concepts like pointers in my definition)
Therefore what is called a virtual function in C++ does not even have to be dependent on dynamic binding (runtime resolution of the target function), but can have a known target at compile time:
int main()
{
Drived d;
Base *bPtr = &d;
bPtr->func();
}
In the above example, the compiler knows that the Base pointer is pointing at a Derived object, and therefore will know the target address for the version of func to run. Therefore my conclusion is that what Wikipedia refers to as a virtual function, is the same as C++ virtual functions that are for some reason dynamically bound:
int main()
{
Drived1 d1;
Drived2 d2;
read val;
if (val == 1) Base *bPtr = &d1;
else Base *bPtr = &d2;
bPtr->func();
}
As you can see this is also just conditional execution. So here are my questions:
1) If virtual function is defined as a function with unknown target at compile time, how is this different than conditional execution? Are they the same in assembly level but different at higher layers of abstraction?
2) If virtual function is defined as a facility to allow for polymorphism as defined above, then does it mean that again it is only a concept of higher level languages?
1) If virtual function is defined as a function with unknown target at compile time, how is this different than conditional execution? Are they the same in assembly level but different at higher layers of abstraction?
At the assembly/machine code level, virtual functions are typically implemented as class-specific tables of function pointers (known as Virtual Dispatch Tables or VDTs), with each object of those types having a pointer to its class's table. The layout of these tables is consistent across base and derived classes such that given a pointer to any object in the heirarchy, the function pointer for any given virtual function is always at the same position in all the classes' VDTs. This means the same machine code can take the object pointer and find the function to call.
A difference from the type of switch based code you illustrate in that all code with such switches would need to be manually updated and recompiled to support more types. With function pointers, new code for new types can simply be linked to existing code that works via pointers, without the latter being changed or recompiled.
2) If virtual function is defined as a facility to allow for polymorphism as defined above, then does it mean that again it is only a concept of higher level languages?
Firstly, your attempt to define polymorphism is not consistent with C++ terminology. You've got:
A polymorphic class is a class with objects that can have different forms (morphology) as opposed to having a static form. That is, the objects can have different actions and properties based on their subclass.
It'd be closer to the truth in C++ to say that any given class has one form, and it's different classes in an inheritance heirarchy that may have different forms / actions / properties.
Onwards. At the machine code level, you can - obviously given C++ has to output machine code - use function pointers and get the same runtime effect as virtual functions.
What virtual functions add is the convenience and reliability of having the compiler doing much of the work for you:
create virtual-dispatch tables,
ensuring consistent ordering of function pointers,
giving objects an implicit pointers to these tables and initialising it reliably in the first non-abstract-base's constructor and updating it as construction bubbles down the class heirarchy to the actual object's runtime type, then reversing the value of the pointer as destructors kick in,
checking that overrides have the same function signature as the virtual functions they override,
optionally optimising away runtime dispatch when the called function can be deduced at compile time.
Such assurances and compiler-generated actions makes C++-style virtual functions and dispatch a higher level language feature than programmer-coordinated use of function pointers, let alone switches on runtime type. That said, there's nothing in particular stopping someone adding such support to an assembly language. That said, many languages that are on balance even higher level than C++ lack anything similar to virtual functions. (At the extreme, a 5GL may not even expose a notion of functions to the "programmer"/user).

Disable dynamic binding (virtual table creation) in c++ for virtual functions

I recently I came across a C++ interview question which got me very intrigued:
Suppose you declare mistakenly some C++ member function as virtual, but (maybe for performance reasons) you want to prevent the compiler from creating a v-table for this function. That is, disable thee dynamic function binding in favor of a static binding.
How would you achieve this? Also, are there some C++11 specific ways of doing so?
I'm aware of no way to coerce a C++ compiler to disable dynamic binding, short of forcing it to compile code purely as C if it supports such an option (which not all C++ compilers do, but most do). However, that is sort of throwing out the baby with the bathwater, as it notionally disables all C++ features that are not part of C.
There is, of course, the final identifier introduced in C++11, which prevents further derivation from a class or overriding of virtual members. Strictly speaking, this does not prevent dynamic dispatch though - it addresses a different problem.
One way to avoid implications (perceived or actual) of dynamic binding is to avoid using or writing any classes with virtual member functions, and to not create a class hierarchy (i.e. don't derive from classes with virtual functions). Clearly, if there are no virtual functions in play, there is no need for virtual function dispatch, and therefore no need for dynamic binding.
If you know the type of an object, it is possible to avoid use of dynamic binding by using static dispatch i.e. explicitly naming which function to call. For example, let's say we have a class Base that provides a public virtual member named foo() and a class named Derived that inherits from Base and overrides foo(). Then the following avoids doing dynamic dispatch;
Base *b = new Derived;
b->Base::foo(); // static call; will not call `Derived::foo()`
b->Derived::Foo(); // incorrect static call. Will not compile since b is a pointer to Base not Derived
Derived *d = new Derived;
d->Derived::foo(); // static call of Derived::foo()
d->Base::foo(); // static call of Base::foo()
Of course, if the code which uses an object relies on knowledge of the ACTUAL type of an object, or on a specific variant of foo() being called, then its design sort-of defeats the purpose of having a polymorphic base class and other classes that derive from it.
In the above, the compiler will still support virtual function calls (vtable, etc, if that is now the compiler works) and that may affect the process of creating and destroying objects.
Another technique to avoid dynamic dispatch (or binding) is to use templates (sometimes called compile-time polymorphism). Essentially, the template may assume a type provides some interface (or set of operations) and will work with any variable of a type with that interface. For example;
struct X
{
void foo();
};
template<class T> void func()
{
T x; // relies on T being instantiable (and destructible)
x.foo(); // relies on T having a member named foo()
}
// in some function somewhere where both X and func() are known to the compiler
func<X>();
Such templates do not need the type T to have virtual functions, so do not rely on dynamic dispatch (binding). However, there is nothing stopping such a template function working with a class that has virtual member functions, so this does not disable dynamic binding - it only allows the programmer to make choices to avoid using dynamic binding.
If I was asked this question in interview, I'd probably point out all of the above, but leave it unsaid that the question is rather silly. An interviewer knowledgeable about C++ will realise that, and just be interested in how you think through and address such a question (after all real-world developers are often asked by management or customers to meet silly or unrealistic requirements, and are expected to be tactful enough to avoid telling their managers or customers they are being foolish). If the interviewer has asked the question without understanding that (or without another member of the interview panel who understands that in the room) I wouldn't want to work with that employer anyway.
you could avoid the overhead by disabling RTTI...there is a compile time switch for that.
There wont be any overhead of virtual table dispatch, for dynamic_cast/typeid once RTTI disable flag is enabled.

Virtual Function simply a function overloading?

So I came across something called Virtual Function in C++, which in a nutshell from what I understood is used to enable function overloading in derived/child classes.
So given that we have the following class:
class MyBase{
public:
virtual void saySomething() { /* some code */ }
};
then when we make a new class that inherits MyBase like this:
class MySubClass : public MyBase{
public:
void saySomething() { /* different code than in MyBase function */ }
};
the function in MySubClass will execute its own saySomething() function.
To understand it, isn't it same as in Java where you achieve the same by simply writing the same name of the function in the derived class, which will automatically overwrite it / overload it?
Where's in C++ to achieve that you need that extra step, which is declaring the function in base class as virtual?
Thank you in advance! :)
Yes you are correct. In Java, all functions are implicitly virtual. In C++ you have a choice: in order to make a function virtual, you need to mark it as such in the base class. (Some folk also repeat the virtual keyword in derived classes, but that is superfluous).
Well in c++ a virtual function comes with a cost. To be able to provide polymorphism, overloading etc you need to declare a method as virtual.
As C++ is concerned with the layout of a program the virtual keywords comes with an overhead which may not be desired. Java is compiled into bytecode and execute in a virtual machine. C++ and native assembly code is directly executed on the CPU. This gives you, the developer, a possibility to fully understand and control how the code looks and execute at assembler level (beside optimization, etc).
Declaring anything virtual in a C++ class creates a vtable entry per class on which the entire overloading thing is done.
There is also compile time polymorphism with templates that mitigates the vtable and resolution overhead which has it's own set of issues and possibilities.
Let's put it this way.
MyBase *ptr; // Pointer to MyBase
ptr = new MySubClass;
ptr->saySomething();
If saySomething is not virtual in MyBase, the base class version will always be called. If it's virtual, then any derived version will be used, if available.
Virtual Function simply a function overloading?
No. "Overloading" means providing multiple functions with the same name but different parameter types, with the appropriate function chosen at compile time. "Overriding" means providing multiple functions within a class heirarchy, with the appropriate function chosen at run time. In C++, only virtual functions can be overridden.
To understand it, isn't it same as in Java where you achieve the same by simply writing the same name of the function in the derived class, which will automatically overwrite it / overload it?
Yes, assuming you mean "override". In Java, methods are overridable by default. This matches Java's (original) philosophy that we should use a 90s-style object-oriented paradigm for everything.
Where's in C++ to achieve that you need that extra step, which is declaring the function in base class as virtual?
Making functions overridable has a run-time cost, so C++ only does that if you specifically request it. This matches C++'s philosophy that you should choose the most appropriate paradigm for your application, and not pay for language facilities you don't need.

Inline and Virtual

guys. I have read several threads about the interaction between inline and virtual co-existing in one function. In most cases, compilers won't consider it as inline. However, is the principle applied to the scenario when a non-virtual inline member function call a virtual function? say:
class ABC{
public:
void callVirtual(){IAmVitrual();}
protected:
virtual void IAmVirtual();
};
What principle? I would expect the compiler to generate a call to the virtual function. The call (in effect a jump-to-function-pointer) may be inlined but the IAmVirtual function is not.
The virtual function itself is not inline, and it is not called with qualification needed to inline it even if it were, so it can't be inlined.
The whole point of virtual functions is that the compiler generally doesn't know which of the derived class implementations will be needed at run-time, or even if extra derived classes will be dynamically loaded from shared libraries. So, in general, it's impossible to inline. The one case that the compiler can inline is when it happens to know for sure which type it's dealing with because it can see the concrete type in the code and soon afterwards - with no chance of the type having changed - see the call to the virtual function. Even then, it's not required to try to optimise or inline, it's just the only case where it's even possible.
You shouldn't try to fight this unless the profiler's proven the virtual calls are killing you. Then, first try to group a bunch of operations so one virtual call can do more work for you. If virtual dispatch is still just too slow, consider maintaining some kind of discriminated union: it's a lot less flexible and cleanly extensible, but can avoid the virtual function call overheads and allow inlining.
All that assumes you really need dynamic dispatch: some programmers and systems over-use virtual functions just because OO was the in thing 20 years ago, or they've used an OO-only language like Java. C++ has a rich selection of compile-time polymorphic mechanisms, including templates.
In your case callVirtual() will be inlined. Any non-virtual function can be a good candidate of being inline (obviously last decision is upto compiler).
Virtual functions have to be looked up in the Virtual Method Table, and as a result the compiler cannot simply move them to be inline. This is generally a runtime look up. An inline function however may call a virtual one and the compiler can put that call (the code to look up the call in the VMT) inline.

If classes with virtual functions are implemented with vtables, how is a class with no virtual functions implemented?

In particular, wouldn't there have to be some kind of function pointer in place anyway?
I think that the phrase "classes with virtual functions are implemented with vtables" is misleading you.
The phrase makes it sound like classes with virtual functions are implemented "in way A" and classes without virtual functions are implemented "in way B".
In reality, classes with virtual functions, in addition to being implemented as classes are, they also have a vtable. Another way to see it is that "'vtables' implement the 'virtual function' part of a class".
More details on how they both work:
All classes (with virtual or non-virtual methods) are structs. The only difference between a struct and a class in C++ is that, by default, members are public in structs and private in classes. Because of that, I'll use the term class here to refer to both structs and classes. Remember, they are almost synonyms!
Data Members
Classes are (as are structs) just blocks of contiguous memory where each member is stored in sequence. Note that some times there will be gaps between members for CPU architectural reasons, so the block can be larger than the sum of its parts.
Methods
Methods or "member functions" are an illusion. In reality, there is no such thing as a "member function". A function is always just a sequence of machine code instructions stored somewhere in memory. To make a call, the processor jumps to that position of memory and starts executing. You could say that all methods and functions are 'global', and any indication of the contrary is a convenient illusion enforced by the compiler.
Obviously, a method acts like it belongs to a specific object, so clearly there is more going on. To tie a particular call of a method (a function) to a specific object, every member method has a hidden argument that is a pointer to the object in question. The member is hidden in that you don't add it to your C++ code yourself, but there is nothing magical about it -- it's very real. When you say this:
void CMyThingy::DoSomething(int arg);
{
// do something
}
The compiler really does this:
void CMyThingy_DoSomething(CMyThingy* this, int arg)
{
/do something
}
Finally, when you write this:
myObj.doSomething(aValue);
the compiler says:
CMyThingy_DoSomething(&myObj, aValue);
No need for function pointers anywhere! The compiler knows already which method you are calling so it calls it directly.
Static methods are even simpler. They don't have a this pointer, so they are implemented exactly as you write them.
That's is! The rest is just convenient syntax sugaring: The compiler knows which class a method belongs to, so it makes sure it doesn't let you call the function without specifying which one. It also uses that knowledge to translates myItem to this->myItem when it's unambiguous to do so.
(yeah, that's right: member access in a method is always done indirectly via a pointer, even if you don't see one)
(Edit: Removed last sentence and posted separately so it can be criticized separately)
Non virtual member functions are really just a syntactic sugar as they are almost like an ordinary function but with access checking and an implicit object parameter.
struct A
{
void foo ();
void bar () const;
};
is basically the same as:
struct A
{
};
void foo (A * this);
void bar (A const * this);
The vtable is needed so that we call the right function for our specific object instance. For example, if we have:
struct A
{
virtual void foo ();
};
The implementation of 'foo' might approximate to something like:
void foo (A * this) {
void (*realFoo)(A *) = lookupVtable (this->vtable, "foo");
(realFoo)(this); // Make the call to the most derived version of 'foo'
}
The virtual methods are required when you want to use polymorphism. The virtual modifier puts the method in the VMT for late binding and then at runtime is decided which method from which class is executed.
If the method is not virtual - it is decided at compile time from which class instance will it be executed.
Function pointers are used mostly for callbacks.
If a class with a virtual function is implemented with a vtable, then a class with no virtual function is implemented without a vtable.
A vtable contains the function pointers needed to dispatch a call to the appropriate method. If the method isn't virtual, the call goes to the class's known type, and no indirection is needed.
For a non-virtual method the compiler can generate a normal function invocation (e.g., CALL to a particular address with this pointer passed as a parameter) or even inline it. For a virtual function, the compiler doesn't usually know at compile time at which address to invoke the code, therefore it generates code that looks up the address in the vtable at runtime and then invokes the method. True, even for virtual functions the compiler can sometimes correctly resolve the right code at compile time (e.g., methods on local variables invoked without a pointer/reference).
(I pulled this section from my original answer so that it can be criticized separately. It is a lot more concise and to the point of your question, so in a way it's a much better answer)
No, there are no function pointers; instead, the compiler turns the problem inside-out.
The compiler calls a global function with a pointer to the object instead of calling some pointed-to function inside the object
Why? Because it's usually a lot more efficient that way. Indirect calls are expensive instructions.
There's no need for function pointers as it cant change during the runtime.
Branches are generated directly to the compiled code for the methods; just like if you have functions that aren't in a class at all, branches are generated straight to them.
The compiler/linker links directly which methods will be invoked. No need for a vtable indirection. BTW, what does that have to do with "stack vs. heap"?