Related
I am trying to understand the concept of virtual functions. According to Wikipedia:
In short, a virtual function defines a target function to be executed,
but the target might not be known at compile time.
My question is, how is this different from conditional execution?
void conditional_func(func_to_run) {
switch(func_to_run) {
case func1_tag: func1(); break;
case func1_tag: func1(); break;
...
}
}
int main() {
read func_to_run
conditional_func(func_to_run)
}
As you can see the ultimate target of the conditional_func is not known at runtime.
In C++, it seems that virtual function is defined as a facility to allow for "polymorphism". My definition of polymorphism: A polymorphic class is a class with objects that can have different forms (morphology) as opposed to having a static form. That is, the objects can have different actions and properties based on their subclass. (I'm avoiding mention of language specific concepts like pointers in my definition)
Therefore what is called a virtual function in C++ does not even have to be dependent on dynamic binding (runtime resolution of the target function), but can have a known target at compile time:
int main()
{
Drived d;
Base *bPtr = &d;
bPtr->func();
}
In the above example, the compiler knows that the Base pointer is pointing at a Derived object, and therefore will know the target address for the version of func to run. Therefore my conclusion is that what Wikipedia refers to as a virtual function, is the same as C++ virtual functions that are for some reason dynamically bound:
int main()
{
Drived1 d1;
Drived2 d2;
read val;
if (val == 1) Base *bPtr = &d1;
else Base *bPtr = &d2;
bPtr->func();
}
As you can see this is also just conditional execution. So here are my questions:
1) If virtual function is defined as a function with unknown target at compile time, how is this different than conditional execution? Are they the same in assembly level but different at higher layers of abstraction?
2) If virtual function is defined as a facility to allow for polymorphism as defined above, then does it mean that again it is only a concept of higher level languages?
1) If virtual function is defined as a function with unknown target at compile time, how is this different than conditional execution? Are they the same in assembly level but different at higher layers of abstraction?
At the assembly/machine code level, virtual functions are typically implemented as class-specific tables of function pointers (known as Virtual Dispatch Tables or VDTs), with each object of those types having a pointer to its class's table. The layout of these tables is consistent across base and derived classes such that given a pointer to any object in the heirarchy, the function pointer for any given virtual function is always at the same position in all the classes' VDTs. This means the same machine code can take the object pointer and find the function to call.
A difference from the type of switch based code you illustrate in that all code with such switches would need to be manually updated and recompiled to support more types. With function pointers, new code for new types can simply be linked to existing code that works via pointers, without the latter being changed or recompiled.
2) If virtual function is defined as a facility to allow for polymorphism as defined above, then does it mean that again it is only a concept of higher level languages?
Firstly, your attempt to define polymorphism is not consistent with C++ terminology. You've got:
A polymorphic class is a class with objects that can have different forms (morphology) as opposed to having a static form. That is, the objects can have different actions and properties based on their subclass.
It'd be closer to the truth in C++ to say that any given class has one form, and it's different classes in an inheritance heirarchy that may have different forms / actions / properties.
Onwards. At the machine code level, you can - obviously given C++ has to output machine code - use function pointers and get the same runtime effect as virtual functions.
What virtual functions add is the convenience and reliability of having the compiler doing much of the work for you:
create virtual-dispatch tables,
ensuring consistent ordering of function pointers,
giving objects an implicit pointers to these tables and initialising it reliably in the first non-abstract-base's constructor and updating it as construction bubbles down the class heirarchy to the actual object's runtime type, then reversing the value of the pointer as destructors kick in,
checking that overrides have the same function signature as the virtual functions they override,
optionally optimising away runtime dispatch when the called function can be deduced at compile time.
Such assurances and compiler-generated actions makes C++-style virtual functions and dispatch a higher level language feature than programmer-coordinated use of function pointers, let alone switches on runtime type. That said, there's nothing in particular stopping someone adding such support to an assembly language. That said, many languages that are on balance even higher level than C++ lack anything similar to virtual functions. (At the extreme, a 5GL may not even expose a notion of functions to the "programmer"/user).
I've recently read about the Dynamic Dispatch on Wikipedia and couldn't understand the difference between dynamic dispatch and late binding in C++.
When each one of the mechanisms is used?
The exact quote from Wikipedia:
Dynamic dispatch is different from late binding (also known as dynamic binding). In the context of selecting an operation, binding refers to the process of associating a name with an operation. Dispatching refers to choosing an implementation for the operation after you have decided which operation a name refers to. With dynamic dispatch, the name may be bound to a polymorphic operation at compile time, but the implementation not be chosen until runtime (this is how dynamic dispatch works in C++). However, late binding does imply dynamic dispatching since you cannot choose which implementation of a polymorphic operation to select until you have selected the operation that the name refers to.
A fairly decent answer to this is actually incorporated into a question on late vs. early binding on programmers.stackexchange.com.
In short, late binding refers to the object-side of an eval, dynamic dispatch refers to the functional-side. In late binding the type of a variable is the variant at runtime. In dynamic-dispatch, the function or subroutine being executed is the variant.
In C++, we don't really have late binding because the type is known (not necessarily the end of the inheritance hierarchy, but at least a formal base class or interface). But we do have dynamic dispatch via virtual methods and polymorphism.
The best example I can offer for late-binding is the untyped "object" in Visual Basic. The runtime environment does all the late-binding heavy lifting for you.
Dim obj
- initialize object then..
obj.DoSomething()
The compiler will actually code the appropriate execution context for the runtime-engine to perform a named lookup of the method called DoSomething, and if discovered with the properly matching parameters, actually execute the underlying call. In reality, something about the type of the object is known (it inherits from IDispatch and supports GetIDsOfNames(), etc). but as far as the language is concerned the type of the variable is utterly unknown at compile time, and it has no idea if DoSomething is even a method for whatever obj actually is until runtime reaches the point of execution.
I won't bother dumping a C++ virtual interface et'al, as I'm confident you already know what they look like. I hope it is obvious that the C++ language simply can't do this. It is strongly-typed. It can (and does, obviously) do dynamic dispatch via the polymorphic virtual method feature.
In C++, both are same.
In C++, there are two kinds of binding:
static binding — which is done at compile-time.
dynamic binding — which is done at runtime.
Dynamic binding, since it is done at runtime, is also referred to as late binding and static binding is sometime referred to as early binding.
Using dynamic-binding, C++ supports runtime-polymorphism through virtual functions (or function pointers), and using static-binding, all other functions calls are resolved.
Late binding is calling a method by name during runtime.
You don't really have this in c++, except for importing methods from a DLL.
An example for that would be: GetProcAddress()
With dynamic dispatch, the compiler has enough information to call the right implementation of the method. This is usually done by creating a virtual table.
The link itself explained the difference:
Dynamic dispatch is different from late binding (also known as dynamic binding). In the context of selecting an operation, binding refers to the process of associating a name with an operation. Dispatching refers to choosing an implementation for the operation after you have decided which operation a name refers to.
and
With dynamic dispatch, the name may be bound to a polymorphic operation at compile time, but the implementation not be chosen until runtime (this is how dynamic dispatch works in C++). However, late binding does imply dynamic dispatching since you cannot choose which implementation of a polymorphic operation to select until you have selected the operation that the name refers to.
But they're mostly equal in C++ you can do a dynamic dispatch by virtual functions and vtables.
C++ uses early binding and offers both dynamic and static dispatch. The default form of dispatch is static. To get dynamic dispatch you must declare a method as virtual.
Binding refers to the process of associating a name with an operation.
the main thing here is function parameters these decides which function to call at runtime
Dispatching refers to choosing an implementation for the operation after you have decided which operation a name refers to.
dispatch control to that according to parameter match
http://en.wikipedia.org/wiki/Dynamic_dispatch
hope this help you
Let me give you an example of the differences because they are NOT the same. Yes, dynamic dispatch lets you choose the correct method when you are referring to an object by a superclass, but that magic is very specific to that class hierarchy, and you have to do some declarations in the base class to make it work (abstract methods fill out the vtables since the index of the method in the table cant change between specific types). So, you can call methods in Tabby and Lion and Tiger all by a generic Cat pointer and even have arrays of Cats filled with Lions and Tigers and Tabbys. It knows what indexes those methods refer to in the object's vtable at compile-time (static/early binding), even though the method is selected at run-time (dynamic dispatch).
Now, lets implement an array that contains Lions and Tigers and Bears! ((Oh My!)). Assuming we don't have a base class called Animal, in C++, you are going to have significant work to do to because the compiler isn't going to let you do any dynamic dispatch without a common base class. The indexes for the vtables need to match up, and that can't be done between unreleated classes. You'd need to have a vtable big enough to hold the virtual methods of all classes in the system. C++ programmers rarely see this as a limitation because you have been trained to think a certain way about class design. I'm not saying its better or worse.
With late binding, the run-time takes care of this without a common base class. There is normally a hash table system used to find methods in the classes with a cache system used in the dispatcher. Where in C++, the compiler knows all the types. In a late-bound language, the objects themselves know their type (its not typeless, the objects themselves know exactly who they are in most cases). This means I can have arrays of multiple types of objects if I want (Lions and Tigers and Bears). And you can implement message forwarding and prototyping (allows behaviors to be changed per object without changing the class) and all sorts of other things in ways that are much more flexible and lead to less code overhead than in languages that don't support late binding.
Ever program in Android and use findViewById()? You almost always end up casting the result to get the right type, and casting is basically lying to the compiler and giving up all the static type-checking goodness that is supposed to make static languages superior. Of course, you could instead have findTextViewById(), findEditTextById(), and a million others so that your return types match, but that is throwing polymorphism out the window; arguably the whole basis of OOP. A late-bound language would probably let you simply index by an ID, and treat it like a hash table and not care what the type was being indexed nor returned.
Here's another example. Let's say that you have your Lion class and its default behavior is to eat you when you see it. In C++, if you wanted to have a single "trained" lion, you need to make a new subclass. Prototyping would let you simply change the one or two methods of that particular Lion that need to be changed. It's class and type don't change. C++ can't do that. This is important since when you have a new "AfricanSpottedLion" that inherits from Lion, you can train it too. The prototyping doesn't change the class structure so it can be expanded. This is normally how these languages handle issues that normally require multiple inheritance, or perhaps multiple inheritance is how you handle a lack of prototyping.
FYI, Objective-C is C with SmallTalk's message passing added and SmallTalk is the original OOP, and both are late bound with all the features above and more. Late bound languages may be slightly slower from a micro-level standpoint, but can often allow the code to structured in a way that is more efficient at a macro-level, and it all boils down to preference.
Given that wordy Wikipedia definition I'd be tempted to classify dynamic dispatch as the late binding of C++
struct Base {
virtual void foo(); // Dynamic dispatch according to Wikipedia definition
void bar(); // Static dispatch according to Wikipedia definition
};
Late binding instead, for Wikipedia, seems to mean pointer-to-member dispatch of C++
(this->*mptr)();
where the selection of what is the operation being invoked (and not just which implementation) is done at runtime.
In C++ literature however late binding is normally used for what Wikipedia calls dynamic dispatch.
Dynamic dispatch is what happens when you use the virtual keyword in C++. So for example:
struct Base
{
virtual int method1() { return 1; }
virtual int method2() { return 2; } // not overridden
};
struct Derived : public Base
{
virtual int method1() { return 3; }
}
int main()
{
Base* b = new Derived;
std::cout << b->method1() << std::endl;
}
will print 3, because the method has been dynamically dispatched. The C++ standard is very careful not to specify how exactly this happens behind the scenes, but every compiler under the sun does it in the same way. They create a table of function pointers for each polymorphic type (called the virtual table or vtable), and when you call a virtual method, the "real" method is looked up from the vtable, and that version is called. So you can imaging something like this pseudocode:
struct BaseVTable
{
int (*_method1) () = &Base::method1; // real function address
int (*_method2) () = &Base::method2;
};
struct DerivedVTable
{
int (*method1) () = &Derived::method1; //overriden
int (*method2) () = &Base::method2; // not overridden
};
In this way, the compiler can be sure that a method with a particular signature exists at compile time. However, at run-time, the call might actually be dispatched via the vtable to a different function. Calls to virtual functions are a tiny bit slower than non-virtual calls, because of the extra indirection step.
On the other hand, my understanding of the term late binding is that the function pointer is looked up by name at runtime, from a hash table or something similar. This is the way things are done in Python, JavaScript and (if memory serves) Objective-C. This makes it possible to add new methods to a class at run-time, which cannot directly be done in C++. This is particularly useful for implementing things like mixins. However, the downside is that the run-time lookup is generally considerably slower than even a virtual call in C++, and the compiler is not able to perform any compile-time type checking for the newly-added methods.
This question might help you.
Dynamic dispatch generally refers to multiple dispatch.
Consider the below example. I hope it might help you.
class Base2;
class Derived2; //Derived2 class is child of Base2
class Base1 {
public:
virtual void function1 (Base2 *);
virtual void function1 (Derived2 *);
}
class Derived1: public Base1 {
public:
//override.
virtual void function1(Base2 *);
virtual void function1(Derived2 *);
};
Consider the case of below.
Derived1 * d = new Derived1;
Base2 * b = new Derived2;
//Now which function1 will be called.
d->function1(b);
It will call function1 taking Base2* not Derived2*. This is due to lack of dynamic multiple dispatch.
Late binding is one of the mechanism to implement dynamic single dispatch.
I suppose the meaning is when you have two classes B,C inherits the same father class A. so, pointer of the father (type A) can hold each of sons types. The compiler cannot know what the type holds in the pointer in certain time, because it can change during the program run.
There is special functions to determine what the type of certain object in certain time. like instanceof in java, or by if(typeid(b) == typeid(A))... in c++.
In C++, both dynamic dispatch and late binding is the same. Basically, the value of a single object determines the piece of code invoked at runtime. In languages like C++ and java dynamic dispatch is more specifically dynamic single dispatch which works as mentioned above. In this case, since the binding occurs at runtime, it is also called late binding. Languages like smalltalk allow dynamic multiple dispatch in which the runtime method is chosen at runtime based on the identities or values of more than one object.
In C++ we dont really have late binding, because the type information is known. Thus in the C++ or Java context, dynamic dispatch and late binding are the same. Actual/fully late binding, I think is in languages like python which is a method-based lookup rather than type based.
I've recently read about the Dynamic Dispatch on Wikipedia and couldn't understand the difference between dynamic dispatch and late binding in C++.
When each one of the mechanisms is used?
The exact quote from Wikipedia:
Dynamic dispatch is different from late binding (also known as dynamic binding). In the context of selecting an operation, binding refers to the process of associating a name with an operation. Dispatching refers to choosing an implementation for the operation after you have decided which operation a name refers to. With dynamic dispatch, the name may be bound to a polymorphic operation at compile time, but the implementation not be chosen until runtime (this is how dynamic dispatch works in C++). However, late binding does imply dynamic dispatching since you cannot choose which implementation of a polymorphic operation to select until you have selected the operation that the name refers to.
A fairly decent answer to this is actually incorporated into a question on late vs. early binding on programmers.stackexchange.com.
In short, late binding refers to the object-side of an eval, dynamic dispatch refers to the functional-side. In late binding the type of a variable is the variant at runtime. In dynamic-dispatch, the function or subroutine being executed is the variant.
In C++, we don't really have late binding because the type is known (not necessarily the end of the inheritance hierarchy, but at least a formal base class or interface). But we do have dynamic dispatch via virtual methods and polymorphism.
The best example I can offer for late-binding is the untyped "object" in Visual Basic. The runtime environment does all the late-binding heavy lifting for you.
Dim obj
- initialize object then..
obj.DoSomething()
The compiler will actually code the appropriate execution context for the runtime-engine to perform a named lookup of the method called DoSomething, and if discovered with the properly matching parameters, actually execute the underlying call. In reality, something about the type of the object is known (it inherits from IDispatch and supports GetIDsOfNames(), etc). but as far as the language is concerned the type of the variable is utterly unknown at compile time, and it has no idea if DoSomething is even a method for whatever obj actually is until runtime reaches the point of execution.
I won't bother dumping a C++ virtual interface et'al, as I'm confident you already know what they look like. I hope it is obvious that the C++ language simply can't do this. It is strongly-typed. It can (and does, obviously) do dynamic dispatch via the polymorphic virtual method feature.
In C++, both are same.
In C++, there are two kinds of binding:
static binding — which is done at compile-time.
dynamic binding — which is done at runtime.
Dynamic binding, since it is done at runtime, is also referred to as late binding and static binding is sometime referred to as early binding.
Using dynamic-binding, C++ supports runtime-polymorphism through virtual functions (or function pointers), and using static-binding, all other functions calls are resolved.
Late binding is calling a method by name during runtime.
You don't really have this in c++, except for importing methods from a DLL.
An example for that would be: GetProcAddress()
With dynamic dispatch, the compiler has enough information to call the right implementation of the method. This is usually done by creating a virtual table.
The link itself explained the difference:
Dynamic dispatch is different from late binding (also known as dynamic binding). In the context of selecting an operation, binding refers to the process of associating a name with an operation. Dispatching refers to choosing an implementation for the operation after you have decided which operation a name refers to.
and
With dynamic dispatch, the name may be bound to a polymorphic operation at compile time, but the implementation not be chosen until runtime (this is how dynamic dispatch works in C++). However, late binding does imply dynamic dispatching since you cannot choose which implementation of a polymorphic operation to select until you have selected the operation that the name refers to.
But they're mostly equal in C++ you can do a dynamic dispatch by virtual functions and vtables.
C++ uses early binding and offers both dynamic and static dispatch. The default form of dispatch is static. To get dynamic dispatch you must declare a method as virtual.
Binding refers to the process of associating a name with an operation.
the main thing here is function parameters these decides which function to call at runtime
Dispatching refers to choosing an implementation for the operation after you have decided which operation a name refers to.
dispatch control to that according to parameter match
http://en.wikipedia.org/wiki/Dynamic_dispatch
hope this help you
Let me give you an example of the differences because they are NOT the same. Yes, dynamic dispatch lets you choose the correct method when you are referring to an object by a superclass, but that magic is very specific to that class hierarchy, and you have to do some declarations in the base class to make it work (abstract methods fill out the vtables since the index of the method in the table cant change between specific types). So, you can call methods in Tabby and Lion and Tiger all by a generic Cat pointer and even have arrays of Cats filled with Lions and Tigers and Tabbys. It knows what indexes those methods refer to in the object's vtable at compile-time (static/early binding), even though the method is selected at run-time (dynamic dispatch).
Now, lets implement an array that contains Lions and Tigers and Bears! ((Oh My!)). Assuming we don't have a base class called Animal, in C++, you are going to have significant work to do to because the compiler isn't going to let you do any dynamic dispatch without a common base class. The indexes for the vtables need to match up, and that can't be done between unreleated classes. You'd need to have a vtable big enough to hold the virtual methods of all classes in the system. C++ programmers rarely see this as a limitation because you have been trained to think a certain way about class design. I'm not saying its better or worse.
With late binding, the run-time takes care of this without a common base class. There is normally a hash table system used to find methods in the classes with a cache system used in the dispatcher. Where in C++, the compiler knows all the types. In a late-bound language, the objects themselves know their type (its not typeless, the objects themselves know exactly who they are in most cases). This means I can have arrays of multiple types of objects if I want (Lions and Tigers and Bears). And you can implement message forwarding and prototyping (allows behaviors to be changed per object without changing the class) and all sorts of other things in ways that are much more flexible and lead to less code overhead than in languages that don't support late binding.
Ever program in Android and use findViewById()? You almost always end up casting the result to get the right type, and casting is basically lying to the compiler and giving up all the static type-checking goodness that is supposed to make static languages superior. Of course, you could instead have findTextViewById(), findEditTextById(), and a million others so that your return types match, but that is throwing polymorphism out the window; arguably the whole basis of OOP. A late-bound language would probably let you simply index by an ID, and treat it like a hash table and not care what the type was being indexed nor returned.
Here's another example. Let's say that you have your Lion class and its default behavior is to eat you when you see it. In C++, if you wanted to have a single "trained" lion, you need to make a new subclass. Prototyping would let you simply change the one or two methods of that particular Lion that need to be changed. It's class and type don't change. C++ can't do that. This is important since when you have a new "AfricanSpottedLion" that inherits from Lion, you can train it too. The prototyping doesn't change the class structure so it can be expanded. This is normally how these languages handle issues that normally require multiple inheritance, or perhaps multiple inheritance is how you handle a lack of prototyping.
FYI, Objective-C is C with SmallTalk's message passing added and SmallTalk is the original OOP, and both are late bound with all the features above and more. Late bound languages may be slightly slower from a micro-level standpoint, but can often allow the code to structured in a way that is more efficient at a macro-level, and it all boils down to preference.
Given that wordy Wikipedia definition I'd be tempted to classify dynamic dispatch as the late binding of C++
struct Base {
virtual void foo(); // Dynamic dispatch according to Wikipedia definition
void bar(); // Static dispatch according to Wikipedia definition
};
Late binding instead, for Wikipedia, seems to mean pointer-to-member dispatch of C++
(this->*mptr)();
where the selection of what is the operation being invoked (and not just which implementation) is done at runtime.
In C++ literature however late binding is normally used for what Wikipedia calls dynamic dispatch.
Dynamic dispatch is what happens when you use the virtual keyword in C++. So for example:
struct Base
{
virtual int method1() { return 1; }
virtual int method2() { return 2; } // not overridden
};
struct Derived : public Base
{
virtual int method1() { return 3; }
}
int main()
{
Base* b = new Derived;
std::cout << b->method1() << std::endl;
}
will print 3, because the method has been dynamically dispatched. The C++ standard is very careful not to specify how exactly this happens behind the scenes, but every compiler under the sun does it in the same way. They create a table of function pointers for each polymorphic type (called the virtual table or vtable), and when you call a virtual method, the "real" method is looked up from the vtable, and that version is called. So you can imaging something like this pseudocode:
struct BaseVTable
{
int (*_method1) () = &Base::method1; // real function address
int (*_method2) () = &Base::method2;
};
struct DerivedVTable
{
int (*method1) () = &Derived::method1; //overriden
int (*method2) () = &Base::method2; // not overridden
};
In this way, the compiler can be sure that a method with a particular signature exists at compile time. However, at run-time, the call might actually be dispatched via the vtable to a different function. Calls to virtual functions are a tiny bit slower than non-virtual calls, because of the extra indirection step.
On the other hand, my understanding of the term late binding is that the function pointer is looked up by name at runtime, from a hash table or something similar. This is the way things are done in Python, JavaScript and (if memory serves) Objective-C. This makes it possible to add new methods to a class at run-time, which cannot directly be done in C++. This is particularly useful for implementing things like mixins. However, the downside is that the run-time lookup is generally considerably slower than even a virtual call in C++, and the compiler is not able to perform any compile-time type checking for the newly-added methods.
This question might help you.
Dynamic dispatch generally refers to multiple dispatch.
Consider the below example. I hope it might help you.
class Base2;
class Derived2; //Derived2 class is child of Base2
class Base1 {
public:
virtual void function1 (Base2 *);
virtual void function1 (Derived2 *);
}
class Derived1: public Base1 {
public:
//override.
virtual void function1(Base2 *);
virtual void function1(Derived2 *);
};
Consider the case of below.
Derived1 * d = new Derived1;
Base2 * b = new Derived2;
//Now which function1 will be called.
d->function1(b);
It will call function1 taking Base2* not Derived2*. This is due to lack of dynamic multiple dispatch.
Late binding is one of the mechanism to implement dynamic single dispatch.
I suppose the meaning is when you have two classes B,C inherits the same father class A. so, pointer of the father (type A) can hold each of sons types. The compiler cannot know what the type holds in the pointer in certain time, because it can change during the program run.
There is special functions to determine what the type of certain object in certain time. like instanceof in java, or by if(typeid(b) == typeid(A))... in c++.
In C++, both dynamic dispatch and late binding is the same. Basically, the value of a single object determines the piece of code invoked at runtime. In languages like C++ and java dynamic dispatch is more specifically dynamic single dispatch which works as mentioned above. In this case, since the binding occurs at runtime, it is also called late binding. Languages like smalltalk allow dynamic multiple dispatch in which the runtime method is chosen at runtime based on the identities or values of more than one object.
In C++ we dont really have late binding, because the type information is known. Thus in the C++ or Java context, dynamic dispatch and late binding are the same. Actual/fully late binding, I think is in languages like python which is a method-based lookup rather than type based.
Here is what I am talking about
// some guy wrote this, used as a Policy with templates
struct MyWriter {
void write(std::vector<char> const& data) {
// ...
}
};
In some existing code, the people did not use templates, but interfaces+type-erasure
class IWriter {
public:
virtual ~IWriter() {}
public:
virtual void write(std::vector<char> const& data) = 0;
};
Someone else wanted to be usable with both approaches and writes
class MyOwnClass: private MyWriter, public IWriter {
// other stuff
};
MyOwnClass is implemented-in-terms-of MyWriter. Why doesn't MyOwnClass' inherited member functions implement the interface of IWriter automatically? Instead the user has to write forwarding functions that do nothing but call the base class versions, as in
class MyOwnClass: private MyWriter, public IWriter {
public:
void write(std::vector<char> const& data) {
MyWriter::write(data);
}
};
I know that in Java when you have a class that implements an interface and derives from a class that happens to have suitable methods, that base class automatically implements the interface for the derived class.
Why doesn't C++ do that? It seems like a natural thing to have.
This is multiple inheritance, and there are two inherited functions with the same signature, both of which have implementation. That's where C++ is different from Java.
Calling write on an expression whose static type is MyBigClass would therefore be ambiguous as to which of the inherited functions was desired.
If write is only called through base class pointers, then defining write in the derived class is NOT necessary, contrary to the claim in the question. Now that the question changed to include a pure specifier, implementing that function in the derived class is necessary to make the class concrete and instantiable.
MyWriter::write cannot be used for the virtual call mechanism of MyBigClass, because the virtual call mechanism requires a function that accepts an implicit IWriter* const this, and MyWriter::write accepts an implicit MyWriter* const this. A new function is required, which must take into account the address difference between the IWriter subobject and the MyWriter subobject.
It would be theoretically possible for the compiler to create this new function automatically, but it would be fragile, since a change in a base class could suddenly cause a new function to be chosen for forwarding. It's less fragile in Java, where only single inheritance is possible (there's only one choice for what function to forward to), but in C++, which supports full multiple inheritance, the choice is ambiguous, and we haven't even started on diamond inheritance or virtual inheritance yet.
Actually, this problem (difference between subobject addresses) is solved for virtual inheritance. But it requires additional overhead that's not necessary most of the time, and a C++ guiding principle is "you don't pay for what you don't use".
Why doesn't C++ do that? It seems like a natural thing to have.
Actually, no, it is extremely unnatural thing to have.
Please note that my reasoning is based on my own understanding of "common sense" and can be fundamentally flawed as a result.
You see, you have two different methods, first one in MyWriter, which is non virtual and second one in IWriter which is virtual. They are completely different despite "looking" similar.
I suggest to check this question. The good thing about non-virtual methods is that no matter what you do, as long as they don't call virtual methods, their behavior will never change. I.e. somebody deriving from your class with non-virtual methods will not break existing method by masking them. Virtual methods are designed to be overriden. The price of that is that it is possible to break underlying logic by improperly overriding virtual method. And this is a root of your problem.
Let's say what you propose is allowed. (automatic conversion to virtual with multiple inheritance) There two possible solutions:
Solution #1
MyWriter becomes virtual. Consequences: All existing C++ code in the world becomes easy to break via typo or name clash. MyWriter method was not supposed to be overriden initially, so suddenly turning it into virtual will (murphy's law) break underlying logic of MyWriter class when somebody derives from MyOwnClass. Which means that suddenly making MyWriter::write virtual is a bad idea.
Soluion #2
MyWriter remains static BUUUT it is included temporarily as a virtual method into IWriter, until overriden. At first glance there's nothing to worry about, but let's think about it. IWriter implements some kind of concept you had in mind, and it is supposed to do something. MyWriter implements another concept. To assign MyWriter::write as IWriter::write method you need two guarantees:
Compiler must ensure that MyWriter::write does what IWriter::write() is supposed to do.
Compiler must ensure that calling MyWriter::write from IWriter will not break existing functionality in MyWriter code programmer expects to use elsewhere.
So, the thing is that compiler cannot guarantee that. Functions have similar name and argument list, but by Murphy's law that means that they're prbably doing completely different thing. (sinf and cosf have same argument list, for example), and it is unlikely that compiler will be able to predict the future and make sure that at no point in development will MyWriter be changed in such way that it will become incompatible with IWriter. So, since machine can't make reasonable decision (no AI for that) by itself, it has to ask YOU, programmer - "What is it you wish to do?". And you say "redirect virtual method into MyWriter::write(). It totally won't break anything. I think.".
And that's why you must specify which method you want to use manually....
Doing it automatically would be unintuitive and surprising. C++ does not assume that multiple base classes are related to each other, and protects the user against name collisions between their members by defining nested name specifiers for nonstatic members. Adding implicit declarations to MyOwnClass where signatures from IWriter and MyWriter collide would be antithetical to protecting names.
However, C++11 extensions do bring us closer. Consider this:
class MyOwnClass: private MyWriter, public IWriter {
public:
void write(std::vector<char> const& data) final = MyWriter::write;
};
This mechanism would be safe because it expresses that MyWriter doesn't expect any further overrides, and convenient because it names the function signature that will be "joined" but nothing more. Also, final would be ill-formed if the function weren't implicitly virtual, so it checks that the signature matches the virtual interface.
On one hand, most interfaces don't just happen to match up this way. Defining this feature to work only with identical signatures would be safe but rarely useful. Defining it as a shortcut to a delegating function body would be useful but fragile. So it might not really be a good feature
On the other hand, this is a good design pattern to provide functionality which isn't virtual when you don't need it to be. So given this idiom, we might use it to write good code, even if it doesn't match up well with current practices.
Why doesn't C++ do that?
I'm not sure what you're asking here. Could C++ be rewritten to allow this? Yes, but to what end?
Because MyWriter and IWriter are completely different classes, it is illegal in C++ to call a member of MyWriter through an instance of IWriter. The member pointers have completely different types. And just as a MyWriter* is not convertible to a IWriter*, neither is a void (MyWriter::*)(const std::vector<char>&) convertible to a void (IWriter::*)(const std::vector<char>&).
The rules of C++ don't change just because there could be a third class that combines the two. Neither class is a direct parent/child relative of one another. Therefore, they are treated as entirely distinct classes.
Remember: member functions always take an additional parameter: a this pointer to the object that they point to. You cannot call void (MyWriter::*)(const std::vector<char>&) on an IWriter*. The third class can have a method that casts itself into the proper base class, but it must actually have this method. So either you or the C++ compiler must create it. The rules of C++ require this.
Consider what would have to happen to make this work without a derived-class method.
A function gets an IWriter*. The user calls the write member of it, using nothing more than the IWriter* pointer. So... exactly how can the compiler generate the code to call MyWriter::writer? Remember: MyWriter::writer needs a MyWriter instance. And there is no relationship between IWriter and MyWriter.
So how exactly could the compiler do the type coercion locally? The compiler would have to check the virtual function to see if the actual function to be called takes IWriter or some other type. If it takes another type, it would have to convert the pointer to its true type, then do another conversion to the type needed by the virtual function. After doing all of that, it would then be able to make the call.
All of this overhead would affect every virtual call. All of them would have to at least check to see if the actual function to be call. Every call will also have to generate the code to do the type conversions, just in case.
Every virtual function call would have a "get type" and conditional branch in it. Even if it is never possible to trigger that branch. So you would be paying for something regardless of whether you use it or not. That's not the C++ way.
Even worse, a straight v-table implementation of virtual calls is no longer possible. The fastest method of doing virtual dispatch would not be a conforming implementation. The C++ committee is not going to make any change that would make such implementations impossible.
Again, to what end? Just so that you don't have to write a simple forwarding function?
Just make MyWriter derive from IWriter, eliminate the IWriter derivation in MyOwnClass, and move on with life. This should resolve the problem and should not interfere with the template code.
Here is what I am talking about
// some guy wrote this, used as a Policy with templates
struct MyWriter {
void write(std::vector<char> const& data) {
// ...
}
};
In some existing code, the people did not use templates, but interfaces+type-erasure
class IWriter {
public:
virtual ~IWriter() {}
public:
virtual void write(std::vector<char> const& data) = 0;
};
Someone else wanted to be usable with both approaches and writes
class MyOwnClass: private MyWriter, public IWriter {
// other stuff
};
MyOwnClass is implemented-in-terms-of MyWriter. Why doesn't MyOwnClass' inherited member functions implement the interface of IWriter automatically? Instead the user has to write forwarding functions that do nothing but call the base class versions, as in
class MyOwnClass: private MyWriter, public IWriter {
public:
void write(std::vector<char> const& data) {
MyWriter::write(data);
}
};
I know that in Java when you have a class that implements an interface and derives from a class that happens to have suitable methods, that base class automatically implements the interface for the derived class.
Why doesn't C++ do that? It seems like a natural thing to have.
This is multiple inheritance, and there are two inherited functions with the same signature, both of which have implementation. That's where C++ is different from Java.
Calling write on an expression whose static type is MyBigClass would therefore be ambiguous as to which of the inherited functions was desired.
If write is only called through base class pointers, then defining write in the derived class is NOT necessary, contrary to the claim in the question. Now that the question changed to include a pure specifier, implementing that function in the derived class is necessary to make the class concrete and instantiable.
MyWriter::write cannot be used for the virtual call mechanism of MyBigClass, because the virtual call mechanism requires a function that accepts an implicit IWriter* const this, and MyWriter::write accepts an implicit MyWriter* const this. A new function is required, which must take into account the address difference between the IWriter subobject and the MyWriter subobject.
It would be theoretically possible for the compiler to create this new function automatically, but it would be fragile, since a change in a base class could suddenly cause a new function to be chosen for forwarding. It's less fragile in Java, where only single inheritance is possible (there's only one choice for what function to forward to), but in C++, which supports full multiple inheritance, the choice is ambiguous, and we haven't even started on diamond inheritance or virtual inheritance yet.
Actually, this problem (difference between subobject addresses) is solved for virtual inheritance. But it requires additional overhead that's not necessary most of the time, and a C++ guiding principle is "you don't pay for what you don't use".
Why doesn't C++ do that? It seems like a natural thing to have.
Actually, no, it is extremely unnatural thing to have.
Please note that my reasoning is based on my own understanding of "common sense" and can be fundamentally flawed as a result.
You see, you have two different methods, first one in MyWriter, which is non virtual and second one in IWriter which is virtual. They are completely different despite "looking" similar.
I suggest to check this question. The good thing about non-virtual methods is that no matter what you do, as long as they don't call virtual methods, their behavior will never change. I.e. somebody deriving from your class with non-virtual methods will not break existing method by masking them. Virtual methods are designed to be overriden. The price of that is that it is possible to break underlying logic by improperly overriding virtual method. And this is a root of your problem.
Let's say what you propose is allowed. (automatic conversion to virtual with multiple inheritance) There two possible solutions:
Solution #1
MyWriter becomes virtual. Consequences: All existing C++ code in the world becomes easy to break via typo or name clash. MyWriter method was not supposed to be overriden initially, so suddenly turning it into virtual will (murphy's law) break underlying logic of MyWriter class when somebody derives from MyOwnClass. Which means that suddenly making MyWriter::write virtual is a bad idea.
Soluion #2
MyWriter remains static BUUUT it is included temporarily as a virtual method into IWriter, until overriden. At first glance there's nothing to worry about, but let's think about it. IWriter implements some kind of concept you had in mind, and it is supposed to do something. MyWriter implements another concept. To assign MyWriter::write as IWriter::write method you need two guarantees:
Compiler must ensure that MyWriter::write does what IWriter::write() is supposed to do.
Compiler must ensure that calling MyWriter::write from IWriter will not break existing functionality in MyWriter code programmer expects to use elsewhere.
So, the thing is that compiler cannot guarantee that. Functions have similar name and argument list, but by Murphy's law that means that they're prbably doing completely different thing. (sinf and cosf have same argument list, for example), and it is unlikely that compiler will be able to predict the future and make sure that at no point in development will MyWriter be changed in such way that it will become incompatible with IWriter. So, since machine can't make reasonable decision (no AI for that) by itself, it has to ask YOU, programmer - "What is it you wish to do?". And you say "redirect virtual method into MyWriter::write(). It totally won't break anything. I think.".
And that's why you must specify which method you want to use manually....
Doing it automatically would be unintuitive and surprising. C++ does not assume that multiple base classes are related to each other, and protects the user against name collisions between their members by defining nested name specifiers for nonstatic members. Adding implicit declarations to MyOwnClass where signatures from IWriter and MyWriter collide would be antithetical to protecting names.
However, C++11 extensions do bring us closer. Consider this:
class MyOwnClass: private MyWriter, public IWriter {
public:
void write(std::vector<char> const& data) final = MyWriter::write;
};
This mechanism would be safe because it expresses that MyWriter doesn't expect any further overrides, and convenient because it names the function signature that will be "joined" but nothing more. Also, final would be ill-formed if the function weren't implicitly virtual, so it checks that the signature matches the virtual interface.
On one hand, most interfaces don't just happen to match up this way. Defining this feature to work only with identical signatures would be safe but rarely useful. Defining it as a shortcut to a delegating function body would be useful but fragile. So it might not really be a good feature
On the other hand, this is a good design pattern to provide functionality which isn't virtual when you don't need it to be. So given this idiom, we might use it to write good code, even if it doesn't match up well with current practices.
Why doesn't C++ do that?
I'm not sure what you're asking here. Could C++ be rewritten to allow this? Yes, but to what end?
Because MyWriter and IWriter are completely different classes, it is illegal in C++ to call a member of MyWriter through an instance of IWriter. The member pointers have completely different types. And just as a MyWriter* is not convertible to a IWriter*, neither is a void (MyWriter::*)(const std::vector<char>&) convertible to a void (IWriter::*)(const std::vector<char>&).
The rules of C++ don't change just because there could be a third class that combines the two. Neither class is a direct parent/child relative of one another. Therefore, they are treated as entirely distinct classes.
Remember: member functions always take an additional parameter: a this pointer to the object that they point to. You cannot call void (MyWriter::*)(const std::vector<char>&) on an IWriter*. The third class can have a method that casts itself into the proper base class, but it must actually have this method. So either you or the C++ compiler must create it. The rules of C++ require this.
Consider what would have to happen to make this work without a derived-class method.
A function gets an IWriter*. The user calls the write member of it, using nothing more than the IWriter* pointer. So... exactly how can the compiler generate the code to call MyWriter::writer? Remember: MyWriter::writer needs a MyWriter instance. And there is no relationship between IWriter and MyWriter.
So how exactly could the compiler do the type coercion locally? The compiler would have to check the virtual function to see if the actual function to be called takes IWriter or some other type. If it takes another type, it would have to convert the pointer to its true type, then do another conversion to the type needed by the virtual function. After doing all of that, it would then be able to make the call.
All of this overhead would affect every virtual call. All of them would have to at least check to see if the actual function to be call. Every call will also have to generate the code to do the type conversions, just in case.
Every virtual function call would have a "get type" and conditional branch in it. Even if it is never possible to trigger that branch. So you would be paying for something regardless of whether you use it or not. That's not the C++ way.
Even worse, a straight v-table implementation of virtual calls is no longer possible. The fastest method of doing virtual dispatch would not be a conforming implementation. The C++ committee is not going to make any change that would make such implementations impossible.
Again, to what end? Just so that you don't have to write a simple forwarding function?
Just make MyWriter derive from IWriter, eliminate the IWriter derivation in MyOwnClass, and move on with life. This should resolve the problem and should not interfere with the template code.