This is not a question about how they work and declared, this I think is pretty much clear to me. The question is about why to implement this?
I suppose the practical reason is to simplify bunch of other code to relate and declare their variables of base type, to handle objects and their specific methods from many other subclasses?
Could this be done by templating and typechecking, like I do it in Objective C? If so, what is more efficient? I find it confusing to declare object as one class and instantiate it as another, even if it is its child.
SOrry for stupid questions, but I havent done any real projects in C++ yet and since I am active Objective C developer (it is much smaller language thus relying heavily on SDK's functionalities, like OSX, iOS) I need to have clear view on any parallel ways of both cousins.
Yes, this can be done with templates, but then the caller must know what the actual type of the object is (the concrete class) and this increases coupling.
With virtual functions the caller doesn't need to know the actual class - it operates through a pointer to a base class, so you can compile the client once and the implementor can change the actual implementation as much as it wants and the client doesn't have to know about that as long as the interface is unchanged.
Virtual functions implement polymorphism. I don't know Obj-C, so I cannot compare both, but the motivating use case is that you can use derived objects in place of base objects and the code will work. If you have a compiled and working function foo that operates on a reference to base you need not modify it to have it work with an instance of derived.
You could do that (assuming that you had runtime type information) by obtaining the real type of the argument and then dispatching directly to the appropriate function with a switch of shorts, but that would require either manually modifying the switch for each new type (high maintenance cost) or having reflection (unavailable in C++) to obtain the method pointer. Even then, after obtaining a method pointer you would have to call it, which is as expensive as the virtual call.
As to the cost associated to a virtual call, basically (in all implementations with a virtual method table) a call to a virtual function foo applied on object o: o.foo() is translated to o.vptr[ 3 ](), where 3 is the position of foo in the virtual table, and that is a compile time constant. This basically is a double indirection:
From the object o obtain the pointer to the vtable, index that table to obtain the pointer to the function and then call. The extra cost compared with a direct non-polymorphic call is just the table lookup. (In fact there can be other hidden costs when using multiple inheritance, as the implicit this pointer might have to be shifted), but the cost of the virtual dispatch is very small.
I don't know the first thing about Objective-C, but here's why you want to "declare an object as one class and instantiate it as another": the Liskov Substitution Principle.
Since a PDF is a document, and an OpenOffice.org document is a document, and a Word Document is a document, it's quite natural to write
Document *d;
if (ends_with(filename, ".pdf"))
d = new PdfDocument(filename);
else if (ends_with(filename, ".doc"))
d = new WordDocument(filename);
else
// you get the point
d->print();
Now, for this to work, print would have to be virtual, or be implemented using virtual functions, or be implemented using a crude hack that reinvents the virtual wheel. The program need to know at runtime which of various print methods to apply.
Templating solves a different problem, where you determine at compile time which of the various containers you're going to use (for example) when you want to store a bunch of elements. If you operate on those containers with template functions, then you don't need to rewrite them when you switch containers, or add another container to your program.
A virtual function is important in inheritance. Think of an example where you have a CMonster class and then a CRaidBoss and CBoss class that inherit from CMonster.
Both need to be drawn. A CMonster has a Draw() function, but the way a CRaidBoss and a CBoss are drawn is different. Thus, the implementation is left to them by utilizing the virtual function Draw.
Well, the idea is simply to allow the compiler to perform checks for you.
It's like a lot of features : ways to hide what you don't want to have to do yourself. That's abstraction.
Inheritance, interfaces, etc. allow you to provide an interface to the compiler for the implementation code to match.
If you didn't have the virtual function mecanism, you would have to write :
class A
{
void do_something();
};
class B : public A
{
void do_something(); // this one "hide" the A::do_something(), it replace it.
};
void DoSomething( A* object )
{
// calling object->do_something will ALWAYS call A::do_something()
// that's not what you want if object is B...
// so we have to check manually:
B* b_object = dynamic_cast<B*>( object );
if( b_object != NULL ) // ok it's a b object, call B::do_something();
{
b_object->do_something()
}
else
{
object->do_something(); // that's a A, call A::do_something();
}
}
Here there are several problems :
you have to write this for each function redefined in a class hierarchy.
you have one additional if for each child class.
you have to touch this function again each time you add a definition to the whole hierarcy.
it's visible code, you can get it wrong easily, each time
So, marking functions virtual does this correctly in an implicit way, rerouting automatically, in a dynamic way, the function call to the correct implementation, depending on the final type of the object.
You dont' have to write any logic so you can't get errors in this code and have an additional thing to worry about.
It's the kind of thing you don't want to bother with as it can be done by the compiler/runtime.
The use of templates is also technically known as polymorphism from theorists. Yep, both are valid approach to the problem. The implementation technics employed will explain better or worse performance for them.
For example, Java implements templates, but through template erasure. This means that it is only apparently using templates, under the surface is plain old polymorphism.
C++ has very powerful templates. The use of templates makes code quicker, though each use of a template instantiates it for the given type. This means that, if you use an std::vector for ints, doubles and strings, you'll have three different vector classes: this means that the size of the executable will suffer.
Related
I've recently read about the Dynamic Dispatch on Wikipedia and couldn't understand the difference between dynamic dispatch and late binding in C++.
When each one of the mechanisms is used?
The exact quote from Wikipedia:
Dynamic dispatch is different from late binding (also known as dynamic binding). In the context of selecting an operation, binding refers to the process of associating a name with an operation. Dispatching refers to choosing an implementation for the operation after you have decided which operation a name refers to. With dynamic dispatch, the name may be bound to a polymorphic operation at compile time, but the implementation not be chosen until runtime (this is how dynamic dispatch works in C++). However, late binding does imply dynamic dispatching since you cannot choose which implementation of a polymorphic operation to select until you have selected the operation that the name refers to.
A fairly decent answer to this is actually incorporated into a question on late vs. early binding on programmers.stackexchange.com.
In short, late binding refers to the object-side of an eval, dynamic dispatch refers to the functional-side. In late binding the type of a variable is the variant at runtime. In dynamic-dispatch, the function or subroutine being executed is the variant.
In C++, we don't really have late binding because the type is known (not necessarily the end of the inheritance hierarchy, but at least a formal base class or interface). But we do have dynamic dispatch via virtual methods and polymorphism.
The best example I can offer for late-binding is the untyped "object" in Visual Basic. The runtime environment does all the late-binding heavy lifting for you.
Dim obj
- initialize object then..
obj.DoSomething()
The compiler will actually code the appropriate execution context for the runtime-engine to perform a named lookup of the method called DoSomething, and if discovered with the properly matching parameters, actually execute the underlying call. In reality, something about the type of the object is known (it inherits from IDispatch and supports GetIDsOfNames(), etc). but as far as the language is concerned the type of the variable is utterly unknown at compile time, and it has no idea if DoSomething is even a method for whatever obj actually is until runtime reaches the point of execution.
I won't bother dumping a C++ virtual interface et'al, as I'm confident you already know what they look like. I hope it is obvious that the C++ language simply can't do this. It is strongly-typed. It can (and does, obviously) do dynamic dispatch via the polymorphic virtual method feature.
In C++, both are same.
In C++, there are two kinds of binding:
static binding — which is done at compile-time.
dynamic binding — which is done at runtime.
Dynamic binding, since it is done at runtime, is also referred to as late binding and static binding is sometime referred to as early binding.
Using dynamic-binding, C++ supports runtime-polymorphism through virtual functions (or function pointers), and using static-binding, all other functions calls are resolved.
Late binding is calling a method by name during runtime.
You don't really have this in c++, except for importing methods from a DLL.
An example for that would be: GetProcAddress()
With dynamic dispatch, the compiler has enough information to call the right implementation of the method. This is usually done by creating a virtual table.
The link itself explained the difference:
Dynamic dispatch is different from late binding (also known as dynamic binding). In the context of selecting an operation, binding refers to the process of associating a name with an operation. Dispatching refers to choosing an implementation for the operation after you have decided which operation a name refers to.
and
With dynamic dispatch, the name may be bound to a polymorphic operation at compile time, but the implementation not be chosen until runtime (this is how dynamic dispatch works in C++). However, late binding does imply dynamic dispatching since you cannot choose which implementation of a polymorphic operation to select until you have selected the operation that the name refers to.
But they're mostly equal in C++ you can do a dynamic dispatch by virtual functions and vtables.
C++ uses early binding and offers both dynamic and static dispatch. The default form of dispatch is static. To get dynamic dispatch you must declare a method as virtual.
Binding refers to the process of associating a name with an operation.
the main thing here is function parameters these decides which function to call at runtime
Dispatching refers to choosing an implementation for the operation after you have decided which operation a name refers to.
dispatch control to that according to parameter match
http://en.wikipedia.org/wiki/Dynamic_dispatch
hope this help you
Let me give you an example of the differences because they are NOT the same. Yes, dynamic dispatch lets you choose the correct method when you are referring to an object by a superclass, but that magic is very specific to that class hierarchy, and you have to do some declarations in the base class to make it work (abstract methods fill out the vtables since the index of the method in the table cant change between specific types). So, you can call methods in Tabby and Lion and Tiger all by a generic Cat pointer and even have arrays of Cats filled with Lions and Tigers and Tabbys. It knows what indexes those methods refer to in the object's vtable at compile-time (static/early binding), even though the method is selected at run-time (dynamic dispatch).
Now, lets implement an array that contains Lions and Tigers and Bears! ((Oh My!)). Assuming we don't have a base class called Animal, in C++, you are going to have significant work to do to because the compiler isn't going to let you do any dynamic dispatch without a common base class. The indexes for the vtables need to match up, and that can't be done between unreleated classes. You'd need to have a vtable big enough to hold the virtual methods of all classes in the system. C++ programmers rarely see this as a limitation because you have been trained to think a certain way about class design. I'm not saying its better or worse.
With late binding, the run-time takes care of this without a common base class. There is normally a hash table system used to find methods in the classes with a cache system used in the dispatcher. Where in C++, the compiler knows all the types. In a late-bound language, the objects themselves know their type (its not typeless, the objects themselves know exactly who they are in most cases). This means I can have arrays of multiple types of objects if I want (Lions and Tigers and Bears). And you can implement message forwarding and prototyping (allows behaviors to be changed per object without changing the class) and all sorts of other things in ways that are much more flexible and lead to less code overhead than in languages that don't support late binding.
Ever program in Android and use findViewById()? You almost always end up casting the result to get the right type, and casting is basically lying to the compiler and giving up all the static type-checking goodness that is supposed to make static languages superior. Of course, you could instead have findTextViewById(), findEditTextById(), and a million others so that your return types match, but that is throwing polymorphism out the window; arguably the whole basis of OOP. A late-bound language would probably let you simply index by an ID, and treat it like a hash table and not care what the type was being indexed nor returned.
Here's another example. Let's say that you have your Lion class and its default behavior is to eat you when you see it. In C++, if you wanted to have a single "trained" lion, you need to make a new subclass. Prototyping would let you simply change the one or two methods of that particular Lion that need to be changed. It's class and type don't change. C++ can't do that. This is important since when you have a new "AfricanSpottedLion" that inherits from Lion, you can train it too. The prototyping doesn't change the class structure so it can be expanded. This is normally how these languages handle issues that normally require multiple inheritance, or perhaps multiple inheritance is how you handle a lack of prototyping.
FYI, Objective-C is C with SmallTalk's message passing added and SmallTalk is the original OOP, and both are late bound with all the features above and more. Late bound languages may be slightly slower from a micro-level standpoint, but can often allow the code to structured in a way that is more efficient at a macro-level, and it all boils down to preference.
Given that wordy Wikipedia definition I'd be tempted to classify dynamic dispatch as the late binding of C++
struct Base {
virtual void foo(); // Dynamic dispatch according to Wikipedia definition
void bar(); // Static dispatch according to Wikipedia definition
};
Late binding instead, for Wikipedia, seems to mean pointer-to-member dispatch of C++
(this->*mptr)();
where the selection of what is the operation being invoked (and not just which implementation) is done at runtime.
In C++ literature however late binding is normally used for what Wikipedia calls dynamic dispatch.
Dynamic dispatch is what happens when you use the virtual keyword in C++. So for example:
struct Base
{
virtual int method1() { return 1; }
virtual int method2() { return 2; } // not overridden
};
struct Derived : public Base
{
virtual int method1() { return 3; }
}
int main()
{
Base* b = new Derived;
std::cout << b->method1() << std::endl;
}
will print 3, because the method has been dynamically dispatched. The C++ standard is very careful not to specify how exactly this happens behind the scenes, but every compiler under the sun does it in the same way. They create a table of function pointers for each polymorphic type (called the virtual table or vtable), and when you call a virtual method, the "real" method is looked up from the vtable, and that version is called. So you can imaging something like this pseudocode:
struct BaseVTable
{
int (*_method1) () = &Base::method1; // real function address
int (*_method2) () = &Base::method2;
};
struct DerivedVTable
{
int (*method1) () = &Derived::method1; //overriden
int (*method2) () = &Base::method2; // not overridden
};
In this way, the compiler can be sure that a method with a particular signature exists at compile time. However, at run-time, the call might actually be dispatched via the vtable to a different function. Calls to virtual functions are a tiny bit slower than non-virtual calls, because of the extra indirection step.
On the other hand, my understanding of the term late binding is that the function pointer is looked up by name at runtime, from a hash table or something similar. This is the way things are done in Python, JavaScript and (if memory serves) Objective-C. This makes it possible to add new methods to a class at run-time, which cannot directly be done in C++. This is particularly useful for implementing things like mixins. However, the downside is that the run-time lookup is generally considerably slower than even a virtual call in C++, and the compiler is not able to perform any compile-time type checking for the newly-added methods.
This question might help you.
Dynamic dispatch generally refers to multiple dispatch.
Consider the below example. I hope it might help you.
class Base2;
class Derived2; //Derived2 class is child of Base2
class Base1 {
public:
virtual void function1 (Base2 *);
virtual void function1 (Derived2 *);
}
class Derived1: public Base1 {
public:
//override.
virtual void function1(Base2 *);
virtual void function1(Derived2 *);
};
Consider the case of below.
Derived1 * d = new Derived1;
Base2 * b = new Derived2;
//Now which function1 will be called.
d->function1(b);
It will call function1 taking Base2* not Derived2*. This is due to lack of dynamic multiple dispatch.
Late binding is one of the mechanism to implement dynamic single dispatch.
I suppose the meaning is when you have two classes B,C inherits the same father class A. so, pointer of the father (type A) can hold each of sons types. The compiler cannot know what the type holds in the pointer in certain time, because it can change during the program run.
There is special functions to determine what the type of certain object in certain time. like instanceof in java, or by if(typeid(b) == typeid(A))... in c++.
In C++, both dynamic dispatch and late binding is the same. Basically, the value of a single object determines the piece of code invoked at runtime. In languages like C++ and java dynamic dispatch is more specifically dynamic single dispatch which works as mentioned above. In this case, since the binding occurs at runtime, it is also called late binding. Languages like smalltalk allow dynamic multiple dispatch in which the runtime method is chosen at runtime based on the identities or values of more than one object.
In C++ we dont really have late binding, because the type information is known. Thus in the C++ or Java context, dynamic dispatch and late binding are the same. Actual/fully late binding, I think is in languages like python which is a method-based lookup rather than type based.
I've recently read about the Dynamic Dispatch on Wikipedia and couldn't understand the difference between dynamic dispatch and late binding in C++.
When each one of the mechanisms is used?
The exact quote from Wikipedia:
Dynamic dispatch is different from late binding (also known as dynamic binding). In the context of selecting an operation, binding refers to the process of associating a name with an operation. Dispatching refers to choosing an implementation for the operation after you have decided which operation a name refers to. With dynamic dispatch, the name may be bound to a polymorphic operation at compile time, but the implementation not be chosen until runtime (this is how dynamic dispatch works in C++). However, late binding does imply dynamic dispatching since you cannot choose which implementation of a polymorphic operation to select until you have selected the operation that the name refers to.
A fairly decent answer to this is actually incorporated into a question on late vs. early binding on programmers.stackexchange.com.
In short, late binding refers to the object-side of an eval, dynamic dispatch refers to the functional-side. In late binding the type of a variable is the variant at runtime. In dynamic-dispatch, the function or subroutine being executed is the variant.
In C++, we don't really have late binding because the type is known (not necessarily the end of the inheritance hierarchy, but at least a formal base class or interface). But we do have dynamic dispatch via virtual methods and polymorphism.
The best example I can offer for late-binding is the untyped "object" in Visual Basic. The runtime environment does all the late-binding heavy lifting for you.
Dim obj
- initialize object then..
obj.DoSomething()
The compiler will actually code the appropriate execution context for the runtime-engine to perform a named lookup of the method called DoSomething, and if discovered with the properly matching parameters, actually execute the underlying call. In reality, something about the type of the object is known (it inherits from IDispatch and supports GetIDsOfNames(), etc). but as far as the language is concerned the type of the variable is utterly unknown at compile time, and it has no idea if DoSomething is even a method for whatever obj actually is until runtime reaches the point of execution.
I won't bother dumping a C++ virtual interface et'al, as I'm confident you already know what they look like. I hope it is obvious that the C++ language simply can't do this. It is strongly-typed. It can (and does, obviously) do dynamic dispatch via the polymorphic virtual method feature.
In C++, both are same.
In C++, there are two kinds of binding:
static binding — which is done at compile-time.
dynamic binding — which is done at runtime.
Dynamic binding, since it is done at runtime, is also referred to as late binding and static binding is sometime referred to as early binding.
Using dynamic-binding, C++ supports runtime-polymorphism through virtual functions (or function pointers), and using static-binding, all other functions calls are resolved.
Late binding is calling a method by name during runtime.
You don't really have this in c++, except for importing methods from a DLL.
An example for that would be: GetProcAddress()
With dynamic dispatch, the compiler has enough information to call the right implementation of the method. This is usually done by creating a virtual table.
The link itself explained the difference:
Dynamic dispatch is different from late binding (also known as dynamic binding). In the context of selecting an operation, binding refers to the process of associating a name with an operation. Dispatching refers to choosing an implementation for the operation after you have decided which operation a name refers to.
and
With dynamic dispatch, the name may be bound to a polymorphic operation at compile time, but the implementation not be chosen until runtime (this is how dynamic dispatch works in C++). However, late binding does imply dynamic dispatching since you cannot choose which implementation of a polymorphic operation to select until you have selected the operation that the name refers to.
But they're mostly equal in C++ you can do a dynamic dispatch by virtual functions and vtables.
C++ uses early binding and offers both dynamic and static dispatch. The default form of dispatch is static. To get dynamic dispatch you must declare a method as virtual.
Binding refers to the process of associating a name with an operation.
the main thing here is function parameters these decides which function to call at runtime
Dispatching refers to choosing an implementation for the operation after you have decided which operation a name refers to.
dispatch control to that according to parameter match
http://en.wikipedia.org/wiki/Dynamic_dispatch
hope this help you
Let me give you an example of the differences because they are NOT the same. Yes, dynamic dispatch lets you choose the correct method when you are referring to an object by a superclass, but that magic is very specific to that class hierarchy, and you have to do some declarations in the base class to make it work (abstract methods fill out the vtables since the index of the method in the table cant change between specific types). So, you can call methods in Tabby and Lion and Tiger all by a generic Cat pointer and even have arrays of Cats filled with Lions and Tigers and Tabbys. It knows what indexes those methods refer to in the object's vtable at compile-time (static/early binding), even though the method is selected at run-time (dynamic dispatch).
Now, lets implement an array that contains Lions and Tigers and Bears! ((Oh My!)). Assuming we don't have a base class called Animal, in C++, you are going to have significant work to do to because the compiler isn't going to let you do any dynamic dispatch without a common base class. The indexes for the vtables need to match up, and that can't be done between unreleated classes. You'd need to have a vtable big enough to hold the virtual methods of all classes in the system. C++ programmers rarely see this as a limitation because you have been trained to think a certain way about class design. I'm not saying its better or worse.
With late binding, the run-time takes care of this without a common base class. There is normally a hash table system used to find methods in the classes with a cache system used in the dispatcher. Where in C++, the compiler knows all the types. In a late-bound language, the objects themselves know their type (its not typeless, the objects themselves know exactly who they are in most cases). This means I can have arrays of multiple types of objects if I want (Lions and Tigers and Bears). And you can implement message forwarding and prototyping (allows behaviors to be changed per object without changing the class) and all sorts of other things in ways that are much more flexible and lead to less code overhead than in languages that don't support late binding.
Ever program in Android and use findViewById()? You almost always end up casting the result to get the right type, and casting is basically lying to the compiler and giving up all the static type-checking goodness that is supposed to make static languages superior. Of course, you could instead have findTextViewById(), findEditTextById(), and a million others so that your return types match, but that is throwing polymorphism out the window; arguably the whole basis of OOP. A late-bound language would probably let you simply index by an ID, and treat it like a hash table and not care what the type was being indexed nor returned.
Here's another example. Let's say that you have your Lion class and its default behavior is to eat you when you see it. In C++, if you wanted to have a single "trained" lion, you need to make a new subclass. Prototyping would let you simply change the one or two methods of that particular Lion that need to be changed. It's class and type don't change. C++ can't do that. This is important since when you have a new "AfricanSpottedLion" that inherits from Lion, you can train it too. The prototyping doesn't change the class structure so it can be expanded. This is normally how these languages handle issues that normally require multiple inheritance, or perhaps multiple inheritance is how you handle a lack of prototyping.
FYI, Objective-C is C with SmallTalk's message passing added and SmallTalk is the original OOP, and both are late bound with all the features above and more. Late bound languages may be slightly slower from a micro-level standpoint, but can often allow the code to structured in a way that is more efficient at a macro-level, and it all boils down to preference.
Given that wordy Wikipedia definition I'd be tempted to classify dynamic dispatch as the late binding of C++
struct Base {
virtual void foo(); // Dynamic dispatch according to Wikipedia definition
void bar(); // Static dispatch according to Wikipedia definition
};
Late binding instead, for Wikipedia, seems to mean pointer-to-member dispatch of C++
(this->*mptr)();
where the selection of what is the operation being invoked (and not just which implementation) is done at runtime.
In C++ literature however late binding is normally used for what Wikipedia calls dynamic dispatch.
Dynamic dispatch is what happens when you use the virtual keyword in C++. So for example:
struct Base
{
virtual int method1() { return 1; }
virtual int method2() { return 2; } // not overridden
};
struct Derived : public Base
{
virtual int method1() { return 3; }
}
int main()
{
Base* b = new Derived;
std::cout << b->method1() << std::endl;
}
will print 3, because the method has been dynamically dispatched. The C++ standard is very careful not to specify how exactly this happens behind the scenes, but every compiler under the sun does it in the same way. They create a table of function pointers for each polymorphic type (called the virtual table or vtable), and when you call a virtual method, the "real" method is looked up from the vtable, and that version is called. So you can imaging something like this pseudocode:
struct BaseVTable
{
int (*_method1) () = &Base::method1; // real function address
int (*_method2) () = &Base::method2;
};
struct DerivedVTable
{
int (*method1) () = &Derived::method1; //overriden
int (*method2) () = &Base::method2; // not overridden
};
In this way, the compiler can be sure that a method with a particular signature exists at compile time. However, at run-time, the call might actually be dispatched via the vtable to a different function. Calls to virtual functions are a tiny bit slower than non-virtual calls, because of the extra indirection step.
On the other hand, my understanding of the term late binding is that the function pointer is looked up by name at runtime, from a hash table or something similar. This is the way things are done in Python, JavaScript and (if memory serves) Objective-C. This makes it possible to add new methods to a class at run-time, which cannot directly be done in C++. This is particularly useful for implementing things like mixins. However, the downside is that the run-time lookup is generally considerably slower than even a virtual call in C++, and the compiler is not able to perform any compile-time type checking for the newly-added methods.
This question might help you.
Dynamic dispatch generally refers to multiple dispatch.
Consider the below example. I hope it might help you.
class Base2;
class Derived2; //Derived2 class is child of Base2
class Base1 {
public:
virtual void function1 (Base2 *);
virtual void function1 (Derived2 *);
}
class Derived1: public Base1 {
public:
//override.
virtual void function1(Base2 *);
virtual void function1(Derived2 *);
};
Consider the case of below.
Derived1 * d = new Derived1;
Base2 * b = new Derived2;
//Now which function1 will be called.
d->function1(b);
It will call function1 taking Base2* not Derived2*. This is due to lack of dynamic multiple dispatch.
Late binding is one of the mechanism to implement dynamic single dispatch.
I suppose the meaning is when you have two classes B,C inherits the same father class A. so, pointer of the father (type A) can hold each of sons types. The compiler cannot know what the type holds in the pointer in certain time, because it can change during the program run.
There is special functions to determine what the type of certain object in certain time. like instanceof in java, or by if(typeid(b) == typeid(A))... in c++.
In C++, both dynamic dispatch and late binding is the same. Basically, the value of a single object determines the piece of code invoked at runtime. In languages like C++ and java dynamic dispatch is more specifically dynamic single dispatch which works as mentioned above. In this case, since the binding occurs at runtime, it is also called late binding. Languages like smalltalk allow dynamic multiple dispatch in which the runtime method is chosen at runtime based on the identities or values of more than one object.
In C++ we dont really have late binding, because the type information is known. Thus in the C++ or Java context, dynamic dispatch and late binding are the same. Actual/fully late binding, I think is in languages like python which is a method-based lookup rather than type based.
I have run into an annoying problem lately, and I am not satisfied with my own workaround: I have a program that maintains a vector of pointers to a base class, and I am storing there all kind of children object-pointers. Now, each child class has methods of their own, and the main program may or not may call these methods, depending on the type of object (note though that they all heavily use common methods of the base class, so this justify inheritance).
I have found useful to have an "object identifier" to check the class type (and then either call the method or not), which is already not very beautiful, but this is not the main inconvenience. The main inconvenience is that, if I want to actually be able to call a derived class method using the base class pointer (or even just store the pointer in the pointer array), then one need to declare the derived methods as virtual in the base class.
Make sense from the C++ coding point of view.. but this is not practical in my case (from the development point of view), because I am planning to create many different children classes in different files, perhaps made by different people, and I don't want to tweak/maintain the base class each time, to add virtual methods!
How to do this? Essentially, what I am asking (I guess) is how to implement something like Objective-C NSArrays - if you send a message to an object that does not implement the method, well, nothing happens.
regards
Instead of this:
// variant A: declare everything in the base class
void DoStuff_A(Base* b) {
if (b->TypeId() == DERIVED_1)
b->DoDerived1Stuff();
else if if (b->TypeId() == DERIVED_2)
b->DoDerived12Stuff();
}
or this:
// variant B: declare nothing in the base class
void DoStuff_B(Base* b) {
if (b->TypeId() == DERIVED_1)
(dynamic_cast<Derived1*>(b))->DoDerived1Stuff();
else if if (b->TypeId() == DERIVED_2)
(dynamic_cast<Derived2*>(b))->DoDerived12Stuff();
}
do this:
// variant C: declare the right thing in the base class
b->DoStuff();
Note there's a single virtual function in the base per stuff that has to be done.
If you find yourself in a situation where you are more comfortable with variants A or B then with variant C, stop and rethink your design. You are coupling components too tightly and in the end it will backfire.
I am planning to create many different children classes in different
files, perhaps made by different people, and I don't want to
tweak/maintain the base class each time, to add virtual methods!
You are OK with tweaking DoStuff each time a derived class is added, but tweaking Base is a no-no. May I ask why?
If your design does not fit in either A, B or C pattern, show what you have, for clairvoyance is a rare feat these days.
You can do what you describe in C++, but not using functions. It is, by the way, kind of horrible but I suppose there might be cases in which it's a legitimate approach.
First way of doing this:
Define a function with a signature something like boost::variant parseMessage(std::string, std::vector<boost::variant>); and perhaps a string of convenience functions with common signatures on the base class and include a message lookup table on the base class which takes functors. In each class constructor add its messages to the message table and the parseMessage function then parcels off each message to the right function on the class.
It's ugly and slow but it should work.
Second way of doing this:
Define the virtual functions further down the hierarchy so if you want to add int foo(bar*); you first add a class that defines it as virtual and then ensure every class that wants to define int foo(bar*); inherit from it. You can then use dynamic_cast to ensure that the pointer you are looking at inherits from this class before trying to call int foo(bar*);. Possible these interface adding classes could be pure virtual so they can be mixed in to various points using multiple inheritance, but that may have its own problems.
This is less flexible than the first way and requires the classes that implement a function to be linked to each other. Oh, and it's still ugly.
But mostly I suggest you try and write C++ code like C++ code not Objective-C code.
This can be solved by adding some sort of introspection capabilities and meta object system. This talk Metadata and reflection in C++ — Jeff Tucker demonstrates how to do this using c++'s template meta programming.
If you don't want to go to the trouble of implementing one yourself, then it would be easier to use an existing one such as Qt's meta object system. Note that this solution does not work with multiple inheritance due to limitations in the meta object compiler: QObject Multiple Inheritance.
With that installed, you can query for the presence of methods and call them. This is quite tedious to do by hand, so the easiest way to call such a methods is using the signal and slot mechanism.
There is also GObject which is quite simmilar and there are others.
If you are planning to create many different children classes in different files, perhaps made by different people, and also I would guess you don't want to change your main code for every child class. Then I think what you need to do in your base class is to define several (not to many) virtual functions (with empty implementation) BUT those functions should be used to mark a time in the logic where they are called like "AfterInseart" or "BeforeSorting", Etc.
Usually there are not to many places in the logic you wish a derived classes to perform there own logic.
I have an interface class similar to:
class IInterface
{
public:
virtual ~IInterface() {}
virtual methodA() = 0;
virtual methodB() = 0;
};
I then implement the interface:
class AImplementation : public IInterface
{
// etc... implementation here
}
When I use the interface in an application is it better to create an instance of the concrete class AImplementation. Eg.
int main()
{
AImplementation* ai = new AIImplementation();
}
Or is it better to put a factory "create" member function in the Interface like the following:
class IInterface
{
public:
virtual ~IInterface() {}
static std::tr1::shared_ptr<IInterface> create(); // implementation in .cpp
virtual methodA() = 0;
virtual methodB() = 0;
};
Then I would be able to use the interface in main like so:
int main()
{
std::tr1::shared_ptr<IInterface> test(IInterface::create());
}
The 1st option seems to be common practice (not to say its right). However, the 2nd option was sourced from "Effective C++".
One of the most common reasons for using an interface is so that you can "program against an abstraction" rather then a concrete implementation.
The biggest benefit of this is that it allows changing of parts of your code while minimising the change on the remaining code.
Therefore although we don't know the full background of what you're building, I would go for the Interface / factory approach.
Having said this, in smaller applications or prototypes I often start with concrete classes until I get a feel for where/if an interface would be desirable. Interfaces can introduce a level of indirection that may just not be necessary for the scale of app you're building.
As a result in smaller apps, I find I don't actually need my own custom interfaces. Like so many things, you need to weigh up the costs and benefits specific to your situation.
There is yet another alternative which you haven't mentioned:
int main(int argc, char* argv[])
{
//...
boost::shared_ptr<IInterface> test(new AImplementation);
//...
return 0;
}
In other words, one can use a smart pointer without using a static "create" function. I prefer this method, because a "create" function adds nothing but code bloat, while the benefits of smart pointers are obvious.
There are two separate issues in your question:
1. How to manage the storage of the created object.
2. How to create the object.
Part 1 is simple - you should use a smart pointer like std::tr1::shared_ptr to prevent memory leaks that otherwise require fancy try/catch logic.
Part 2 is more complicated.
You can't just write create() in main() like you want to - you'd have to write IInterface::create(), because otherwise the compiler will be looking for a global function called create, which isn't what you want. It might seem like having the 'std::tr1::shared_ptr test' initialized with the value returned by create() might seem like it'd do what you want, but that's not how C++ compilers work.
As to whether using a factory method on the interface is a better way to do this than just using new AImplementation(), it's possible it'd be helpful in your situation, but beware of speculative complexity - if you're writing the interface so that it always creates an AImplementation and never a BImplementation or a CImplementation, it's hard to see what the extra complexity buys you.
"Better" in what sense?
The factory method doesn't buy you much if you only plan to have, say, one concrete class. (But then again, if you only plan to have one concrete class, do you really need the interface class at all? Maybe yes, if you're using COM.) In any case, if you can forsee a small, fixed limit on the number of concrete classes, then the simpler implementation may be the "better" one, on the whole.
But if there may be many concrete classes, and if you don't want to have the base class be tightly coupled to them, then the factory pattern may be useful.
And yes, this can help reduce coupling -- if the base class provides some means for the derived classes to register themselves with the base class. This would allow the factory to know which derived classes exist, and how to create them, without needing compile-time information about them.
Use the 1st method. Your factory method in the 2nd option would have to be implemented per-concrete class and this is not possible to do in the interface. I.e., IInterface::create() has no idea exactly which concrete class you actually wish to instantiate.
A static method cannot be virtual, and implementing a non-static create() method in your concrete classes has not really won you anything in this case.
Factory methods are certainly useful, but this is not the correct use.
Which item in Effective C++ recommends the 2nd option? I don't see it in mine (though I don't also have the second book). That may clear up a mis-understanding.
I would go with the first option just because it's more common and more understandable. It's really up to you, but if your working on a commercial app then I would ask what my peers what they use.
I do have a very simple question there:
Are you sure you want to use a pointer ?
This question might seem unlogical but people coming from a Java background use new much often than required. In your example, creating the variable on the stack would be amply sufficient.
How can be changed the behavior of an object at runtime? (using C++)
I will give a simple example. I have a class Operator that contains a method operate. Let’s suppose it looks like this:
double operate(double a, double b){
return 0.0;
}
The user will give some input values for a and b, and will choose what operation to perform let’s say that he can choose to compute addition or multiplication. Given it’s input all I am allowed to do is instantiate Operator and call operate(a, b), which is written exactly how I mentioned before.
The methods that compute multiplication or addition will be implemented somewhere (no idea where).
In conclusion I have to change the behavior of my Operator object depending on the user's input.
The standard pattern for this is to make the outer class have a pointer to an "implementation" class.
// derive multiple implementations from this:
class Implementation
{
virtual ~Implementation() {} // probably essential!
virtual void foo() = 0;
};
class Switcheroo
{
Implementation *impl_;
public:
// constructor, destructor, copy constructor, assignment
// must all be properly defined (any that you can't define,
// make private)
void foo()
{
impl_->foo();
}
};
By forwarding all the member functions of Switcheroo to the impl_ member, you get the ability to switch in a different implementation whenever you need to.
There are various names for this pattern: Pimpl (short for "private implementation"), Smart Reference (as opposed to Smart Pointer, due to the fowarding member functions), and it has something in common with the Proxy and Bridge patterns.
I'm mentioning this only as trivia and can't unrecommend it more, but here we go...
WARNING DANGER!!!
A stupid trick I've seen is called clutching, I think, but it's only for the truely foolish. Basically you swap the virtualtable pointer to that of another class, it works, but it could theoretically destroy the world or cause some other undefined behavior :)
Anyways instead of this just use dynamic classing and kosher C++, but as an experiment the above is kind of fun...
Coplien's Envelope/Letter Pattern (in his must read book Advanced C++ Programming Styles and Idioms) is the classic way to do this.
Briefly, an Envelope and a Letter are both subclasses of an abstract base class/interfcae that defines the public interface for all subclasses.
An Envelope holds (and hides the true type of) a Letter.
A variety of Letter classes have different implementations of the abstract class's public interface.
An Envelope has no real implementation; it just forards (delegates) to its Letter. It holds a pointer to the abstract base class, and points that at a concrete Letter class instance. As the implementation needs to be changed, the type of Letter subclass pointer to is changed.
As users only have a reference to the Envelope, this change is invisible to them except in that the Envelope's behavior changes.
Coplien's examples are particularly clean, because it's the Letters, not the envelope that cause the change.
One example is of a Number class hierarchy. The abstract base declares certain operations over all Numbers, e.g, addition. Integer and a Complex are examples of concrete subclasses.
Adding an Integer and an Integer results in an Integer, but adding a Interget and a Complex results in a Complex.
Here's what the Envelope looks like for addition:
public class Number {
Number* add( const Number* const n ) ; // abstract, deriveds override
}
public class Envelope : public Number {
private Number* letter;
...
Number* add( const Number& rhs) { // add a number to this
// if letter and rhs are both Integers, letter->add returns an Integer
// if letter is a a Complex, or rhs is, what comes back is a Complex
//
letter = letter->add( rhs ) ) ;
return this;
}
}
Now in the client's pointer never changes, and they never ever need to know what the Envelop is holding. Here's the client code:
int main() {
// makeInteger news up the Envelope, and returns a pointer to it
Number* i = makeInteger( 1 ) ;
// makeComplex is similar, both return Envelopes.
Number* c = makeComplex( 1, 1 ) ;
// add c to i
i->add(c) ;
// to this code, i is now, for all intents and purposes, a Complex!
// even though i still points to the same Envelope, because
// the envelope internally points to a Complex.
}
In his book, Coplien goes into greater depth -- you'll note that the add method requires multi-dispatch of some form --, and adds syntactic sugar. But this is the gist of how you can get what's called "runtime polymorphism".
You can achieve it through dynamic binding (polymorphism)... but it all depends on what you are actually trying to achieve.
You can't change the behavior of arbitrary objects using any sane way unless the object was intended to use 'plugin' behaviour through some technique (composition, callbacks etc).
(Insane ways might be overwriting process memory where the function code lies...)
However, you can overwrite an object's behavior that lies in virtual methods by overwriting the vtable (An approach can be found in this article ) without overwriting memory in executable pages. But this still is not a very sane way to do it and it bears multiple security risks.
The safest thing to do is to change the behavior of objects that were designed to be changed by providing the appropriate hooks (callbacks, composition ...).
Objects always have the behaviour that's defined by their class.
If you need different behaviour, you need a different class...
You could also consider the Role Pattern with dynamic binding..i'm struggling with the same thing that you do..I read about the Strategy pattern but the role one sounds like a good solution also...
There are many ways to do this proxying, pImpl idiom, polymorphism, all with pros and cons. The solution that is best for you will depend on exactly which problem you are trying to solve.
Many many ways:
Try if at first. You can always change the behavior with if statement. Then you probably find the 'polymorphism' way more accurate, but it depends on your task.
Create a abstract class, declaring the methods, which behavior must be variable, as virtual.
Create concrete classes, that will implement the virtual methods. There are many ways to achieve this, using design patterns.
You can change the object behavior using dynamic binding. The design patterns like Decorator, Strategy would actually help you to realize the same.