I understand that dynamic/static polymorphism depends on the application design and requirements. However, is it advisable to ALWAYS choose static polymorphism over dynamic if possible? In particular, I can see the following 2 design choice in my application, both of which seem to be advised against:
Implement Static polymorphism using CRTP: No vtable lookup overhead while still providing an interface in form of template base class. But, uses a Lot of switch and static_cast to access the correct class/method, which is hazardous
Dynamic Polymorphism: Implement interfaces (pure virtual classes), associating lookup cost for even trivial functions like accessors/mutators
My application is very time critical, so am in favor of static polymorphism. But need to know if using too many static_cast is an indication of poor design, and how to avoid that without incurring latency.
EDIT: Thanks for the insight. Taking a specific case, which of these is a better approach?
class IMessage_Type_1
{
virtual long getQuantity() =0;
...
}
class Message_Type_1_Impl: public IMessage_Type_1
{
long getQuantity() { return _qty;}
...
}
OR
template <class T>
class TMessage_Type_1
{
long getQuantity() { return static_cast<T*>(this)->getQuantity(); }
...
}
class Message_Type_1_Impl: public TMessage_Type_1<Message_Type_1_Impl>
{
long getQuantity() { return _qty; }
...
}
Note that there are several mutators/accessors in each class, and I do need to specify an interface in my application. In static polymorphism, I switch just once - to get the message type. However, in dynamic polymorphism, I am using virtual functions for EACH method call. Doesnt that make it a case to use static poly? I believe static_cast in CRTP is quite safe and no performance penalty (compile time bound) ?
Static and dynamic polymorphism are designed to solve different
problems, so there are rarely cases where both would be appropriate. In
such cases, dynamic polymorphism will result in a more flexible and
easier to manage design. But most of the time, the choice will be
obvious, for other reasons.
One rough categorisation of the two: virtual functions allow different
implementations for a common interface; templates allow different
interfaces for a common implementation.
A switch is nothing more than a sequence of jumps that -after optimized- becomes a jump to an address looked-up by a table. Exactly like a virtual function call is.
If you have to jump depending on a type, you must first select the type. If the selection cannot be done at compile time (essentially because it depends on the input) you must always perform two operation: select & jump. The syntactic tool you use to select doesn't change the performance, since optimize the same.
In fact you are reinventing the v-table.
You see the design issues associated with purely template based polymorphism. While a looking virtual base class gives you a pretty good idea what is expected from a derived class, this gets much harder in heavily templated designs. One can easily demonstrate that by introducing a syntax error while using one of the boost libraries.
On the other hand, you are fearful of performance issues when using virtual functions. Proofing that this will be a problem is much harder.
IMHO this is a non-question. Stick with virtual functions until indicated otherwise. Virtual function calls are a lot faster than most people think (Calling a function from a dynamically linked library also adds a layer of indirection. No one seems to think about that).
I would only consider a templated design if it makes the code easier to read (generic algorithms), you use one of the few cases known to be slow with virtual functions (numeric algorithms) or you already identified it as a performance bottleneck.
Static polimorphism may provide significant advantage if the called method may be inlined by compiler.
For example, if the virtual method looks like this:
protected:
virtual bool is_my_class_fast_enough() override {return true;}
then static polimophism should be the preferred way (otherwise, the method should be honest and return false :).
"True" virtual call (in most cases) can't be inlined.
Other differences(such as additional indirection in the vtable call) are neglectable
[EDIT]
However, if you really need runtime polymorphism
(if the caller shouldn't know the method's implementation and, therefore, the method can't be inlined on the caller's side) then
do not reinvent vtable (as Emilio Garavaglia mentioned), just use it.
Related
I was interviewed by a financial company and was asked this question:
"List the case(s) when you prefer virtual functions over templates?"
It sounds weird for me, because usually we are aiming the opposite right?
All the books, articles, talks are there encouraging us to use static polymorphism instead of dynamic.
Are there any known cases I was not aware of when you should use virtual functions and avoid templates?
GUI / visualization widget toolkits are an obvious case. Re-implementing a draw method, for example, is certainly less cumbersome with virtual methods and dynamic dispatch. And since modern C++ tends to discourage raw pointer management, std::unique_ptr can manage the resource for you.
I'm sure there are plenty of other hierarchical examples you can come up with... a base enemy class for a game, with virtual methods handling the behaviour of various baddies:)
The whole 'overhead' argument for dynamic dispatch is completely without merit today. I'd argue that the vtable indirection implementation hasn't been a significant overhead for serious workloads in decades. There's the more interesting question that if C++ was designed today, would polymorphism be part of the language? But that's neither here nor there now.
I don't like the chances of this question remaining open, as it's not a direct programming problem and is probably too subject to opinions. It might be a better question for software engineering.
When the type of the object is not known at compile time, use virtual methods.
e.g.
void Accept (Fruit* pFruit) // supplied from external factors at runtime
{
pFruit->eat(); // `Fruit` can be anyone among `Apple/Blackberry/Chickoo/`...
}
Based on what user enters, a fruit will be supplied to the function. Hence, there is no way we can figure out that what is going to be eat(). So it's a candidate for runtime polymorphism:
class Fruit
{
public: virtual void eat () = 0;
}
In all the other cases, always prefer static polymorphism (includes templates). It's more deterministic and easy to maintain.
Interview question seems to asking for preference (& your reasoing).
I would prefer interfaces (virtual methods) for ease of creating mocks for unit testing. We can use template for these, but it is cumbersome (consumer has to be templated). If profiling shows no speed degradation with vtable lookups, prefer that over mocking/testing non-virtual methods.
Also, type erasure. I don't think it can be implemented at all using templates. Type erasure can be roughly thought of as void ptr and function pointers, which can be implemented easily using interfaces + virtual methods.
Templates need code implementation to be made available. I believe virtual methods of an interface can be invoked on a child ptr with just its binary (object) file.
Not sure if code bloat with template is an issue with modern compilers, but if executable size increases significantly, that could be an issue in embedded systems with limited memory and these would not prefer to use templates.
As part of a system design, we need to implement a factory pattern. In combination with the Factory pattern, we are also using CRTP, to provide a base set of functionality which can then be customized by the Derived classes.
Sample code below:
class FactoryInterface{
public:
virtual void doX() = 0;
};
//force all derived classes to implement custom_X_impl
template< typename Derived, typename Base = FactoryInterface>
class CRTP : public Base
{
public:
void doX(){
// do common processing..... then
static_cast<Derived*>(this)->custom_X_impl();
}
};
class Derived: public CRTP<Derived>
{
public:
void custom_X_impl(){
//do custom stuff
}
};
Although this design is convoluted, it does a provide a few benefits. All the calls after the initial virtual function call can be inlined. The derived class custom_X_impl call is also made efficiently.
I wrote a comparison program to compare the behavior for a similar implementation (tight loop, repeated calls) using function pointers and virtual functions. This design came out triumphs for gcc/4.8 with O2 and O3.
A C++ guru however told me yesterday, that any virtual function call in a large executing program can take a variable time, considering cache misses and I can achieve a potentially better performance using C style function table look-ups and gcc hotlisting of functions. However I still see 2x the cost in my sample program mentioned above.
My questions are as below:
1. Is the guru's assertion true? For either answers, are there any links I can refer.
2. Is there any low latency implementation which I can refer, has a base class invoking a custom function in a derived class, using function pointers?
3. Any suggestions on improving the design?
Any other feedback is always welcome.
Your guru refers to the hot attribute of the gcc compiler. The effect of this attribute is:
The function is optimized more aggressively and on many targets it is
placed into a special subsection of the text section so all hot
functions appear close together, improving locality.
So yes, in a very large code base, the hotlisted function may remain in cache ready to be executed without delay, because it avodis cache misses.
You can perfectly use this attribute for member functions:
struct X {
void test() __attribute__ ((hot)) {cout <<"hello, world !\n"; }
};
But...
When you use virtual functions the compiler generally generates a vtable that is shared between all objects of the class. This table is a table of pointers to functions. And indeed -- your guru is right -- nothing garantees that this table remains in cached memory.
But, if you manually create a "C-style" table of function pointers, the problem is EXACTLY THE SAME. While the function may remain in cache, nothing ensures that your function table remains in cache as well.
The main difference between the two approaches is that:
in the case of virtual functions, the compiler knows that the virtual function is a hot spot, and could decide to make sure to keep the vtable in cache as well (I don't know if gcc can do this or if there are plans to do so).
in the case of the manual function pointer table, your compiler will not easily deduce that the table belongs to a hot spot. So this attempt of manual optimization might very well backfire.
My opinion: never try to optimize yourself what a compiler can do much better.
Conclusion
Trust in your benchmarks. And trust your OS: if your function or your data is frequently acessed, there are high chances that a modern OS will take this into account in its virtual memry management, and whatever the compiler will generate.
I know why and how virtual methods work, and most of the time people tell me I should always mark a method virtual, but, I don't understand why if I'm not going to override it. And I also know there's a tiny memory issue.
Please explain me why I should mark all methods virtual and what's the trade-off.
Code example:
class Point
{
int x, y;
public:
virtual void setX(int i);
virtual void setY(int i);
};
(That question is not equal to Should I mark all methods virtual? because I want to know the trade-off and because the programming language in the case is C++, not C#)
OBS: I'm sorry if there's any grammar error, English is not my native language.
No, you should not "mark all methods as virtual".
If you want the method to be virtual, then mark it as such. If not, leave the keyword out.
There is an overhead for virtual methods compared to regular ones. If you want to read more about this, check out the Wikipedia side about VTables.
The real reason to make member functions non-virtual is to enforce class invariants.
Advice to make all member functions virtual generally means that either:
The people giving the advice don't understand the class, or
the people giving the advice don't understand OO design.
Yes, there are a few cases (e.g., some abstract base classes, where the only class invariant is the existence of specific functions) in which all the functions should be virtual. Those are the exception though. In most classes, virtual functions should be restricted to those that you really intend to allow derived classes to provide new/different behavior.
As for the discussion of things like vtables and the overhead of virtual function calls, I'd say they're correct as far as they go, but they miss the really big point. Whether a particular function should or shouldn't be virtual is primarily a question of class design and only secondarily a question of function call overhead. The two don't do the same thing, so trying to compare overhead rarely makes sense.
That is not the case, ie, if you dont need a virtual function then dont use it. Also as per Bjarne Stroustrup Pay per use
In C++: --
Virtual functions have a slight performance penalty. Normally it is too small to make any difference but in a tight loop it might be
significant.
A virtual function increases the size of each object by one pointer. Again this is typically insignificant, but if you create
millions of small objects it could be a factor.
Classes with virtual functions are generally meant to be inherited from. The derived classes may replace some, all or none of the
virtual functions. This can create additional complexity and
complexity is the programmers mortal enemy. For example, a derived
class may poorly implement a virtual function. This may break a part
of the base class that relies on the virtual function.
One of C++'s basic principles is that you don't pay for what you don't need. virtual functions cost more than normal member functions in both time and space. Therefore you shouldn't always use them irregardless of whether or not you'll actually ever need them or not.
Making methods virtual has slight costs (more code, more complexity, larger binaries, slower method calls), and if the class is not inherited from it brings no benefit. Classes need to be designed for inheritance, otherwise inheriting from them is just begging to shoot yourself in the foot. So no, you should not always make every method virtual. The people who tell you this are probably just too inheritance-happy.
It is not true that all functions should be marked as virtual.
Indeed, there's a pattern for enforcing pre/postconditions which explicitly requires that public members are not virtual. It works as follows:
class Whatever
{
public:
int frobnicate(int);
protected:
virtual int do_frobnicate(int);
};
int Whatever::frobnicate(int x)
{
check_preconditions(x);
int result = do_frobnicate(x);
check_postconditions(x, result);
return result;
}
Since derived classes cannot override the public function, they cannot remove the pre/postcondition checks. They can, however, override the protected do_frobnicate which does the actual work.
(Edit - I got to this question by way of a duplicate C# question, but the answer is still useful, I think! Edited to reflect that:)
Actually, C# has a good "official" reason: https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/classes-and-structs/versioning-with-the-override-and-new-keywords
The first sentence there is:
The C# language is designed so that versioning between base and derived classes in different libraries can evolve and maintain backward compatibility.
So if you're writing a class and it's possible end-users will make derived classes, and it's possible different versions will get out of sync...all of a sudden it starts to be important. If you have carefully protected core aspects of your code, then if you update things, people can still use the old derived class (hopefully).
On the other hand, if you are okay with no one being able to used derived classes until their authors have updated to your newest version, everything can be virtual. I have seen this scenario in several "early access" games that allow modding - when the game version increases, all of a sudden a lot of mods are broken because they relied on things working one way...and they changed. (Okay, not all changes are related to this, but some are.)
It really depends on your usage scenario. If people can keep using your old version, maybe they don't care if you've updated it and are happy to keep using it with their derived classes. In a large business scenario, making everything virtual may very well be a recipe for breaking many things at once when someone updates something.
Does this apply to C++ as well? I don't see why not - C++ is also used for massive projects and would also face the dangers of multiple simultaneous versions.
Which are the guidelines for choosing between template duck-typing and pure virtual base class inheritance? Examples:
// templates
class duck {
void sing() { std::cout << "quack\n"; }
};
template<typename bird>
void somefunc(const bird& b) {
b.sing();
}
// pure virtual base class
class bird {
virtual void sing() = 0;
};
class duck : public bird {
void sing() { std::cout << "quack\n"; }
}
void somefunc(const bird& b) {
b.sing();
}
With template duck-typing, you are doing static polymorphism. Thus, you cannot do things like
std::vector<bird*> birds;
birds.push_back(new duck());
However, since you are relying on compile time typing, you are a little more efficient (no virtual call implies no dynamic dispatch (base on the dynamic type)).
If having the "template nature" of things propagate widely is OK with you, templates ("compile-time duck typing") can give you blazing speed (avoiding the "level of indirection" that's implicit in a virtual-function call) though maybe at some cost in memory footprint (in theory, good C++ implementations could avoid that memory overhead related to templates, but I don't feel very confident that such high-quality compilers will necessarily be available on all platforms where you need to port;-). So, at least pragmatically, it's something of a speed/memory trade-off. If the operations you're doing are so super-slow as I/O, then maybe the relatively tiny speed gain from avoiding a virtual call isn't really material to your use case.
Compile time vs. Runtime. If you want compile time binding you need to use templates. If you don't know the types at compile time, you should use virtual inheritence.
They are two completely different things. One is not an alternative to the other. The template function provides a general operation somefunc() which applies to a whole class of types, not just birds. The type of its parameter must be known at compile-time. The virtual method provides a runtime polymorphic operation specific to birds. The exact type of the parameter (this) need not be known at compile-time.
Since they provide different functionality, and are not in conflict with each other, it's rare that you ever need to decide between the two approaches. Decide what functionality you need, and the sensible approach will be obvious. It may even be a combination of the two.
(btw, the term "duck typing" is misused here. Neither approach is duck typing. You should drop the phrase from your C++ lexicon. )
#John is right. If you have two covariant type parameters you have no choice, you have to use templates. Object oriented techniques provide run-time dispatch but it is only available for types whose methods have at most one variant argument (the object).
Most interesting problems involve relations which are N-ary with N>1 therefore you will usually have no choice but to use templates. Please examine the standard library to see which technique is used most.
Is it bad design to check if an object is of a particular type by having some sort of ID data member in it?
class A
{
private:
bool isStub;
public:
A(bool isStubVal):isStub(isStubVal){}
bool isStub(){return isStub;}
};
class A1:public A
{
public:
A1():A(false){}
};
class AStub:public A
{
public:
AStub():A(true){}
};
EDIT 1:
Problem is A holds a lot of virtual functions, which A1 doesn't override but the stub needs to, for indidicating that you are working on a stub instead of an actual object. Here maintainability is the question, for every function that i add to A, i need to override it in stub. forgetting it means dangerous behaviour as A's virtual function gets executed with stub's data. Sure I can add an abstract class ABase and let A and Astub inherit from them. But the design has become rigid enough to allow this refactor.
A reference holder to A is held in another class B. B is initialized with the stub reference, but later depending on some conditions, the reference holder in B is reinitialized with the A1,A2 etc.. So when i do this BObj.GetA(), i can check in GetA() if the refholder is holding a stub and then give an error in that case. Not doing that check means, i would have to override all functions of A in AStub with the appropriate error conditions.
Generally, yes. You're half OO, half procedural.
What are you going to do once you determine the object type? You probably should put that behavior in the object itself (perhaps in a virtual function), and have different derived classes implement that behavior differently. Then you have no reason to check the object type at all.
In your specific example you have a "stub" class. Instead of doing...
if(!stub)
{
dosomething;
}
Just call
object->DoSomething();
and have the implemention in AStub be a empty
Generally yes. Usually you want not to query the object, but to expect it to BEHAVE the proper way. What you suggest is basically a primitive RTTI, and this is generally frowned upon, unless there are better options.
The OO way would be to Stub the functionality, not check for it. However, in the case of a lot of functions to "stub" this may not seem optimal.
Hence, this depends on what you want the class to really do.
Also note, that in this case you don't waste space:
class A
{
public:
virtual bool isStub() = 0;
};
class A1:public A
{
public:
virtual bool isStub() { return false; };
};
class AStub:public A
{
public:
virtual bool isStub() { return true; };
};
... buuut you have a virtual function -- what usually is not a problem, unless it's a performance bottleneck.
If you want to find out the type of object at runtime you can use a dynamic_cast. You must have a pointer or reference to the object, and then check the result of the dynamic_cast. If it is not NULL, then the object is the correct type.
With polymorphic classes you can use the typeofoperator to perform RTTI. Most of the time you shouldn't need to. Without polymorphism, there's no language facility to do so, but you should need to even less often.
One caveat. Obviously your type is going to be determined at construction time. If your determination of 'type' is a dynamic quantity you can't solve this problem with the C++ type system. In that case you need to have some function. But in this case it is better to use the overridable/dynamic behavior as Terry suggested.
Can you provide some better information as what you are trying to accomplish?
This sort of thing is fine. It's generally better to put functionality in the object, so that there's no need to switch on type -- this makes the calling code simpler and localises future changes -- but there's a lot to be said for being able to check the types.
There will always be exceptions to the general case, even with the best will in the world, and being able to quickly check for the odd specific case can make the difference between having something fixed by one change in one place, a quick project-specific hack in the project-specific code, and having to make more invasive, wide-reaching changes (extra functions in the base class at the very least) -- possibly pushing project-specific concerns into shared or framework code.
For a quick solution to the problem, use dynamic_cast. As others have noted, this lets one check that an object is of a given type -- or a type derived from that (an improvement over the straightforward "check IDs" approach). For example:
bool IsStub( const A &a ) {
return bool( dynamic_cast< const AStub * >( &a ) );
}
This requires no setup, and without any effort on one's part the results will be correct. It is also template-friendly in a very straightforward and obvious manner.
Two other approaches may also suit.
If the set of derived types is fixed, or there are a set of derived types that get commonly used, one might have some functions on the base class that will perform the cast. The base class implementations return NULL:
class A {
virtual AStub *AsStub() { return NULL; }
virtual OtherDerivedClass *AsOtherDerivedClass() { return NULL; }
};
Then override as appropriate, for example:
class AStub : public A {
AStub *AsStub() { return this; }
};
Again, this allows one to have objects of a derived type treated as if they were their base type -- or not, if that would be preferable. A further advantage of this is that one need not necessarily return this, but could return a pointer to some other object (a member variable perhaps). This allows a given derived class to provide multiple views of itself, or perhaps change its role at runtime.
This approach is not especially template friendly, though. It would require a bit of work, with the result either being a bit more verbose or using constructs with which not everybody is familiar.
Another approach is to reify the object type. Have an actual object that represents the type, that can be retrieved by both a virtual function and a static function. For simple type checking, this is not much better than dynamic_cast, but the cost is more predictable across a wide range of compilers, and the opportunities for storing useful data (proper class name, reflection information, navigable class hierarchy information, etc.) are much greater.
This requires a bit of infrastructure (a couple of macros, at least) to make it easy to add the virtual functions and maintain the hierarchy data, but it provides good results. Even if this is only used to store class names that are guaranteed to be useful, and to check for types, it'll pay for itself.
With all this in place, checking for a particular type of object might then go something like this example:
bool IsStub( const A &a ) {
return a.GetObjectType().IsDerivedFrom( AStub::GetClassType() );
}
(IsDerivedFrom might be table-driven, or it could simply loop through the hierarchy data. Either of these may or may not be more efficient than dynamic_cast, but the approximate runtime cost is at least predictable.)
As with dynamic_cast, this approach is also obviously amenable to automation with templates.
In the general case it might not be a good design, but in some specific cases it is a reasonable design choice to provide an isStub() method for the use of a specific client that would otherwise need to use RTTI. One such case is lazy loading:
class LoadingProxy : IInterface
{
private:
IInterface m_delegate;
IInterface loadDelegate();
public:
LoadingProxy(IInterface delegate) : m_delegate(delegate){}
int useMe()
{
if (m_delegate.isStub())
{
m_delegate = loadDelegate();
}
return m_delegate.useMe();
}
};
The problem with RTTI is that it is relatively expensive (slow) compared with a virtual method call, so that if your useMe() function is simple/quick, RTTI determines the performance. On one application that I worked on, using RTTI tests to determine if lazy loading was needed was one of the performance bottlenecks identified by profiling.
However, as many other answers have said, the application code should not need to worry about whether it has a stub or a usable instance. The test should be in one place/layer in the application. Unless you might need multiple LoadingProxy implementations there might be a case for making isStub() a friend function.