Do polymorphism or conditionals promote better design? - c++

I recently stumbled across this entry in the google testing blog about guidelines for writing more testable code. I was in agreement with the author until this point:
Favor polymorphism over conditionals: If you see a switch statement you should think polymorphisms. If you see the same if condition repeated in many places in your class you should again think polymorphism. Polymorphism will break your complex class into several smaller simpler classes which clearly define which pieces of the code are related and execute together. This helps testing since simpler/smaller class is easier to test.
I simply cannot wrap my head around that. I can understand using polymorphism instead of RTTI (or DIY-RTTI, as the case may be), but that seems like such a broad statement that I can't imagine it actually being used effectively in production code. It seems to me, rather, that it would be easier to add additional test cases for methods which have switch statements, rather than breaking down the code into dozens of separate classes.
Also, I was under the impression that polymorphism can lead to all sorts of other subtle bugs and design issues, so I'm curious to know if the tradeoff here would be worth it. Can someone explain to me exactly what is meant by this testing guideline?

Actually this makes testing and code easier to write.
If you have one switch statement based on an internal field you probably have the same switch in multiple places doing slightly different things. This causes problems when you add a new case as you have to update all the switch statements (if you can find them).
By using polymorphism you can use virtual functions to get the same functionality and because a new case is a new class you don't have to search your code for things that need to be checked it is all isolated for each class.
class Animal
{
public:
Noise warningNoise();
Noise pleasureNoise();
private:
AnimalType type;
};
Noise Animal::warningNoise()
{
switch(type)
{
case Cat: return Hiss;
case Dog: return Bark;
}
}
Noise Animal::pleasureNoise()
{
switch(type)
{
case Cat: return Purr;
case Dog: return Bark;
}
}
In this simple case every new animal causes requires both switch statements to be updated.
You forget one? What is the default? BANG!!
Using polymorphism
class Animal
{
public:
virtual Noise warningNoise() = 0;
virtual Noise pleasureNoise() = 0;
};
class Cat: public Animal
{
// Compiler forces you to define both method.
// Otherwise you can't have a Cat object
// All code local to the cat belongs to the cat.
};
By using polymorphism you can test the Animal class.
Then test each of the derived classes separately.
Also this allows you to ship the Animal class (Closed for alteration) as part of you binary library. But people can still add new Animals (Open for extension) by deriving new classes derived from the Animal header. If all this functionality had been captured inside the Animal class then all animals need to be defined before shipping (Closed/Closed).

Do not fear...
I guess your problem lies with familiarity, not technology. Familiarize yourself with C++ OOP.
C++ is an OOP language
Among its multiple paradigms, it has OOP features and is more than able to support comparison with most pure OO language.
Don't let the "C part inside C++" make you believe C++ can't deal with other paradigms. C++ can handle a lot of programming paradigms quite graciously. And among them, OOP C++ is the most mature of C++ paradigms after procedural paradigm (i.e. the aforementioned "C part").
Polymorphism is Ok for production
There is no "subtle bugs" or "not suitable for production code" thing. There are developers who remain set in their ways, and developers who'll learn how to use tools and use the best tools for each task.
switch and polymorphism are [almost] similar...
... But polymorphism removed most errors.
The difference is that you must handle the switches manually, whereas polymorphism is more natural, once you get used with inheritance method overriding.
With switches, you'll have to compare a type variable with different types, and handle the differences. With polymorphism, the variable itself knows how to behave. You only have to organize the variables in logical ways, and override the right methods.
But in the end, if you forget to handle a case in switch, the compiler won't tell you, whereas you'll be told if you derive from a class without overriding its pure virtual methods. Thus most switch-errors are avoided.
All in all, the two features are about making choices. But Polymorphism enable you to make more complex and in the same time more natural and thus easier choices.
Avoid using RTTI to find an object's type
RTTI is an interesting concept, and can be useful. But most of the time (i.e. 95% of the time), method overriding and inheritance will be more than enough, and most of your code should not even know the exact type of the object handled, but trust it to do the right thing.
If you use RTTI as a glorified switch, you're missing the point.
(Disclaimer: I am a great fan of the RTTI concept and of dynamic_casts. But one must use the right tool for the task at hand, and most of the time RTTI is used as a glorified switch, which is wrong)
Compare dynamic vs. static polymorphism
If your code does not know the exact type of an object at compile time, then use dynamic polymorphism (i.e. classic inheritance, virtual methods overriding, etc.)
If your code knows the type at compile time, then perhaps you could use static polymorphism, i.e. the CRTP pattern http://en.wikipedia.org/wiki/Curiously_Recurring_Template_Pattern
The CRTP will enable you to have code that smells like dynamic polymorphism, but whose every method call will be resolved statically, which is ideal for some very critical code.
Production code example
A code similar to this one (from memory) is used on production.
The easier solution revolved around a the procedure called by message loop (a WinProc in Win32, but I wrote a simplier version, for simplicity's sake). So summarize, it was something like:
void MyProcedure(int p_iCommand, void *p_vParam)
{
// A LOT OF CODE ???
// each case has a lot of code, with both similarities
// and differences, and of course, casting p_vParam
// into something, depending on hoping no one
// did a mistake, associating the wrong command with
// the wrong data type in p_vParam
switch(p_iCommand)
{
case COMMAND_AAA: { /* A LOT OF CODE (see above) */ } break ;
case COMMAND_BBB: { /* A LOT OF CODE (see above) */ } break ;
// etc.
case COMMAND_XXX: { /* A LOT OF CODE (see above) */ } break ;
case COMMAND_ZZZ: { /* A LOT OF CODE (see above) */ } break ;
default: { /* call default procedure */} break ;
}
}
Each addition of command added a case.
The problem is that some commands where similar, and shared partly their implementation.
So mixing the cases was a risk for evolution.
I resolved the problem by using the Command pattern, that is, creating a base Command object, with one process() method.
So I re-wrote the message procedure, minimizing the dangerous code (i.e. playing with void *, etc.) to a minimum, and wrote it to be sure I would never need to touch it again:
void MyProcedure(int p_iCommand, void *p_vParam)
{
switch(p_iCommand)
{
// Only one case. Isn't it cool?
case COMMAND:
{
Command * c = static_cast<Command *>(p_vParam) ;
c->process() ;
}
break ;
default: { /* call default procedure */} break ;
}
}
And then, for each possible command, instead of adding code in the procedure, and mixing (or worse, copy/pasting) the code from similar commands, I created a new command, and derived it either from the Command object, or one of its derived objects:
This led to the hierarchy (represented as a tree):
[+] Command
|
+--[+] CommandServer
| |
| +--[+] CommandServerInitialize
| |
| +--[+] CommandServerInsert
| |
| +--[+] CommandServerUpdate
| |
| +--[+] CommandServerDelete
|
+--[+] CommandAction
| |
| +--[+] CommandActionStart
| |
| +--[+] CommandActionPause
| |
| +--[+] CommandActionEnd
|
+--[+] CommandMessage
Now, all I needed to do was to override process for each object.
Simple, and easy to extend.
For example, say the CommandAction was supposed to do its process in three phases: "before", "while" and "after". Its code would be something like:
class CommandAction : public Command
{
// etc.
virtual void process() // overriding Command::process pure virtual method
{
this->processBefore() ;
this->processWhile() ;
this->processAfter() ;
}
virtual void processBefore() = 0 ; // To be overriden
virtual void processWhile()
{
// Do something common for all CommandAction objects
}
virtual void processAfter() = 0 ; // To be overriden
} ;
And, for example, CommandActionStart could be coded as:
class CommandActionStart : public CommandAction
{
// etc.
virtual void processBefore()
{
// Do something common for all CommandActionStart objects
}
virtual void processAfter()
{
// Do something common for all CommandActionStart objects
}
} ;
As I said: Easy to understand (if commented properly), and very easy to extend.
The switch is reduced to its bare minimum (i.e. if-like, because we still needed to delegate Windows commands to Windows default procedure), and no need for RTTI (or worse, in-house RTTI).
The same code inside a switch would be quite amusing, I guess (if only judging by the amount of "historical" code I saw in our app at work).

Unit testing an OO program means testing each class as a unit. A principle that you want to learn is "Open to extension, closed to modification". I got that from Head First Design Patterns. But it basically says that you want to have the ability to easily extend your code without modifying existing tested code.
Polymorphism makes this possible by eliminating those conditional statements. Consider this example:
Suppose you have a Character object that carries a Weapon. You can write an attack method like this:
If (weapon is a rifle) then //Code to attack with rifle else
If (weapon is a plasma gun) //Then code to attack with plasma gun
etc.
With polymorphism the Character does not have to "know" the type of weapon, simply
weapon.attack()
would work. What happens if a new weapon was invented? Without polymorphism you will have to modify your conditional statement. With polymorphism you will have to add a new class and leave the tested Character class alone.

I'm a bit of a skeptic: I believe inheritance often adds more complexity than it removes.
I think you are asking a good question, though, and one thing I consider is this:
Are you splitting into multiple classes because you are dealing with different things? Or is it really the same thing, acting in a different way?
If it's really a new type, then go ahead and create a new class. But if it's just an option, I generally keep it in the same class.
I believe the default solution is the single-class one, and the onus is on the programmer proposing inheritance to prove their case.

Not an expert in the implications for test cases, but from a software development perspective:
Open-closed principle -- Classes should be closed to alteration, but open to extension. If you manage conditional operations via a conditional construct, then if a new condition is added, your class needs to change. If you use polymorphism, the base class need not change.
Don't repeat yourself -- An important part of the guideline is the "same if condition." That indicates that your class has some distinct modes of operation that can be factored into a class. Then, that condition appears in one place in your code -- when you instantiate the object for that mode. And again, if a new one comes along, you only need to change one piece of code.

Polymorphism is one of the corner stones of OO and certainly is very useful.
By dividing concerns over multiple classes you create isolated and testable units.
So instead of doing a switch...case where you call methods on several different types or implemenations you create a unified interface, having multiple implementations.
When you need to add an implementation, you do not need to modify the clients, as is the case with switch...case. Very important as this helps to avoid regression.
You can also simplify your client algorithm by dealing with just one type : the interface.
Very important to me is that polymorphism is best used with a pure interface/implementation pattern ( like the venerable Shape <- Circle etc... ) .
You can also have polymorphism in concrete classes with template-methods ( aka hooks ), but its effectiveness decreases as complexity increases.
Polymorphism is the foundation on which our company's codebase is built, so I consider it very practical.

Switches and polymorphism does the same thing.
In polymorphism (and in class-based programming in general) you group the functions by their type. When using switches you group the types by function. Decide which view is good for you.
So if your interface is fixed and you only add new types, polymorphism is your friend.
But if you add new functions to your interface you will need to update all implementations.
In certain cases, you may have a fixed amount of types, and new functions can come, then switches are better. But adding new types makes you update every switch.
With switches you are duplicating sub-type lists. With polymorphism you are duplicating operation lists. You traded a problem to get a different one. This is the so called expression problem, which is not solved by any programming paradigm I know. The root of the problem is the one-dimensional nature of the text used to represent the code.
Since pro-polymorphism points are well discussed here, let me provide a pro-switch point.
OOP has design patterns to avoid common pitfalls. Procedural programming has design patterns too (but no one have wrote it down yet AFAIK, we need another new Gang of N to make a bestseller book of those...). One design pattern could be always include a default case.
Switches can be done right:
switch (type)
{
case T_FOO: doFoo(); break;
case T_BAR: doBar(); break;
default:
fprintf(stderr, "You, who are reading this, add a new case for %d to the FooBar function ASAP!\n", type);
assert(0);
}
This code will point your favorite debugger to the location where you forgot to handle a case. A compiler can force you to implement your interface, but this forces you to test your code thoroughly (at least to see the new case is noticed).
Of course if a particular switch would be used more than one places, it's cut out into a function (don't repeat yourself).
If you want to extend these switches just do a grep 'case[ ]*T_BAR' rn . (on Linux) and it will spit out the locations worth looking at. Since you need to look at the code, you will see some context which helps you how to add the new case correctly. When you use polymorphism the call sites are hidden inside the system, and you depend on the correctness of the documentation, if it exists at all.
Extending switches does not break the OCP too, since you does not alter the existing cases, just add a new case.
Switches also help the next guy trying to get accustomed to and understand the code:
The possible cases are before your eyes. That's a good thing when reading code (less jumping around).
But virtual method calls are just like normal method calls. One can never know if a call is virtual or normal (without looking up the class). That's bad.
But if the call is virtual, possible cases are not obvious (without finding all derived classes). That's also bad.
When you provide an interface to a thirdparty, so they can add behavior and user data to a system, then that's a different matter. (They can set callbacks and pointers to user-data, and you give them handles)
Further debate can be found here: http://c2.com/cgi/wiki?SwitchStatementsSmell
I'm afraid my "C-hacker's syndrome" and anti-OOPism will eventually burn all my reputation here. But whenever I needed or had to hack or bolt something into a procedural C system, I found it quite easy, the lack of constraints, forced encapsulation and less abstraction layers makes me "just do it". But in a C++/C#/Java system where tens of abstraction layers stacked on the top of each other in the software's lifetime, I need to spend many hours sometimes days to find out how to correctly work around all the constraints and limitations that other programmers built into their system to avoid others "messing with their class".

This is mainly to do with encapsulation of knowledge. Let's start with a really obvious example - toString(). This is Java, but easily transfers to C++. Suppose you want to print a human friendly version of an object for debugging purposes. You could do:
switch(obj.type): {
case 1: cout << "Type 1" << obj.foo <<...; break;
case 2: cout << "Type 2" << ...
This would however clearly be silly. Why should one method somewhere know how to print everything. It will often be better for the object itself to know how to print itself, eg:
cout << object.toString();
That way the toString() can access member fields without needing casts. They can be tested independently. They can be changed easily.
You could argue however, that how an object prints shouldn't be associated with an object, it should be associated with the print method. In this case, another design pattern comes in helpful, which is the Visitor pattern, used to fake Double Dispatch. Describing it fully is too long for this answer, but you can read a good description here.

If you are using switch statements everywhere you run into the possibility that when upgrading you miss one place thats needs an update.

It works very well if you understand it.
There are also 2 flavors of polymorphism. The first is very easy to understand in java-esque:
interface A{
int foo();
}
final class B implements A{
int foo(){ print("B"); }
}
final class C implements A{
int foo(){ print("C"); }
}
B and C share a common interface. B and C in this case can't be extended, so you're always sure which foo() you're calling. Same goes for C++, just make A::foo pure virtual.
Second, and trickier is run-time polymorphism. It doesn't look too bad in pseudo-code.
class A{
int foo(){print("A");}
}
class B extends A{
int foo(){print("B");}
}
class C extends B{
int foo(){print("C");}
}
...
class Z extends Y{
int foo(){print("Z");
}
main(){
F* f = new Z();
A* a = f;
a->foo();
f->foo();
}
But it is a lot trickier. Especially if you're working in C++ where some of the foo declarations may be virtual, and some of the inheritance might be virtual. Also the answer to this:
A* a = new Z;
A a2 = *a;
a->foo();
a2.foo();
might not be what you expect.
Just keep keenly aware of what you do and don't know if you're using run-time polymorphism. Don't get overconfident, and if you're not sure what something is going to do at run-time, then test it.

I must re-iterate that finding all switch statments can be a non trivial processes in a mature code base. If you miss any then the application is likely to crash because of an unmatched case statement unless you have default set.
Also check out "Martin Fowlers" book on "Refactoring"
Using a switch instead of polymorphism is a code smell.

It really depends on your style of programming. While this may be correct in Java or C#, I don't agree that automatically deciding to use polymorphism is correct. You can split your code into lots of little functions and perform an array lookup with function pointers (initialized at compile time), for instance. In C++, polymorphism and classes are often overused - probably the biggest design mistake made by people coming from strong OOP languages into C++ is that everything goes into a class - this is not true. A class should only contain the minimal set of things that make it work as a whole. If a subclass or friend is necessary, so be it, but they shouldn't be the norm. Any other operations on the class should be free functions in the same namespace; ADL will allow these functions be used without lookup.
C++ is not an OOP language, don't make it one. It's as bad as programming C in C++.

Related

Programming pattern for components that are toggleable at runtime

I'm wondering if there is some kind of logical programming pattern or structure that I should be using if sometimes during runtime a component should be used and other times not. The obvious simple solution is to just use if-else statements everywhere. I'm trying to avoid littering my code with if-else statements since once the component is toggled on, it will more than likely be on for a while and I wonder if its worth it to recheck if the same component is active all over the place when the answer will most likely not have changed between checks.
Thanks
A brief example of what I'm trying to avoid
class MainClass
{
public:
// constructors, destructors, etc
private:
ComponentClass m_TogglableComponent;
}
// somewhere else in the codebase
if (m_TogglableComponent.IsActive())
{
// do stuff
}
// somewhere totally different in the codebase
if (m_TogglableComponent.IsActive())
{
// do some different stuff
}
Looks like you're headed towards a feature toggle. This is a common occurrence when there's a piece of functionality that you need to be able to toggle on or off at run time. The key piece of insight with this approach is to use polymorphism instead of if/else statements, leveraging object oriented practices.
Martin Fowler details an approach here, as well as his rationale: http://martinfowler.com/articles/feature-toggles.html
But for a quick answer, instead of having state in your ComponentClass that tells observers whether it's active or not, you'll want to make a base class, AbstractComponentClass, and two base classes ActiveComponentClass and InactiveComponentClass. Bear in mind that m_TogglableComponent is currently an automatic member, and you'll need to make it a pointer under this new setup.
AbstractComponentClass will define pure virtual methods that both need to implement. In ActiveComponentClass you will put your normal functionality, as if it were enabled. In InactiveComponentClass you do as little as possible, enough to make the component invisible as far as MainClass is concerned. Void functions will do nothing and functions return values will return neutral values.
The last step is creating an instance of one of these two classes. This is where you bring in dependency injection. In your constructor to MainClass, you'll take a pointer of type AbstractComponentClass. From there on it doesn't care if it's Active or Inactive, it just calls the virtual functions. Whoever owns or controls MainClass is the one that injects the kind that you want, either active or inactive, which could be read by configuration or however else your system decides when to toggle.
If you need to change the behaviour at run time, you'll also need a setter method that takes another AbstractComponentClass pointer and replaces the one from the constructor.

Object composition promotes code reuse. (T/F, why)

I'm studying for an exam and am trying to figure this question out. The specific question is "Inheritance and object composition both promote code reuse. (T/F)", but I believe I understand the inheritance portion of the question.
I believe inheritance promotes code reuse because similar methods can be placed in an abstract base class such that the similar methods do not have to be identically implemented within multiple children classes. For example, if you have three kinds of shapes, and each shape's method "getName" simply returns a data member '_name', then why re-implement this method in each of the child classes when it can be implemented once in the abstract base class "shape".
However, my best understanding of object composition is the "has-a" relationship between objects/classes. For example, a student has a school, and a school has a number of students. This can be seen as object composition since they can't really exist without each other (a school without any students isn't exactly a school, is it? etc). But I see no way that these two objects "having" each other as a data member will promote code reuse.
Any help? Thanks!
Object composition can promote code reuse because you can delegate implementation to a different class, and include that class as a member.
Instead of putting all your code in your outermost classes' methods, you can create smaller classes with smaller scopes, and smaller methods, and reuse those classes/methods throughout your code.
class Inner
{
public:
void DoSomething();
};
class Outer1
{
public:
void DoSomethingBig()
{
// We delegate part of the call to inner, allowing us to reuse its code
inner.DoSomething();
// Todo: Do something else interesting here
}
private:
Inner inner;
};
class Outer2
{
public:
void DoSomethingElseThatIsBig()
{
// We used the implementation twice in different classes,
// without copying and pasting the code.
// This is the most basic possible case of reuse
inner.DoSomething();
// Todo: Do something else interesting here
}
private:
Inner inner;
};
As you mentioned in your question, this is one of the two most basic Object Oriented Programming principals, called a "has-a relationship". Inheritance is the other relationship, and is called an "is-a replationship".
You can also combine inheritance and composition in quite useful ways that will often multiply your code (and design) reuse potential. Any real world and well-architected application will constantly combine both of these concepts to gain as much reuse as possible. You'll find out about this when you learn about Design Patterns.
Edit:
Per Mike's request in the comments, a less abstract example:
// Assume the person class exists
#include<list>
class Bus
{
public:
void Board(Person newOccupant);
std::list<Person>& GetOccupants();
private:
std::list<Person> occupants;
};
In this example, instead of re-implementing a linked list structure, you've delegated it to a list class. Every time you use that list class, you're re-using the code that implements the list.
In fact, since list semantics are so common, the C++ standard library gave you std::list, and you just had to reuse it.
1) The student knows about a school, but this is not really a HAS-A relationship; while you would want to keep track of what school the student attends, it would not be logical to describe the school as being part of the student.
2) More people occupy the school than just students. That's where the reuse comes in. You don't have to re-define the things that make up a school each time you describe a new type of school-attendee.
I have to agree with #Karl Knechtel -- this is a pretty poor question. As he said, it's hard to explain why, but I'll give it a shot.
The first problem is that it uses a term without defining it -- and "code reuse" means a lot of different things to different people. To some people, cutting and pasting qualifies as code reuse. As little as I like it, I have to agree with them, to at least some degree. Other people define cod reuse in ways that rule out cutting and pasting as being code reuse (classing another copy of the same code as separate code, not reusing the same code). I can see that viewpoint too, though I tend to think their definition is intended more to serve a specific end than be really meaningful (i.e., "code reuse"->good, "cut-n-paste"->bad, therefore "cut-n-paste"!="code reuse"). Unfortunately, what we're looking at here is right on the border, where you need a very specific definition of what code reuse means before you can answer the question.
The definition used by your professor is likely to depend heavily upon the degree of enthusiasm he has for OOP -- especially during the '90s (or so) when OOP was just becoming mainstream, many people chose to define it in ways that only included the cool new OOP "stuff". To achieve the nirvana of code reuse, you had to not only sign up for their OOP religion, but really believe in it! Something as mundane as composition couldn't possibly qualify -- no matter how strangely they had to twist the language for that to be true.
As a second major point, after decades of use of OOP, a few people have done some fairly careful studies of what code got reused and what didn't. Most that I've seen have reached a fairly simple conclusion: it's quite difficult (i.e., essentially impossible) correlate coding style with reuse. Nearly any rule you attempt to make about what will or won't result in code reuse can and will be violated on a regular basis.
Third, and what I suspect tends to be foremost in many people's minds is the fact that asking the question at all makes it sound as if this is something that can/will affect a typical coder -- that you might want to choose between composition and inheritance (for example) based on which "promotes code reuse" more, or something on that order. The reality is that (just for example) you should choose between composition and inheritance primarily based upon which more accurately models the problem you're trying to solve and which does more to help you solve that problem.
Though I don't have any serious studies to support the contention, I would posit that the chances of that code being reused will depend heavily upon a couple of factors that are rarely even considered in most studies: 1) how similar of a problem somebody else needs to solve, and 2) whether they believe it will be easier to adapt your code to their problem than to write new code.
I should add that in some of the studies I've seen, there were factors found that seemed to affect code reuse. To the best of my recollection, the one that stuck out as being the most important/telling was not the code itself at all, but the documentation available for that code. Being able to use the code without basically reverse engineer it contributes a great deal toward its being reused. The second point was simply the quality of the code -- a number of the studies were done in places/situations where they were trying to promote code reuse. In a fair number of cases, people tried to reuse quite a bit more code than they really did, but had to give up on it simply because the code wasn't good enough -- everything from bugs to clumsy interfaces to poor portability prevented reuse.
Summary: I'll go on record as saying that code reuse has probably been the most overhyped, under-delivered promise in software engineering over at least the last couple of decades. Even at best, code reuse remains a fairly elusive goal. Trying to simplify it to the point of treating it as a true/false question based on two factors is oversimplifying the question to the point that it's not only meaningless, but utterly ridiculous. It appears to trivialize and demean nearly the entire practice of software engineering.
I have an object Car and an object Engine:
class Engine {
int horsepower;
}
class Car {
string make;
Engine cars_engine;
}
A Car has an Engine; this is composition. However, I don't need to redefine Engine to put an engine in a car -- I simply say that a Car has an Engine. Thus, composition does indeed promote code reuse.
Object composition does promote code re-use. Without object composition, if I understand your definition of it properly, every class could have only primitive data members, which would be beyond awful.
Consider the classes
class Vector3
{
double x, y, z;
double vectorNorm;
}
class Object
{
Vector3 position;
Vector3 velocity;
Vector3 acceleration;
}
Without object composition, you would be forced to have something like
class Object
{
double positionX, positionY, positionZ, positionVectorNorm;
double velocityX, velocityY, velocityZ, velocityVectorNorm;
double accelerationX, accelerationY, accelerationZ, accelerationVectorNorm;
}
This is just a very simple example, but I hope you can see how even the most basic object composition promotes code reuse. Now think about what would happen if Vector3 contained 30 data members. Does this answer your question?

A C++ based variants chess engine design question

I have a chess variants engine that plays suicide chess and losers chess along with normal chess. I might, over time, add more variants to my engine. The engine is implemented completely in C++ with proper usage of OOP. My question is related to design of such a variant engine.
Initially the project started as a suicide-only engine while over time I added other flavors. For adding new variants, I experimented using polymorphism in C++ first. For instance, a MoveGenerator abstract class had two subclasses SuicideMoveGenerator and NormalMoveGenerator and depending on the type of game chosen by user, a factory would instantiate the right subclass. But I found this to be much slower - obviously because instantiating classes containing virtual functions and calling virtual functions in tight loops are both quite inefficient.
But then it occurred to me to use C++ templates with template specialization for separating logic for different variants with maximum reuse of code. This also seemed very logical because dynamic linking is not really necessary in the context as once you choose the type of game, you basically stick with it until the end of the game. C++ template specialization provides exactly this - static polymorphism. The template parameter is either SUICIDE or LOSERS or NORMAL.
enum GameType { LOSERS, NORMAL, SUICIDE };
So once user selects a game type, appropriate game object is instantiated and everything called from there will be appropriately templatized. For instance if user selects suicide chess, lets say:
ComputerPlayer<SUICIDE>
object is instantiated and that instantiation basically is linked to the whole control flow statically. Functions in ComputerPlayer<SUICIDE> would work with MoveGenerator<SUICIDE>, Board<SUICIDE> and so on while corresponding NORMAL one will appropriately work.
On a whole, this lets me instantiate the right templatize specialized class at the beginning and without any other if conditions anywhere, the whole thing works perfectly. The best thing is there is no performance penalty at all!
The main downside with this approach however is that using templates makes your code a bit harder to read. Also template specialization if not appropriately handled can lead to major bugs.
I wonder what do other variant engine authors normally do for separation of logic (with good reuse of code)?? I found C++ template programming quite suitable but if there's anything better out there, I would be glad to embrace. In particular, I checked Fairymax engine by Dr. H G Muller but that uses config files for defining game rules. I don't want to do that because many of my variants have different extensions and by making it generic to the level of config-files the engine might not grow strong. Another popular engine Sjeng is littered with if conditions everywhere and I personally feel thats not a good design.
Any new design insights would be very useful.
"Calling virtual functions in tight loops are inefficient"
I would be pretty surprised actually if this caused any real bloat, if all the variables of the loop have the same dynamic type, then I would expect the compiler to fetch the corresponding instruction from its L1 cache and thus not suffer much.
However there is one part that worries me:
"obviously because instantiating classes containing virtual functions [is] quite inefficient"
Now... I am really surprised.
The cost of instantiating a class with virtual functions is near undistinguishable from the cost of instantiating a class without any virtual functions: it's one more pointer, and that's all (on popular compilers, which corresponds to the _vptr).
I surmise that your problem lies elsewhere. So I am going to take a wild guess:
do you, by any chance, have a lot of dynamic instantiation going on ? (calling new)
If that is the case, you would gain much by removing them.
There is a Design Pattern called Strategy which would be eminently suitable for your precise situation. The idea of this pattern is akin, in fact, to the use of virtual functions, but it actually externalize those functions.
Here is a simple example:
class StrategyInterface
{
public:
Move GenerateMove(Player const& player) const;
private:
virtual Move GenerateMoveImpl(Player const& player) const = 0;
};
class SuicideChessStrategy: public StrategyInterface
{
virtual Move GenerateMoveImpl(Player const& player) const = 0;
};
// Others
Once implemented, you need a function to get the right strategy:
StrategyInterface& GetStrategy(GameType gt)
{
static std::array<StrategyInterface*,3> strategies
= { new SuicideChessStrategy(), .... };
return *(strategies[gt]);
}
And finally, you can delegate the work without using inheritance for the other structures:
class Player
{
public:
Move GenerateMove() const { return GetStrategy(gt).GenerateMove(*this); }
private:
GameType gt;
};
The cost is pretty much similar to using virtual functions, however you do not need dynamically allocated memory for the basic objects of your game any longer, and stack allocation is a LOT faster.
I'm not quite sure if this is a fit but you may be able to achieve static polymorphism via the CRTP with some slight modifications to your original design.

dynamical binding or switch/case?

A scene like this:
I've different of objects do the similar operation as respective func() implements.
There're 2 kinds of solution for func_manager() to call func() according to different objects
Solution 1: Use virtual function character specified in c++. func_manager works differently accroding to different object point pass in.
class Object{
virtual void func() = 0;
}
class Object_A : public Object{
void func() {};
}
class Object_B : public Object{
void func() {};
}
void func_manager(Object* a)
{
a->func();
}
Solution 2: Use plain switch/case. func_manager works differently accroding to different type pass in
typedef enum _type_t
{
TYPE_A,
TYPE_B
}type_t;
void func_by_a()
{
// do as func() in Object_A
}
void func_by_b()
{
// do as func() in Object_A
}
void func_manager(type_t type)
{
switch(type){
case TYPE_A:
func_by_a();
break;
case TYPE_B:
func_by_b();
default:
break;
}
}
My Question are 2:
1. at the view point of DESIGN PATTERN, which one is better?
2. at the view point of RUNTIME EFFCIENCE, which one is better? Especailly as the kinds of Object increases, may be up to 10-15 total, which one's overhead oversteps the other? I don't know how switch/case implements innerly, just a bunch of if/else?
Thanks very much!
from the view point of DESIGN PATTERN, which one is better?
Using polymorphism (Solution 1) is better.
Just one data point: Imagine you have a huge system built around either of the two and then suddenly comes the requirement to add another type. With solution one, you add one derived class, make sure it's instantiated where required, and you're done. With solution 2 you have thousands of switch statements smeared all over the system and it is more or less impossible to guarantee you found all the places where you have to modify them for the new type.
from the view point of RUNTIME EFFCIENCE, which one is better? Especailly as the kinds of Object
That's hard to say.
I remember a footnote in Stanley Lippmann's Inside the C++ Object Model, where he says that studies have shown that virtual functions might have a small advantage against switches over types. I would be hard-pressed, however, to cite chapter and verse, and, IIRC, the advantage didn't seem big enough to make the decision dependent on it.
The first solution is better if only because it's shorter in code. It is also easier to maintain and faster to compile: if you want to add types you need only add new types as headers and compilation units, with no need to change and recompile the code that is responsible for the type mapping. In fact, this code is generated by the compiler and is likely to be as efficient or more efficient than anything you can write on your own.
Virtual functions at the lowest level cost no more than an extra dereference in a table (array). But never mind that, this sort of performance nitpicking really doesn't matter at this microscopic numbers. Your code is simpler, and that's what matters. The runtime dispatch that C++ gives you is there for a reason. Use it.
I would say the first one is better. In solution 2 func_manager will have to know about all types and be updated everytime you add a new type. If you go with solution 1 you can later add a new type and func_manager will just work.
In this simple case I would actually guess that solution 1 will be faster since it can directly look up the function address in the vtable. If you have 15 different types the switch statement will likely not end up as a jump table but basically as you say a huge if/else statement.
From the design point of view the first one is definitely better as thats what inheritance was intended for, to make different objects behave homogeneously.
From the efficiency point of view in both alternatives you have more or less the same generated code, somewhere there must be the choice making code. Difference is that inheritance handles it for you automatically in the 1st one and you do it manually in the 2nd one.
Using "dynamic dispatch" or "virtual dispatch" (i.e. invoking a virtual function) is better both with respect to design (keeping changes in one place, extensibility) and with respect to runtime efficiency (simple dereference vs. a simple dereference where a jump table is used or an if...else ladder which is really slow). As a slight aside, "dynamic binding" doesn't mean what you think... "dynamic binding" refers to resolving a variable's value based on its most recent declaration, as opposed to "static binding" or "lexical binding", which refers to resolving a variable by the current inner-most scope in which it is declared.
Design
If another programmer comes along who doesn't have access to your source code and wants to create an implementation of Object, then that programmer is stuck... the only way to extend functionality is by adding yet another case in a very long switch statement. Whereas with virtual functions, the programmer only needs to inherit from your interface and provide definitions for the virtual methods and, walla, it works. Also, those switch statements end up all over the place, and so adding new implementations almost always requires modifying many switch statements everywhere, while inheritance keeps the changes localized to one class.
Efficiency
Dynamic dispatch simply looks up a function in the object's virtual table and then jumps to that location. It is incredibly fast. If the switch statement uses a jump table, it will be roughly the same speed; however, if there are very few implementations, some programmer is going to be tempted to use an if...else ladder instead of a switch statement, which generally is not able to take advantage of jump tables and is, therefore, slower.
Why nobody suggests function objects? I think kingkai interested in solving the problem, not only that two solutions.
I'm not experienced with them, but they do their job:
struct Helloer{
std::string operator() (void){
return std::string("Hello world!");
}
};
struct Byer{
std::string operator() (void){
return std::string("Good bye world!");
}
};
template< class T >
void say( T speaker){
std::cout << speaker() << std::endl;
}
int main()
{
say( Helloer() );
say( Byer() );
}
Edit: In my opinion this is more "right" approach, than classes with single method (which is not function call operator). Actually, I think this overloading was added to C++ to avoid such classes.
Also, function objects are more convenient to use, even if you dont want templates - just like usual functions.
In the end consider STL - it uses func objects everywhere and looks pretty natural. And I dont even mention Boost

Best way to use a C++ Interface

I have an interface class similar to:
class IInterface
{
public:
virtual ~IInterface() {}
virtual methodA() = 0;
virtual methodB() = 0;
};
I then implement the interface:
class AImplementation : public IInterface
{
// etc... implementation here
}
When I use the interface in an application is it better to create an instance of the concrete class AImplementation. Eg.
int main()
{
AImplementation* ai = new AIImplementation();
}
Or is it better to put a factory "create" member function in the Interface like the following:
class IInterface
{
public:
virtual ~IInterface() {}
static std::tr1::shared_ptr<IInterface> create(); // implementation in .cpp
virtual methodA() = 0;
virtual methodB() = 0;
};
Then I would be able to use the interface in main like so:
int main()
{
std::tr1::shared_ptr<IInterface> test(IInterface::create());
}
The 1st option seems to be common practice (not to say its right). However, the 2nd option was sourced from "Effective C++".
One of the most common reasons for using an interface is so that you can "program against an abstraction" rather then a concrete implementation.
The biggest benefit of this is that it allows changing of parts of your code while minimising the change on the remaining code.
Therefore although we don't know the full background of what you're building, I would go for the Interface / factory approach.
Having said this, in smaller applications or prototypes I often start with concrete classes until I get a feel for where/if an interface would be desirable. Interfaces can introduce a level of indirection that may just not be necessary for the scale of app you're building.
As a result in smaller apps, I find I don't actually need my own custom interfaces. Like so many things, you need to weigh up the costs and benefits specific to your situation.
There is yet another alternative which you haven't mentioned:
int main(int argc, char* argv[])
{
//...
boost::shared_ptr<IInterface> test(new AImplementation);
//...
return 0;
}
In other words, one can use a smart pointer without using a static "create" function. I prefer this method, because a "create" function adds nothing but code bloat, while the benefits of smart pointers are obvious.
There are two separate issues in your question:
1. How to manage the storage of the created object.
2. How to create the object.
Part 1 is simple - you should use a smart pointer like std::tr1::shared_ptr to prevent memory leaks that otherwise require fancy try/catch logic.
Part 2 is more complicated.
You can't just write create() in main() like you want to - you'd have to write IInterface::create(), because otherwise the compiler will be looking for a global function called create, which isn't what you want. It might seem like having the 'std::tr1::shared_ptr test' initialized with the value returned by create() might seem like it'd do what you want, but that's not how C++ compilers work.
As to whether using a factory method on the interface is a better way to do this than just using new AImplementation(), it's possible it'd be helpful in your situation, but beware of speculative complexity - if you're writing the interface so that it always creates an AImplementation and never a BImplementation or a CImplementation, it's hard to see what the extra complexity buys you.
"Better" in what sense?
The factory method doesn't buy you much if you only plan to have, say, one concrete class. (But then again, if you only plan to have one concrete class, do you really need the interface class at all? Maybe yes, if you're using COM.) In any case, if you can forsee a small, fixed limit on the number of concrete classes, then the simpler implementation may be the "better" one, on the whole.
But if there may be many concrete classes, and if you don't want to have the base class be tightly coupled to them, then the factory pattern may be useful.
And yes, this can help reduce coupling -- if the base class provides some means for the derived classes to register themselves with the base class. This would allow the factory to know which derived classes exist, and how to create them, without needing compile-time information about them.
Use the 1st method. Your factory method in the 2nd option would have to be implemented per-concrete class and this is not possible to do in the interface. I.e., IInterface::create() has no idea exactly which concrete class you actually wish to instantiate.
A static method cannot be virtual, and implementing a non-static create() method in your concrete classes has not really won you anything in this case.
Factory methods are certainly useful, but this is not the correct use.
Which item in Effective C++ recommends the 2nd option? I don't see it in mine (though I don't also have the second book). That may clear up a mis-understanding.
I would go with the first option just because it's more common and more understandable. It's really up to you, but if your working on a commercial app then I would ask what my peers what they use.
I do have a very simple question there:
Are you sure you want to use a pointer ?
This question might seem unlogical but people coming from a Java background use new much often than required. In your example, creating the variable on the stack would be amply sufficient.