Now I am developing a class for recognize a object in a photo, and this class is composed of several components (classes). For example,
class PhotoRecognizer
{
public:
int perform_recogniton()
{
pPreProcessing->do_preprocessing();
pFeatureExtractor->do_feature_extraction();
pClassifier->do_classification()
}
boost::shared_ptr<PreProcessing> pPreProcessing;
boost::shared_ptr<FeatureExtractor> pFeatureExtractor;
boost::shared_ptr<Classifier> pClassifier;
}
In this example, when we use this class to perform recognition, we invoke other classes PreProcessing, FeatureExtractor and Classifier. As you can image, there are many different methods to implement each class. For example, for the Classifier class, we can use SVMClassfier or NeuralNetworkClassifer, which is a derived class of the basic Classifier class.
class SVMClassifier: public Classifier
{
public:
void do_classification();
};
Therefore, by using different elements within PhotoRecognizer class, we can create different kinds of PhotoRecongnizer. Now, I am building a benchmark to know how to combine these elements together to create an optimal PhotoRecognizer. One solution I can think of is to use abstract factory:
class MethodFactory
{
public:
MethodFactory(){};
boost::shared_ptr<PreProcessing> pPreProcessing;
boost::shared_ptr<FeatureExtractor> pFeatureExtractor;
boost::shared_ptr<Classifier> pClassifier;
};
class Method1:public MethodFactory
{
public:
Method1():MethodFactory()
{
pPreProcessing.reset(new GaussianFiltering);
pFeatureExtractor.reset(new FFTStatictis);
pClassifier.reset(new SVMClassifier);
}
};
class Method2:public MethodFactory
{
public:
Method1():MethodFactory()
{
pPreProcessing.reset(new MedianFiltering);
pFeatureExtractor.reset(new WaveletStatictis);
pClassifier.reset(new NearestNeighborClassifier);
}
};
class PhotoRecognizer
{
public:
PhotoRecognizer(MethodFactory *p):pFactory(p)
{
}
int perform_recogniton()
{
pFactory->pPreProcessing->do_preprocessing();
pFactory->pFeatureExtractor->do_feature_extraction();
pFactory->pClassifier->do_classification()
}
MethodFactory *pFactory;
}
So when I use Method1 to perform photo recognition, I can simply do the following:
Method1 med;
PhotoRecognizer recogMethod1(&med);
med.perform_recognition()
Further more, I can even make the class PhotoRecognizer more compact:
enum RecMethod
{
Method1, Method2
};
class PhotoRecognizer
{
public:
PhotoRecognizer(RecMethod)
{
switch(RecMethod)
{
case Method1:
pFactory.reset(new Method1());
break;
...
}
}
boost::shared_ptr<MethodFactory> pFactory;
};
So here is my question: is abstract factory design pattern well justified in the situation described above? are there alternative solutions? Thanks.
As so often there is no ultimate "right" method to do it, and the answer depends a lot on how the project will be used. So if it is only for quick tests, done once and never looked back - go on and use enums if it is your heart's desire, nobody should stop you.
However, if you plan to extend the possible methods over time, I would discourage the usage of your second approach with enums. The reason is: every time you want to add a new method you have to change PhotoRecognizer class, so you have to read the code, to remember what it is doing and if somebody else should do it - it would take even more time.
The design with enums violates two first rules of SOLID (https://en.wikipedia.org/wiki/SOLID_(object-oriented_design)):
Open-Closed-Principle (OCP): PhotoRecognizer class cannot be extended (adding a new method) without modification of its code.
Single-Responsibility-Principle (SRP): PhotoRecognizer class does not only recognize the photo, but also serves as a factory for methods.
Your first approach is better, because if you would define another Method3 you could put it into your PhotoRecognizer and use it without changing the code of the class:
//define Method3 somewhere
Method3 med;
PhotoRecognizer recogMethod3(&med);
med.perform_recognition()
What I don't like about your approach, is that for every possible combination you have to write a class (MethodX), which might result in a lot of joyless work. I would do the following:
struct Method
{
boost::shared_ptr<PreProcessing> pPreProcessing;
boost::shared_ptr<FeatureExtractor> pFeatureExtractor;
boost::shared_ptr<Classifier> pClassifier;
};
See Method as as a collection of slots for different algorithms, it here because it is convenient to pass Processing/Extractor/Classifier in this way.
And one could use a factory function:
enum PreprocessingType {pType1, pType2, ...};
enum FeatureExtractorType {feType1, feType2, ..};
enum ClassifierType {cType1, cType2, ... };
Method createMethod(PreprocessingType p, FeatureExtractionType fe, ClassifierType ct){
Method result;
swith(p){
pType1: result.pPreprocessing.reset(new Type1Preprocessing());
break;
....
}
//the same for the other two: fe and ct
....
return result
}
You might ask: "But how about OCP?" - and you would be right! One has to change the createMethod to add other (new) classes. And it might be not much comfort to you, that you still have the possibility to create a Method-object by hand, initialize the fields with the new classes and pass it to a PhotoRecognizer-constructor.
But with C++, you have a mighty tool at your disposal - the templates:
template < typename P, typename FE, typename C>
Method createMethod(){
Method result;
result.pPrepricessing.reset(new P());
result.pFeatureExtractor.reset(new FE());
result.pClassifier.reset(new C());
return result
}
And you are free to chose any combination you want without changing the code:
//define P1, FE22, C2 somewhere
Method medX=createMethod<P1, FE22, C2>();
PhotoRecognizer recogMethod3(&med);
recogMethod3.perform_recognition()
There is yet another issue: What if the class PreProcessingA can not be used with the class ClassifierB? Earlier, if there was no class MethodAB nobody could use it, but now this mistake is possible.
To handle this problem, traits can be used:
template <class A, class B>
struct Together{
static const bool can_be_used=false;
template <>
struct Together<class PreprocessingA, class ClassifierA>{
static const bool can_be_used=true;
}
template < typename P, typename FE, typename C>
Method createMethod(){
static_assert(Together<P,C>::can_be_used, "classes cannot be used together");
Method result;
....
}
Conclusion
This approach has the following advantages:
SRP, i.e. PhotoRecognizer - only recognizes, Method - only bundles the algorithm parts and createMethod - only creates a method.
OCP, i.e. we can add new algorithms without changing the code of other classes/functions
Thanks to traits, we can detect a wrong combination of part-algorithms at compile time.
No boilerplate code / no code duplication.
PS:
You could say, why not scratch the whole Method class? One could just as well use:
template < typename P, typename FE, typename C>
PhotoRecognizer{
P preprocessing;
FE featureExtractor;
C classifier;
...
}
PhotoRecognizer<P1, FE22, C2> recog();
recog.perform_recognition();
Yeah it's true. This alternative has some advantages and disadvantages, one must know more about the project to be able to make the right trade off. But as default I would go with the more SRP-principle compliant approach of encapsulating the part-algorithms into the Method class.
I've implemented an abstract factory pattern here and there. I've always regret the decision after revisiting the code for maintenance. There is no case, I can think of, where one or more factory methods wouldn't have been a better idea. Therefore, I like your second approach best. Consider ditching the method class as ead suggested. Once your testing is complete you'll have one or more factory methods that construct exactly what you want, and best of all, you and others will be able to follow the code later. For example:
std::shared_ptr<PhotoRecognizer> CreateOptimizedPhotoRecognizer()
{
auto result = std::make_shared<PhotoRecognizer>(
CreatePreProcessing(PreProcessingMethod::MedianFiltering),
CreateFeatureExtractor(FeatureExtractionMethod::WaveletStatictis),
CreateClassifier(ClassificationMethod::NearestNeighborClassifier)
);
return result;
}
Use your factory method in code like this:
auto pPhotoRecognizer = CreateOptimizedPhotoRecognizer();
Create the enumerations as you suggested. I know, I know, open/closed principle... If you keep these enumerations in one spot you won't have a problem keeping them in sync with your factory methods. First the enumerations:
enum class PreProcessingMethod { MedianFiltering, FilteringTypeB };
enum class FeatureExtractionMethod { WaveletStatictis, FeatureExtractionTypeB };
enum class ClassificationMethod { NearestNeighborClassifier, SVMClassfier, NeuralNetworkClassifer };
Here's an example of a component factory method:
std::shared_ptr<PreProcessing> CreatePreProcessing(PreProcessingMethod method)
{
std::shared_ptr<PreProcessing> result;
switch (method)
{
case PreProcessingMethod::MedianFiltering:
result = std::make_shared<MedianFiltering>();
break;
case PreProcessingMethod::FilteringTypeB:
result = std::make_shared<FilteringTypeB>();
break;
default:
break;
}
return result;
}
In order to determine the best combinations of algorithms you'll probably want to create some automated tests that run through all the possible permutations of components. One way to do this could be as straight forward as:
for (auto preProc = static_cast<PreProcessingMethod>(0); ;
preProc = static_cast<PreProcessingMethod>(static_cast<int>(preProc) + 1))
{
auto pPreProcessing = CreatePreProcessing(preProc);
if (!pPreProcessing)
break;
for (auto feature = static_cast<FeatureExtractionMethod>(0); ;
feature = static_cast<FeatureExtractionMethod>(static_cast<int>(feature) + 1))
{
auto pFeatureExtractor = CreateFeatureExtractor(feature);
if (!pFeatureExtractor)
break;
for (auto classifier = static_cast<ClassificationMethod>(0); ;
classifier = static_cast<ClassificationMethod>(static_cast<int>(classifier) + 1))
{
auto pClassifier = CreateClassifier(classifier);
if (!pClassifier)
break;
{
auto pPhotoRecognizer = std::make_shared<PhotoRecognizer>(
pPreProcessing,
pFeatureExtractor,
pClassifier
);
auto testResults = TestRecognizer(pPhotoRecognizer);
PrintConfigurationAndResults(pPhotoRecognizer, testResults);
}
}
}
}
Unless you are reusing MethodFactory, I'd recommend the following:
struct Method1 {
using PreProcessing_t = GaussianFiltering;
using FeatureExtractor_t = FFTStatictis;
using Classifier_t = SVMClassifier;
};
class PhotoRecognizer
{
public:
template<typename Method>
PhotoRecognizer(Method tag) {
pPreProcessing.reset(new typename Method::PreProcessing_t());
pFeatureExtractor.reset(new typename Method::FeatureExtractor_t());
pClassifier.reset(new typename Method::Classifier_t());
}
};
Usage:
PhotoRecognizer(Method1());
Related
I wonder if it is possible to generate types set from enum class for the metaprogramming purposes.
I'm originally a C# programmer and used to using a lot of attributes for reflection and metaprogramming. For example, it is a general pattern for me to write a snippet like that with C#:
public enum ComponentEnum { Component1, Component2, Component3 }
[Component(ComponentEnum.Component1)]
public class Component1
{
/* Some code */
}
public static class ComponentsMeta
{
private static Dictionary<Type, ComponentEnum> map;
static ComponentMeta() { /*process the whole codebase via reflection, search Component marked classes an fill the map */}
public static bool IsComponent<T>() => map.ContainsKey(typeof(T));
public static int GetComponentUID<T>() => (int)map[typeof(T)];
}
Of course, it is a very basic snippet without asserts and some other stuff but I believe you got the idea.
I want to make the same behavior in the c++ snippet. What I want to do exactly is makes a type called Components that will contain some utility functions like bool Components::isComponent<T>() or size_t Components::getComponentUID<T>() or some related stuff. The best way I've seen so far is to write it down by myself, making a metaclass like
template <typename Ts..>
class ComponentsData
{
/* functions impl here */
}
typedef ComponentsData<C1, C2, C3> Components;
So, now I can ask Components<C1>::getComponentUID() and it returns me uid of that component (depends on its position as template parameter or constexpr value of that component, it doesn't matter). But it is a very inconvenient way to do that and I wonder if I can put a macro inside the component class or using attributes and code generation step or something. In other words, my goal is to mark somehow the class that it should be in that components set and use it later. What c++ can offer for that purpose?
It will be okay if I could make something like I did C# way - make an enum class, list all the components there, and write a constexpr value inside a component class (or somewhere near the enum class, both ways is good for me).
I mean something like that:
/* ComponentsEnum.h */
enum class ComponentsEnum { Comp1, Comp2, Comp3 };
// Here is some magic to generate Components<C1, C2, C3> metaclass.
/* another file */
#include "ComponentsEnum.h"
struct C1 { const ComponentsEnum MyValue = ComponentsEnum::Comp1; };
or something like that
/* ComponentsEnum.h */
enum class ComponentsEnum { Comp1, Comp2, Comp3 };
// Here is all the magic
// All enum members concats into `Components<Comp1, Comp2, Comp3, ...>`
ConcatAll<ComponentsEnum>();
/* another file */
#include "ComponentsEnum.h"
struct Comp1 { };
or maybe something with macro magic:
/* ComponentsEnum.h */
enum class ComponentsEnum { Comp1, Comp2, Comp3 };
#define InitMeta(ComponentsEnumMember) /* Some Magic */
/* another file */
#include "ComponentsEnum.h"
struct Comp1 { InitMeta(ComponentsEnum::Comp1) };
Thanks in advance!
Following on my comment.
You could do something like this in C++17:
// In register.hpp
int register_me();
// In register.cpp
int register_me(){
static int id = 0;
return id++;
}
// In wherever.hpp
// #include "register.hpp"
struct component{
inline static int id = register_me();
};
Pre-C++17 requires moving the definition and initialization to a .cpp for each component::id.
But I strongly recommend not to use this. Rethink your design, converting types to IDs is a code smell for me. C++ is not really designed to do such things, it can haunt you later.
The code above relies on dynamic initialization of all static variable at the start of the program. The order is unspecified, each compilation might result in assignment of different IDs.
Definitely do not put this into any shared libraries before being 100% sure you know how the compilation, linking, and loading processes work for your toolchain because these are outside the scope of C++ Standard.
Thanks to the #JerryJeremiah link and #Quimby advice, I found the solution.
So, I was misled by my C# habits and the idea was quite simple but tricky.
According to the difference between C# generics and C++ templates, generics are runtime instanced types, but templates are compile-time types. So, I do not need to create a map or process the whole codebase, all I need will be generated with templates in compile time.
The solution itself:
I want an enum to generate continuous uid numbers for my components. So, define it:
enum class ComponentEnum
{
C1,
C2,
C3
};
I want a simple interface for my Components to ask for meta information. Define it too:
struct Components
{
template<typename T>
static bool isComponent() { /* Some stuff here */ }
template<typename T>
static int getComponentUID() { /* Some stuff here */ }
};
Now I can ask uid with one simple generalized call Components::getComponentUID<MyComponent>(). Nice.
The real magic. I've created template metaclass and macro to create a typedef and some additional methods:
template <typename T, ComponentEnum enumMember>
struct ComponentMeta
{
static constexpr bool isComponent = true;
static constexpr int uid = static_cast<int>(enumMember);
};
#define ComponentMetaMacro(type_name, enum_name) typedef ComponentMeta<type_name, ComponentEnum::enum_name> Meta; \
static const char* toString() { return #type_name; }
So I can fill methods from my interface with simple forwarding to that metaclass:
struct Components
{
template<typename T>
static bool isComponent() { return T::Meta::isComponent; }
template<typename T>
static int getComponentUID() { return T::Meta::uid; }
};
All things left is include header with metaclass and macro and call the macro:
struct C1
{
ComponentMetaMacro(C1, C1)
};
struct C2
{
ComponentMetaMacro(C2, C2)
};
Run a few tests:
std::cout << C1::toString() << ": " << Components::getComponentUID<C1>() << std::endl;
std::cout << C2::toString() << ": " << Components::getComponentUID<C2>() << std::endl;
C1: 0
C2: 1
Yay!
This solution has three main problems:
isComponent() becomes the static assert instead of the flag. I mean, the code won't compile if T-type is not a component. It is quite ok but smells.
It is a single linked meta. I can't get a component type from the index, only an index from the type. But for serialization purposes, it could be useful to have a backlink.
I should include the enum class to every component header. It means there will be a huge compile-time affect when I will add a new enum member. I suppose there is a way to avoid it but can't see one. The only enum class purpose is to have the smallest index as possible for every component that will be static between compilations. Maybe I have to think about some data generation or another approaches, but for the small project it is ok.
I'm applying the Factory design pattern in my C++ project, and below you can see how I am doing it. I try to improve my code by following the "anti-if" campaign, thus want to remove the if statements that I am having. Any idea how can I do it?
typedef std::map<std::string, Chip*> ChipList;
Chip* ChipFactory::createChip(const std::string& type) {
MCList::iterator existing = Chips.find(type);
if (existing != Chips.end()) {
return (existing->second);
}
if (type == "R500") {
return Chips[type] = new ChipR500();
}
if (type == "PIC32F42") {
return Chips[type] = new ChipPIC32F42();
}
if (type == "34HC22") {
return Chips[type] = new Chip34HC22();
}
return 0;
}
I would imagine creating a map, with string as the key, and the constructor (or something to create the object). After that, I can just get the constructor from the map using the type (type are strings) and create my object without any if. (I know I'm being a bit paranoid, but I want to know if it can be done or not.)
You are right, you should use a map from key to creation-function.
In your case it would be
typedef Chip* tCreationFunc();
std::map<std::string, tCreationFunc*> microcontrollers;
for each new chip-drived class ChipXXX add a static function:
static Chip* CreateInstance()
{
return new ChipXXX();
}
and also register this function into the map.
Your factory function should be somethink like this:
Chip* ChipFactory::createChip(std::string& type)
{
ChipList::iterator existing = microcontrollers.find(type);
if (existing != microcontrollers.end())
return existing->second();
return NULL;
}
Note that copy constructor is not needed, as in your example.
The point of the factory is not to get rid of the ifs, but to put them in a separate place of your real business logic code and not to pollute it. It is just a separation of concerns.
If you're desperate, you could write a jump table/clone() combo that would do this job with no if statements.
class Factory {
struct ChipFunctorBase {
virtual Chip* Create();
};
template<typename T> struct CreateChipFunctor : ChipFunctorBase {
Chip* Create() { return new T; }
};
std::unordered_map<std::string, std::unique_ptr<ChipFunctorBase>> jumptable;
Factory() {
jumptable["R500"] = new CreateChipFunctor<ChipR500>();
jumptable["PIC32F42"] = new CreateChipFunctor<ChipPIC32F42>();
jumptable["34HC22"] = new CreateChipFunctor<Chip34HC22>();
}
Chip* CreateNewChip(const std::string& type) {
if(jumptable[type].get())
return jumptable[type]->Create();
else
return null;
}
};
However, this kind of approach only becomes valuable when you have large numbers of different Chip types. For just a few, it's more useful just to write a couple of ifs.
Quick note: I've used std::unordered_map and std::unique_ptr, which may not be part of your STL, depending on how new your compiler is. Replace with std::map/boost::unordered_map, and std::/boost::shared_ptr.
No you cannot get rid of the ifs. the createChip method creats a new instance depending on constant (type name )you pass as argument.
but you may optimaze yuor code a little removing those 2 line out of if statment.
microcontrollers[type] = newController;
return microcontrollers[type];
To answer your question: Yes, you should make a factory with a map to functions that construct the objects you want. The objects constructed should supply and register that function with the factory themselves.
There is some reading on the subject in several other SO questions as well, so I'll let you read that instead of explaining it all here.
Generic factory in C++
Is there a way to instantiate objects from a string holding their class name?
You can have ifs in a factory - just don't have them littered throughout your code.
struct Chip{
};
struct ChipR500 : Chip{};
struct PIC32F42 : Chip{};
struct ChipCreator{
virtual Chip *make() = 0;
};
struct ChipR500Creator : ChipCreator{
Chip *make(){return new ChipR500();}
};
struct PIC32F42Creator : ChipCreator{
Chip *make(){return new PIC32F42();}
};
int main(){
ChipR500Creator m; // client code knows only the factory method interface, not the actuall concrete products
Chip *p = m.make();
}
What you are asking for, essentially, is called Virtual Construction, ie the ability the build an object whose type is only known at runtime.
Of course C++ doesn't allow constructors to be virtual, so this requires a bit of trickery. The common OO-approach is to use the Prototype pattern:
class Chip
{
public:
virtual Chip* clone() const = 0;
};
class ChipA: public Chip
{
public:
virtual ChipA* clone() const { return new ChipA(*this); }
};
And then instantiate a map of these prototypes and use it to build your objects (std::map<std::string,Chip*>). Typically, the map is instantiated as a singleton.
The other approach, as has been illustrated so far, is similar and consists in registering directly methods rather than an object. It might or might not be your personal preference, but it's generally slightly faster (not much, you just avoid a virtual dispatch) and the memory is easier to handle (you don't have to do delete on pointers to functions).
What you should pay attention however is the memory management aspect. You don't want to go leaking so make sure to use RAII idioms.
I have a set of classes that describe a set of logical boxes that can hold things and do things to them. I have
struct IBox // all boxes do these
{
....
}
struct IBoxCanDoX // the power to do X
{
void x();
}
struct IBoxCanDoY // the power to do Y
{
void y();
}
I wonder what is the 'best' or maybe its just 'favorite' idiom for a client of these classes to deal with these optional capabilities
a)
if(typeid(box) == typeid(IBoxCanDoX))
{
IBoxCanDoX *ix = static_cast<IBoxCanDoX*>(box);
ix->x();
}
b)
IBoxCanDoX *ix = dynamic_cast<IBoxCanDoX*>(box);
if(ix)
{
ix->x();
}
c)
if(box->canDoX())
{
IBoxCanDoX *ix = static_cast<IBoxCanDoX*>(box);
ix->x();
}
d) different class struct now
struct IBox
{
void x();
void y();
}
...
box->x(); /// ignored by implementations that dont do x
e) same except
box->x() // 'not implemented' exception thrown
f) explicit test function
if(box->canDoX())
{
box->x();
}
I am sure there are others too.
EDIT:
Just to make the use case clearer
I am exposing this stuff to end users via interactive ui. They can type 'make box do X'. I need to know if box can do x. Or I need to disable the 'make current box do X' command
EDIT2: Thx to all answerers
as Noah Roberts pointed out (a) doesnt work (explains some of my issues !).
I ended up doing (b) and a slight variant
template<class T>
T* GetCurrentBox()
{
if (!current_box)
throw "current box not set";
T* ret = dynamic_cast<T*>(current_box);
if(!ret)
throw "current box doesnt support requested operation";
return ret;
}
...
IBoxCanDoX *ix = GetCurrentBox<IBoxCanDoX>();
ix->x();
and let the UI plumbing deal nicely with the exceptions (I am not really throwing naked strings).
I also intend to explore Visitor
I suggest the Visitor pattern for double-dispatch problems like this in C++:
class IVisitor
{
public:
virtual void Visit(IBoxCanDoX *pBox) = 0;
virtual void Visit(IBoxCanDoY *pBox) = 0;
virtual void Visit(IBox* pBox) = 0;
};
class IBox // all boxes do these
{
public:
virtual void Accept(IVisitor *pVisitor)
{
pVisitor->Visit(this);
}
};
class BoxCanDoY : public IBox
{
public:
virtual void Accept(IVisitor *pVisitor)
{
pVisitor->Visit(this);
}
};
class TestVisitor : public IVisitor
{
public:
// override visit methods to do tests for each type.
};
void Main()
{
BoxCanDoY y;
TestVisitor v;
y.Accept(&v);
}
Of the options you've given, I'd say that b or d are "best". However, the need to do a lot of this sort of thing is often indicative of a poor design, or of a design that would be better implemented in a dynamically typed language rather than in C++.
If you are using the 'I' prefix to mean "interface" as it would mean in Java, which would be done with abstract bases in C++, then your first option will fail to work....so that one's out. I have used it for some things though.
Don't do 'd', it will pollute your hierarchy. Keep your interfaces clean, you'll be glad you did. Thus a Vehicle class doesn't have a pedal() function because only some vehicles can pedal. If a client needs the pedal() function then it really does need to know about those classes that can.
Stay way clear of 'e' for the same reason as 'd' PLUS that it violates the Liskov Substitution Principle. If a client needs to check that a class responds to pedal() before calling it so that it doesn't explode then the best way to do that is to attempt casting to an object that has that function. 'f' is just the same thing with the check.
'c' is superfluous. If you have your hierarchy set up the way it should be then casting to ICanDoX is sufficient to check if x can do X().
Thus 'b' becomes your answer from the options given. However, as Gladfelter demonstrates, there are options you haven't considered in your post.
Edit note: I did not notice that 'c' used a static_cast rather than dynamic. As I mention in an answer about that, the dynamic_cast version is cleaner and should be preferred unless specific situations dictate otherwise. It's similar to the following options in that it pollutes the base interface.
Edit 2: I should note that in regard to 'a', I have used it but I don't use types statically like you have in your post. Any time I've used typeid to split flow based on type it has always been based on something that is registered during runtime. For example, opening the correct dialog to edit some object of unknown type: the dialog governors are registered with a factory based on the type they edit. This keeps me from having to change any of the flow control code when I add/remove/change objects. I generally wouldn't use this option under different circumstances.
A and B require run time type identification(RTTI) and might be slower if you are doing a lot checks. Personally I don't like the solutions of "canDoX" methods, if situations like this arise the design probably needs an upgrade because you are exposing information that is not relevant to the class.
If you only need to execute X or Y, depending on the class, I would go for a virtual method in IBox which get overridden in subclasses.
class IBox{
virtual void doThing();
}
class IBoxCanDoX: public IBox{
void doThing() { doX(); }
void doX();
}
class IBoxCanDoY: public IBox{
void doThing() { doY(); }
void doY();
}
box->doThing();
If that solution is not applicable or you need more complex logic, then look at the Visitor design pattern. But keep in mind that the visitor pattern is not very flexible when you add new classes regularly or methods change/are added/are removed (but that also goes true for your proposed alternatives).
If you are trying to call either of these classes actions from contingent parts of code, you I would suggest you wrap that code in a template function and name each class's methods the same way to implement duck typing, thus your client code would look like this.
template<class box>
void box_do_xory(box BOX){
BOX.xory();
}
There is no general answer to your question. Everything depends. I can say only that:
- don't use a), use b) instead
- b) is nice, requires least code, no need for dummy methods, but dynamic_cast is a little slow
- c) is similar to b) but it is faster (no dynamic_cast) and requires more memory
- e) has no sense, you still need to discover if you can call the method so the exception is not thrown
- d) is better then f) (less code to write)
- d) e) and f) produce more garbage code then others, but are faster and less memory consuming
I assume that you will not only be working with one object of one type here.
I would lay out the data that you are working with and try to see how you can lay it out in memory in order to do data-driven programming. A good layout in memory should reflect the way that you store the data in your classes and how the classes are layed out in memory. Once you have that basic design structured (shouldn't take more than a napkin), I would begin organizing the objects into lists dependent on the operations that you plan to do on the data. If you plan to do X() on a collection of objects { Y } in the subset X, I would probably make sure to have a static array of Y that I create from the beginning. If you wish to access the entire of X occasionally, that can be arranged by collecting the lists into a dynamic list of pointers (using std::vector or your favorite choice).
I hope that makes sense, but once implemented it gives simple straight solutions that are easy to understand and easy to work with.
There is a generic way to test if a class supports a certain concept and then to execute the most appropriate code. It uses SFINAE hack. This example is inspired by Abrahams and Gurtovoy's "C++ Template Metaprogramming" book. The function doIt will use x method if it is present, otherwise it will use y method. You can extend CanDo structure to test for other methods as well. You can test as many methods as you wish, as long as the overloads of doIt can be resolved uniquely.
#include <iostream>
#include <boost/config.hpp>
#include <boost/utility/enable_if.hpp>
typedef char yes; // sizeof(yes) == 1
typedef char (&no)[2]; // sizeof(no) == 2
template<typename T>
struct CanDo {
template<typename U, void (U::*)()>
struct ptr_to_mem {};
template<typename U>
static yes testX(ptr_to_mem<U, &U::x>*);
template<typename U>
static no testX(...);
BOOST_STATIC_CONSTANT(bool, value = sizeof(testX<T>(0)) == sizeof(yes));
};
struct DoX {
void x() { std::cout << "doing x...\n"; }
};
struct DoAnotherX {
void x() { std::cout << "doing another x...\n"; }
};
struct DoY {
void y() { std::cout << "doing y...\n"; }
};
struct DoAnotherY {
void y() { std::cout << "doing another y...\n"; }
};
template <typename Action>
typename boost::enable_if<CanDo<Action> >::type
doIt(Action* a) {
a->x();
}
template <typename Action>
typename boost::disable_if<CanDo<Action> >::type
doIt(Action* a) {
a->y();
}
int main() {
DoX doX;
DoAnotherX doAnotherX;
DoY doY;
DoAnotherY doAnotherY;
doIt(&doX);
doIt(&doAnotherX);
doIt(&doY);
doIt(&doAnotherY);
}
I have something like the following in the header
class MsgBase
{
public:
unsigned int getMsgType() const { return type_; }
...
private:
enum Types { MSG_DERIVED_1, MSG_DERIVED_2, ... MSG_DERIVED_N };
unsigned int type_;
...
};
class MsgDerived1 : public MsgBase { ... };
class MsgDerived2 : public MsgBase { ... };
...
class MsgDerivedN : public MsgBase { ... };
and is used as
MsgBase msgHeader;
// peeks into the input stream to grab the
// base class that has the derived message type
// non-destructively
inputStream.deserializePeek( msgHeader );
unsigned int msgType = msgHeader.getMsgType();
MsgDerived1 msgDerived1;
MsgDerived2 msgDerived2;
...
MsgDerivedN msgDerivedN;
switch( msgType )
{
case MSG_DERIVED_1:
// fills out msgDerived1 from the inputStream
// destructively
inputStream.deserialize( msgDerived1 );
/* do MsgDerived1 processing */
break;
case MSG_DERIVED_2:
inputStream.deserialize( msgDerived2 );
/* do MsgDerived1 processing */
break;
...
case MSG_DERIVED_N:
inputStream.deserialize( msgDerivedN );
/* do MsgDerived1 processing */
break;
}
This seems like the type of situation which would be fairly common and well suited to refactoring. What would be the best way to apply design patterns (or basic C++ language feature redesign) to refactor this code?
I have read that the Command pattern is commonly used to refactor switch statements but that seems only applicable when choosing between algorithms to do a task. Is this a place where the factory or abstract factory pattern is applicable (I am not very familiar with either)? Double dispatch?
I've tried to leave out as much inconsequential context as possible but if I missed something important just let me know and I'll edit to include it. Also, I could not find anything similar but if this is a duplicate just redirect me to the appropriate SO question.
You could use a Factory Method pattern that creates the correct implementation of the base class (derived class) based on the value you peek from the stream.
The switch isn't all bad. It's one way to implement the factory pattern. It's easily testable, it makes it easy to understand the entire range of available objects, and it's good for coverage testing.
Another technique is to build a mapping between your enum types and factories to make the specific objects from the data stream. This turns the compile-time switch into a run-time lookup. The mapping can be built at run-time, making it possible to add new types without recompiling everything.
// You'll have multiple Factories, all using this signature.
typedef MsgBase *(*Factory)(StreamType &);
// For example:
MsgBase *CreateDerived1(StreamType &inputStream) {
MsgDerived1 *ptr = new MsgDerived1;
inputStream.deserialize(ptr);
return ptr;
}
std::map<Types, Factory> knownTypes;
knownTypes[MSG_DERIVED_1] = CreateDerived1;
// Then, given the type, you can instantiate the correct object:
MsgBase *object = (*knownTypes[type])(inputStream);
...
delete object;
Pull Types and type_ out of MsgBase, they don't belong there.
If you want to get totally fancy, register all of your derived types with the factory along with the token (e.g. 'type') that the factory will use to know what to make. Then, the factory looks up that token on deserialize in its table, and creates the right message.
class DerivedMessage : public Message
{
public:
static Message* Create(Stream&);
bool Serialize(Stream&);
private:
static bool isRegistered;
};
// sure, turn this into a macro, use a singleton, whatever you like
bool DerivedMessage::isRegistered =
g_messageFactory.Register(Hash("DerivedMessage"), DerivedMessage::Create);
etc. The Create static method allocates a new DerivedMessage and deserializes it, the Serialize method writes the token (in this case, Hash("DerivedMessage")) and then serializes itself. One of them should probably test isRegistered so that it doesn't get dead stripped by the linker.
(Notably, this method doesn't require an enum or other "static list of everything that can ever exist". At this time I can't think of another method that doesn't require circular references to some degree.)
It's generally a bad idea for a base class to have knowledge about derived classes, so a redesign is definitely in order. A factory pattern is probably what you want here as you already noted.
I have a question, though it is not limited to C++. How to return totally different class from one function?
f() {
in case one: return A;
in case two: return B;
in case three: return C;
}
For example, I have two balls in the space, according to the position and the size, there are three situations for the two balls to intersect with each other, i.e, non-intersection, at point, a and circle. How can I return different class in one function?
Thanks.
If you can afford Boost then this sounds like a perfect application for Boost.Variant.
struct NoIntersection {
// empty
};
struct Point {
// whatever
};
struct Circle {
// whatever
};
typedef boost::variant<NoIntersection, Point, Circle> IntersectionResult;
IntersectionResult intersection_test() {
if(some_condition){
return NoIntersection();
}
if(other_condition){
return Point(x, y);
}
if(another_condition){
return Circle(c, r);
}
throw std::runtime_error("unexpected");
}
You then process your result with a static visitor:
struct process_result_visitor : public boost::static_visitor<> {
void operator()(NoIntersection) {
std::cout << "there was no intersection\n";
}
void operator()(Point const &pnt) {
std::cout << "there was a point intersection\n";
}
void operator()(Circle const &circle) {
std::cout << "there was a circle intersection\n";
}
};
IntersectionResult result = intersection_test();
boost::apply_visitor(process_result_visitor(), result);
EDIT: The visitor class must derive from boost::static_visitor
UPDATE: Prompted by some critical comments I've written a little benchmark program. Four approaches are compared:
boost::variant
union
class hierarchy
boost::any
These are the results in my home computer, when I compile in release mode with default optimizations (VC08):
test with boost::variant took 0.011 microseconds
test with union took 0.012 microseconds
test with hierarchy took 0.227 microseconds
test with boost::any took 0.188 microseconds
Using boost::variant is faster than a union and leads (IMO) to the most elegant code. I'd guess that the extremely poor performance of the class hierarchy approach is due to the need to use dynamic memory allocations and dynamic dispatch. boost::any is neither fast nor especially elegant so I wouldn't consider it for this task (it has other applications though)
The classes you want to return should be derived from a common base class. So, you can return the base type. For Example (this is not a code, just marking the pattern, you can use an interface if your language supports this abstraction or abstract class for example. If you use C++ you will have to return a pointer of the common class):
class A : public Common
{
..
}
class B : public Common
{
..
}
class C : public Common
{
..
}
Common f() {
in case one: return A;
in case two: return B;
in case three: return C;
}
In addition to #Manuel's Boost.Variant suggestion, take a look at Boost.Any: has similar purpose as Boost.Variant but different tradeoffs and functionality.
boost::any is unbounded (can hold any type) while boost::variant is bounded (supported types is encoded in variant type, so it can hold only values of these types).
// from Beyond the C++ Standard Library: An Introduction to Boost
// By Björn Karlsson
#include <iostream>
#include <string>
#include <utility>
#include <vector>
#include "boost/any.hpp"
class A {
public:
void some_function() { std::cout << "A::some_function()\n"; }
};
class B {
public:
void some_function() { std::cout << "B::some_function()\n"; }
};
class C {
public:
void some_function() { std::cout << "C::some_function()\n"; }
};
int main() {
std::cout << "Example of using any.\n\n";
std::vector<boost::any> store_anything;
store_anything.push_back(A());
store_anything.push_back(B());
store_anything.push_back(C());
// While we're at it, let's add a few other things as well
store_anything.push_back(std::string("This is fantastic! "));
store_anything.push_back(3);
store_anything.push_back(std::make_pair(true, 7.92));
void print_any(boost::any& a);
// Defined later; reports on the value in a
std::for_each(
store_anything.begin(),
store_anything.end(),
print_any);
}
void print_any(boost::any& a) {
if (A* pA=boost::any_cast<A>(&a)) {
pA->some_function();
}
else if (B* pB=boost::any_cast<B>(&a)) {
pB->some_function();
}
else if (C* pC=boost::any_cast<C>(&a)) {
pC->some_function();
}
}
In order to be able to do anything useful with the result, you have to return an object which has a common baseclass. In your case you might want to let A, B, and C inherit from a common "intersection-class"; a class which is common for all objects which represents some form of intersection. Your function f would then return an object of this type.
The classes you want to return should have a common parent class or interface.
If those classes do not have anything in common, that, I suppose, is untrue, you can return object.
This feature is also known as polymorphism.
In c++ base class pointer can point to derived class object. We can make use of this fact to code a function that meets your requirements:
class shape{};
class circle: public shape
{};
class square: public shape
{};
shape* function(int i){ // function returning a base class pointer.
switch(i) {
case 1: return new circle();
case 2: return new square();
}
}
There is one other option available. You can return a union of pointers to objects along with a tag that tells the caller which member of the union is valid. Something like:
struct result {
enum discriminant { A_member, B_member, C_member, Undefined } tag;
union result_data {
A *a_object;
B *b_object;
C *c_object;
} data;
result(): tag(Undefined) {}
explicit result(A *obj): tag(A_member) { data.a_object = obj; }
explicit result(B *obj): tag(B_member) { data.b_object = obj; }
explicit result(C *obj): tag(C_member) { data.c_object = obj; }
};
I would probably use Boost.variant as suggested by Manuel if you have the option.
You can't. You can only return a base pointer to different derived classes. If this is absolutely, 100% needed, you can use exceptions as a ugly hack, but that's obviously not recommended at all.
Even if you could return three different types of objects from the function, what would you do with the result? You need to do something like:
XXX ret_val = getIntersection();
If getIntersection returned three different types of objects, XXX would have to change based on what getIntersection was going to return. Clearly this is quite impossible.
To deal with this, you can define one type that defines enough to cover all the possibilities:
class Intersection {
enum { empty, point, circle, sphere};
point3D location;
size_t radius;
};
Now getIntersection() can return an Intersection that defines what kind of intersection you have (and BTW, you need to consider the fourth possibility: with two spheres of the same radius and same center point, the intersection will be a sphere) and the size and location of that intersection.
The limitation is based on the declared return type of your method. Your code states:
f() {
in case one: return A;
in case two: return B;
in case three: return C;
}
When in reality the compiler requires something like this:
FooType f() {
in case one: return A;
in case two: return B;
in case three: return C;
}
It must be possible to convert the A, B, and C to a FooType, typically through simple inheritance, though I won't get into the differences between subclasses vs subtyping.
There are approaches that can get around this. You could create a class or struct (C++) which has fields for each different type of possible return and use some flag field to indicate which field is the actual returned value.
class ReturnHolder {
public int fieldFlag;
public TypeA A;
public TypeB B;
public TypeC C;
}
The enum example in another answer is more of the same. The reason why that is a hack is that the code that handles the return from this method will have to have lots of code to handle each of the different possibilites, like so
main(){
FooType *x = new FooType();
ReturnHolder ret = x.f();
switch (ret.fieldFlag)
case: 1
//read ret.A
case: 2
//read ret.B
case: 3
//read ret.C
}
And that's without even going into trying to do it with Exceptions which introduce even bigger problems. Maybe I'll add that in later as an edit.
And by the way, as you said that question "is not limited to C++":
1) dynamic languages, of course, make it piece of cake:
# python
def func(i):
if i == 0:
return 0
elif i == 1:
return "zero"
else
return ()
2) some functional languages (Haskell, OCaml, Scala, F#) provide nice built-in variants that are called Algebraic Data Types (article has good samples).
In languages that reflection, it is easier to achieve. In cpp, if you have a standard set of classes to be returned (pointers), create an enumeration and return the enum value. Using this value you can infer the class type. This is a generic way in case there is no common parent class
You really shouldn't want to be doing that, and should really come up with a better design instead of forcing a square peg in a round hole. And with most languages you can't do it at all, by design. You will never really know what you are working with, and neither will the compiler ahead of time, ensuring extra bugs and weird behavior and incomprehensibility.