What is a good design to use external class on member functions? - c++

I have the following design problem and am seeking for the most elegant and even more important most efficient solution as this problem comes from a context where performance is an issue.
Simply spoken I have a class "Function_processor" that does some calculations for real functions (e.g. calculates the roots of a real function) and I have another class "A" that has different such functions and needs to use the Function_processor to perform calculations on them.
The Function_processor should be as generic as possible (e.g. do not provide interfaces for all sorts of different objects), but merely stick to its own task (do calculations for any functions).
#include "function_processor.h"
class A {
double a;
public:
A(double a) : a(a) {}
double function1(double x) {
return a*x;
}
double function2(double x){
return a*x*x;
}
double calculate_sth() {
Function_processor function_processor(3*a+1, 7);
return function_processor.do_sth(&function1);
}
};
class Function_processor {
double p1, p2;
public:
Function_processor(double parameter1, double parameter2);
double do_sth(double (*function)(double));
double do_sth_else(double (*function)(double));
};
Clearly I can not pass the member functions A::function1/2 as in the following example (I know that, but this is roughly what I would consider readable code).
Also I can not make function1/2 static because they use the non-static member a.
I am sure I could use sth like std::bind or templates (even though I have hardly any experience with these things) but then I am mostly concerned about the performance I would get.
What is the best (nice code and fast performance) solution to my problem ?
Thanks for your help !

This is not really the best way to do this, either from a pure OO point of view or a functional or procedural POV. First of all, your class A is really nothing more than a namespace that has to be instantiated. Personally, I'd just put its functions as free floating C-style ones - maybe in a namespace somewhere so that you get some kind of classification.
Here's how you'd do it in pure OO:
class Function
{
virtual double Execute(double value);
};
class Function1 : public Function
{
virtual double Execute(double value) { ... }
};
class FunctionProcessor
{
void Process(Function & f)
{
...
}
}
This way, you could instantiate Function1 and FunctionProcessor and send the Function1 object to the Process method. You could derive anything from Function and pass it to Process.
A similar, but more generic way to do it is to use templates:
template <class T>
class FunctionProcessor
{
void Process()
{
T & function;
...
}
}
You can pass anything at all as T, but in this case, T becomes a compile-time dependency, so you have to pass it in code. No dynamic stuff allowed here!
Here's another templated mechanism, this time using simple functions instead of classes:
template <class T>
void Process(T & function)
{
...
double v1 = function(x1);
double v2 = function(x2);
...
}
You can call this thing like this:
double function1(double val)
{
return blah;
}
struct function2
{
double operator()(double val) { return blah; }
};
// somewhere else
FunctionProcessor(function1);
FunctionProcessor(function2());
You can use this approach with anything that can be called with the right signature; simple functions, static methods in classes, functors (like struct function2 above), std::mem_fun objects, new-fangled c++11 lambdas,... And if you use functors, you can pass them parameters in the constructor, just like any object.
That last is probably what I'd do; it's the fastest, if you know what you're calling at compile time, and the simplest while reading the client code. If it has to be extremely loosely coupled for some reason, I'd go with the first class-based approach. I personally think that circumstance is quite rare, especially as you describe the problem.
If you still want to use your class A, make all the functions static if they don't need member access. Otherwise, look at std::mem_fun. I still discourage this approach.

If I understood correctly, what you're searching for seems to be pointer to member functions:
double do_sth(double (A::*function)(double));
For calling, you would however also need an object of class A. You could also pass that into function_processor in the constructor.
Not sure about the performance of this, though.

Related

virtual overloading vs `std::function` member?

I'm in a situation where I have a class, let's call it Generic. This class has members and attributes, and I plan to use it in a std::vector<Generic> or similar, processing several instances of this class.
Also, I want to specialize this class, the only difference between the generic and specialized objects would be a private method, which does not access any member of the class (but is called by other methods). My first idea was to simply declare it virtual and overload it in specialized classes like this:
class Generic
{
// all other members and attributes
private:
virtual float specialFunc(float x) const =0;
};
class Specialized_one : public Generic
{
private:
virtual float specialFunc(float x) const{ return x;}
};
class Specialized_two : public Generic
{
private:
virtual float specialFunc(float x) const{ return 2*x; }
}
And thus I guess I would have to use a std::vector<Generic*>, and create and destroy the objects dynamically.
A friend suggested me using a std::function<> attribute for my Generic class, and give the specialFunc as an argument to the constructor but I am not sure how to do it properly.
What would be the advantages and drawbacks of these two approaches, and are there other (better ?) ways to do the same thing ? I'm quite curious about it.
For the details, the specialization of each object I instantiate would be determined at runtime, depending on user input. And I might end up with a lot of these objects (not yet sure how many), so I would like to avoid any unnecessary overhead.
virtual functions and overloading model an is-a relationship while std::function models a has-a relationship.
Which one to use depends on your specific use case.
Using std::function is perhaps more flexible as you can easily modify the functionality without introducing new types.
Performance should not be the main decision point here unless this code is provably (i.e. you measured it) the tight loop bottleneck in your program.
First of all, let's throw performance out the window.
If you use virtual functions, as you stated, you may end up with a lot of classes with the same interface:
class generic {
virtual f(float x);
};
class spec1 : public generic {
virtual f(float x);
};
class spec2 : public generic {
virtual f(float x);
};
Using std::function<void(float)> as a member would allow you to avoid all the specializations:
class meaningful_class_name {
std::function<void(float)> f;
public:
meaningful_class_name(std::function<void(float)> const& p_f) : f(p_f) {}
};
In fact, if this is the ONLY thing you're using the class for, you might as well just remove it, and use a std::function<void(float)> at the level of the caller.
Advantages of std::function:
1) Less code (1 class for N functions, whereas the virtual method requires N classes for N functions. I'm making the assumption that this function is the only thing that's going to differ between classes).
2) Much more flexibility (You can pass in capturing lambdas that hold state if you want to).
3) If you write the class as a template, you could use it for all kinds of function signatures if needed.
Using std::function solves whatever problem you're attempting to tackle with virtual functions, and it seems to do it better. However, I'm not going to assert that std::function will always be better than a bunch of virtual functions in several classes. Sometimes, these functions have to be private and virtual because their implementation has nothing to do with any outside callers, so flexibility is NOT an advantage.
Disadvantages of std::function:
1) I was about to write that you can't access the private members of the generic class, but then I realized that you can modify the std::function in the class itself with a capturing lambda that holds this. Given the way you outlined the class however, this shouldn't be a problem since it seems to be oblivious to any sort of internal state.
What would be the advantages and drawbacks of these two approaches, and are there other (better ?) ways to do the same thing ?
The issue I can see is "how do you want your class defined?" (as in, what is the public interface?)
Consider creating an API like this:
class Generic
{
// all other members and attributes
explicit Generic(std::function<float(float)> specialFunc);
};
Now, you can create any instance of Generic, without care. If you have no idea what you will place in specialFunc, this is the best alternative ("you have no idea" means that clients of your code may decide in one month to place a function from another library there, an identical function ("receive x, return x"), accessing some database for the value, passing a stateful functor into your function, or whatever else).
Also, if the specialFunc can change for an existing instance (i.e. create instance with specialFunc, use it, change specialFunc, use it again, etc) you should use this variant.
This variant may be imposed on your code base by other constraints. (for example, if want to avoid making Generic virtual, or if you need it to be final for other reasons).
If (on the other hand) your specialFunc can only be a choice from a limited number of implementations, and client code cannot decide later they want something else - i.e. you only have identical function and doubling the value - like in your example - then you should rely on specializations, like in the code in your question.
TLDR: Decide based on the usage scenarios of your class.
Edit: regarding beter (or at least alternative) ways to do this ... You could inject the specialFunc in your class on an "per needed" basis:
That is, instead of this:
class Generic
{
public:
Generic(std::function<float(float> f) : specialFunc{f} {}
void fancy_computation2() { 2 * specialFunc(2.); }
void fancy_computation4() { 4 * specialFunc(4.); }
private:
std::function<float(float> specialFunc;
};
You could write this:
class Generic
{
public:
Generic() {}
void fancy_computation2(std::function<float(float> f) { 2 * f(2.); }
void fancy_computation4(std::function<float(float> f) { 4 * f(4.); }
private:
};
This offers you more flexibility (you can use different special functions with single instance), at the cost of more complicated client code. This may also be a level of flexibility that you do not want (too much).

Dynamically construct function

I fear something like this is answered somewhere on this site, but I can't find it because I don't even know how to formulate the question. So here's the problem:
I have a voxel drowing function. First I calculate offsets, angles and stuff and after I do drowing. But I make few versions of every function because sometimes I want to copy pixel, sometimes blit, sometimes blit 3*3 square for every pixel for smoothing effect, sometimes just copy pixel to n*n pixels on the screen if object is resized. And there's tons of versions for that small part in the center of a function.
What can I do instead of writing 10 of same functions which differ only by central part of code? For performance reasons, passing a function pointer as an argument is not an option. I'm not sure making them inline will do the trick, because arguments I send differ: sometimes I calculate volume(Z value), sometimes I know pixels are drawn from bottom to top.
I assume there's some way of doing this stuff in C++ everybody knows about.
Please tell me what I need to learn to do this. Thanks.
The traditional OO approaches to this are the template method pattern and the strategy pattern.
Template Method
The first is an extension of the technique described in Vincenzo's answer: instead of writing a simple non-virtual wrapper, you write a non-virtual function containing the whole algorithm. Those parts that might vary, are virtual function calls.
The specific arguments needed for a given implementation, are stored in the derived class object that provides that implementation.
eg.
class VoxelDrawer {
protected:
virtual void copy(Coord from, Coord to) = 0;
// any other functions you might want to change
public:
virtual ~VoxelDrawer() {}
void draw(arg) {
for (;;) {
// implement full algorithm
copy(a,b);
}
}
};
class SmoothedVoxelDrawer: public VoxelDrawer {
int radius; // algorithm-specific argument
void copy(Coord from, Coord to) {
blit(from.dx(-radius).dy(-radius),
to.dx(-radius).dy(-radius),
2*radius, 2*radius);
}
public:
SmoothedVoxelDrawer(int r) : radius(r) {}
};
Strategy
This is similar but instead of using inheritance, you pass a polymorphic Copier object as an argument to your function. Its more flexible in that it decouples your various copying strategies from the specific function, and you can re-use your copying strategies in other functions.
struct VoxelCopier {
virtual void operator()(Coord from, Coord to) = 0;
};
struct SmoothedVoxelCopier: public VoxelCopier {
// etc. as for SmoothedVoxelDrawer
};
void draw_voxels(arguments, VoxelCopier &copy) {
for (;;) {
// implement full algorithm
copy(a,b);
}
}
Although tidier than passing in a function pointer, neither the template method nor the strategy are likely to have better performance than just passing a function pointer: runtime polymorphism is still an indirect function call.
Policy
The modern C++ equivalent of the strategy pattern is the policy pattern. This simply replaces run-time polymorphism with compile-time polymorphism to avoid the indirect function call and enable inlining
// you don't need a common base class for policies,
// since templates use duck typing
struct SmoothedVoxelCopier {
int radius;
void copy(Coord from, Coord to) { ... }
};
template <typename CopyPolicy>
void draw_voxels(arguments, CopyPolicy cp) {
for (;;) {
// implement full algorithm
cp.copy(a,b);
}
}
Because of type deduction, you can simply call
draw_voxels(arguments, SmoothedVoxelCopier(radius));
draw_voxels(arguments, OtherVoxelCopier(whatever));
NB. I've been slightly inconsistent here: I used operator() to make my strategy call look like a regular function, but a normal method for my policy. So long as you choose one and stick with it, this is just a matter of taste.
CRTP Template Method
There's one final mechanism, which is the compile-time polymorphism version of the template method, and uses the Curiously Recurring Template Pattern.
template <typename Impl>
class VoxelDrawerBase {
protected:
Impl& impl() { return *static_cast<Impl*>(this); }
void copy(Coord from, Coord to) {...}
// *optional* default implementation, is *not* virtual
public:
void draw(arg) {
for (;;) {
// implement full algorithm
impl().copy(a,b);
}
}
};
class SmoothedVoxelDrawer: public VoxelDrawerBase<SmoothedVoxelDrawer> {
int radius; // algorithm-specific argument
void copy(Coord from, Coord to) {
blit(from.dx(-radius).dy(-radius),
to.dx(-radius).dy(-radius),
2*radius, 2*radius);
}
public:
SmoothedVoxelDrawer(int r) : radius(r) {}
};
Summary
In general I'd prefer the strategy/policy patterns for their lower coupling and better reuse, and choose the template method pattern only where the top-level algorithm you're parameterizing is genuinely set in stone (ie, when you're either refactoring existing code or are really sure of your analysis of the points of variation) and reuse is genuinely not an issue.
It's also really painful to use the template method if there is more than one axis of variation (that is, you have multiple methods like copy, and want to vary their implementations independently). You either end up with code duplication or mixin inheritance.
I suggest using the NVI idiom.
You have your public method which calls a private function that implements the logic that must differ from case to case.
Derived classes will have to provide an implementation of that private function that specializes them for their particular task.
Example:
class A {
public:
void do_base() {
// [pre]
specialized_do();
// [post]
}
private:
virtual void specialized_do() = 0;
};
class B : public A {
private:
void specialized_do() {
// [implementation]
}
};
The advantage is that you can keep a common implementation in the base class and detail it as required for any subclass (which just need to reimplement the specialized_do method).
The disadvantage is that you need a different type for each implementation, but if your use case is drawing different UI elements, this is the way to go.
You could simply use the strategy pattern
So, instead of something like
void do_something_one_way(...)
{
//blah
//blah
//blah
one_way();
//blah
//blah
}
void do_something_another_way(...)
{
//blah
//blah
//blah
another_way();
//blah
//blah
}
You will have
void do_something(...)
{
//blah
//blah
//blah
any_which_way();
//blah
//blah
}
any_which_way could be a lambda, a functor, a virtual member function of a strategy class passed in. There are many options.
Are you sure that
"passing a function pointer as an argument is not an option"
Does it really slow it down?
You could use higher order functions, if your 'central part' can be parameterized nicely.
Here is a simple example of a function that returns a function which adds n to its argument:
#include <iostream>
#include<functional>
std::function<int(int)> n_adder(int n)
{
return [=](int x){return x+n;};
}
int main()
{
auto add_one = n_adder(1);
std::cout<<add_one(5);
}
You can use either Template Method pattern or Strategy pattern.
Usually Template method pattern is used in white-box frameworks, when you need to know about the internal structure of a framework to correctly subclass a class.
Strategy pattern is usually used in black-box frameworks, when you should not know about the implementation of the framework, since you only need to understand the contract of the methods you should implement.
For performance reasons, passing a function pointer as an argument is not an option.
Are you sure that passing one additional parameter and will cause performance problems? In this case you may have similar performance penalties if you use OOP techniques, like Template method or Strategy. But it is usually necessary to use profilier to determine what is the source of the performance degradation. Virtual calls, passing additional parameters, calling function through a pointer are usually very cheap, comparing to complex algorithms. You may find that these techniques consumes insignificant percent of CPU resources comparing to other code.
I'm not sure making them inline will do the trick, because arguments I send differ: sometimes I calculate volume(Z value), sometimes I know pixels are drawn from bottom to top.
You could pass all the parameter required for drawing in all cases. Alternatively if use Tempate method pattern a base class could provide methods that can return the data that could be required for drawing in different cases. In Strategy pattern, you could pass an instance of an object that could provide this kind of data to a Strategy implementation.

several classes implement parent class with varying api

I have a class Feature with a pure virtual method.
class Feature {
public:
virtual ~Feature() {}
virtual const float getValue(const vector<int>& v) const = 0;
};
This class is implemented by several classes, for example FeatureA and FeatureB.
A separate class Computer (simplified) uses the getValue method to do some computation.
class Computer {
public:
const float compute(const vector<Feature*>& features, const vector<int>& v) {
float res = 0;
for (int i = 0; i < features.size(); ++i) {
res += features[i]->getValue(v);
}
return res;
}
};
Now, I am would like to implement FeatureC but I realize that I need additional information in the getValue method. The method in FeatureC looks like
const float getValue(const vector<int>& v, const vector<int>& additionalInfo) const;
I can of course modify the signature of getValue in Feature, FeatureA, FeatureB to take additionalInfo as a parameter and also add additionalInfo as a parameter in the compute method. But then I may have to modify all those signatures again later if I want to implement FeatureD that needs even more additional info. I wonder if there is a more elegant solution to this or if there is a known design pattern that you can point me to for further reading.
You have at least two options:
Instead of passing the single vector to getValue(), pass a struct. In this struct you can put the vector today, and more data tomorrow. Of course, if some concrete runs of your program don't need the extra fields, the need to compute them might be wasteful. But it will impose no performance penalty if you always need to compute all the data anyway (i.e. if there will always be one FeatureC).
Pass to getValue() a reference to an object having methods to get the necessary data. This object could be the Computer itself, or some simpler proxy. Then the getValue() implementations can request exactly what they need, and it can be lazily computed. The laziness will eliminate wasted computations in some cases, but the overall structure of doing it this way will impose some small constant overhead due to having to call (possibly virtual) functions to get the various data.
Requiring the user of your Feature class hierarchy to call different methods based on class defeats polymorphism. Once you start doing dynamic_cast<>() you know you should be rethinking your design.
If a subclass requires information that it can only get from its caller, you should change the getValue() method to take an additionalInfo argument, and simply ignore that information in classes where it doesn't matter.
If FeatureC can get additionalInfo by calling another class or function, that's usually a better approach, as it limits the number of classes that need to know about it. Perhaps the data is available from an object which FeatureC is given access to via its constructor, or from a singleton object, or it can be calculated by calling a function. Finding the best approach requires a bit more knowledge about the case.
This problem is addressed in item 39 of C++ Coding Standards (Sutter, Alexandrescu), which is titled "Consider making virtual functions nonpublic, and public functions nonvirtual."
In particular, one of the motivations for following the Non-Virtual-Interface design pattern (this is what the item is all about) is stated as
Each interface can take its natural shape: When we separate the public interface
from the customization interface, each can easily take the form it naturally
wants to take instead of trying to find a compromise that forces them to look
identical. Often, the two interfaces want different numbers of functions and/or
different parameters; [...]
This is particularly useful
In base classes with a high cost of change
Another design pattern which is very useful in this case is the Visitor pattern. As for the NVI it applies when base classes (as well as the whole hierarchy) have a high cost of change. You can find plenty of discussion about this design pattern, I suggest you to read the related chapter in Modern C++ (Alexandrescu), which (on the side) gives you a great insight on how to use the (very easy to use) Visitor facilities in loki
I suggest for you to read all of this material and then edit the question so that we can give you a better answer. We can come up with all sort of solutions (e.g. use an additional method which gives the class the additional parameters, if needed) which might well not suit your case.
Try to address the following questions:
would a template-based solution fit the problem?
would it be feasible to add a new layer of indirection when calling the function?
would a "push argument"-"push argument"-...-"push argument"-"call function" method be of help? (this might seem very odd at first, but
think to something like "cout << arg << arg << arg << endl", where
"endl" is the "call function")
how do you intend to distinguish how to call the function in Computer::compute?
Now that we had some "theory", let's aim for the practice using the Visitor pattern:
#include <iostream>
using namespace std;
class FeatureA;
class FeatureB;
class Computer{
public:
int visitA(FeatureA& f);
int visitB(FeatureB& f);
};
class Feature {
public:
virtual ~Feature() {}
virtual int accept(Computer&) = 0;
};
class FeatureA{
public:
int accept(Computer& c){
return c.visitA(*this);
}
int compute(int a){
return a+1;
}
};
class FeatureB{
public:
int accept(Computer& c){
return c.visitB(*this);
}
int compute(int a, int b){
return a+b;
}
};
int Computer::visitA(FeatureA& f){
return f.compute(1);
}
int Computer::visitB(FeatureB& f){
return f.compute(1, 2);
}
int main()
{
FeatureA a;
FeatureB b;
Computer c;
cout << a.accept(c) << '\t' << b.accept(c) << endl;
}
You can try this code here.
This is a rough implementation of the Visitor pattern which, as you can see, solves your problem. I strongly advice you not to try to implement it this way, there are obvious dependency problems which can be solved by means of a refinement called the Acyclic Visitor. It is already implemented in Loki, so there is no need to worry about implementing it.
Apart from implementation, as you can see you are not relying on type switches (which, as somebody else pointed out, you should avoid whenever possible) and you are not requiring the classes to have any particular interface (e.g. one argument for the compute function). Moreover, if the visitor class is a hierarchy (make Computer a base class in the example), you won't need to add any new function to the hierarchy when you want to add functionalities of this sort.
If you don't like the visitA, visitB, ... "pattern", worry not: this is just a trivial implementation and you don't need that. Basically, in a real implementation you use template specialization of a visit function.
Hope this helped, I had put a lot of effort into it :)
Virtual functions, to work correctly, needs to have exactly the same "signature" (same parameters and same return type). Otherwise, you just get a "new member function", which isn't what you want.
The real question here is "how does the calling code know it needs the extra information".
You can solve this in a few different ways - the first one is to always pass in const vector <int>& additionalInfo, whether it's needed or not.
If that's not possible, because there isn't any additionalInfo except for in the case of FeatureC, you could have an "optional" parameter - which means use a pointer to vector (vector<int>* additionalInfo), which is NULL when the value is not available.
Of course if additionalInfo is a value that is something that can be stored in the FeatureC class, then that would also work.
Another option is to extend the base class Feature to have two more options:
class Feature {
public:
virtual ~Feature() {}
virtual const float getValue(const vector<int>& v) const = 0;
virtual const float getValue(const vector<int>& v, const vector<int>& additionalInfo) { return -1.0; };
virtual bool useAdditionalInfo() { return false; }
};
and then make your loop something like this:
for (int i = 0; i < features.size(); ++i) {
if (features[i]->useAdditionalInfo())
{
res += features[i]->getValue(v, additionalInfo);
}
else
{
res += features[i]->getValue(v);
}
}

pattern to avoid dynamic_cast

I have a class:
class A
{
public:
virtual void func() {…}
virtual void func2() {…}
};
And some derived classes from this one, lets say B,C,D... In 95 % of the cases, i want to go through all objects and call func or func2(), so therefore i have them in a vector, like:
std::vector<std::shared_ptr<A> > myVec;
…
for (auto it = myVec.begin(); it != myVec.end(); ++it)
(*it).func();
However, in the rest 5 % of the cases i want to do something different to the classes depending on their subclass. And I mean totally different, like calling functions that takes other parameters or not calling functions at all for some subclasses. I have thought of some options to solve this, none of which I really like:
Use dynamic_cast to analyze subclass. Not good, too slow as I make calls very often and on limited hardware
Use a flag in each subclass, like an enum {IS_SUBCLASS_B, IS_SUBCLASS_C}. Not good as it doesnt feel OO.
Also put the classes in other vectors, each for their specific task. This doesnt feel really OO either, but maybe I'm wrong here. Like:
std::vector<std::shared_ptr<B> > vecForDoingSpecificOperation;
std::vector<std::shared_ptr<C> > vecForDoingAnotherSpecificOperation;
So, can someone suggest a style/pattern that achieves what I want?
Someone intelligent (unfortunately I forgot who) once said about OOP in C++: The only reason for switch-ing over types (which is what all your suggestions propose) is fear of virtual functions. (That's para-paraphrasing.) Add virtual functions to your base class which derived classes can override, and you're set.
Now, I know there are cases where this is hard or unwieldy. For that we have the visitor pattern.
There's cases where one is better, and cases where the other is. Usually, the rule of thumb goes like this:
If you have a rather fixed set of operations, but keep adding types, use virtual functions.
Operations are hard to add to/remove from a big inheritance hierarchy, but new types are easy to add by simply having them override the appropriate virtual functions.
If you have a rather fixed set of types, but keep adding operations, use the visitor pattern.
Adding new types to a large set of visitors is a serious pain in the neck, but adding a new visitor to a fixed set of types is easy.
(If both change, you're doomed either way.)
According to your comments, what you have stumbled upon is known (dubiously) as the Expression Problem, as expressed by Philip Wadler:
The Expression Problem is a new name for an old problem. The goal is to define a datatype by cases, where one can add new cases to the datatype and new functions over the datatype, without recompiling existing code, and while retaining static type safety (e.g., no casts).
That is, extending both "vertically" (adding types to the hierarchy) and "horizontally" (adding functions to be overriden to the base class) is hard on the programmer.
There was a long (as always) discussion about it on Reddit in which I proposed a solution in C++.
It is a bridge between OO (great at adding new types) and generic programming (great at adding new functions). The idea is to have a hierachy of pure interfaces and a set of non-polymorphic types. Free-functions are defined on the concrete types as needed, and the bridge with the pure interfaces is brought by a single template class for each interface (supplemented by a template function for automatic deduction).
I have found a single limitation to date: if a function returns a Base interface, it may have been generated as-is, even though the actual type wrapped supports more operations, now. This is typical of a modular design (the new functions were not available at the call site). I think it illustrates a clean design, however I understand one could want to "recast" it to a more verbose interface. Go can, with language support (basically, runtime introspection of the available methods). I don't want to code this in C++.
As already explained myself on reddit... I'll just reproduce and tweak the code I already submitted there.
So, let's start with 2 types and a single operation.
struct Square { double side; };
double area(Square const s);
struct Circle { double radius; };
double area(Circle const c);
Now, let's make a Shape interface:
class Shape {
public:
virtual ~Shape();
virtual double area() const = 0;
protected:
Shape(Shape const&) {}
Shape& operator=(Shape const&) { return *this; }
};
typedef std::unique_ptr<Shape> ShapePtr;
template <typename T>
class ShapeT: public Shape {
public:
explicit ShapeT(T const t): _shape(t) {}
virtual double area() const { return area(_shape); }
private:
T _shape;
};
template <typename T>
ShapePtr newShape(T t) { return ShapePtr(new ShapeT<T>(t)); }
Okay, C++ is verbose. Let's check the use immediately:
double totalArea(std::vector<ShapePtr> const& shapes) {
double total = 0.0;
for (ShapePtr const& s: shapes) { total += s->area(); }
return total;
}
int main() {
std::vector<ShapePtr> shapes{ new_shape<Square>({5.0}), new_shape<Circle>({3.0}) };
std::cout << totalArea(shapes) << "\n";
}
So, first exercise, let's add a shape (yep, it's all):
struct Rectangle { double length, height; };
double area(Rectangle const r);
Okay, so far so good, let's add a new function. We have two options.
The first is to modify Shape if it is in our power. This is source compatible, but not binary compatible.
// 1. We need to extend Shape:
virtual double perimeter() const = 0
// 2. And its adapter: ShapeT
virtual double perimeter() const { return perimeter(_shape); }
// 3. And provide the method for each Shape (obviously)
double perimeter(Square const s);
double perimeter(Circle const c);
double perimeter(Rectangle const r);
It may seem that we fall into the Expression Problem here, but we don't. We needed to add the perimeter for each (already known) class because there is no way to automatically infer it; however it did not require editing each class either!
Therefore, the combination of External Interface and free functions let us neatly (well, it is C++...) sidestep the issue.
sodraz noticed in comments that the addition of a function touched the original interface which may need to be frozen (provided by a 3rd party, or for binary compatibility issues).
The second options therefore is not intrusive, at the cost of being slightly more verbose:
class ExtendedShape: public Shape {
public:
virtual double perimeter() const = 0;
protected:
ExtendedShape(ExtendedShape const&) {}
ExtendedShape& operator=(ExtendedShape const&) { return *this; }
};
typedef std::unique_ptr<ExtendedShape> ExtendedShapePtr;
template <typename T>
class ExtendedShapeT: public ExtendedShape {
public:
virtual double area() const { return area(_data); }
virtual double perimeter() const { return perimeter(_data); }
private:
T _data;
};
template <typename T>
ExtendedShapePtr newExtendedShape(T t) { return ExtendedShapePtr(new ExtendedShapeT<T>(t)); }
And then, define the perimeter function for all those Shape we would like to use with the ExtendedShape.
The old code, compiled to work against Shape, still works. It does not need the new function anyway.
The new code can make use of the new functionality, and still interface painlessly with the old code. (*)
There is only one slight issue, if the old code return a ShapePtr, we do not know whether the shape actually has a perimeter function (note: if the pointer is generated internally, it has not been generated with the newExtendedShape mechanism). This is the limitation of the design mentioned at the beginning. Oops :)
(*) Note: painlessly implies that you know who the owner is. A std::unique_ptr<Derived>& and a std::unique_ptr<Base>& are not compatible, however a std::unique_ptr<Base> can be build from a std::unique_ptr<Derived> and a Base* from a Derived* so make sure your functions are clean ownership-wise and you're golden.

How to implement the factory method pattern in C++ correctly

There's this one thing in C++ which has been making me feel uncomfortable for quite a long time, because I honestly don't know how to do it, even though it sounds simple:
How do I implement Factory Method in C++ correctly?
Goal: to make it possible to allow the client to instantiate some object using factory methods instead of the object's constructors, without unacceptable consequences and a performance hit.
By "Factory method pattern", I mean both static factory methods inside an object or methods defined in another class, or global functions. Just generally "the concept of redirecting the normal way of instantiation of class X to anywhere else than the constructor".
Let me skim through some possible answers which I have thought of.
0) Don't make factories, make constructors.
This sounds nice (and indeed often the best solution), but is not a general remedy. First of all, there are cases when object construction is a task complex enough to justify its extraction to another class. But even putting that fact aside, even for simple objects using just constructors often won't do.
The simplest example I know is a 2-D Vector class. So simple, yet tricky. I want to be able to construct it both from both Cartesian and polar coordinates. Obviously, I cannot do:
struct Vec2 {
Vec2(float x, float y);
Vec2(float angle, float magnitude); // not a valid overload!
// ...
};
My natural way of thinking is then:
struct Vec2 {
static Vec2 fromLinear(float x, float y);
static Vec2 fromPolar(float angle, float magnitude);
// ...
};
Which, instead of constructors, leads me to usage of static factory methods... which essentially means that I'm implementing the factory pattern, in some way ("the class becomes its own factory"). This looks nice (and would suit this particular case), but fails in some cases, which I'm going to describe in point 2. Do read on.
another case: trying to overload by two opaque typedefs of some API (such as GUIDs of unrelated domains, or a GUID and a bitfield), types semantically totally different (so - in theory - valid overloads) but which actually turn out to be the same thing - like unsigned ints or void pointers.
1) The Java Way
Java has it simple, as we only have dynamic-allocated objects. Making a factory is as trivial as:
class FooFactory {
public Foo createFooInSomeWay() {
// can be a static method as well,
// if we don't need the factory to provide its own object semantics
// and just serve as a group of methods
return new Foo(some, args);
}
}
In C++, this translates to:
class FooFactory {
public:
Foo* createFooInSomeWay() {
return new Foo(some, args);
}
};
Cool? Often, indeed. But then- this forces the user to only use dynamic allocation. Static allocation is what makes C++ complex, but is also what often makes it powerful. Also, I believe that there exist some targets (keyword: embedded) which don't allow for dynamic allocation. And that doesn't imply that the users of those platforms like to write clean OOP.
Anyway, philosophy aside: In the general case, I don't want to force the users of the factory to be restrained to dynamic allocation.
2) Return-by-value
OK, so we know that 1) is cool when we want dynamic allocation. Why won't we add static allocation on top of that?
class FooFactory {
public:
Foo* createFooInSomeWay() {
return new Foo(some, args);
}
Foo createFooInSomeWay() {
return Foo(some, args);
}
};
What? We can't overload by the return type? Oh, of course we can't. So let's change the method names to reflect that. And yes, I've written the invalid code example above just to stress how much I dislike the need to change the method name, for example because we cannot implement a language-agnostic factory design properly now, since we have to change names - and every user of this code will need to remember that difference of the implementation from the specification.
class FooFactory {
public:
Foo* createDynamicFooInSomeWay() {
return new Foo(some, args);
}
Foo createFooObjectInSomeWay() {
return Foo(some, args);
}
};
OK... there we have it. It's ugly, as we need to change the method name. It's imperfect, since we need to write the same code twice. But once done, it works. Right?
Well, usually. But sometimes it does not. When creating Foo, we actually depend on the compiler to do the return value optimisation for us, because the C++ standard is benevolent enough for the compiler vendors not to specify when will the object created in-place and when will it be copied when returning a temporary object by value in C++. So if Foo is expensive to copy, this approach is risky.
And what if Foo is not copiable at all? Well, doh. (Note that in C++17 with guaranteed copy elision, not-being-copiable is no problem anymore for the code above)
Conclusion: Making a factory by returning an object is indeed a solution for some cases (such as the 2-D vector previously mentioned), but still not a general replacement for constructors.
3) Two-phase construction
Another thing that someone would probably come up with is separating the issue of object allocation and its initialisation. This usually results in code like this:
class Foo {
public:
Foo() {
// empty or almost empty
}
// ...
};
class FooFactory {
public:
void createFooInSomeWay(Foo& foo, some, args);
};
void clientCode() {
Foo staticFoo;
auto_ptr<Foo> dynamicFoo = new Foo();
FooFactory factory;
factory.createFooInSomeWay(&staticFoo);
factory.createFooInSomeWay(&dynamicFoo.get());
// ...
}
One may think it works like a charm. The only price we pay for in our code...
Since I've written all of this and left this as the last, I must dislike it too. :) Why?
First of all... I sincerely dislike the concept of two-phase construction and I feel guilty when I use it. If I design my objects with the assertion that "if it exists, it is in valid state", I feel that my code is safer and less error-prone. I like it that way.
Having to drop that convention AND changing the design of my object just for the purpose of making factory of it is.. well, unwieldy.
I know that the above won't convince many people, so let's me give some more solid arguments. Using two-phase construction, you cannot:
initialise const or reference member variables,
pass arguments to base class constructors and member object constructors.
And probably there could be some more drawbacks which I can't think of right now, and I don't even feel particularly obliged to since the above bullet points convince me already.
So: not even close to a good general solution for implementing a factory.
Conclusions:
We want to have a way of object instantiation which would:
allow for uniform instantiation regardless of allocation,
give different, meaningful names to construction methods (thus not relying on by-argument overloading),
not introduce a significant performance hit and, preferably, a significant code bloat hit, especially at client side,
be general, as in: possible to be introduced for any class.
I believe I have proven that the ways I have mentioned don't fulfil those requirements.
Any hints? Please provide me with a solution, I don't want to think that this language won't allow me to properly implement such a trivial concept.
First of all, there are cases when
object construction is a task complex
enough to justify its extraction to
another class.
I believe this point is incorrect. The complexity doesn't really matter. The relevance is what does. If an object can be constructed in one step (not like in the builder pattern), the constructor is the right place to do it. If you really need another class to perform the job, then it should be a helper class that is used from the constructor anyway.
Vec2(float x, float y);
Vec2(float angle, float magnitude); // not a valid overload!
There is an easy workaround for this:
struct Cartesian {
inline Cartesian(float x, float y): x(x), y(y) {}
float x, y;
};
struct Polar {
inline Polar(float angle, float magnitude): angle(angle), magnitude(magnitude) {}
float angle, magnitude;
};
Vec2(const Cartesian &cartesian);
Vec2(const Polar &polar);
The only disadvantage is that it looks a bit verbose:
Vec2 v2(Vec2::Cartesian(3.0f, 4.0f));
But the good thing is that you can immediately see what coordinate type you're using, and at the same time you don't have to worry about copying. If you want copying, and it's expensive (as proven by profiling, of course), you may wish to use something like Qt's shared classes to avoid copying overhead.
As for the allocation type, the main reason to use the factory pattern is usually polymorphism. Constructors can't be virtual, and even if they could, it wouldn't make much sense. When using static or stack allocation, you can't create objects in a polymorphic way because the compiler needs to know the exact size. So it works only with pointers and references. And returning a reference from a factory doesn't work too, because while an object technically can be deleted by reference, it could be rather confusing and bug-prone, see Is the practice of returning a C++ reference variable, evil? for example. So pointers are the only thing that's left, and that includes smart pointers too. In other words, factories are most useful when used with dynamic allocation, so you can do things like this:
class Abstract {
public:
virtual void do() = 0;
};
class Factory {
public:
Abstract *create();
};
Factory f;
Abstract *a = f.create();
a->do();
In other cases, factories just help to solve minor problems like those with overloads you have mentioned. It would be nice if it was possible to use them in a uniform way, but it doesn't hurt much that it is probably impossible.
Simple Factory Example:
// Factory returns object and ownership
// Caller responsible for deletion.
#include <memory>
class FactoryReleaseOwnership{
public:
std::unique_ptr<Foo> createFooInSomeWay(){
return std::unique_ptr<Foo>(new Foo(some, args));
}
};
// Factory retains object ownership
// Thus returning a reference.
#include <boost/ptr_container/ptr_vector.hpp>
class FactoryRetainOwnership{
boost::ptr_vector<Foo> myFoo;
public:
Foo& createFooInSomeWay(){
// Must take care that factory last longer than all references.
// Could make myFoo static so it last as long as the application.
myFoo.push_back(new Foo(some, args));
return myFoo.back();
}
};
Have you thought about not using a factory at all, and instead making nice use of the type system? I can think of two different approaches which do this sort of thing:
Option 1:
struct linear {
linear(float x, float y) : x_(x), y_(y){}
float x_;
float y_;
};
struct polar {
polar(float angle, float magnitude) : angle_(angle), magnitude_(magnitude) {}
float angle_;
float magnitude_;
};
struct Vec2 {
explicit Vec2(const linear &l) { /* ... */ }
explicit Vec2(const polar &p) { /* ... */ }
};
Which lets you write things like:
Vec2 v(linear(1.0, 2.0));
Option 2:
you can use "tags" like the STL does with iterators and such. For example:
struct linear_coord_tag linear_coord {}; // declare type and a global
struct polar_coord_tag polar_coord {};
struct Vec2 {
Vec2(float x, float y, const linear_coord_tag &) { /* ... */ }
Vec2(float angle, float magnitude, const polar_coord_tag &) { /* ... */ }
};
This second approach lets you write code which looks like this:
Vec2 v(1.0, 2.0, linear_coord);
which is also nice and expressive while allowing you to have unique prototypes for each constructor.
You can read a very good solution in: http://www.codeproject.com/Articles/363338/Factory-Pattern-in-Cplusplus
The best solution is on the "comments and discussions", see the "No need for static Create methods".
From this idea, I've done a factory. Note that I'm using Qt, but you can change QMap and QString for std equivalents.
#ifndef FACTORY_H
#define FACTORY_H
#include <QMap>
#include <QString>
template <typename T>
class Factory
{
public:
template <typename TDerived>
void registerType(QString name)
{
static_assert(std::is_base_of<T, TDerived>::value, "Factory::registerType doesn't accept this type because doesn't derive from base class");
_createFuncs[name] = &createFunc<TDerived>;
}
T* create(QString name) {
typename QMap<QString,PCreateFunc>::const_iterator it = _createFuncs.find(name);
if (it != _createFuncs.end()) {
return it.value()();
}
return nullptr;
}
private:
template <typename TDerived>
static T* createFunc()
{
return new TDerived();
}
typedef T* (*PCreateFunc)();
QMap<QString,PCreateFunc> _createFuncs;
};
#endif // FACTORY_H
Sample usage:
Factory<BaseClass> f;
f.registerType<Descendant1>("Descendant1");
f.registerType<Descendant2>("Descendant2");
Descendant1* d1 = static_cast<Descendant1*>(f.create("Descendant1"));
Descendant2* d2 = static_cast<Descendant2*>(f.create("Descendant2"));
BaseClass *b1 = f.create("Descendant1");
BaseClass *b2 = f.create("Descendant2");
I mostly agree with the accepted answer, but there is a C++11 option that has not been covered in existing answers:
Return factory method results by value, and
Provide a cheap move constructor.
Example:
struct sandwich {
// Factory methods.
static sandwich ham();
static sandwich spam();
// Move constructor.
sandwich(sandwich &&);
// etc.
};
Then you can construct objects on the stack:
sandwich mine{sandwich::ham()};
As subobjects of other things:
auto lunch = std::make_pair(sandwich::spam(), apple{});
Or dynamically allocated:
auto ptr = std::make_shared<sandwich>(sandwich::ham());
When might I use this?
If, on a public constructor, it is not possible to give meaningful initialisers for all class members without some preliminary calculation, then I might convert that constructor to a static method. The static method performs the preliminary calculations, then returns a value result via a private constructor which just does a member-wise initialisation.
I say 'might' because it depends on which approach gives the clearest code without being unnecessarily inefficient.
Loki has both a Factory Method and an Abstract Factory. Both are documented (extensively) in Modern C++ Design, by Andei Alexandrescu. The factory method is probably closer to what you seem to be after, though it's still a bit different (at least if memory serves, it requires you to register a type before the factory can create objects of that type).
I don't try to answer all of my questions, as I believe it is too broad. Just a couple of notes:
there are cases when object construction is a task complex enough to justify its extraction to another class.
That class is in fact a Builder, rather than a Factory.
In the general case, I don't want to force the users of the factory to be restrained to dynamic allocation.
Then you could have your factory encapsulate it in a smart pointer. I believe this way you can have your cake and eat it too.
This also eliminates the issues related to return-by-value.
Conclusion: Making a factory by returning an object is indeed a solution for some cases (such as the 2-D vector previously mentioned), but still not a general replacement for constructors.
Indeed. All design patterns have their (language specific) constraints and drawbacks. It is recommended to use them only when they help you solve your problem, not for their own sake.
If you are after the "perfect" factory implementation, well, good luck.
This is my c++11 style solution. parameter 'base' is for base class of all sub-classes. creators, are std::function objects to create sub-class instances, might be a binding to your sub-class' static member function 'create(some args)'. This maybe not perfect but works for me. And it is kinda 'general' solution.
template <class base, class... params> class factory {
public:
factory() {}
factory(const factory &) = delete;
factory &operator=(const factory &) = delete;
auto create(const std::string name, params... args) {
auto key = your_hash_func(name.c_str(), name.size());
return std::move(create(key, args...));
}
auto create(key_t key, params... args) {
std::unique_ptr<base> obj{creators_[key](args...)};
return obj;
}
void register_creator(const std::string name,
std::function<base *(params...)> &&creator) {
auto key = your_hash_func(name.c_str(), name.size());
creators_[key] = std::move(creator);
}
protected:
std::unordered_map<key_t, std::function<base *(params...)>> creators_;
};
An example on usage.
class base {
public:
base(int val) : val_(val) {}
virtual ~base() { std::cout << "base destroyed\n"; }
protected:
int val_ = 0;
};
class foo : public base {
public:
foo(int val) : base(val) { std::cout << "foo " << val << " \n"; }
static foo *create(int val) { return new foo(val); }
virtual ~foo() { std::cout << "foo destroyed\n"; }
};
class bar : public base {
public:
bar(int val) : base(val) { std::cout << "bar " << val << "\n"; }
static bar *create(int val) { return new bar(val); }
virtual ~bar() { std::cout << "bar destroyed\n"; }
};
int main() {
common::factory<base, int> factory;
auto foo_creator = std::bind(&foo::create, std::placeholders::_1);
auto bar_creator = std::bind(&bar::create, std::placeholders::_1);
factory.register_creator("foo", foo_creator);
factory.register_creator("bar", bar_creator);
{
auto foo_obj = std::move(factory.create("foo", 80));
foo_obj.reset();
}
{
auto bar_obj = std::move(factory.create("bar", 90));
bar_obj.reset();
}
}
Factory Pattern
class Point
{
public:
static Point Cartesian(double x, double y);
private:
};
And if you compiler does not support Return Value Optimization, ditch it, it probably does not contain much optimization at all...
extern std::pair<std::string_view, Base*(*)()> const factories[2];
decltype(factories) factories{
{"blah", []() -> Base*{return new Blah;}},
{"foo", []() -> Base*{return new Foo;}}
};
I know this question has been answered 3 years ago, but this may be what your were looking for.
Google has released a couple of weeks ago a library allowing easy and flexible dynamic object allocations. Here it is: http://google-opensource.blogspot.fr/2014/01/introducing-infact-library.html